Google unveils Nested Learning at NeurIPS 2025 to curb forgetting: why Hope tops transformers in tests

On November 7, 2025, Google Research introduced Nested Learning, a continual learning approach, and unveiled Hope, a self-modifying recurrent architecture with a continuum memory system. The announcement credits Ali Behrouz and Vahab Mirrokni and was published on the Google Research blog.

Key details

Nested Learning frames a model as nested optimization problems that update at different frequencies.
The paper is titled "Nested Learning: The Illusion of Deep Learning Architectures" and is listed at NeurIPS 2025.
Hope is a self-modifying recurrent architecture with a continuum memory system whose modules update at different rates to manage context.
The team proposes new deep optimizers by recasting momentum as an L2 regression objective.
Evaluations span language modeling, long-context reasoning, continual learning, and knowledge incorporation tasks.
According to the post, Hope delivered lower perplexity and higher accuracy than modern recurrent models and standard transformers.
The post also reports improved long-context performance on Needle-In-A-Haystack tasks compared with TTT and Mamba2.

Background and framing

The work targets catastrophic forgetting, where new learning degrades performance on prior tasks. Nested Learning aims to mitigate this by unifying architectural and optimization levels and ordering internal components by update frequency.

The post models training via backpropagation as associative memory, extending the view to the attention mechanism in transformers and to optimizer states. This framing is used to derive deep optimizers with standard loss objectives.

Hope architecture at a glance

Hope builds on Titans-style memory prioritization and a self-referential process for self-modifying behavior. It introduces continuum memory blocks to support extended context windows and is presented as a proof-of-concept system.

Publication and contributors

The research credits include Ali Behrouz, Meisam Razaviyayn, Peilin Zhong, and Vahab Mirrokni. Acknowledgements list Praneeth Kacham and Corinna Cortes for reviews, with additional thanks to Yuan Deng and Zeman Li.

Source: Google Research announcement and the NeurIPS 2025 listing.

Google unveils Nested Learning at NeurIPS 2025 to curb forgetting: why Hope tops transformers in tests

Key details

Background and framing

Hope architecture at a glance

Publication and contributors

More articles