Google Research has introduced the Titans architecture and the MIRAS framework for long-context AI in new research papers and an official blog post. The work focuses on sequence models that use explicit memory modules at inference time. According to Google, these approaches help models keep useful information across very long inputs in experimental evaluations.
Google Titans and MIRAS - Key Details
Titans is a model family that augments short-range sequence processing with a dedicated long-term memory module. The memory module uses a "surprise" signal to decide what information to store as the model processes tokens. Titans also applies momentum and adaptive forgetting to maintain relevant details and clear outdated information over long spans.
Google describes this long-term memory as a deep neural network rather than a fixed-size summary vector. The architecture can attach to existing sequence models, extending their context handling without replacing core components. Titans updates its memory module at test time using gradient-based learning.
MIRAS is a general framework for designing sequence models as associative memory systems. It centers on four design choices: memory structure, attentional bias, stability and retention, and memory algorithm. Google Research uses MIRAS to interpret standard components such as forget gates and to construct new model variants.
- Titans introduces a long-term memory module with surprise-based selection, momentum, and adaptive forgetting through weight decay.
- MIRAS defines memory structure, attentional objectives, stability mechanisms, and learning rules as configurable dimensions for sequence model design.
- Both projects target handling very long sequences without repeated full-context attention or heavy state compression.
- On its research blog, Google describes Titans and MIRAS together as a "significant advancement in sequence modeling".
Background Context
The research addresses limits faced by modern language models when they process long documents, conversations, or data streams. Many systems either maintain an attention window over earlier tokens or compress past content into a shorter internal summary. Both approaches face tradeoffs between detail preservation and computational cost as context length grows.
Titans and MIRAS treat memory as an actively managed component rather than a fixed architectural side effect. The Titans paper reports improved long-context task performance over baseline Transformers and linear recurrent models. According to the authors, Titans scales to context windows larger than two million tokens with higher accuracy than these baselines.
The MIRAS paper presents three new sequence models built using this framework and evaluates them on multiple downstream tasks. The authors report that all tested MIRAS variants outperform Transformers and linear RNNs in their experiments. The Titans experiments also show higher retrieval accuracy on the BABILong long-context benchmark than several larger baseline models, including GPT-4.
Source Citations
Public documentation provides full methodological details, experimental setups, and reported results.
- Titans research paper: Titans paper (PDF)
- MIRAS framework paper: MIRAS paper
- Official Google Research summary: blog post






