Google Research introduced PASTA, a reinforcement learning agent for text-to-image generation, on October 2, 2025. The announcement, authored by Guy Tennenholtz and Craig Boutilier, was published on the Google Research blog. PASTA iteratively refines images over multiple user interactions by combining real human ratings with large-scale simulated feedback.
What is PASTA
PASTA - Preference Adaptive and Sequential Text-to-image Agent - reframes image generation as a multi-turn selection task that adapts to user preferences. In each turn, the agent selects four prompt expansions, presents a slate of candidate images, observes the user’s choice, and updates its next slate.
The system uses Gemini Flash to produce prompt expansions and Stable Diffusion XL to render images.
Training and Data
- Data collection: Over 7,000 human rater interactions and more than 30,000 simulated interaction trajectories.
- User simulator: Utility and choice components built with pre-trained CLIP encoders, discovering latent user types via expectation-maximization.
- Learning setup: Value-based reinforcement learning trained with implicit Q-learning.
- Interaction design: The agent proposes a slate of four prompt expansions per turn.
- Open data: Sequential rater data and simulated trajectories are available on Kaggle (Foundational dataset).
Evaluation and Results
PASTA was evaluated on preference prediction and ranking tasks, including Pick-a-Pic accuracy and Spearman's rank correlation, plus choice accuracy and cross-turn accuracy. Tests also included public preference datasets such as the HPS test.
- Training with both real and simulated interactions outperformed a baseline pipeline without additional training.
- In head-to-head comparisons, 85% of human raters preferred PASTA’s final images over the baseline.
- Baseline configuration: Gemini Flash for prompt expansion and SDXL for image creation.
Why it matters
The project targets limitations of single-prompt generation by learning from sequential user choices. Treating image creation as a multi-turn process aims to better align outputs with evolving user preferences.
Availability and Sources
- Announcement: Google Research blog (October 2, 2025)
- Paper: PASTA - Preference Adaptive and Sequential Text-to-image Agent
- Data release: PASTA datasets on Kaggle
Contributors
Ofir Nabati, Guy Tennenholtz, ChihWei Hsu, Moonkyung Ryu, Deepak Ramachandran, Yinlam Chow, Xiang Li, and Craig Boutilier.