Google's StreetReaderAI prototype makes Street View speak for blind and low-vision users

Google Research announced StreetReaderAI, a context-aware Street View accessibility prototype, on October 29, 2025. The work is detailed in a blog post and an arXiv paper and was presented at UIST’25.

What is StreetReaderAI

StreetReaderAI is a proof-of-concept system that generates context-aware descriptions of Street View scenes and answers questions in real time for blind and low-vision users. It combines multimodal AI with accessible navigation controls.

The prototype includes two agents: AI Describer and AI Chat, both powered by Gemini.
AI Describer produces audio descriptions using Street View imagery and geographic data.
AI Chat answers questions about the current panorama, past views, and nearby places.
Gemini Live supports real-time, interactive conversation about scenes and local features.
The agent uses Google's Multimodal Live API, function calling, and session memory.
The context window is set to 1,048,576 input tokens.
Users can navigate by voice or keyboard, pan and move, and jump between locations. The interface announces cardinal headings and whether forward movement is available.

Project materials: Google Research announcement and the paper StreetReaderAI: Making Street View Accessible Using Context-Aware Multimodal AI.

Study results

Researchers ran an in-person lab study with eleven blind screen reader users. Participants visited more than 350 panoramas and made over 1,000 AI requests.

AI Chat was used six times more often than AI Describer.
Participants rated overall usefulness 6.4 on a 1-7 Likert scale - median 7, standard deviation 0.9.
Researchers analyzed 917 AI Chat interactions across 23 question types. Top categories: orientation 27.0 percent, object existence 26.5 percent, general description 18.4 percent, location 14.9 percent.
Of 816 AI Chat questions assessed for accuracy, 86.3 percent were correct. Incorrect answers were 3.9 percent, partial answers 3.2 percent, refusals 6.6 percent.
Among incorrect answers, 62.5 percent were false negatives and 37.5 percent were misidentifications.

Why it matters

Interactive street view tools are standard in major mapping services, but screen readers do not interpret these image-based scenes and alt text is not available. Google reports more than 220 billion Street View images across more than 110 countries and territories, creating a large opportunity for accessible exploration.

StreetReaderAI was designed by blind and sighted accessibility researchers and builds on prior work such as Shades of Doom, BlindSquare, and SoundScape. Authors credited in the study include Jon E. Froehlich, Alexander J. Fiannaca, Nimer Jaber, Victor Tsaran, Shaun K. Kane, and Philip Nelson. The project is also listed on the Google Research publication page.

See it in action

Demos highlight AI-generated scene descriptions and real-time conversations about local features: YouTube video, YouTube video, and a participant session YouTube video. A compilation of diverse user questions and responses is available here.

Google's StreetReaderAI prototype makes Street View speak for blind and low-vision users - why chat beat descriptions 6x

What is StreetReaderAI

Study results

Why it matters

See it in action

Sources

More articles