In a recent episode of Google's "Search Off the Record" podcast, available on YouTube and Apple Podcasts, Google Search representatives Gary Illyes and Martin Splitt outlined how Googlebot processes HTML during crawling. They explained why browser resource hints do not influence Googlebot, why key metadata must stay in the head, and why HTML validity is not used as a ranking factor.
How Googlebot Handles Resource Hints
Illyes said that browser-focused resource hints such as dns-prefetch, preconnect, prefetch, and preload are not used by Googlebot when it crawls pages. These hints are designed to mitigate network latency in user browsers, which Google's internal systems typically do not face.
- Google's DNS resolution is already fast, so additional dns-prefetch directives are unnecessary for crawling.
- Googlebot caches page resources separately instead of fetching all assets in real time like a typical browser.
- This caching strategy reduces bandwidth use and server load for sites that are crawled.
Illyes added that Googlebot processes pages asynchronously, so it does not rely on preload hints when deciding what to fetch next. He contrasted this with Chrome's Speculation Rules API, which can prefetch results in the browser to speed up user clicks.
Both Illyes and Splitt stressed that resource hints still matter for users, because they can improve perceived page speed in browsers. These optimizations affect user experience, not how Googlebot crawls or indexes content.
Metadata Placement And HTML Validity
Splitt described a case where a standards-compliant script tag in the head injected an iframe, causing the browser to close the head section early. As a result, hreflang link elements ended up in the body, where Google's systems ignored them.
Illyes said Google follows the HTML living standard, which restricts meta name="robots" and rel="canonical" elements to the head. Accepting those directives in the body could allow attackers to inject markup that changes canonicalization or robots behavior.
He also reiterated previous guidance that site owners should use full URLs in canonical tags to avoid parser ambiguity. That advice aligns with Google's preference for clearly placed, unambiguous metadata in the head.
Illyes confirmed that HTML validity is not used as a ranking signal in Google Search. Validators typically treat validity as a binary pass or fail property, which he said limits its value for ranking decisions. A missing closing span tag, for example, can make HTML invalid without changing what users see.
Splitt added that heading hierarchy and HTML5 layout elements help accessibility and general structure but contribute little direct weight as ranking signals in Google Search.
Illyes' comments on separate caching of page resources are consistent with Google's published crawler guidance, which encourages using ETag headers to cut down on unnecessary crawling. ETags allow Googlebot to avoid redundant downloads while still checking content freshness.
The discussion appears in the episode titled "How browsers really parse HTML and what that means for SEO" on Google's official "Search Off the Record" podcast, available on the Google Search Central YouTube channel and on Apple Podcasts.






