Cloudflare removed Perplexity AI’s crawler from its Verified Bots programme and began blocking its traffic on 26 June 2024 after the security firm found repeated violations of robots.txt rules and attempts to disguise the crawler’s identity.
Cloudflare blocks Perplexity crawler
According to Cloudflare, the crawler rotated IP addresses, changed Autonomous System Numbers, and used an undisclosed network while posing as a standard Chrome browser. These actions broke the company’s Verified Bots requirements, which mandate declared IP ranges and strict adherence to robots.txt.
Key details from Cloudflare’s report
- Blog post date: 26 June 2024
- Undeclared user agent observed: “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36”
- Official identifiers: “PerplexityBot” and “Perplexity-User”
- Verified Bots participants must publish IP ranges and respect robots.txt
Stealth crawling methods identified
Cloudflare’s telemetry showed the crawler switching Autonomous System Numbers after each block, a tactic commonly used to evade network filters. Investigators also logged browser impersonation designed to bypass user agent checks, which Cloudflare said conflicts with the trust standards required for Verified Bots.
Perplexity’s response
In a statement dated 27 June 2024, Perplexity said the traffic originated from "user-driven AI assistants" rather than a rogue scraper. The company argued that Cloudflare cannot reliably differentiate its legitimate automated requests from malicious traffic, claiming the block harms end users who rely on its service.
How Cloudflare’s Verified Bots programme works
Introduced in 2022, the Verified Bots list allows trusted crawlers such as Google, Bing, and LinkedIn to access Cloudflare-protected sites without triggering security rules. To remain on the list, participants must:
- Register and publish their IP ranges
- Respect crawl delays and bandwidth limits
- Comply fully with robots.txt directives
Removal from the list triggers automatic blocking for customers using Cloudflare’s default bot management settings.