Software engineer Wilson Lin released a demo search engine aimed at reducing SEO spam. You can try the demo here and read the full technical write-up here.

What he built
Over two months, Lin built a prototype that retrieves results using neural embeddings with sentence-level chunking for precision. He trained a DistilBERT classifier to link sentences to their dependencies so responses include necessary context.
"I would follow the 'chain' backwards to ensure all dependents were also provided in context."
Main content extraction focused on HTML tags such as blockquote, dl, ol, p, pre, table, and ul.
Crawling and canonicalization
The crawler only fetched HTTPS URLs with valid eTLDs and hostnames, disallowing ports, usernames, and passwords in URLs. Canonicalization decoded and re-encoded components, normalized query parameters, and lowercased origins. Lin noted that DNS failures, very long URLs, and unusual characters caused downstream issues.
Infrastructure and scale
Initial infrastructure ran on Oracle Cloud, citing 10 TB of free egress per month. As the system scaled, Lin moved from PostgreSQL to 64 RocksDB shards. At peak, ingestion reached about 200,000 writes per second across thousands of clients. Each page stored raw HTML, normalized data, contextual chunks, hundreds of embeddings, and metadata.
Embedding generation began with OpenAI’s API and later shifted to self-hosted inference on Runpod GPUs, including RTX 4090 instances. Lin said Runpod offered lower per-hour rates than AWS and Lambda, along with more stable networking.
Results
In tests such as "best programming blogs" and paragraph-length queries, Lin reported fewer spammy results compared with typical engines. The public demo is available here, and the technical breakdown is here.
Takeaways
Lin’s key lessons include the importance of index coverage for quality and the difficulty of crawling and filtering at scale. He noted coverage gaps as a constraint for independent engines and highlighted the challenge of automatically assessing trust, originality, and accuracy. In a future iteration, he would prioritize evaluation methods earlier. The system’s architecture changed as scale increased, evolving from managed databases to sharded RocksDB and GPU inference.