The Codex Blog¶

Commentary and research notes on the frontier of applied AI.

October 30, 2025
2 min read

The Chaotic, Honest History of the Engine Behind Google and Modern AI

I recently came across a delicious article on Medium that I had to share: Dense vs Sparse: A Short, Chaotic, and Honest History of RAG Retrievers by Pınar Ece Aktan. It's one of the best summaries I've read of the journey our field of Information Retrieval (IR) has taken.

October 20, 2025
4 min read

Reviewing GraphMERT

In the relentless race for bigger Large Language Models, we've come to equate scale with capability. But what if, for a critical class of enterprise problems, a smaller, more specialized tool isn't just better—it's in a different league entirely?

A recent paper from Princeton University, "GraphMERT: Efficient and Scalable Distillation of Reliable Knowledge Graphs from Unstructured Data," delivers a quiet bombshell. It introduces an ~80M-parameter, encoder-only model that distills reliable, domain-specific Knowledge Graphs (KGs) from text.