The Bridge, Not the Brain • Tom Krush

In 2016, AlphaGo beat Lee Sedol. It combined Monte Carlo tree search with deep neural networks. A mathematical engine that explores possible futures using computation, no language required. It was a machine that could reason.

In 2017, “Attention Is All You Need” dropped. It became the blueprint for everything. Scale the transformer, add data, and machines could write, translate, argue. The entire field pivoted.

What are they good at? Language models are great at one specific thing. Taking messy, ambiguous human language and compressing it into structure. Subject. Relation. Object.

That’s what they are. Extraction engines.

The mistake is asking them to also be the thing on the other side.

A transformer predicts the next token. That’s it. It has no idea whether what it said three paragraphs ago still holds. It doesn’t build a chain of evidence, it just sounds like it does. So we scale the model, throw more data at it, and hope something true comes out the other side. Sometimes it does. Often enough that we stop asking if the method actually works.

We’ve been treating language models like brains. They’re not brains. They’re bridges.

Human language on one side. Logic on the other. And for the longest time, no reliable way across. The language model is the bridge. Not the brain. The bridge.

You don’t need a trillion parameters to extract the bones of a sentence. For clean, well-formed text, a small model can do this efficiently. The harder cases, the ambiguous, context-heavy, domain-specific ones, still need muscle. But the job is extraction, and extraction is a hard but bounded problem.

So what goes on the other side?

Once information is structured, you can store it. Index it. Traverse it. Run inference rules that produce new facts with traceable proof chains. Detect contradictions not by asking a model if something “seems wrong,” but by finding two paths that arrive at opposite conclusions.

There’s a deeper layer. Flat space doesn’t capture how knowledge actually works. Concepts are hierarchical. They nest. Some spaces are built for hierarchy the way a tree is built for branching, and the flat embeddings we’ve been using aren’t one of them. Put entities in a space that understands nesting natively, and you can represent in a handful of dimensions what flat geometry needs thousands for.

A model extracts. A graph stores. A geometry organizes. An engine reasons.

None of this matters if the way we build it is wasteful. If running a model burns through resources just to sound convincing, we’ve gotten the priorities wrong. The point of the technology is to make things better, not to make things. If it doesn’t serve people, it shouldn’t ship. If efficiency and truth can’t be achieved, it’s not worth it.

flowchart LR A["Model Extracts"] --> B["Graph Stores"] B --> C["Geometry Organizes"] C --> D["Engine Reasons"] style A fill:#1a1a2e,stroke:#e94560,color:#fff,stroke-width:2px style B fill:#0f3460,stroke:#e94560,color:#fff,stroke-width:2px style C fill:#0f3460,stroke:#e94560,color:#fff,stroke-width:2px style D fill:#1a1a2e,stroke:#e94560,color:#fff,stroke-width:3px