

You’re describing trust dynamics right and that’s exactly why this project doesn’t ask you to trust the model. It asks you to trust observable outputs: provenance labels, deterministic lanes, fail-loud behaviour.
When it fails, you can see exactly which layer failed and why. Then you can fix it yourself. That’s more than you get right now (and in part why LLMs are considered toxic).
The correction mechanism is explicit rather than hoped for (“it learns” or “it earns my trust back”): you encode the fix via cheatsheets, memory, or lane contracts and it sticks permanently.
The model can’t drift back to the wrong answer. That’s not the model earning trust back - it’s you patching the ground truth it reasons from. Progress is measured in artifacts, not vibes.
Until someone makes better AI, that’s all we’ve got. Generally, we don’t get even this much.
Sadly, AI isn’t “one mind learning”; it can’t. So trust is earned by shrinking failure classes and proving it stuck again and again and again (aka making sure the tool does what it should be doing).
Whether that’s satisfying in the way a person earning trust back is satisfying - look, honestly, probably not. But it’s more auditable.
LLMs aren’t people and I’m ok with meeting them where they are.



FWIW Extra shit I cooked last night. It’s live now, so deserves a PS: of its own
PPS: I inbuilt as spam blocker as well.
Enjoy :) Blurb below
“But what if it just… Googled it?”
We can do that. But better.
You: Who won best picture at the 97th Academy Awards? Model: Anora won best picture at the 97th Academy Awards. See: https://www.wdsu.com/article/2025-oscars-biggest-moments/64003102 Confidence: medium | Source: WebWithout
>>web, that same 4B model said “The Fabelmans.” Then when I pushed it, “Cannes Film Festival.” With web retrieval, the router searches the internet, scores every result deterministically (phrase match + token overlap + domain trust), and only accepts evidence that passes a hard threshold. Garbage results get rejected, not served. The model never touches the answer - it’s extracted straight from the evidence.Retrieval cascade:
Cheatsheets → Wiki → Web → Model. Each step fires only if the previous one missed. The model is last resort, not first call. Sound familiar?I asked it who wrote a paper and it invented “Dr. David J. G. Smith” - a person who does not exist. After wiring DOI/Crossref fallback for academic metadata:
You: Who wrote The Anatomy of a Large-Scale Hypertextual Web Search Engine? Model: Sergey Brin and Lawrence Page. See: https://research.google/pubs/the-anatomy-of-a-large-scale-hypertextual-web-search-engine/ Confidence: medium | Source: WebDeterministic extraction from metadata. No model synthesis.
>>webis provider-agnostic - ships with DuckDuckGo (no API key, no account) and supports Tavily, SearxNG, or your own adapter. Add your own trusted domains in one config line (there are a bunch baked in already, like pubmed). Every answer comes with aSee:URL so you can verify with one click. Receipts, not pinky promises. PS: I even cooked in allow-list / deny-list domain filters, junk-domain blocklist and ad/tracker URL rejection so your results don’t get fouled with low quality spam shit.