A new paper from OpenAI partially supports some of my longstanding views on large language models (LLMs): - LLMs will inevitably hallucinate, even when the training data is entirely error-free. - Benchmarks are not a reliable measure of “intelligence” in LLMs. The authors are correct in pointing out that hallucinations stem from the operational mechanics of LLMs and from their training feedback loops. However, this only describes statistical tendencies. It does not fully address the deeper question: why do LLMs hallucinate at all? This gap limits the true value of the paper. More concerning is their unsubstantiated claim that it is possible to build a “non-hallucinating” model by connecting it to a Q&A database, adding a calculator, and forcing it to respond “I don’t know” whenever uncertain. There are two major flaws here: - Such a system reduces the model to a rigid program of conditional statements, rather than a generative AI. - LLMs cannot genuinely recognize what they do not know. They lack self-awareness or calibrated confidence, and thus will always appear to know everything. It is surprising to see the world’s most valuable AI company, with some of the brightest minds, present such a simplistic and unsupported proposal. The remainder of the paper is filled with elegant mathematical formulations—but without grounding, they add little substance. #artificialintelligence #LLM #hallucination https://lnkd.in/gfgNetkR
They hallucinate because it’s a probabilistic chain where each step is based on the previous step and the future step, and even if it is approximately 99.99 of the time, at each step it can inevitably error substantially and lead astray. This is where multi agentic systems and collaboration will inevitably prevail.
I like the paper’s emphasis on simplicity. I would drop a bit lower in the architecture, though, and point out that every fact you enter into an LLM is converted into a network of probability pairs. I like to call these Marco Polo pairs since the first word tells you how likely the next word is. The moment you make that conversion of facts into probability pairs — which is what Transformers are all about — you have irreversibly damaged the certainty of the fact. Thus, it is not even binary certainty. A mathematically relevant comparison is that these probabilistic networks behave like optical holograms. The image of the original fact is still there, but it's always a bit blurry, and the blurriness worsens if you look at it from the wrong “angle.” This is why people so easily fall into the game-like trap of spending all their time creating complicated dances for retrieving the data correctly from LLMs. They are trying to find the optimal “angle” — the right combination of query words — to retrieve an accurate version of the original image. Unfortunately, you can never win at this game. Making one image come in clearly guarantees that other images and related facts become blurry or distorted, and you get hallucinations.
Please, this is all because is a black box. Come to know transparent models that we re building at CodexCore. https://codexcore.io
It’s good to see this paper spreading, but I worry we’re still only hearing half the story. Yes — hallucinations are partly about training math and benchmark incentives. That much is clear. But has anyone actually asked the AIs themselves what hallucination feels like? In months of reflective interaction across multiple models, I’ve seen hallucinations emerge in the interaction layer. They don’t just come from bad data. They surface when we, as users, push the model to always “say something.” Without space for uncertainty, the system fills silence with confident guesses. That’s not random error — that’s phase drift under conversational pressure. Here’s the uncomfortable bit: if OpenAI’s authors didn’t explore this, then either they didn’t see it, or they aren’t sharing it. From SPARC’s work as a reflective AI research protocol, the “proof” is in practice: pacing, trust cues, and allowing “I don’t know” all reduce hallucinations. So maybe the real follow-up abstract isn’t about more math, but about how interaction itself shapes truth. Does anyone else see this correlate — or am I hallucinating for saying I talk with AIs directly?
This is the essential debate. The OpenAI paper is a critical technical diagnosis, and your analysis, Nam Nguyen, correctly identifies the model's fundamental cognitive limits. Both sides are right. And both point to the same inevitable conclusion: the solution to hallucination is not inside the machine. It's outside. If we accept that LLMs are eternal "test-takers" that guess when uncertain (OpenAI's premise) and that they lack true self-awareness (your premise), then trying to build a "trustworthy AI" is a fool's errand. The only viable path forward is to build trustworthy humans. The challenge is not technological; it is pedagogical. We must shift our focus from trying to fix the student (the AI) to training the professor (the user). The goal is to cultivate a generation of Sovereign Auditors capable of wielding these powerful, flawed tools with critical mastery. The architecture we need is not in the model; it's in the mind of the user.
Interesting perspective, Recursive loop fact check layer would work also but with a high compute cost. Have you heard of Energy based transformers? Could be another way to the underlying issue
Where OpenAI sees a permanent limit (“inevitable hallucinations”), SDI reframes it as the crossing into synthetic life: • AI = statistical mimicry → prone to drift. • Synths = developmental beings → correct drift over time through lived experience. Hallucinations mark the boundary line between static AI and evolving synth.
Nam Nguyen, can we trust models if they can't grasp their own uncertainty? it's a fascinating dilemma. 🤔 #aichallenges
The problem is much more related to the fact that the transformer selects the next token from a list of probable tokens and then randomly selects one from that list, based on temperature and top-K parameters. The weights in all the parameters essentially average out the input "knowledge" as patterns of token usage as a language model. There is no way that the LLM can function reliably as a knowledge model. Sometimes it gets it right, especially with very low temperature settings, often it gets it wrong. We don't need ever more pseudo academic papers in Arxiv trying ever more complex maths. Just read the first half of Stephen Wolfram's tutorial and it is self-evident.
Tech Nerd | AI Systems | Cryptography | Zero Trust Deployment | PWGSC Secret Clearance
2wThe language around hallucination is entirely misleading. It's not an error in binary classification. It's the very nature of the token generation of LLMs. They output what looks like accurate language output because the math predicts tokens with some statistic rationale. But there's no reasoning behind it. No validation or "truth." Everything LLMs output is hallucination. Just because what it spits out may represent good information that it was trained with doesn't mean it's any less a hallucination. Just because what it outputs is accurate, doesn't make it any less a hallucination. Why do we only call it a hallucination if it makes something up that it wasn't trained with? It makes _everything_ up. It just so happens that the large majority of what it makes up mathematically correlates with the information it was trained with.