Jonathan Staude’s Post

View profile for Jonathan Staude

Mathematician and Entrepreneur | AI & Data Strategy | Software Developer

LLMs hallucinate - by design. This is what OpenAI now also openly communicates. Just for everyone in Analytics this means: LLMs can not "analyze" data in a sense that it can solve mathematical equations reliably. This means every analytical system based on LLMs must have an in-between layer (SQL generator, python-code generator or equal) to ensure deterministic results.

View profile for Nam Nguyen

Technical Support Engineer at Tek Experts

A new paper from OpenAI partially supports some of my longstanding views on large language models (LLMs): - LLMs will inevitably hallucinate, even when the training data is entirely error-free. - Benchmarks are not a reliable measure of “intelligence” in LLMs. The authors are correct in pointing out that hallucinations stem from the operational mechanics of LLMs and from their training feedback loops. However, this only describes statistical tendencies. It does not fully address the deeper question: why do LLMs hallucinate at all? This gap limits the true value of the paper. More concerning is their unsubstantiated claim that it is possible to build a “non-hallucinating” model by connecting it to a Q&A database, adding a calculator, and forcing it to respond “I don’t know” whenever uncertain. There are two major flaws here: - Such a system reduces the model to a rigid program of conditional statements, rather than a generative AI. - LLMs cannot genuinely recognize what they do not know. They lack self-awareness or calibrated confidence, and thus will always appear to know everything. It is surprising to see the world’s most valuable AI company, with some of the brightest minds, present such a simplistic and unsupported proposal. The remainder of the paper is filled with elegant mathematical formulations—but without grounding, they add little substance. #artificialintelligence #LLM #hallucination https://lnkd.in/gfgNetkR

  • No alternative text description for this image

To view or add a comment, sign in

Explore content categories