Interesting post. Lots of talk about AGI and whether and when we will hit. It's all beside the point. Understanding theoretical ceilings is interesting. But the more urgent conversation is what today’s systems already deliver. And how fast that frontier is moving. A Look at the Data: Super‑Human Performance Is Here GPQA Diamond (graduate‑level benchmark) - Human PhD experts: 65–74 % - OpenAI o3 (May 2025 preview): 83 % Humanity’s Last Exam - Single human expert: effectively 0 % - Frontier agents (Grok‑4 Heavy, Grok 4, ChatGPT Agent): 44.4 %, 38.6 %, and 44.4 % respectively Beyond Exams: The Exponential Growth of AI Capability METR’s longitudinal study measures how long a real‑world task AI can complete reliably. The horizon has doubled roughly every 7 months for six straight years. In practice that means: - 2023: agents could finish ~5‑minute tasks - 2024: ~15‑minute tasks - 2025: on track for hour‑plus workflows, with days‑long tasks in sight See: https://lnkd.in/eYZTeiSc Products, Processes & People: Where the Value Lies Today What we can build on today's AI is nothing short of extraordinary: entire product categories, fully automated workflows, and new ways to amplify human talent are already within reach. - Products – Entire markets (e.g., bespoke legal research, on‑demand drug‑discovery catalysts) remain untapped because founders haven’t asked “What if the agent owned the workflow?” - Processes – Most firms still deploy LLMs as glorified chatbots. The bigger prize is end‑to‑end orchestration: marketing funnels that refine themselves, R&D pipelines that iterate overnight. - People – Upskilling teams to partner with these tools will create orders‑of‑magnitude more value than squeezing another point of perplexity out of next‑token prediction. The Business Operating System: Agents × Humans, Integrated What follows logically is a new business operating system, which is a layer that fuses AI agents with human judgment, data pipelines, and legacy systems to run whole value chains autonomously. Think: 1. Cognitive backends powered by frontier LLMs & specialized models. 2. Agentic orchestration handling multi‑step processes: data ingestion → reasoning → action. 3. Human‑in‑the‑loop controls for governance, escalation, and continuous improvement. 4. Domain apps & APIs that let industries—from finance to manufacturing—plug in with minimal lift. All of this is doable today. We're building it now alongside partners in finance, healthcare, manufacturing, and other sectors. The limiting factor is organizational courage, not technical feasibility. The Real Bottleneck Whether or not scaled transformers ever reach “AGI” is a sideshow. The biggest constraint on prosperity right now is our own imagination. My strong recommendation: spend more energy experimenting with the capabilities we already have, and worry less about theoretical walls that ongoing empirical progress keeps bulldozing.
Fintech Professional | AI Solution Architect | Real Time Data, Ontologies & Knowledge Graphs | Exploring AI Beyond LLMs
The wall is real.. I am not sure who needs to hear it right now, but the transformer wall is real. Another day, another paper confirms what many of us already know too well. And like the others, it will probably be ignored by those still high on hyperscale hopium.. But Peter Coveney from University College London and Sauro Succi from the Italian Institute of Technology just put the wall into hard math. They formalized it using scaling laws, entropy bounds, and statistical mechanics. They show that LLMs cannot escape a builtin wall or a limit. It is not something temporary. It is not due to lack of or bad data, or insufficient tuning.. It is structural (as has been shown a million times already). Transformers work by predicting the next token based on learned statistical patterns. They are trained to minimize the divergence between the LLM/LRMs probability distribution and the distribution of the training data. But that divergence, known as Kullback Leibler divergence, cannot be reduced to zero. There is always a nonzero lower bound, and no amount of scaling can push through it.. The transformer, as it scales, runs into diminishing returns because it lacks the representational capacity to fully capture the structure of natural language or the world it references.. It cannot reliably compress rich semantic content into a stable latent form and recover its full meaning during decoding. The LLM behaves as a lossy compressor, amplifying noise and incoherence under the illusion of fluency. which fuels hallucinations. The asymptotic flattening of improvement is a direct result of how the model approximates statistical relationships rather than learning grounded semantic meaning. The hallucinations and confident errors are not outliers. They are structural consequences of this limitation (duhhhhhh). The transformer was not built to model human cognition. It was built to solve language to language translation. It was designed to map between two structured domains of text with some stochastic variability added to allow for flexibility in phrasing. That is it.. Yet here we are, doing stupid shit with it.. Scaling a system that was never intended to understand. Wrapping retrieval hacks around it. Feeding it synthetic reinforcement loops and curated context.. Meh. What we are building are high resolution interpolative approximators. Useful in narrow closed world applications or for brainstorming ideas. Dangerous in anything that requires actual reasoning, consistency or correctness. The LLM wall is real.. It is measurable and it is being hit by frontier labs. And pretending that another gazillion parameters will unlock cognition is not innovation.. It is delulu bullshit and yet most of us gobble it all up. We are not advancing toward general intelligence. We are just hyperscaling a tool designed for translation and acting surprised when it fails to actually think.. When all it can really do is context conditioned approximate pattern retrieval. #ai