AGI won't come from current LLMs
AI needs physics to evolve

AGI won't come from current LLMs

Prof. Yann LeCun is a highly influential figure in artificial intelligence and machine learning. He is one of the pioneers of deep learning and is best known for his work on convolutional neural networks (CNNs), which have had a major impact on computer vision and AI. LeCun received the Turing Award in 2018, along with Geoffrey Hinton and Yoshua Bengio, for their contributions to deep learning. More recently, he was awarded the Queen Elizabeth Prize for Engineering in 2024, recognizing his work in advancing AI technologies. In his interview with Dr. Matt Kawecki, a former digital ambassador of the EU, he emphasized that current transformer-based LLMs will not lead to AGI, contrary to the promises made by many model development companies to their investors to secure billions of dollars in funding.

According to Yann LeCun, we are often misled into perceiving AI systems as intelligent simply because they can manipulate natural language. However, their lack of memory and reasoning abilities place them at a significant disadvantage. Despite these limitations, the next paradigm of AI will still be built upon deep learning.

The field of machine learning has primarily focused on three approaches:

  • Supervised Learning: This method trains AI to recognize images through exposure to thousands of variations, allowing it to classify objects with high accuracy.

  • Reinforcement Learning: Here, the system receives feedback on whether an answer is correct or incorrect. While highly effective in structured environments such as chess, it struggles in the complexities of the physical world. Training robots through reinforcement learning yields some success but remains largely inefficient.

  • Self-Supervised Learning: This approach enables AI to understand interdependencies within input data. It has driven significant advancements in chatbots by training systems to predict missing words in sentences. The results have been remarkable.

However, all these methods fall short when it comes to truly understanding the physical world. A dictionary contains a finite set of words, yet it cannot predict real-world events. Humans and animals, by contrast, grasp fundamental concepts like gravity intuitively and within a short period, an ability AI struggles to replicate. This challenge is known as the Moravec Paradox: tasks that are easy for humans, such as learning to drive within 20 hours of training, remain extraordinarily difficult for self-driving cars.

A typical LLM is trained on an immense dataset of about 20 trillion tokens from the internet. It would take a human an entire lifetime to read that volume of text. However, visual systems can absorb information at a much faster rate. In approximately four years, a human processes the same amount of information as an LLM, yet humans extract meaning and knowledge in a fundamentally different way.

The Limitations of Current AI

The amount of information one extracts from a message depends heavily on interpretation. Present-day LLMs perform reasoning and planning in a highly rudimentary manner. They generate multiple tokens based on a probabilistic search, after which another neural network determines the most optimal sequence. It is a process that is computationally expensive and vastly different from human cognition. Unlike AI, humans excel at predicting sequences of actions to accomplish tasks, a key distinction that sets them apart from current LLM capabilities.

While business process automation is relatively straightforward with AI, achieving true intelligence requires system-level thinking, something current models lack.

The Future of AI and Robotics

For AI to power advanced robotics, it must develop a deeper understanding of the physical world, possess persistent memory, and demonstrate reasoning and planning abilities. Robotics companies are betting on rapid AI advancements in the coming years to make their products commercially viable.

However, the unpredictable nature of the real world presents a significant challenge. Predicting the next word in a sentence, LLMs' primary function, will not lead to Artificial General Intelligence. Similarly, treating video frames as tokens within a transformer model is insufficient for developing a comprehensive understanding of the physical world.

One promising approach is Joint Embedding Predictive Architecture (JEPA). Instead of merely predicting the next token, JEPA learns an abstract representation of input data and makes predictions within that representational space. This methodology could serve as a stepping stone toward more advanced AI systems. Moreover, JEPA itself is a macro-architecture built upon transformers, indicating that the foundation of deep learning remains central to AI's evolution.

Watch the full interview here:

https://www.youtube.com/watch?v=RUnFgu8kH-4

Conflict of Interest Disclaimer:

Prof. Yann LeCun serves as the Chief AI Scientist at Meta and a professor at NYU. His views on AI, machine learning, and AGI may be shaped by his professional affiliations. Readers should take potential conflicts of interest into account when assessing his statements on AI research, industry trends, and funding matters.

 

Mike Kelleher

Alliances and Partnership Manager

4mo

Interesting point, Liton. While AGI is still out of reach, it's cool to see how these models are changing the way we interact with technology. What developments do you think could bring us closer to AGI?

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore topics