DeepSeek, OpenAI and the Jevons Paradox
This is the 12th article of Beyond Entropy, a space where the chaos of the future, the speed of emerging technologies and the explosion of opportunities are slowed down, allowing us to turn (qu)bits into our dreams.
The outline of today's post we be:
Let’s start!
DeepSeek, Qwen, and OpenAI-o3: Reasoning Capabilities
In the past 15 days there has been an incredible acceleration of releases of new AI models. This has disrupted not only the tech community, but the entire market and even some geopolitical balances. This clearly shows that the race to dominate AI is becoming strategic and high on the political agenda of major countries. Words such as Open Source, LLMs, or Reinforcement Learning have passed almost everyone's lips.
In particular, three AI models were released (DeepSeek-R1, Qwen 2.5 Max, and OpenAI o3-mini) that competed on one of the most strategic and important tasks for model supremacy: Reasoning Capabilities. These capabilities involve mathematical skills, logical-problem solving reasoning, coding, and PhD-level knowledge of certain topics, especially analytical and scientific.
Reasoning capabilities are measured on benchmarks such as AIME 2024, GPQA Diamond, FrontierMath, SWE-bench, and others. Some preliminary results (check this blog post) shows that:
What about Energy Efficiency?
We have all heard that DeepSeek has been able to compete with the better OpenAI and Anthropic models at a significantly lower cost (although it is not yet entirely clear by how many orders of magnitude). This includes both training and inference costs.
How is this possible? In recent years, AI research has focused on developing new techniques that are even more efficient. Among these, the most popular are Pruning, Quantization, Distillation, LoRA, and Experts Routing. In particular, the DeepSeek team has exploited:
DeepSeek team also claims to take advantage of Knowledge Distillation, which consists of transfer knowledge from a pre-trained big model (the teacher) to a smaller, more efficient model (the student) by training the later to mimic the former. However, it seems that DeepSeek is not exactly leveraging distillation but Supervised Fine Tuning (SFT), as noted by Andriy Burkov in this interesting discussion.
How does energy efficiency translate into final costs for using these models?
Therefore, to date DeepSeek is 4 times cheaper than OpenAI o3-mini, which is 63% cheaper than OpenAI o1-mini (Pricing: DeepSeek, OpenAI).
The Jevons Paradox and Market Fall
The race toward efficiency of LLMs (keeping their performances constant) naively suggests less and less need for computational power. For this reason the release of the efficient DeepSeek triggered a very heavy market reaction for the shares of USA big tech companies and in particular NVIDIA.
Only the most tech-savvy and careful observers have not sold NVIDIA's stock, knowing full well that counterintuitively the computational power required by AI can only increase. This is the famous Jevons Paradox that states:
Any increase in resource efficiency generates an increase in long-term resource consumption, rather than a decrease.
In the context of LLMs, this translates into the fact that by becoming less and less expensive (both for training and for inference) they will be able to be adopted by more companies, startups and users. Each of these entities will necessarily have to use computational resources. Extreme example: if today only one player can train an LLM with 10 thousand GPUs, tomorrow 10 million players can train an LLM with only one GPU each. If you do the math, the number of total GPUs can only increase.
Digital & Tech news from Europe
In this section I want to highlight some tech news from Europe (to stay updated consider to follow the official EU Digital & Tech ):
Opportunities, talks, and events
I share some opportunities from my network that you might find interesting:
🚀 Job opportunities:
🔬 Research opportunities:
📚 Other opportunities:
Thanks for reading and thanks for sharing if you found this content useful! Until the next post!