The Artificial Investor Issue #46 - February 2025 recap
My name is Aris Xenofontos and I am an investor at Seaya Ventures. This is the monthly version of the Artificial Investor that covers the top AI developments of the previous month.
The scaling laws are dead. Long live the scaling laws! Deep Research: the new commodity. The OpenAI soap opera continues. Can Europe catch up? Software developer: a legacy job? Humanity’s Last Exam. All this and much more in today’s market update post.
Let’s dive in.
🚀 Jumping on the bandwagon
We had a megaround last month with Anthropic raising 3.5 billion dollars. Other large rounds highlight investor appetite for verticalised AI (Abrige’s 250-million and Harrison’s 110-million fundraises in Healthcare, Eudia’s 100-million fundraise in Legal Services), automation (Tine’s 125-million round), the infrastructure layer (DDN’s 300-million raise in data infrastructure, Krutrim’s 230-million raise in AI Cloud infrastructure, EnCharge’s 100-million raise in chips, ElevenLab’s 180-million raise in Voice AI) and humanoid robotics (Apptronik raised 350 million dollars and Neura Robotics 125 million dollars) .
In terms of exits, we saw a couple of notable mainly small acquisitions of AI players by large data and hardware companies. MongoDB acquired Voyage to enhance its AI Integration and data retrieval capabilities for 220 million dollars and LG acquired Bear Robotics to expand its robotics offering for 300 million dollars. Finally, CoreWeave, the AI datacentre infrastructure provider, announced it is filing for an IPO seeking a 35-billion-dollar valuation with the intention to go public in March.
📈 On pink paper
The scaling laws are dead. Long live the scaling laws! Many will remember February 2025 as the month that the launch of a Chinese reasoning model, Deepseek R1, turned the AI world upside down. The main noise was around the fact that the Chinese frontier lab (built as an internal project out of a hedge fund that had spare GPU capacity!) released a model that matched OpenAI’s top reasoning model in various benchmarks while claiming that it cost only 6 million dollars to train. This was initially compared to the billions of dollars that OpenAI has raised to train its models and the hyperscalers have invested in AI infrastructure. How can a Chinese model be 100x cheaper than OpenaAI’s model with the same quality?
The world panicked: everyone started talking about how China has caught up with the US on AI, how much money the US firms have wasted and how AI chip demand has been overestimated. Nvidia was down more than 16% on the next trading day losing 600 billion dollars of value. Of course, reason (no punt intended) prevailed as the story got broken down and analysed with calm.
Specifically:
As a result, we are really talking about Deepseek delivering 30%-50% efficiency vs. OpenAIs SOTA reasoning model that was released 8 months before its model.
Our conclusions, which we also published in the Artificial Investor Issue 44 (DeepSeek and AI myth), are:
So, what happened after that? We observed three types of reactions:
We believe the market has in general overreacted to news looking for a reason to support that AI model scaling is over or not, depending on which side you were on. The scaling laws by definition have always had a ceiling (per dimension) and it looks like we are reaching a ceiling for the dimension of public text data and model parameters. But there are more dimensions to scale on (private text data, video data, inference, reasoning), plus we expect to observe once again the Jevons paradox: the cheaper Intelligence will get, the more demand there will be.
Deep Research: the new commodity
Another big piece of news in February was the creation of a new AI product category: Deep Research. Among hyperscalers, the pioneer was Google with its Deep Research product, which by itself is a piece of news since Google has been a follower in the GenAI wave, in terms of model and product releases. At the same time, let's not forget that Google was not the pioneer in this category - yes, I'm talking to…You.com!
So Google launches Deep Research, which effectively uses a reasoning model, it's Gemini Flash 2.0 Thinking, and what it does is an evolution of what Perplexity has been doing. The difference is that Perplexity has been using the first wave LLMs (without reasoning capabilities) combined with Web search to give a one-shot answer using 5-10 sources to a research question. The Deep Research category type of products are going further than that.
Let's take the example of “drinking and driving regulations between the US and Europe” as a research question. The model does the following:
While the original version of Perplexity would look at 10-15 sources and will come up with an answer that is maximum one page, Google Deep Research looks at 100-200 sources and comes up with four pages that, in our example, would include an introduction (perhaps with an explanation of the topic and why it is important), a summary of what the differences are, a deep dive into the US (perhaps looking at each state and telling you what states are stricter vs. more tolerant) and Europe separately, and finally a conclusion (that would not only summarise the findings, but also talk about why the topic is important, what have been the effects of strict laws on alcohol consumption, etc).
Within a month, Perplexity and OpenAI launched their own products. They are all very similar, and they are actually piggybacking on something that You.com has done first a few months ago (a startup with a less-known brand and distribution power). Finally, a team published on Huggingface and GitHub an open-source version of a Deep Research product that can call any of the AI reasoning models the user prefers (the product is free but one would have to pay for the API fees of the reasoning models of course). On the back of that, OpenAI changed its Deep Research pricing model: it had started with making it available only for 200-dollar-per-month subscribers, but switched to two free uses per month for free subscribers and 10 free uses for 20-dollar-per-month subscribers, most likely in an attempt to remove some friction.
Our thoughts on this product category are:
The OpenAI soap opera continues
Another month, another new episode in the OpenAI drama. OpenAI made some noise with the first big AI Super Bowl ad, which was part of the 330 million dollars spent last year in advertising by AI companies in total. The Super Bowl ad was quite impressive and raised a lot of eyebrows.
At the same time, SoftBank is investing 40 billion dollars in OpenAI at a valuation of 240 billion dollars, leaving many of us to think about how much of an upside there still is in the investment. Softbank would have to believe that OpenAI will soon belong to the exclusive club of the 1-trillion-dollar companies, like Apple, Nvidia, Meta, Microsoft, Amazon, Tesla and Saudi Aramco! In the meantime, Elon Musk tried to make an offer to buy OpenAI and OpenAI's board declined it. This is likely part of the testosterone battle we have witnessed between Elon and Sam (Altman), which, as of recently, includes Zuck. CNBC published that Meta will release a standalone AI app to compete with ChatGPT and Sam posted on X that they should probably launch a social network app soon (being obviously sarcastic about it).
Anyway, on a more serious note, ChatGPT, the killer AI app at the moment, reached the sixth position of US mobile traffic in February. Google.com remained number one with 29% of traffic share while ChatGPT took a 2.3% share. Still a long way to go, but quite impressive to be already number six and above X.com, Reddit, and Netflix, and very close to Instagram and WhatsApp.
Also, OpenAI made a long patent application that gives a glimpse of the company’s product roadmap. There are two main topics there:
⚔️ A double-edged sword
Can Europe catch up?
The biggest piece of news from Europe was the change in sentiment and stance on the trade-off between AI innovation and safety. Last year, the political line was safety first, but this year, as witnessed at the AI Action Summit in Paris, it is about putting innovation first. There, the French President Macron announced a 110-billion-dollar investment in AI (which is like other political announcements that put together a mix of public and private initiatives to sum up to a relatively big number), which is positive for the region albeit a fifth of the number that Trump announced for the US last January. What’s more interesting is that, from a more practical perspective, there were a couple of EU regulations that were being prepared and are getting shelved: the new ePrivacy legislation related to online advertising, and the AI Liability legislation. Instead, the 2025 EU legislative roadmap is focused on the Innovation Act and the AI Cloud and Development Act, which aim to support startups and other innovative companies, and enhance access to data.
Innovation finds a way
Another episode in the US-China AI Cold War was an announcement that Apple is partnering with Alibaba in order to integrate its AI models in iPhones launched in China (given it cannot use any of the American AI models due to export bans). This is an interesting twist, because it demonstrates once more that innovation eventually finds its way to break legal barriers and people and companies get access to the innovation they demand. Which poses the question: “What's the point of export bans and regulations on technology and AI? Do they really have any effect? And if they don't, should we be focusing our energy somewhere else?
Regardless of the effectiveness of restrictions, the two sides continue to march on the US-China AI Cold War. China has reacted to Trump’s tariff announcements by making stricter the rules around investment and M&A between Chinese and foreign Tech companies. On the other side, in reaction to the Deepseek news, US lawmakers pushed for a ban of using the Chinese application in US government-related entities, amidst security concerns around the Chinese government getting access to user data.
Software developer: a legacy job?
A new report came out about the impact of AI on employment and the sentiment seems to be shifting slowly from AI augmentation towards AI automation and replacement. This is something we expected, as AI capabilities grow. What is also special about this report is that it is practical and bottom-up. The Anthropic Economic Index is based on an analysis of the nature and volume of user prompts on the company’s models. The study showed that more than one in three jobs use some AI for at least 25% of their tasks, and that about 60% of workers use AI for augmentation whilst as high as 40% of workers use AI for automation. In addition, the analysis showed that the highest AI adoption rates are in the software engineering role, which take up about 37% of all user queries and the lowest adoption is in physical jobs, which take up 0.1% of queries. This is something we have also seen in the companies we have invested in. We carried out a survey across more than 20 companies of different sizes and sectors and business models, and the feedback was quite consistent: software development is the number one use case currently.
Talking about how software engineering jobs are changing, it was reported that the leading AI developer tool, Cursor, reached 100 million dollars of annual recurring revenue faster than any other Tech company in history: in less than two years since they were founded. Cursor's rapid growth is impressive because let’s not forget that it targets the most tech-savvy users of all: software developers.
Also, OpenAI's model used for coding performs better than 99.8% of developers in various benchmarks. This means that out of about 30 million developers worldwide, only the top 0.2%, or about 60,000 developers, match its performance. Given that Google has about 30,000 and Microsoft about 60,000 software developers, an AI developer agent should be able to get a job in one of these two companies.
We are using various AI software development tools, such as Bolt and Cursor, on a daily basis and plan to write a detailed post about these tools and their impact on software development. Here are a few spoilers:
We'll share more details in our upcoming blog post.
In other AI-risk-related news
🧩 Laying the groundwork
🔢 Models
The most notable model launches last month came from China. We have been giving a fair real estate to Chinese model launches in the past on this blog; hopefully, since the Deepseek-R1 launch everyone will pay a bit more attention now 🙂 Alibaba took the lion share this time around introducing new versions for its Qwen open-source model series:
We also saw the launch of an open-source lyrics-to-song generation model by YuE. It’s fun to play with!
US BigTech companies also launched some new models. The previously-announced small-model series of Microsoft’s, Phi-4, the already-previewed Gemini-2.0 Pro of Google’s (which takes the top spot in all categories in the relevant benchmarks and connects with YouTube, Maps, and Search), as well as new models, such as Google’s open-source VLM, PaliGemma 2 Mix, NVIDIA’s Llama Nemotron models for Agentic AI development and IBM’s Granite 3.2, which is an open-source model that combines reasoning and vision. It seems like you are no one in the Tech world nowadays if you haven’t launched your own LLMs.
Other recently-released models include Mistral’s le Chat and Small 3, Allen Institute's Tülu 3 and Omnihuman’s OmniHuman-1.
In February a lot of publicity was given to a new AI model benchmark called “Humanity’s Last Exam” (HLE), which we would like to comment on briefly. The benchmark was developed collaboratively by nearly 1,000 subject-matter experts from over 500 institutions across 50 countries and aims to tackle “benchmark saturation,” wherein top LLMs get near-perfect results on existing benchmarks, such as MMLU, making it challenging to gauge genuine AI progress. HLE consists of 3,000 difficult questions drawn from subjects like mathematics, humanities and natural sciences, that go beyond easy web lookups or simple database retrievals. The format of the questions is multi-modal: about 10% of the questions require both text and image comprehension and the rest are text-based, consisting of multiple-choice and exact-answer questions. Most LLMs have scored below 10%, but recently OpenAI Deep Research broke the barrier of a 25% score. Humans score below 5%.
Below is an example question. Good luck! 🙂
🧱 Infrastructure
Deepseek what? It looks like many leading AI companies around the world don’t seem to be very fussed about the Deepseek efficient-AI news. Apple is planning to invest 500 billion dollars in expanding its AI infrastructure, including a new factory in Texas, Meta is in talks for a 200 billion-dollar data centre project, Alibaba plans to invest 52 billion dollars in AI Cloud over the next three years and AWS announced an 11 billion-dollar investment in Georgia, US for an AI datacentre.
🤖 Robotics
The year of robotics. 2025 looks like it’s going to be the (first) year of robotics. Robot dexterity is going to the next level with humanoids imitating signature movements by famous athletes, such as Cristiano Ronaldo, and robot dogs running 100 meters in less than 10 seconds. Also, it looks like Meta is also planning to jump on the humanoid bandwagon with a strategic investment in the space, managed by new leadership. The year of robotics includes autonomous vehicles. Things are heating up in the US where Lyft has partnered with Mobileye to launch robotaxis in Dallas by 2026 and May Mobility launched the first driverless commercial ridehailing service in Georgia. Chinese firms are not standing still, as BYD launched a new generation of its God's Eye autonomous driving system.
In other hardware-related news:
🍽️ Fun things to impress at the dinner table
Read my mind. Meta researchers have developed a system that decodes someone’s brain activity and converts it into text. At the same time, Neuralink unveiled a similar technology that generates hand-written thoughts.
Eye contact. This Zoom user leverages AI to participate in Zoom calls making eye contact, but is actually moving around the room.
Hear me out. G. Gerganov created his own AI language that allows voice AI models talk to each other faster and with 100% reliability.
Partner at Seaya Ventures
6moSign up to receive free posts in your inbox: https://artificialinvestor.substack.com