Small Action Models Are the Future of AI Agents

Tomasz Tunguz

Published Aug 4, 2025

2025 is the year of agents, & the key capability of agents is calling tools.

When using Claude Code, I can tell the AI to sift through a newsletter, find all the links to startups, verify they exist in our CRM, with a single command. This might involve two or three different tools being called.

But here’s the problem: using a large foundation model for this is expensive, often rate-limited, & overpowered for a selection task.

What is the best way to build an agentic system with tool calling?

The answer lies in small action models. NVIDIA released a compelling paperarguing that “Small language models (SLMs) are sufficiently powerful, inherently more suitable, & necessarily more economical for many invocations in agentic systems.”

I’ve been testing different local models to validate a cost reduction exercise. I started with a Qwen3:30b parameter model, which works but can be quite slow because it’s such a big model, even though only 3 billion of those 30 billion parameters are active at any one time.

The NVIDIA paper recommends the Salesforce xLAM model – a different architecture called a large action model specifically designed for tool selection.

So, I ran a test of my own, each model calling a tool to list my Asana tasks.

The results were striking: xLAM completed tasks in 2.61 seconds with 100% success, while Qwen took 9.82 seconds with 92% success – nearly four times as long.

This experiment shows the speed gain, but there’s a trade-off: how much intelligence should live in the model versus in the tools themselves. This limited

With larger models like Qwen, tools can be simpler because the model has better error tolerance & can work around poorly designed interfaces. The model compensates for tool limitations through brute-force reasoning.

With smaller models, the model has less capacity to recover from mistakes, so the tools must be more robust & the selection logic more precise. This might seem like a limitation, but it’s actually a feature.

This constraint eliminates the compounding error rate of LLM chained tools. When large models make sequential tool calls, errors accumulate exponentially.

Small action models force better system design, keeping the best of LLMs and combining it with specialized models.

This architecture is more efficient, faster, & more predictable.

Tomasz Tunguz

116,027 followers

+ Subscribe

Yuri Narciss (那悠瑞)

Co-Founder AlphaNeural AI; Executive coach for founders and leaders at fast growing start-ups

we are building a marketplace (complete with infra) where developers can monetise proprietary models. Check it out at https://app.alphaneural.io/

1 Reaction

Opeyemi Awoyemi

Building next gen AI career platform at hello.cv // Investor @ Fast Forward Fund. Founder, largest jobsite and web hosting company in Nigeria. CS OAU, Wharton MBA. Follow me for AI, Fintech and Future of Work insights.

This makes a lot of sense intuitively. An agent to click through a specific task or to extract leads need far too many few parameters.

Valerii Gorbanov

Product Leader | Fintech & Crypto & Web3 | LatAm Expert

I completely agree that the future lies in agentic systems equipped with efficient tool calling. Streamlining these processes could significantly reduce costs and open up new possibilities for lean startups to leverage AI without breaking the bank.

Val Bercovici

Building AI Factories, Open Source & Cloud Native

AI tokenomics are driving sophisticated engineering decisions like these model price/performance tradeoffs. I'm excited to be driving down token costs overall with 1000x KV Cache boosting via software-defined memory.

Srivasudhevan R

Moreover, SLMs can be installed on a local server within the organisaton's firewall so that security is guaranteed an it can be trained with firm's own data such as emails, files, etc.

Small Action Models Are the Future of AI Agents

Tomasz Tunguz

Tomasz Tunguz

116,027 followers

More articles by this author

Others also viewed

Why “AI-Ready” Is the New Baseline for MVPs

New Blog! Build a Better AI Agent

Does an AI Premium Exist in the Fundraising Market?

The Real AI Revolution Isn’t the Agent — It’s the Architecture

Part I: Will AI Bring Forth a 'Digital Detroit' Scenario

What Is Fuzzy Logic In Artificial Intelligence

The AI Agent Era Is Here And Most Infrastructure Isn’t Ready

The Market Isn’t Lost — It’s Ungoverned

We Turned Sand Into Thinking Machines (109)

Assembly of Experts: How Chimera Changes the Game in AI Model Building

Explore topics

Tomasz Tunguz

116,027 followers

From Knowledge to Action

Aug 8, 2025

The AI-Driven Cloud Market Share Shift

Aug 4, 2025

Why Seed Rounds Are Growing as Startups Shrink

Jul 28, 2025

PageRank in the Age of AI

Jul 23, 2025

The AI Search Tipping Point

Jul 22, 2025

The Question I Ask Myself Before I AI

Jul 21, 2025

The Sales Strategy Conquering the AI Market

Jul 18, 2025

Hidden Technical Debt in AI

Jul 17, 2025

The Rise of the Agent Manager

Jul 15, 2025

Budgeting for AI in Your Startup

Jul 11, 2025

Others also viewed

Why “AI-Ready” Is the New Baseline for MVPs

New Blog! Build a Better AI Agent

Does an AI Premium Exist in the Fundraising Market?

The Real AI Revolution Isn’t the Agent — It’s the Architecture

Part I: Will AI Bring Forth a 'Digital Detroit' Scenario

What Is Fuzzy Logic In Artificial Intelligence

The AI Agent Era Is Here And Most Infrastructure Isn’t Ready

The Market Isn’t Lost — It’s Ungoverned

We Turned Sand Into Thinking Machines (109)

Assembly of Experts: How Chimera Changes the Game in AI Model Building

Explore topics