Small Action Models Are the Future of AI Agents

Small Action Models Are the Future of AI Agents

2025 is the year of agents, & the key capability of agents is calling tools.

When using Claude Code, I can tell the AI to sift through a newsletter, find all the links to startups, verify they exist in our CRM, with a single command. This might involve two or three different tools being called.

But here’s the problem: using a large foundation model for this is expensive, often rate-limited, & overpowered for a selection task.

What is the best way to build an agentic system with tool calling?

The answer lies in small action models. NVIDIA released a compelling paperarguing that “Small language models (SLMs) are sufficiently powerful, inherently more suitable, & necessarily more economical for many invocations in agentic systems.”

I’ve been testing different local models to validate a cost reduction exercise. I started with a Qwen3:30b parameter model, which works but can be quite slow because it’s such a big model, even though only 3 billion of those 30 billion parameters are active at any one time.

The NVIDIA paper recommends the Salesforce xLAM model – a different architecture called a large action model specifically designed for tool selection.

So, I ran a test of my own, each model calling a tool to list my Asana tasks.

The results were striking: xLAM completed tasks in 2.61 seconds with 100% success, while Qwen took 9.82 seconds with 92% success – nearly four times as long.

This experiment shows the speed gain, but there’s a trade-off: how much intelligence should live in the model versus in the tools themselves. This limited

With larger models like Qwen, tools can be simpler because the model has better error tolerance & can work around poorly designed interfaces. The model compensates for tool limitations through brute-force reasoning.

With smaller models, the model has less capacity to recover from mistakes, so the tools must be more robust & the selection logic more precise. This might seem like a limitation, but it’s actually a feature.

This constraint eliminates the compounding error rate of LLM chained tools. When large models make sequential tool calls, errors accumulate exponentially.

Small action models force better system design, keeping the best of LLMs and combining it with specialized models.

This architecture is more efficient, faster, & more predictable.

Yuri Narciss (那悠瑞)

Co-Founder AlphaNeural AI; Executive coach for founders and leaders at fast growing start-ups

3d

we are building a marketplace (complete with infra) where developers can monetise proprietary models. Check it out at https://app.alphaneural.io/

Opeyemi Awoyemi

Building next gen AI career platform at hello.cv // Investor @ Fast Forward Fund. Founder, largest jobsite and web hosting company in Nigeria. CS OAU, Wharton MBA. Follow me for AI, Fintech and Future of Work insights.

5d

This makes a lot of sense intuitively. An agent to click through a specific task or to extract leads need far too many few parameters.

Like
Reply
Valerii Gorbanov

Product Leader | Fintech & Crypto & Web3 | LatAm Expert

5d

I completely agree that the future lies in agentic systems equipped with efficient tool calling. Streamlining these processes could significantly reduce costs and open up new possibilities for lean startups to leverage AI without breaking the bank.

Like
Reply
Val Bercovici

Building AI Factories, Open Source & Cloud Native

6d

AI tokenomics are driving sophisticated engineering decisions like these model price/performance tradeoffs. I'm excited to be driving down token costs overall with 1000x KV Cache boosting via software-defined memory.

Like
Reply
Srivasudhevan R

Grow Revenue | Reduce Costs | AI Agents | Logistics | Manufacturing | Finance | Supply Chain | Customer Service

6d

Moreover, SLMs can be installed on a local server within the organisaton's firewall so that security is guaranteed an it can be trained with firm's own data such as emails, files, etc.

Like
Reply

To view or add a comment, sign in

Others also viewed

Explore topics