Crunching the Numbers: A Cost Analysis of AI Agents in the Enterprise

Crunching the Numbers: A Cost Analysis of AI Agents in the Enterprise

2025 is shaping up to be a game-changer in Enterprise AI, with agent automation leading the charge. Sixteen months ago, I published "Decoding the True Cost of Generative AI for Your Enterprise," and now it's time to revisit that framework through the lens of AI agents.

Custom enterprise agents are no longer theoretical - they're already transforming businesses. From HR agents that streamline employee onboarding to sales agents that automate prospect follow-ups to IT agents that autonomously handle support tickets, the impact is tangible. 99% of enterprise AI developers in the US are exploring or developing AI agents today. But what's the real cost of integrating these agents into your enterprise? Let's take a closer look. First, let's talk about what an AI agent is.


Enterprise AI Agents: A primer

While the first wave of generative AI centered primarily on LLMs, the future belongs to agents that can "think, act, and observe". These agents tackle complex problems through methodical, step-by-step approaches (chain of thought), where the agent perceives the input, thinks using its LLM capabilities, and plans a set of possible actions (the use of tools) or responses. The agent then decides on the most suitable action or response, executes it, and observes the outcome. This outcome is used to learn and update its knowledge and context for future interactions. Here is a simple architecture for how a single enterprise agent works.

Simplified agentic workflow. Credit:

By incorporating planning capabilities, iterative refinement, and the use of tools, agentic workflows are able to complete more sophisticated tasks compared to traditional LLM inference. LLM-based agents can integrate with your existing workflows, bringing automation to every corner of your enterprise.

Consider a simple example: asking about the weather where Warren Buffett lives. This is a complex, multi-step problem that can be solved using an agentic workflow. First, an agent breaks down the problem into discrete steps: (1) looking up where Warren Buffet lives, (2) querying a weather API to look up the current weather condition, and then (3) using the data returned by the API to produce a natural language response to your query. In this way, the agent is able to answer a question that both incorporates up-to-date data (via tool use and API calling) and produces a fluent response. Imagine what can be done when the agent is able to query databases, write and execute code (in safe containers), or access your enterprise's specialized APIs.

Understanding the Agent Lifecycle

The agent lifecycle consists of two distinct phases: build and runtime. Let's examine the cost implications of each.

Agent lifecycle

Build Phase Costs

Agent building costs typically manifest in one of two ways:

1.     Upfront development costs to create a custom agent for your enterprise

2.     One-time payment or subscription fees for accessing a pre-built, 3rd-party agent

Agent Build Phase Costs

Build phase costs are only incurred once during build time and don't scale with production usage. Once an agent is created and ready for production, it can be deployed using a deployment service, such as the one-click deployment offered by watsonx.ai. Deployment costs are primarily associated with development labor (e.g., DevOps engineers).

In the future, agents may be added and registered in a central agent hub or directory, which may incur additional costs.


Runtime Costs

Runtime costs require careful consideration as they do scale with usage.

Some companies are leveraging outcome-based pricing for their agents (e.g., $x per solved customer query, meaning a customer only pays for the value achieved by the agent, not simply for its usage or access). Independent from a usage-based or outcome-based pricing, it's important to understand the components of agent runtime costs:

  1. Think and Plan Costs. These costs consist of the computational expenses from the LLM inference used by the agent's think-act-observe process. If this process is not designed, monitored, and managed well, the costs can quickly rise. Prompting frameworks like ReAct, Reflexion, and ReWoo each propose a different variant of think-act-observe that can impact the amount of LLM inference used by the agent. The latest state-of-the-art models (e.g., OpenAI's o1) leverage inference-time compute to improve response quality. The inference costs of these models when used within agentic applications can be huge. Another important factor to consider is the potential of being throttled by 3rd party model providers. Although an agent may perform well during the proof of concept phase, it may struggle to process data quickly enough when faced with heavy demand in a production environment. This can lead to significant unexpected costs, such as needing to upgrade to the next pricing tier to reduce throttling. Careful consideration should be used to determine which prompting framework and which underlying LLM produces the right balance of performance vs. inference cost.

  2. Orchestration and Tool Execution Costs. These costs are related to an agent's use of tools (i.e., tool selection & tool execution) and multi-agent orchestration. Tool execution and 3rd party API costs include the cost of any external API calls and the costs of the additional LLM token overhead used to process the results of those API calls. This overhead stems from how the results of a tool/function call are passed back to the LLM with the previous input context, leading to a higher level of token consumption.

  3. Agent Management Cost. Agent behaviors need to be observed, managed, monitored, and governed, which results in additional costs for monitoring and logging tools, debugging interfaces, governance and compliance frameworks, evaluation metrics and dashboards. These costs vary based on service providers and sophistication levels.

These three costs are unique to agentic workflows. Additional labor costs for development, maintenance and support should also be considered when pricing out an agentic application.

Agent Runtime Costs

By understanding these cost components, enterprises can better plan their agent implementation strategy and avoid unexpected expenses as they scale their AI operations.


Case study: Evaluating the Cost of AI Meeting Agents

Let's revisit Emma, the head of product at a tech startup, who previously evaluated the ROI of integrating LLMs within her firm. In the last article, we calculated the cost of using LLMs to automatically generate one-page summaries for each meeting in her firm that highlight action items and important decisions that were made. This time, we'll explore the cost of implementing an AI meeting agent that not only automates summarization but also integrates meeting insights into internal workflows like CRMs and project management platforms within Emma’s firm. The goal is to help Emma analyze the added cost of AI meeting agent to her firm and if the revenue increase of such implementation justifies the investment.

Emma's firm has 700 employees, with each employee attending an average of 5 meetings per day, each lasting 30 minutes, with 2 colleagues per meeting. Each meeting generates an average of 2 follow-up actions. The meeting agent is a ReAct-based agent that uses 2 cycles of think/act/observe to complete a task.

 

Assumptions

  • Number of employees: 700

  • Meetings per day per employee: 5

  • Employees per meeting: 3

  • Follow-up actions per meeting: 2

  • LLM pricing for the underlying LLM used by the agent: $0.0006 per 1K tokens (prompt and completion)

  • Tool calling cost: $0.001 per API call

Step 1: Estimate the number of summarization requests per day

Let N = Number of summarization requests per day.

N = Number of employees * Number of meetings per day per employee / Number of employees per meeting

= 700 * 5 / 3

= 1,166 meetings per day

 

Step 2: Estimate the summarization cost per meeting:

The summarization cost per meeting is estimated to be $0.004 (See the article for calculation).

Step 3: Calculate the cost of generating follow-up agent actions for a single meeting

The meeting agent uses 2 cycles of think/act/observe to complete a task, with each cycle generating 1000 input tokens and 1000 output tokens, and one API call. The total cost is:

Cost of a think/act/observe cycle = Token cost of think/observe + Cost of API calls in the action phase = $0.0006 (input tokens) + $0.0006 (output tokens) + $0.001 (API call cost) = $0.0022 Total cost per action = 2 (number of think/act/observe cycles) * $0.0022 = $0.0044

Average agent cost to follow up on a given meeting with 2 follow-up actions = 2 * $0.0044 = $0.0088

Step 4: Calculate the total cost of producing meeting summaries and automatically follow-up on meeting actions for a single day

The total cost of an AI meeting agent that summarizes and follows up on meeting actions is the sum of the summarization cost and the agent follow-up action cost: $0.004 + $0.0088 = $0.0128 per meeting

With 1,166 meetings per day, the total daily cost is: 1,166 * $0.0128 = $14.9

The annual cost is: 365 * $14.9 = $5438

Emma's firm would spend approximately ~$5500 annually to produce meeting summaries and automated follow-up actions using an AI meeting agent. By understanding the costs associated with implementing an AI meeting agent, Emma can make an informed decision about whether to adopt this technology or stick with LLM summarizations.

Analysis

Recall from our prior case study that Emma had a $40K revenue in 2023. Her revenue has been growing at a 15% YtY since then and she expects to deliver $70.4K revenue in 2025. The utilization of AI meeting agents will free up so much time for her employees that can now be dedicated to grow revenue and as a result of that she expects a 30% boost on her revenue projections.

Let's look at her her income statement:

Ratio Analysis for for Emma's case when she implements a meeting agent

An annual spend of $5500 will yield a very healthy 80% Rule of 40, which justifies a borderline investment in this case (improving Ruleof 40 by 1%).

Now how can Emma optimize her spend on that meeting agent to improve her margins?

Try the following two scenarios and recalculate the math:

1. Emma switches to a much smaller LLM like IBM Granite with a pricing of: $0.0001 per 1K tokens (prompt and completion).

2. Limit the number of follow-up actions to 1 instead of 2.

Key takeaways

When choosing an agent for your task, cost will be a major determining factor.

  • Accurate cost estimation is crucial: Understanding the costs associated with implementing an AI agent can help you make informed decisions about adoption and customization. In the two exercises above, we saw how actions such as choosing a smaller model or limiting the number of actions taken by the agent can significantly reduce the cost of the agent.

  • Costs can add up quickly: The cost of AI agents can scale rapidly with usage, so it's essential to consider the potential costs and benefits before implementation.

  • Customization and complexity impact costs: The cost of AI agents can vary significantly depending on the complexity of the use case, the level of customization required, and the number of API calls made. 

Ales P.

Generative AI Practice Lead at IBM Switzerland

4mo

Great breakdown. Thanks for the insights.

Like
Reply
Henrietta Akpata

Co-Founder | Board Director | Commercial Growth Leader | AI, Cloud, SaaS | Higher Ed, Life Sciences, Financial Services

6mo

Fascinating cost analysis. Critical for diligence necessary for deploying Agents.

Campbell Robertson

Customer Success Practice Executive | Data & AI Thought Leader | AI & CSM Consultant | Available for Contract/Full-time Opportunities

6mo

Love this

Sridhar Jonnala

Chief Data Scientist,Distinguished Engineer & Chief Technology Officer

6mo

This is Fantastic Maryam Ashoori, PhD Really love it..Simple and relevant for everyone in this market

JITHU RAM

Product Manager | IBM watsonx.ai

6mo

Insightful!

To view or add a comment, sign in

Others also viewed

Explore topics