SlideShare a Scribd company logo
A Comprehensive Guide
to Agentic AI
Debmalya Biswas, PhD
Introduction to AI Agents
Reference Architecture
Agents Discovery &
Marketplace
Personalizing UX for
Agentic AI
Agent Observability &
Memory Management
Agentic AI Scenarios:
Agentic RAGs
Reinforcement Learning
Agents
Responsible AI Agents
Introduction to
Agentic AI
AI Agents
In the Generative AI context,
Agents are representative of
an Autonomous Agent that
can execute complex tasks,
e.g.,
• - make a sale,
• - plan a trip,
• - make a flight booking,
• - book a contractor to do a
house job,
• - order a pizza.
Agentic AI in the News
Agentic AI Evolution
Agentic AI capabilities – Task Decomposition
Agentic AI capabilities – Memory Management
Agentic AI capabilities – Reflect & Adapt
Agentic AI Use-case: Funds Email Marketing Campaign
Agentic AI
Reference
Architecture
Generative AI Lifecycle
Gen AI Architecture Patterns – APIs & Embedded Gen AI
While Enterprise LLM Apps have the
potential to accelerate LLM adoption by
providing an enterprise ready solution; the
same caution needs to be exercised as you
would do before using a 3rd party ML
model — validate LLM/training data
ownership, IP, liability clauses.
Black-box LLM APIs: This is the classic
ChatGPT example, where we have black-
box access to a LLM API/UI. Prompts are
the primary interaction mechanism for such
scenarios.
* D. Biswas. Generative AI – LLMOps Architecture Patterns. Data Driven Investor, 2023 (link)
* D. Biswas. Generative AI – LLMOps Architecture
Patterns. Data Driven Investor, 2023 (link)
Gen AI Architecture Patterns – Fine-tuning
LLMs are generic in nature. To
realize the full potential of LLMs for
Enterprises, they need to be
contextualized with enterprise
knowledge captured in terms of
documents, wikis, business
processes, etc.
This is achieved by fine-tuning a LLM
with enterprise knowledge /
embeddings to develop a context-
specific LLM.
Gen AI Architecture Patterns – Retrieval-Augmented-
Generation (RAG)
Fine-tuning is a computationally intensive process. RAG provides a viable alternative by providing
additional context with the prompts — grounding the retrieval / responses to the given context.
Given a user query, a RAG pipeline literally consists of
the 3 phases below:
- Retrieve: Transform user queries to embeddings to
compare its similarity score with other content.
- Augment: with search results / context retrieved from
a vector store that is kept current and in sync with the
underlying document repository.
- Generate: contextualized responses by making
retrieved chunks part of the prompt template that
provides additional context to the LLM on how to
answer the query.
Agentic AI Platform Reference Architecture
* D. Biswas. Stateful Monitoring and Responsible Deployment of AI Agents. 17th
International Conference on Agents and Artificial Intelligence (ICAART), 2025 (link)
The future where enterprises
will be able to develop new
Enterprise AI Apps by
orchestrating / composing
multiple existing AI Agents.
AI Agents Marketplace
& Discovery for Multi-
agent Systems
(Complex) Agentic AI Task Decomposition
A high-level approach to solving complex tasks:
• - decomposition of the given complex task into
a hierarchy or workflow of) simple tasks,
followed by
• - composition of agents able to execute the
simpler tasks.
This can be achieved in a dynamic or static manner.
• Dynamic: given a complex user task, the system
comes up with a plan to fulfil the request
depending on the capabilities of available
agents at run-time.
• Static: given a set of agents, composite agents
are defined manually at design-time combining
their capabilities.
Agent Marketplace & Discovery of AI Agents
Agent decomposition
and planning (be it static
or dynamic) requires a
discovery module to
identify the agent(s)
capable of executing a
given task.
This implies that there
exists a marketplace
with a registry of agents,
with a well-defined
description of the agent
capabilities and
constraints.
Hierarchical Agent
Composition
In LangGraph (for example), hierarchical agents are
captured as agent nodes that can be langgraph
objects themselves, connected by supervisor
nodes.
• LangGraph: Multi-Agent Workflows,
https://blog.langchain.dev/langgraph-multi-agent-workflows/
Hierarchical Finite State Machine (FSM)
representation of a Travel Funds Service
Limitations of LLMs as execution engines for Agentic AI
Current Agentic AI platforms leverage LLMs for both task decomposition and execution
of the identified tasks / agents.
- - The overall execution occurs within the context of a single LLM, or each task can be
routed to a different LLM.
- - In short, each task execution corresponds to an LLM invocation at run-time.
- Unfortunately, this approach is neither scalable nor practical for complex tasks.
LLMs cannot be expected to come-up with
the most efficient (agent) execution
approach for a given task at run-time every
time, esp. those requiring integration with
enterprise systems.
Agentic AI platforms need to learn
over multiple execution runs (meta-
learning): involving a combination of
user prompts, agents, and their
relevant skills (capabilities).
Non-determinism in Agentic AI Systems
There are two non-deterministic
operators in the execution plan:
‘Check Credit’ and ‘Delivery Mode’.
The choice ‘Delivery Mode’ indicates
that the user can either pick-up the
order directly from the store or have
it shipped to his address.
Given this, shipping is a non-
deterministic choice and may not be
invoked during the actual execution.
L2R for Agent Discovery based on Natural
Language Descriptions
Learning-to-rank (L2R) algorithm
to select top-k agents given a user
prompt:
- We first convert agent (class)
descriptions to semantic
embeddings offline and use them to
train the L2R model.
- The user prompts and the agents
use the same generic embedding
model.
- The inference results including the
agent description embeddings
during training and inferencing are
cached to enable the meta-learning
process for the L2R algorithm.
Agent Discovery based on a Constraints Model
The constraints are specified as logic
predicates in the service description of
the corresponding service published by
its agent.
An agent P provides a set of services
{S1,S2, … , Sn}. Each service S in turn has
a set of associated constraints {C1,C2, …
,Cm}. For each constraint C of a service
S, the constraint values maybe
- a single value (e.g., price of a
service),
- list of values (e.g., list of
destinations served by an airline), or
- or range of values (e.g., minimum,
maximum)
Capability: connects City A to B
Constraint: Flies only on certain
days a week; Needs payment by
Credit Card
* D. Biswas. Constraints Enabled Autonomous Agent Marketplace:
Discovery and Matchmaking. 16th International Conference on Agents
and Artificial Intelligence (ICAART), 2024 (link)
Personalizing UX
for Agentic AI
AI Agent Personalization
Analogous to fine-tuning of large language models (LLMs) to domain specific
LLMs / SLMs,
we argue that personalization / fine-tuning of (marketplace) AI agents will be
needed with respect to enterprise specific context (of applicable user
personas and use-cases) to drive their enterprise adoption.
Key benefits of AI agent personalization include:
- Personalized interaction: The AI agent adapts its language, tone, and
complexity based on user preferences and interaction history. This ensures that
the conversation is more aligned with the user’s expectations and
communication style.
- Use-case context: The AI agent is aware of the underlying enterprise use-case
processes, so that it can prioritize or highlight process features, relevant pieces
of content, etc. — optimizing the interaction to achieve the use-case goal more
efficiently.
- Proactive Assistance: The AI agent anticipates the needs of different users and
offers proactive suggestions, resources, or reminders tailored to their specific
profiles or tasks.
AI Agent Personalization Architecture
We highlight that UI/UX for AI agents is critical as the
last mile to enterprise adoption in this talk.
User Persona based Agent Personalization
Enterprise AI agent personalization remains challenging due to scale, performance, and privacy challenges.
* D. Biswas. Personalizing UX for Agentic AI. AI Advances, 2024 (link)
* D. Biswas. Personalizing UX for Agentic AI. AI
Advances, 2024 (link)
User persona-based agent personalization
segments the end-users of a service into a
manageable set of user categories, which
represent the demographics and preferences
of majority of users.
The fine-tuning process consists of
first parameterizing (aggregated) user data and
conversation history and storing it as memory in
the LLM via adapters, followed by fine-tuning
the LLM for personalized response generation.
The agent — user persona router helps in
performing user segmentation (scoring) and
routing the tasks / prompts to the most
relevant agent persona.
User Data Embeddings
Fine-tuning AI agents on raw user data is often too complex, even if it is at
the (aggregated) persona level.
This is primarily due to the following reasons::
- Agent interaction data usually spans multiple journeys with sparse data
points, various interaction types (multimodal), and potential noise or
inconsistencies with incomplete queries — responses.
- Moreover, effective personalization often requires a deep understanding
of the latent intent / sentiment behind user actions, which can pose
difficulties for generic (pre-trained) LLMs.
- Finally, fine-tuning is computationally intensive. Agent-user interaction
data can be lengthy. Processing and modeling such long sequences (e.g.,
multi-years’ worth of interaction history) with LLMs can be practically
infeasible.
User Data Embeddings (USER-LLM)
USER-LLM distills compressed
representations from diverse and noisy
user interactions, effectively capturing the
essence of a user’s behavioral patterns
and preferences across various interaction
modalities.
* L. Liu L. Ning. USER-LLM: Efficient LLM Contextualization with User Embeddings. Google Research, 2024 (link)
* L. Liu & L. Ning. USER-LLM: Efficient LLM Contextualization
with User Embeddings. Google Research, 2024 (link)
Reinforcement Learning based Personalization
We show how LLM generated responses can be personalized based on a Reinforcement Learning
(RL) enabled Recommendation Engine (RE).
High-level, the RL based LLM response / action RE
works as follows:
- The (current) user sentiment and agent
interaction history are combined to quantify the
user sentiment curve and discount any sudden
changes in user sentiment;
- leading to the aggregate reward value
corresponding to the last LLM response provided
to the user.
- This reward value is then provided as feedback
to the RL agent — to choose the next optimal
LLM generated response / action to be provided
to the user.
D. Biswas. Delayed Rewards in the Context of Reinforcement Learning based Recommender Systems. AAI4H@ECAI 2020: 49-53, (link)
E. Ricciardelli, D. Biswas. Self-improving Chatbots based on Reinforcement Learning. RLDM 2019 (link)
• D. Biswas. Delayed Rewards in the Context of Reinforcement Learning
based Recommender Systems. AAI4H@ECAI 2020: 49-53, (link)
• E. Ricciardelli, D. Biswas. Self-improving Chatbots based on
Reinforcement Learning. RLDM 2019 (link)
Agent Observability
& Memory
Management
Observability Challenges for Agentic AI
Observability for AI Agents is
challenging:
- No global observer: Due to their
distributed nature, we cannot assume
the existence of an entity having
visibility over the entire execution. In
fact, due to their privacy and
autonomy requirements, even the
composite agent may not have
visibility over the internal processing
of its component agents.
- Parallelism: AI agents allow parallel
composition of processes.
- Dynamic configuration: The agents
are selected incrementally as the
execution progresses (dynamic
binding). Thus, the “components” of
the distributed system may not be
known in advance.
Stateful execution for AI Agents
AgentOps monitoring is critical given the
complexity and long running nature of AI
agents. We define observability as the
ability to find out where in the process the
execution is and whether any
unanticipated glitches have appeared.
- Local queries: Queries which can be
answered based on the local state
information of an agent.
- Composite queries: Queries expressed
over the states of several agents.
- Historical queries: Queries related to the
execution history of the composition.
- Relationship queries: Queries based on
the relationship between states.
* D. Biswas. Stateful Monitoring and Responsible Deployment of AI Agents. 17th
International Conference on Agents and Artificial Intelligence (ICAART), 2025 (link)
Conversational Memory Management using Vector DBs
Vector DBs are currently the primary
medium to store and retrieve data
(memory) corresponding to
conversational agents.
- This involves selecting an encoder
model that performs offline data
encoding as a separate process,
converting various forms of raw data,
such as text, audio, and video, into
vectors.
- During a chat, the conversational agent
has the option of querying the long-
term memory system by encoding the
query and searching for relevant
information within Vector DB. The
retrieved information is then used to
answer the query based on the stored
information.
Human Memory Understanding
We need to consider the following memory types.
- Semantic memory: general knowledge with facts, concepts,
meanings, etc.
- Episodic memory: personal memory with respect to specific
events and situations from the past.
- Procedural memory: motor skills like driving a car, with the
corresponding procedures to achieve the task.
- Emotional memory: feelings associated with experiences.
Agentic Memory Management
The memory router, always, by
default, routes to the long-term
memory (LTM) module to see if
an existing pattern is there to
respond to the given user
prompt. If yes, it retrieves and
immediately responds,
personalizing it as needed.
* D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link)
* D. Biswas. Long-term Memory for AI Agents. AI
Advances, 2024 (link)
If the LTM fails, the memory
router routes it to the short-
term memory (STM) module
which then uses its retrieval
processes (APIs, etc.) to get the
relevant context into the STM
(working memory) —leveraging
applicable data services.
Agentic Memory Management (2)
The STM — LTM transformer module is
always active and constantly getting the
context retrieved and extracting recipes
out of it (e.g., refer to the concepts of
teachable agents and recipes
in AutoGen) and storing in a semantic
layer (implemented via Vector DB).
* D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link)
* D. Biswas. Long-term Memory for AI Agents. AI
Advances, 2024 (link)
At the same time, it is also collecting
other associated properties (e.g., no. of
tokens, cost of executing the response,
state of the system, etc.) and
- creating an episode which is then getting
stored in a knowledge graph
- with the underlying procedure stored in
a finite state machine (FSM).
Agentic AI Scenarios:
- Agentic RAGs
- Reinforcement
Learning Agents
Agentic RAGs: extending RAGs to SQL Databases
Agentic AI framework to build RAG
pipelines that work seamlessly over
both structured and unstructured
data stored in Snowflake.
* D. Biswas. Agentic RAGs: extending RAGs to SQL Databases. AI Advances, 2024 (link)
* D. Biswas. Agentic RAGs: extending RAGs to SQL
Databases. AI Advances, 2024 (link)
The SQL & Document query agents
leverage the respective Snowflake
Cortex Analyst and Search
components detailed earlier to
query the underlying SQL and
Document repositories.
Finally, to complete the RAG
pipeline, the retrieved data is added
to the original prompt — leading
the generation of a contextualized
response.
Reinforcement Learning Agents
When we talk about AI agents
today, we mostly talk about LLM
agents, which loosely translates
to invoking (prompting) an LLM
to perform natural language
processing (NLP) tasks
Some agentic tasks might be
better suited to other ML
techniques, e.g., Reinforcement
Learning (RL), predictive
analytics, etc. — depending on
the use-case objectives.
* D. Biswas. LLM based fine-tuning of Reinforcement Learning Agents. AI Advances, 2024 (link)
* D. Biswas. LLM based fine-tuning of Reinforcement
Learning Agents. AI Advances, 2024 (link)
LLM based fine-tuning of Reinforcement Learning Agents
* D. Biswas. LLM based fine-tuning of Reinforcement Learning Agents. AI Advances, 2024 (link)
* D. Biswas. LLM based fine-tuning of Reinforcement
Learning Agents. AI Advances, 2024 (link)
We focus on RL agents, and
show how LLMs can be used
to fine-tune the RL agent
reward / policy functions.
Reinforcement Learning Agents applied to HVAC
Optimization
* D. Biswas. Reinforcement Learning based Energy Optimization in Factories, in proc. of the 11th ACM Conference on Future
Energy Systems (e-Energy), 2020. (link)
* D. Biswas. Reinforcement Learning based Energy Optimization in
Factories, in proc. of the 11th ACM Conference on Future Energy
Systems (e-Energy), 2020. (link)
We show a concrete
example of applying
the fine-tuning
methodology to a real-
life industrial control
system — designing
the RL based controller
for HVAC optimization
in a building setting.
Responsible AI
Agents
Data Quality Issues with respect to LLMs, esp.
Vector DBs
From a data quality point of view,
we see the following challenges
w.r.t. LLMs, esp. Vector DBs:
- Accuracy of the encodings in vector
stores, measures in terms of
correctness and groundedness of
the generated LLM responses.
- Incorrect and/or inconsistent
vectors: Due to issues in the
embedding process, some vectors
may end up getting corrupted, be
incomplete, or getting generated
with a different dimensionality.
- Missing data can be in the form of
missing vectors or metadata.
- Timeliness issues w.r.t. outdated
documents impacting the vector
store.
* D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link)
* D. Biswas. Long-term Memory for AI Agents. AI
Advances, 2024 (link)
Explainability
Explainable AI is an umbrella term for
a range of tools, algorithms and
methods; which accompany AI model
predictions with explanations.
- Explainability of AI models ranks
high among the list of ‘non-
functional’ AI features to be
considered by enterprises.
- For example, this implies having
to explain why an ML model
profiled a user to be in a specific
segment — which led him/her to
receiving an advertisement.
(Labeled)
Data
Train ML
Model
Predictions
Explanation
Model
Explainable
Predictions
Fairness & Bias
Bias creeps into AI models, primarily
due to the inherent bias already
present in the training data.
So the ‘data’ part of AI model
development is key to addressing
bias.
- Historical Bias: arises due to
historical inequality of human
decisions captured in the training
data
- Representation Bias: arises due to
training data that is not
representative of the actual
population.
*H. Suresh, J. V. Guttag. A Framework for Understanding Unintended Consequences of Machine Learning,
2020 (link)
*H. Suresh, J. V. Guttag. A Framework for Understanding
Unintended Consequences of Machine Learning, 2020 (link)
ML Privacy Risks
Two broad categories of
privacy inference attacks:
• Membership inference (if a
specific user data item was
present in the training
dataset) and
• Property inference
(reconstruct properties of a
participant’s dataset)
attacks.
Black box attacks are still
possible when the attacker
only has access to the APIs:
invoke the model and observe
the relationships between
inputs and outputs.
Training
dataset
wants access to
ML Model
(Classification,
Prediction)
Inference
API
has access to
Attacker
* D. Biswas. Privacy Preserving Chatbot Conversations. IEEE AIKE 2020: 179-182 (link)
*D. Biswas, K. Vidyasankar. A Privacy Framework for Hierarchical Federated Learning. CIKM Workshops 2021 (link)
Gen AI Privacy Risks – novel challenges
From a privacy point of view, we
need to consider the following
additional / different LLM privacy
risks:
- Membership and property
leakage from pre-training data
- Model features leakage from
pre-trained LLM
- Privacy leakage from
conversations (history) with
LLMs
- Compliance with privacy intent
of users
* D. Biswas. Privacy Risks of Large Language Models. AI Advances, 2024 (link)
* D. Biswas. Privacy Risks of Large Language Models.
AI Advances, 2024 (link)
Responsible deployment of AI Agents
* D. Biswas. Stateful Monitoring and Responsible Deployment of AI Agents. 17th International Conference on Agents and Artificial Intelligence (ICAART), 2025 (link)
Use-case specific Evaluation of LLMs
Need for a comprehensive LLM evaluation strategy with targeted
success metrics specific to the use-cases.
* D. Biswas. Use Case-Based Evaluation Strategy for LLMs. AI Advances, 2024 (link)
* D. Biswas. Use Case-Based Evaluation
Strategy for LLMs. AI Advances, 2024 (link)
LLM Safety Leaderboard
*Hugging Face LLM Safety Leaderboard (link)
*B. Wang, et. Al. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
Models, 2024 (link)
Thanks
&
Questions
Debmalya Biswas
https://www.linkedin.com/in/debmalya-
biswas-3975261/
https://medium.com/@debmalyabiswas

More Related Content

PPTX
Agentic-AI-The-Next-Wave-of-Intelligence.pptx
PDF
Agentic AI: Scalable & Responsible Deployment of AI Agents in the Enterprise
PDF
Devoxx Morocco 2024 - The Future Beyond LLMs: Exploring Agentic AI
PPTX
Agentic AI: The 2025 Next-Gen Automation Guide
PPTX
AI Agents and their implications for Enterprise AI Use-cases
PPTX
Agentic AI: The Future of Intelligent Automation
PDF
Generative AI
PPTX
Agentic AI in Healthcare: Bridging Healthcare Gaps with the Power of Agentic AI
Agentic-AI-The-Next-Wave-of-Intelligence.pptx
Agentic AI: Scalable & Responsible Deployment of AI Agents in the Enterprise
Devoxx Morocco 2024 - The Future Beyond LLMs: Exploring Agentic AI
Agentic AI: The 2025 Next-Gen Automation Guide
AI Agents and their implications for Enterprise AI Use-cases
Agentic AI: The Future of Intelligent Automation
Generative AI
Agentic AI in Healthcare: Bridging Healthcare Gaps with the Power of Agentic AI

What's hot (20)

PDF
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
PDF
Unlocking the Power of Generative AI An Executive's Guide.pdf
PDF
Types of AI Agents | Presentation | PPT
PPTX
Generative AI Risks & Concerns
PPTX
Implementing Ethics in AI
PDF
Large Language Models - Chat AI.pdf
PDF
Large Language Models Bootcamp
PDF
The current state of generative AI
PPTX
Generative AI Masterclass - Model Risk Management.pptx
PDF
Leveraging Generative AI & Best practices
PPTX
Generative AI Use cases for Enterprise - Second Session
PDF
Introduction to LLMs
PPTX
AI FOR BUSINESS LEADERS
PDF
Responsible Generative AI
PDF
Let's talk about GPT: A crash course in Generative AI for researchers
PDF
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
PDF
Nasscom AI top 50 use cases
PDF
UTILITY OF AI
PPTX
Generative AI and law.pptx
PPTX
Future of AI - 2023 07 25.pptx
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
Unlocking the Power of Generative AI An Executive's Guide.pdf
Types of AI Agents | Presentation | PPT
Generative AI Risks & Concerns
Implementing Ethics in AI
Large Language Models - Chat AI.pdf
Large Language Models Bootcamp
The current state of generative AI
Generative AI Masterclass - Model Risk Management.pptx
Leveraging Generative AI & Best practices
Generative AI Use cases for Enterprise - Second Session
Introduction to LLMs
AI FOR BUSINESS LEADERS
Responsible Generative AI
Let's talk about GPT: A crash course in Generative AI for researchers
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
Nasscom AI top 50 use cases
UTILITY OF AI
Generative AI and law.pptx
Future of AI - 2023 07 25.pptx
Ad

Similar to A comprehensive guide to Agentic AI Systems (20)

PDF
What’s Next in GenAI Deployment step-by-step.pdf
PDF
Multi Agent Workflows in Salesforce and ServiceNow.pdf
PDF
Applying Machine Learning to Boost Digital Business Performance
PDF
[IJET-V2I2P8] Authors:Ms. Madhushree M.Kubsad
DOCX
Knowledge management and information system
PDF
Sustainable & Composable Generative AI
PDF
Automating SaaS Cloud Operations with AI Agent | Bluebash
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
PDF
How to build AI agents with ZBrain: Introduction, agent types, development an...
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PPTX
Constraints Enabled Autonomous Agent Marketplace: Discovery and Matchmaking
PPTX
07Nov2024_MuleSoft_Meetup_MuleSoft+AI.pptx
PDF
M.E Computer Science Server Computing Projects
PPTX
Compositional AI: Fusion of AI/ML Services
PPTX
SFDC Training Day 1SFDC Training Day 1.pptx
PDF
M phil-computer-science-server-computing-projects
PDF
M.Phil Computer Science Server Computing Projects
PPTX
Kochi mulesoft meetup 02
PDF
Future of Software is New SaaS - powered by Services, AI Agents, Sharing
DOC
What’s Next in GenAI Deployment step-by-step.pdf
Multi Agent Workflows in Salesforce and ServiceNow.pdf
Applying Machine Learning to Boost Digital Business Performance
[IJET-V2I2P8] Authors:Ms. Madhushree M.Kubsad
Knowledge management and information system
Sustainable & Composable Generative AI
Automating SaaS Cloud Operations with AI Agent | Bluebash
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
How to build AI agents with ZBrain: Introduction, agent types, development an...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Constraints Enabled Autonomous Agent Marketplace: Discovery and Matchmaking
07Nov2024_MuleSoft_Meetup_MuleSoft+AI.pptx
M.E Computer Science Server Computing Projects
Compositional AI: Fusion of AI/ML Services
SFDC Training Day 1SFDC Training Day 1.pptx
M phil-computer-science-server-computing-projects
M.Phil Computer Science Server Computing Projects
Kochi mulesoft meetup 02
Future of Software is New SaaS - powered by Services, AI Agents, Sharing
Ad

More from Debmalya Biswas (17)

PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
PDF
ICAART 2025 presentation on Stateful Monitoring and Responsible Deployment of...
PDF
Responsible LLMOps presentation at Webit 2024
PPTX
Gen AI: Privacy Risks of Large Language Models (LLMs)
PDF
Responsible Generative AI Design Patterns
PPTX
Data-Driven (Reinforcement Learning-Based) Control
PPTX
Regulating Generative AI - LLMOps pipelines with Transparency
PPTX
MLOps for Compositional AI
PPTX
A Privacy Framework for Hierarchical Federated Learning
PPTX
Edge AI Framework for Healthcare Applications
PPTX
Ethical AI - Open Compliance Summit 2020
PPTX
Privacy Preserving Chatbot Conversations
PPTX
Reinforcement Learning based HVAC Optimization in Factories
PPTX
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
PPTX
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
PDF
Privacy-Preserving Outsourced Profiling
PDF
Privacy Policies Change Management for Smartphones
Agentic AI lifecycle for Enterprise Hyper-Automation
ICAART 2025 presentation on Stateful Monitoring and Responsible Deployment of...
Responsible LLMOps presentation at Webit 2024
Gen AI: Privacy Risks of Large Language Models (LLMs)
Responsible Generative AI Design Patterns
Data-Driven (Reinforcement Learning-Based) Control
Regulating Generative AI - LLMOps pipelines with Transparency
MLOps for Compositional AI
A Privacy Framework for Hierarchical Federated Learning
Edge AI Framework for Healthcare Applications
Ethical AI - Open Compliance Summit 2020
Privacy Preserving Chatbot Conversations
Reinforcement Learning based HVAC Optimization in Factories
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Privacy-Preserving Outsourced Profiling
Privacy Policies Change Management for Smartphones

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Encapsulation theory and applications.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Electronic commerce courselecture one. Pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
MYSQL Presentation for SQL database connectivity
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectral efficient network and resource selection model in 5G networks
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation_ Review paper, used for researhc scholars
Encapsulation theory and applications.pdf
Empathic Computing: Creating Shared Understanding
Electronic commerce courselecture one. Pdf
Chapter 3 Spatial Domain Image Processing.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Review of recent advances in non-invasive hemoglobin estimation
MYSQL Presentation for SQL database connectivity
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

A comprehensive guide to Agentic AI Systems

  • 1. A Comprehensive Guide to Agentic AI Debmalya Biswas, PhD Introduction to AI Agents Reference Architecture Agents Discovery & Marketplace Personalizing UX for Agentic AI Agent Observability & Memory Management Agentic AI Scenarios: Agentic RAGs Reinforcement Learning Agents Responsible AI Agents
  • 3. AI Agents In the Generative AI context, Agents are representative of an Autonomous Agent that can execute complex tasks, e.g., • - make a sale, • - plan a trip, • - make a flight booking, • - book a contractor to do a house job, • - order a pizza.
  • 4. Agentic AI in the News
  • 6. Agentic AI capabilities – Task Decomposition
  • 7. Agentic AI capabilities – Memory Management
  • 8. Agentic AI capabilities – Reflect & Adapt
  • 9. Agentic AI Use-case: Funds Email Marketing Campaign
  • 12. Gen AI Architecture Patterns – APIs & Embedded Gen AI While Enterprise LLM Apps have the potential to accelerate LLM adoption by providing an enterprise ready solution; the same caution needs to be exercised as you would do before using a 3rd party ML model — validate LLM/training data ownership, IP, liability clauses. Black-box LLM APIs: This is the classic ChatGPT example, where we have black- box access to a LLM API/UI. Prompts are the primary interaction mechanism for such scenarios. * D. Biswas. Generative AI – LLMOps Architecture Patterns. Data Driven Investor, 2023 (link) * D. Biswas. Generative AI – LLMOps Architecture Patterns. Data Driven Investor, 2023 (link)
  • 13. Gen AI Architecture Patterns – Fine-tuning LLMs are generic in nature. To realize the full potential of LLMs for Enterprises, they need to be contextualized with enterprise knowledge captured in terms of documents, wikis, business processes, etc. This is achieved by fine-tuning a LLM with enterprise knowledge / embeddings to develop a context- specific LLM.
  • 14. Gen AI Architecture Patterns – Retrieval-Augmented- Generation (RAG) Fine-tuning is a computationally intensive process. RAG provides a viable alternative by providing additional context with the prompts — grounding the retrieval / responses to the given context. Given a user query, a RAG pipeline literally consists of the 3 phases below: - Retrieve: Transform user queries to embeddings to compare its similarity score with other content. - Augment: with search results / context retrieved from a vector store that is kept current and in sync with the underlying document repository. - Generate: contextualized responses by making retrieved chunks part of the prompt template that provides additional context to the LLM on how to answer the query.
  • 15. Agentic AI Platform Reference Architecture * D. Biswas. Stateful Monitoring and Responsible Deployment of AI Agents. 17th International Conference on Agents and Artificial Intelligence (ICAART), 2025 (link) The future where enterprises will be able to develop new Enterprise AI Apps by orchestrating / composing multiple existing AI Agents.
  • 16. AI Agents Marketplace & Discovery for Multi- agent Systems
  • 17. (Complex) Agentic AI Task Decomposition A high-level approach to solving complex tasks: • - decomposition of the given complex task into a hierarchy or workflow of) simple tasks, followed by • - composition of agents able to execute the simpler tasks. This can be achieved in a dynamic or static manner. • Dynamic: given a complex user task, the system comes up with a plan to fulfil the request depending on the capabilities of available agents at run-time. • Static: given a set of agents, composite agents are defined manually at design-time combining their capabilities.
  • 18. Agent Marketplace & Discovery of AI Agents Agent decomposition and planning (be it static or dynamic) requires a discovery module to identify the agent(s) capable of executing a given task. This implies that there exists a marketplace with a registry of agents, with a well-defined description of the agent capabilities and constraints.
  • 19. Hierarchical Agent Composition In LangGraph (for example), hierarchical agents are captured as agent nodes that can be langgraph objects themselves, connected by supervisor nodes. • LangGraph: Multi-Agent Workflows, https://blog.langchain.dev/langgraph-multi-agent-workflows/ Hierarchical Finite State Machine (FSM) representation of a Travel Funds Service
  • 20. Limitations of LLMs as execution engines for Agentic AI Current Agentic AI platforms leverage LLMs for both task decomposition and execution of the identified tasks / agents. - - The overall execution occurs within the context of a single LLM, or each task can be routed to a different LLM. - - In short, each task execution corresponds to an LLM invocation at run-time. - Unfortunately, this approach is neither scalable nor practical for complex tasks. LLMs cannot be expected to come-up with the most efficient (agent) execution approach for a given task at run-time every time, esp. those requiring integration with enterprise systems. Agentic AI platforms need to learn over multiple execution runs (meta- learning): involving a combination of user prompts, agents, and their relevant skills (capabilities).
  • 21. Non-determinism in Agentic AI Systems There are two non-deterministic operators in the execution plan: ‘Check Credit’ and ‘Delivery Mode’. The choice ‘Delivery Mode’ indicates that the user can either pick-up the order directly from the store or have it shipped to his address. Given this, shipping is a non- deterministic choice and may not be invoked during the actual execution.
  • 22. L2R for Agent Discovery based on Natural Language Descriptions Learning-to-rank (L2R) algorithm to select top-k agents given a user prompt: - We first convert agent (class) descriptions to semantic embeddings offline and use them to train the L2R model. - The user prompts and the agents use the same generic embedding model. - The inference results including the agent description embeddings during training and inferencing are cached to enable the meta-learning process for the L2R algorithm.
  • 23. Agent Discovery based on a Constraints Model The constraints are specified as logic predicates in the service description of the corresponding service published by its agent. An agent P provides a set of services {S1,S2, … , Sn}. Each service S in turn has a set of associated constraints {C1,C2, … ,Cm}. For each constraint C of a service S, the constraint values maybe - a single value (e.g., price of a service), - list of values (e.g., list of destinations served by an airline), or - or range of values (e.g., minimum, maximum) Capability: connects City A to B Constraint: Flies only on certain days a week; Needs payment by Credit Card * D. Biswas. Constraints Enabled Autonomous Agent Marketplace: Discovery and Matchmaking. 16th International Conference on Agents and Artificial Intelligence (ICAART), 2024 (link)
  • 25. AI Agent Personalization Analogous to fine-tuning of large language models (LLMs) to domain specific LLMs / SLMs, we argue that personalization / fine-tuning of (marketplace) AI agents will be needed with respect to enterprise specific context (of applicable user personas and use-cases) to drive their enterprise adoption. Key benefits of AI agent personalization include: - Personalized interaction: The AI agent adapts its language, tone, and complexity based on user preferences and interaction history. This ensures that the conversation is more aligned with the user’s expectations and communication style. - Use-case context: The AI agent is aware of the underlying enterprise use-case processes, so that it can prioritize or highlight process features, relevant pieces of content, etc. — optimizing the interaction to achieve the use-case goal more efficiently. - Proactive Assistance: The AI agent anticipates the needs of different users and offers proactive suggestions, resources, or reminders tailored to their specific profiles or tasks.
  • 26. AI Agent Personalization Architecture We highlight that UI/UX for AI agents is critical as the last mile to enterprise adoption in this talk.
  • 27. User Persona based Agent Personalization Enterprise AI agent personalization remains challenging due to scale, performance, and privacy challenges. * D. Biswas. Personalizing UX for Agentic AI. AI Advances, 2024 (link) * D. Biswas. Personalizing UX for Agentic AI. AI Advances, 2024 (link) User persona-based agent personalization segments the end-users of a service into a manageable set of user categories, which represent the demographics and preferences of majority of users. The fine-tuning process consists of first parameterizing (aggregated) user data and conversation history and storing it as memory in the LLM via adapters, followed by fine-tuning the LLM for personalized response generation. The agent — user persona router helps in performing user segmentation (scoring) and routing the tasks / prompts to the most relevant agent persona.
  • 28. User Data Embeddings Fine-tuning AI agents on raw user data is often too complex, even if it is at the (aggregated) persona level. This is primarily due to the following reasons:: - Agent interaction data usually spans multiple journeys with sparse data points, various interaction types (multimodal), and potential noise or inconsistencies with incomplete queries — responses. - Moreover, effective personalization often requires a deep understanding of the latent intent / sentiment behind user actions, which can pose difficulties for generic (pre-trained) LLMs. - Finally, fine-tuning is computationally intensive. Agent-user interaction data can be lengthy. Processing and modeling such long sequences (e.g., multi-years’ worth of interaction history) with LLMs can be practically infeasible.
  • 29. User Data Embeddings (USER-LLM) USER-LLM distills compressed representations from diverse and noisy user interactions, effectively capturing the essence of a user’s behavioral patterns and preferences across various interaction modalities. * L. Liu L. Ning. USER-LLM: Efficient LLM Contextualization with User Embeddings. Google Research, 2024 (link) * L. Liu & L. Ning. USER-LLM: Efficient LLM Contextualization with User Embeddings. Google Research, 2024 (link)
  • 30. Reinforcement Learning based Personalization We show how LLM generated responses can be personalized based on a Reinforcement Learning (RL) enabled Recommendation Engine (RE). High-level, the RL based LLM response / action RE works as follows: - The (current) user sentiment and agent interaction history are combined to quantify the user sentiment curve and discount any sudden changes in user sentiment; - leading to the aggregate reward value corresponding to the last LLM response provided to the user. - This reward value is then provided as feedback to the RL agent — to choose the next optimal LLM generated response / action to be provided to the user. D. Biswas. Delayed Rewards in the Context of Reinforcement Learning based Recommender Systems. AAI4H@ECAI 2020: 49-53, (link) E. Ricciardelli, D. Biswas. Self-improving Chatbots based on Reinforcement Learning. RLDM 2019 (link) • D. Biswas. Delayed Rewards in the Context of Reinforcement Learning based Recommender Systems. AAI4H@ECAI 2020: 49-53, (link) • E. Ricciardelli, D. Biswas. Self-improving Chatbots based on Reinforcement Learning. RLDM 2019 (link)
  • 32. Observability Challenges for Agentic AI Observability for AI Agents is challenging: - No global observer: Due to their distributed nature, we cannot assume the existence of an entity having visibility over the entire execution. In fact, due to their privacy and autonomy requirements, even the composite agent may not have visibility over the internal processing of its component agents. - Parallelism: AI agents allow parallel composition of processes. - Dynamic configuration: The agents are selected incrementally as the execution progresses (dynamic binding). Thus, the “components” of the distributed system may not be known in advance.
  • 33. Stateful execution for AI Agents AgentOps monitoring is critical given the complexity and long running nature of AI agents. We define observability as the ability to find out where in the process the execution is and whether any unanticipated glitches have appeared. - Local queries: Queries which can be answered based on the local state information of an agent. - Composite queries: Queries expressed over the states of several agents. - Historical queries: Queries related to the execution history of the composition. - Relationship queries: Queries based on the relationship between states. * D. Biswas. Stateful Monitoring and Responsible Deployment of AI Agents. 17th International Conference on Agents and Artificial Intelligence (ICAART), 2025 (link)
  • 34. Conversational Memory Management using Vector DBs Vector DBs are currently the primary medium to store and retrieve data (memory) corresponding to conversational agents. - This involves selecting an encoder model that performs offline data encoding as a separate process, converting various forms of raw data, such as text, audio, and video, into vectors. - During a chat, the conversational agent has the option of querying the long- term memory system by encoding the query and searching for relevant information within Vector DB. The retrieved information is then used to answer the query based on the stored information.
  • 35. Human Memory Understanding We need to consider the following memory types. - Semantic memory: general knowledge with facts, concepts, meanings, etc. - Episodic memory: personal memory with respect to specific events and situations from the past. - Procedural memory: motor skills like driving a car, with the corresponding procedures to achieve the task. - Emotional memory: feelings associated with experiences.
  • 36. Agentic Memory Management The memory router, always, by default, routes to the long-term memory (LTM) module to see if an existing pattern is there to respond to the given user prompt. If yes, it retrieves and immediately responds, personalizing it as needed. * D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link) * D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link) If the LTM fails, the memory router routes it to the short- term memory (STM) module which then uses its retrieval processes (APIs, etc.) to get the relevant context into the STM (working memory) —leveraging applicable data services.
  • 37. Agentic Memory Management (2) The STM — LTM transformer module is always active and constantly getting the context retrieved and extracting recipes out of it (e.g., refer to the concepts of teachable agents and recipes in AutoGen) and storing in a semantic layer (implemented via Vector DB). * D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link) * D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link) At the same time, it is also collecting other associated properties (e.g., no. of tokens, cost of executing the response, state of the system, etc.) and - creating an episode which is then getting stored in a knowledge graph - with the underlying procedure stored in a finite state machine (FSM).
  • 38. Agentic AI Scenarios: - Agentic RAGs - Reinforcement Learning Agents
  • 39. Agentic RAGs: extending RAGs to SQL Databases Agentic AI framework to build RAG pipelines that work seamlessly over both structured and unstructured data stored in Snowflake. * D. Biswas. Agentic RAGs: extending RAGs to SQL Databases. AI Advances, 2024 (link) * D. Biswas. Agentic RAGs: extending RAGs to SQL Databases. AI Advances, 2024 (link) The SQL & Document query agents leverage the respective Snowflake Cortex Analyst and Search components detailed earlier to query the underlying SQL and Document repositories. Finally, to complete the RAG pipeline, the retrieved data is added to the original prompt — leading the generation of a contextualized response.
  • 40. Reinforcement Learning Agents When we talk about AI agents today, we mostly talk about LLM agents, which loosely translates to invoking (prompting) an LLM to perform natural language processing (NLP) tasks Some agentic tasks might be better suited to other ML techniques, e.g., Reinforcement Learning (RL), predictive analytics, etc. — depending on the use-case objectives. * D. Biswas. LLM based fine-tuning of Reinforcement Learning Agents. AI Advances, 2024 (link) * D. Biswas. LLM based fine-tuning of Reinforcement Learning Agents. AI Advances, 2024 (link)
  • 41. LLM based fine-tuning of Reinforcement Learning Agents * D. Biswas. LLM based fine-tuning of Reinforcement Learning Agents. AI Advances, 2024 (link) * D. Biswas. LLM based fine-tuning of Reinforcement Learning Agents. AI Advances, 2024 (link) We focus on RL agents, and show how LLMs can be used to fine-tune the RL agent reward / policy functions.
  • 42. Reinforcement Learning Agents applied to HVAC Optimization * D. Biswas. Reinforcement Learning based Energy Optimization in Factories, in proc. of the 11th ACM Conference on Future Energy Systems (e-Energy), 2020. (link) * D. Biswas. Reinforcement Learning based Energy Optimization in Factories, in proc. of the 11th ACM Conference on Future Energy Systems (e-Energy), 2020. (link) We show a concrete example of applying the fine-tuning methodology to a real- life industrial control system — designing the RL based controller for HVAC optimization in a building setting.
  • 44. Data Quality Issues with respect to LLMs, esp. Vector DBs From a data quality point of view, we see the following challenges w.r.t. LLMs, esp. Vector DBs: - Accuracy of the encodings in vector stores, measures in terms of correctness and groundedness of the generated LLM responses. - Incorrect and/or inconsistent vectors: Due to issues in the embedding process, some vectors may end up getting corrupted, be incomplete, or getting generated with a different dimensionality. - Missing data can be in the form of missing vectors or metadata. - Timeliness issues w.r.t. outdated documents impacting the vector store. * D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link) * D. Biswas. Long-term Memory for AI Agents. AI Advances, 2024 (link)
  • 45. Explainability Explainable AI is an umbrella term for a range of tools, algorithms and methods; which accompany AI model predictions with explanations. - Explainability of AI models ranks high among the list of ‘non- functional’ AI features to be considered by enterprises. - For example, this implies having to explain why an ML model profiled a user to be in a specific segment — which led him/her to receiving an advertisement. (Labeled) Data Train ML Model Predictions Explanation Model Explainable Predictions
  • 46. Fairness & Bias Bias creeps into AI models, primarily due to the inherent bias already present in the training data. So the ‘data’ part of AI model development is key to addressing bias. - Historical Bias: arises due to historical inequality of human decisions captured in the training data - Representation Bias: arises due to training data that is not representative of the actual population. *H. Suresh, J. V. Guttag. A Framework for Understanding Unintended Consequences of Machine Learning, 2020 (link) *H. Suresh, J. V. Guttag. A Framework for Understanding Unintended Consequences of Machine Learning, 2020 (link)
  • 47. ML Privacy Risks Two broad categories of privacy inference attacks: • Membership inference (if a specific user data item was present in the training dataset) and • Property inference (reconstruct properties of a participant’s dataset) attacks. Black box attacks are still possible when the attacker only has access to the APIs: invoke the model and observe the relationships between inputs and outputs. Training dataset wants access to ML Model (Classification, Prediction) Inference API has access to Attacker * D. Biswas. Privacy Preserving Chatbot Conversations. IEEE AIKE 2020: 179-182 (link) *D. Biswas, K. Vidyasankar. A Privacy Framework for Hierarchical Federated Learning. CIKM Workshops 2021 (link)
  • 48. Gen AI Privacy Risks – novel challenges From a privacy point of view, we need to consider the following additional / different LLM privacy risks: - Membership and property leakage from pre-training data - Model features leakage from pre-trained LLM - Privacy leakage from conversations (history) with LLMs - Compliance with privacy intent of users * D. Biswas. Privacy Risks of Large Language Models. AI Advances, 2024 (link) * D. Biswas. Privacy Risks of Large Language Models. AI Advances, 2024 (link)
  • 49. Responsible deployment of AI Agents * D. Biswas. Stateful Monitoring and Responsible Deployment of AI Agents. 17th International Conference on Agents and Artificial Intelligence (ICAART), 2025 (link)
  • 50. Use-case specific Evaluation of LLMs Need for a comprehensive LLM evaluation strategy with targeted success metrics specific to the use-cases. * D. Biswas. Use Case-Based Evaluation Strategy for LLMs. AI Advances, 2024 (link) * D. Biswas. Use Case-Based Evaluation Strategy for LLMs. AI Advances, 2024 (link)
  • 51. LLM Safety Leaderboard *Hugging Face LLM Safety Leaderboard (link) *B. Wang, et. Al. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models, 2024 (link)