Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search

Agentic Techniques in
Retrieval-Augmented
Generation with Azure
AI Search
Maxim Salnikov
Digital and App Innovation
Business Lead at Microsoft

I’m Maxim Salnikov
 Building on web platform since 90s
 Organizing developer communities
and technical conferences
 Speaking, training, blogging: Webdev,
Cloud, Generative AI, Prompt
Engineering
Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft

How to build high-quality
search in LLM-infused apps

Well-known limitations of LLMS
Outdated public knowledge
No internal knowledge

Incorporating domain knowledge
Fine
tuning
Learn new skills
(permanently)
High cost, time
Retrieval
augmentation
Learn new facts
(temporarily)

RAG: Retrieval Augmented Generation
User
Question
What is included in
my Northwind
Health Plus plan
that is not in
standard?
[Benefit_Options.pdf#page=3]
Both plans offer coverage for
medical services. Northwind
Health Plus offers coverage for
hospital stays, doctor visits, lab
tests, and X-rays. Northwind
Standard only offers coverage
for doctor visits and lab tests. …
Large
Language
Model
medical services, such as doctor
visits and lab tests, but the
Health Plus plan also covers
hospital stays and X-rays.1
Search

Components of a high-quality RAG
1
Sophisticated
LLM
Adheres to instructions
Supports function calling
2
Well prepared
data
Reasonably sized text
Meaningful vectors
3
Powerful search
functionality
Vector search
Hybrid search
Semantic re-ranking
Filtering

Azure AI Search
State-of-the-art RAG
Rigorously tested
retrieval
Streamlined
data pipeline
Developer
integrations
Enterprise-ready
foundation

Streamlined RAG data pipeline
Ingest
• OneLake
• Blob Storage
• ADLSv2
• SQL DB
• CosmosDB
+ Incremental
change tracking
Extract
• PDFs, Office docs
• Image files
• Nested images in
PDFs, Office docs
• Document layout
JSON, CSVs,
markdown parsing
Chunk
• Split text
into passages
(for text portions):
• Fixed-size
• AI Document
Intelligence layout
output
• Propagate
document
metadata
Embed
• Image → vector
• Text → vector
• Embeddings
from OpenAI,
AI Vision, or
custom models
Index
• Document index
• Chunk index
• Both
Query
• Boosting
• Weighting
• Thresholding

Azure AI Foundry
Foundry Models
Foundry Agent Service
Azure AI
Search
Azure AI
Services
Azure
Machine Learning
Azure AI
Content Safety
Foundry Observability
Security • Identity • Management
Copilot
Studio
Visual
Studio
GitHub
Foundry
SDK
Cloud
Azure
Azure Arc
Foundry Local
Edge
Serverless
Control Azure Kubernetes Service Azure Container Apps
Azure App Service Azure Functions

Classic search workflow
User query L1 L2 → LLM
Results as-is
from L1
Single-shot
retrieval
Reranking
e.g. cross-encoder
Linear
ranked list
Hybrid
RRF(keyword, vector)
No
interpretation
Vector search
embeddings +
vector index
Keyword search
segmentation +
inverted index

A complete stack gives you optimal retrieval
Question: "What underwater activities can I do in the Bahamas?"
Keyword
results
1
Scuba Diving
in Bahamas
Vector
results
1
Scuba Diving in
the Carribean
2
Water skiing in
Seychelles
Fusion
(RRF)
1
Scuba Diving in
the Carribean
2
Scuba Diving
in Bahamas
3
Water skiing
in Seychelles
Reranking
with cutoff
1
Scuba Diving in
Bahamas
2
Scuba Diving in
the Carribean

RAG can struggle with complex queries
Works
security updates KB4048959
Doesn’t work
what does KB4048959 fix
and what systems is the
patch compatibel with?
copay cost
whats the difference in
costs for copays vs split
pays

Introducing agentic
retrieval in Azure AI Search

Announcing
Agentic retrieval
Agentic methods applied to retrieval
Query planning Parallel query execution Results merging
User query
Conversation turns
Query
planning
Search query 1 L1 L2
… L1 L2
Search query n L1 L2
Merged
results
Single LLM call Use conversation history for context • Correct spellings in context • Decompose queries as needed • Paraphrase queries

Query planning & decomposition
what does KB4048959 fix and
what systems is the patch
compatible with?
Correcting spelling is much
more effective if done in
sentence context
Use chat history for context
(the user said earlier it was a
security fix)
Split queries for different
information requirements,
resolving co-references

What is agentic retrieval in Azure AI Search?
What is included in
my Northwind
Health Plus plan that
is not in standard?
medical services, such as doctor
visits and lab tests, but the Health
Plus plan also covers hospital stays
and X-rays1
User
Question
Large
Language
Model
Search
Northwind Health
Plus plan benefits
Search
Standard Northwind
Health plan benefits
Large
Language
Model

Introducing Knowledge Agent
What is included in my Northwind
Health Plus plan that is not in standard?
User
Question
Knowledge
Agent
Northwind
Health Plus
plan benefits
Standard Northwind
Health
plan benefits
Search
Index
Activity Log
Query Results
Northwind Health
Plus plan benefits
3
Standard
Northwind Health
plan benefits
3
Document References
Northwind Health Plus
Summary.pdf
Northwind Health
Comparison.pdf
Northwind Health Standard
Summary.pdf

Query planning in agentic retrieval
User
Question
What is included in my
plan that is not in
standard?
LLM
Reply
Northwind Health
Plus adds several types
of coverage that the
Standard plan does
not have…
User
Question
How much more
does Northwind
Health Plus cost?
LLM
Query
Planning
Payroll deduction
Payroll deduction
Northwind Health
Standard

Activity log in agentic retrieval
LLM
Query
Planning
Input Tokens: 200
Output Tokens: 50
Search
Northwind Health
Plus plan benefits
Results: 3
Duration: 200ms
Search
Standard Northwind
Health plan benefits
Results: 3
Duration: 250ms

How do we know if it’s actually better?

How do we know if it’s actually better?
Query
Answer relevance
Is the answer relevant
to the query?
Context relevance
Is the retrieved context
relevant to the query?
Response Context
Groundedness
Is the response supported by the context?
Source: The RAG Triad | www.trulens.org/getting_started/core_concepts/rag_triad/

Agentic retrieval evals
+40% RAG answer
relevance +30% RAG answer
result rate
For complex queries requiring information from multiple documents
0 10 20 30 40 50 60
Answer score
Content score
Traditional search vs agentic retrieval
Complex Agentric Retrieval Complex Search
Full evals results and methodology description: aka.ms/aisearch-arevals

Resources and
next steps
Get Started with Azure AI Search
aka.ms/AISearch-new
Agentic retrieval free in first phase
of public preview
aka.ms/AgentRAG
GitHub Sample – Quickstart
aka.ms/AISearch-ar-pyn
GitHub Sample – Agent Service Integration
aka.ms/AISearch-ar-agent
Take the Azure AI Learn Courses
aka.ms/CreateAgenticAISolutions

Thank you!
Connect with me on LinkedIn:
• Message me to get a link to this slidedeck
with all links
• Follow me to get the latest AI
announcements
• Invite me to deliver a technical session or
training on Gen AI, Agents, RAG for your
company or conference

Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search

More Related Content

More from Maxim Salnikov (19)

Recently uploaded (20)

Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search