SlideShare a Scribd company logo
Agentic Techniques in
Retrieval-Augmented
Generation with Azure
AI Search
Maxim Salnikov
Digital and App Innovation
Business Lead at Microsoft
I’m Maxim Salnikov
 Building on web platform since 90s
 Organizing developer communities
and technical conferences
 Speaking, training, blogging: Webdev,
Cloud, Generative AI, Prompt
Engineering
Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft
How to build high-quality
search in LLM-infused apps
Well-known limitations of LLMS
Outdated public knowledge
No internal knowledge
Incorporating domain knowledge
Fine
tuning
Learn new skills
(permanently)
High cost, time
Retrieval
augmentation
Learn new facts
(temporarily)
RAG: Retrieval Augmented Generation
User
Question
What is included in
my Northwind
Health Plus plan
that is not in
standard?
[Benefit_Options.pdf#page=3]
Both plans offer coverage for
medical services. Northwind
Health Plus offers coverage for
hospital stays, doctor visits, lab
tests, and X-rays. Northwind
Standard only offers coverage
for doctor visits and lab tests. …
Large
Language
Model
Both plans offer coverage for
medical services, such as doctor
visits and lab tests, but the
Health Plus plan also covers
hospital stays and X-rays.1
Search
Components of a high-quality RAG
1
Sophisticated
LLM
Adheres to instructions
Supports function calling
2
Well prepared
data
Reasonably sized text
Meaningful vectors
3
Powerful search
functionality
Vector search
Hybrid search
Semantic re-ranking
Filtering
Azure AI Search
State-of-the-art RAG
Rigorously tested
retrieval
Streamlined
data pipeline
Developer
integrations
Enterprise-ready
foundation
Streamlined RAG data pipeline
Ingest
• OneLake
• Blob Storage
• ADLSv2
• SQL DB
• CosmosDB
+ Incremental
change tracking
Extract
• PDFs, Office docs
• Image files
• Nested images in
PDFs, Office docs
• Document layout
JSON, CSVs,
markdown parsing
Chunk
• Split text
into passages
(for text portions):
• Fixed-size
• AI Document
Intelligence layout
output
• Propagate
document
metadata
Embed
• Image → vector
• Text → vector
• Embeddings
from OpenAI,
AI Vision, or
custom models
Index
• Document index
• Chunk index
• Both
Query
• Boosting
• Weighting
• Thresholding
Azure AI Foundry
Foundry Models
Foundry Agent Service
Azure AI
Search
Azure AI
Services
Azure
Machine Learning
Azure AI
Content Safety
Foundry Observability
Security • Identity • Management
Copilot
Studio
Visual
Studio
GitHub
Foundry
SDK
Cloud
Azure
Azure Arc
Foundry Local
Edge
Serverless
Control Azure Kubernetes Service Azure Container Apps
Azure App Service Azure Functions
Classic search workflow
User query L1 L2 → LLM
Results as-is
from L1
Single-shot
retrieval
Reranking
e.g. cross-encoder
Linear
ranked list
Hybrid
RRF(keyword, vector)
No
interpretation
Vector search
embeddings +
vector index
Keyword search
segmentation +
inverted index
A complete stack gives you optimal retrieval
Question: "What underwater activities can I do in the Bahamas?"
Keyword
results
1
Scuba Diving
in Bahamas
Vector
results
1
Scuba Diving in
the Carribean
2
Water skiing in
Seychelles
Fusion
(RRF)
1
Scuba Diving in
the Carribean
2
Scuba Diving
in Bahamas
3
Water skiing
in Seychelles
Reranking
with cutoff
1
Scuba Diving in
Bahamas
2
Scuba Diving in
the Carribean
RAG can struggle with complex queries
Works
security updates KB4048959
Doesn’t work
what does KB4048959 fix
and what systems is the
patch compatibel with?
copay cost
whats the difference in
costs for copays vs split
pays
Introducing agentic
retrieval in Azure AI Search
Announcing
Agentic retrieval
Agentic methods applied to retrieval
Query planning Parallel query execution Results merging
User query
Conversation turns
Query
planning
Search query 1 L1 L2
… L1 L2
Search query n L1 L2
Merged
results
Single LLM call Use conversation history for context • Correct spellings in context • Decompose queries as needed • Paraphrase queries
Query planning & decomposition
what does KB4048959 fix and
what systems is the patch
compatible with?
Correcting spelling is much
more effective if done in
sentence context
Use chat history for context
(the user said earlier it was a
security fix)
Split queries for different
information requirements,
resolving co-references
What is agentic retrieval in Azure AI Search?
What is included in
my Northwind
Health Plus plan that
is not in standard?
Both plans offer coverage for
medical services, such as doctor
visits and lab tests, but the Health
Plus plan also covers hospital stays
and X-rays1
User
Question
Large
Language
Model
Search
Northwind Health
Plus plan benefits
Search
Standard Northwind
Health plan benefits
Large
Language
Model
Introducing Knowledge Agent
What is included in my Northwind
Health Plus plan that is not in standard?
User
Question
Knowledge
Agent
Northwind
Health Plus
plan benefits
Standard Northwind
Health
plan benefits
Search
Index
Activity Log
Query Results
Northwind Health
Plus plan benefits
3
Standard
Northwind Health
plan benefits
3
Document References
Northwind Health Plus
Summary.pdf
Northwind Health
Comparison.pdf
Northwind Health Standard
Summary.pdf
Query planning in agentic retrieval
User
Question
What is included in my
Northwind Health Plus
plan that is not in
standard?
LLM
Reply
Northwind Health
Plus adds several types
of coverage that the
Standard plan does
not have…
User
Question
How much more
does Northwind
Health Plus cost?
LLM
Query
Planning
Payroll deduction
Northwind Health Plus
Payroll deduction
Northwind Health
Standard
Activity log in agentic retrieval
LLM
Query
Planning
Input Tokens: 200
Output Tokens: 50
Search
Northwind Health
Plus plan benefits
Results: 3
Duration: 200ms
Search
Standard Northwind
Health plan benefits
Results: 3
Duration: 250ms
How do we know if it’s actually better?
How do we know if it’s actually better?
Query
Answer relevance
Is the answer relevant
to the query?
Context relevance
Is the retrieved context
relevant to the query?
Response Context
Groundedness
Is the response supported by the context?
Source: The RAG Triad | www.trulens.org/getting_started/core_concepts/rag_triad/
Agentic retrieval evals
+40% RAG answer
relevance +30% RAG answer
result rate
For complex queries requiring information from multiple documents
0 10 20 30 40 50 60
Answer score
Content score
Traditional search vs agentic retrieval
Complex Agentric Retrieval Complex Search
Full evals results and methodology description: aka.ms/aisearch-arevals
Resources and
next steps
Get Started with Azure AI Search
aka.ms/AISearch-new
Agentic retrieval free in first phase
of public preview
aka.ms/AgentRAG
GitHub Sample – Quickstart
aka.ms/AISearch-ar-pyn
GitHub Sample – Agent Service Integration
aka.ms/AISearch-ar-agent
Take the Azure AI Learn Courses
aka.ms/CreateAgenticAISolutions
Thank you!
Connect with me on LinkedIn:
• Message me to get a link to this slidedeck
with all links
• Follow me to get the latest AI
announcements
• Invite me to deliver a technical session or
training on Gen AI, Agents, RAG for your
company or conference

More Related Content

PDF
Azure AI Foundry: The AI app and agent factory
PDF
Reimagining Software Development and DevOps with Agentic AI
PDF
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
PDF
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
PDF
Evaluation as an Essential Component of the Generative AI Lifecycle
PDF
From Traction to Production Maturing your LLMOps step by step
PDF
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
PDF
Real-world coding with GitHub Copilot: tips & tricks
Azure AI Foundry: The AI app and agent factory
Reimagining Software Development and DevOps with Agentic AI
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
Evaluation as an Essential Component of the Generative AI Lifecycle
From Traction to Production Maturing your LLMOps step by step
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
Real-world coding with GitHub Copilot: tips & tricks

More from Maxim Salnikov (19)

PDF
AI-assisted development: how to build and ship with confidence
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
PDF
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
PDF
Building Generative AI-infused apps: what's possible and how to start
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
PDF
ChatGPT and not only: how can you use the power of Generative AI at scale
PDF
Using the power of OpenAI with your own data: what's possible and how to start?
PDF
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
PDF
ChatGPT and not only: How to use the power of GPT-X models at scale
PDF
How Azure helps to build better business processes and customer experiences w...
PDF
Using the power of Generative AI at scale
PDF
Web Push Notifications done right
PDF
The Status of Angular v13
PPTX
Azure cloud for the web frontend developers
PDF
[Russian] Сервис-воркеры: используем накопленные знания для светлого будущего...
PDF
[Russian] Прогрессивные веб-приложения: по-настоящему кросс-платформенный опыт
PDF
Securing Connected Cars Requires Digital Identity
PDF
How to Make Your IoT Devices Secure, Act Autonomously & Trusted Subjects
AI-assisted development: how to build and ship with confidence
Prompt Engineering - an Art, a Science, or your next Job Title?
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Building Generative AI-infused apps: what's possible and how to start
Prompt Engineering - an Art, a Science, or your next Job Title?
ChatGPT and not only: how can you use the power of Generative AI at scale
Using the power of OpenAI with your own data: what's possible and how to start?
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Prompt Engineering - an Art, a Science, or your next Job Title?
ChatGPT and not only: How to use the power of GPT-X models at scale
How Azure helps to build better business processes and customer experiences w...
Using the power of Generative AI at scale
Web Push Notifications done right
The Status of Angular v13
Azure cloud for the web frontend developers
[Russian] Сервис-воркеры: используем накопленные знания для светлого будущего...
[Russian] Прогрессивные веб-приложения: по-настоящему кросс-платформенный опыт
Securing Connected Cars Requires Digital Identity
How to Make Your IoT Devices Secure, Act Autonomously & Trusted Subjects
Ad

Recently uploaded (20)

PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPT
Introduction Database Management System for Course Database
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
Transform Your Business with a Software ERP System
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
System and Network Administration Chapter 2
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Nekopoi APK 2025 free lastest update
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
top salesforce developer skills in 2025.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Navsoft: AI-Powered Business Solutions & Custom Software Development
Upgrade and Innovation Strategies for SAP ERP Customers
Introduction Database Management System for Course Database
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Transform Your Business with a Software ERP System
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
System and Network Administration Chapter 2
How to Migrate SBCGlobal Email to Yahoo Easily
Odoo Companies in India – Driving Business Transformation.pdf
Nekopoi APK 2025 free lastest update
Design an Analysis of Algorithms I-SECS-1021-03
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Designing Intelligence for the Shop Floor.pdf
wealthsignaloriginal-com-DS-text-... (1).pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
top salesforce developer skills in 2025.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Ad

Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search

  • 1. Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search Maxim Salnikov Digital and App Innovation Business Lead at Microsoft
  • 2. I’m Maxim Salnikov  Building on web platform since 90s  Organizing developer communities and technical conferences  Speaking, training, blogging: Webdev, Cloud, Generative AI, Prompt Engineering Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft
  • 3. How to build high-quality search in LLM-infused apps
  • 4. Well-known limitations of LLMS Outdated public knowledge No internal knowledge
  • 5. Incorporating domain knowledge Fine tuning Learn new skills (permanently) High cost, time Retrieval augmentation Learn new facts (temporarily)
  • 6. RAG: Retrieval Augmented Generation User Question What is included in my Northwind Health Plus plan that is not in standard? [Benefit_Options.pdf#page=3] Both plans offer coverage for medical services. Northwind Health Plus offers coverage for hospital stays, doctor visits, lab tests, and X-rays. Northwind Standard only offers coverage for doctor visits and lab tests. … Large Language Model Both plans offer coverage for medical services, such as doctor visits and lab tests, but the Health Plus plan also covers hospital stays and X-rays.1 Search
  • 7. Components of a high-quality RAG 1 Sophisticated LLM Adheres to instructions Supports function calling 2 Well prepared data Reasonably sized text Meaningful vectors 3 Powerful search functionality Vector search Hybrid search Semantic re-ranking Filtering
  • 8. Azure AI Search State-of-the-art RAG Rigorously tested retrieval Streamlined data pipeline Developer integrations Enterprise-ready foundation
  • 9. Streamlined RAG data pipeline Ingest • OneLake • Blob Storage • ADLSv2 • SQL DB • CosmosDB + Incremental change tracking Extract • PDFs, Office docs • Image files • Nested images in PDFs, Office docs • Document layout JSON, CSVs, markdown parsing Chunk • Split text into passages (for text portions): • Fixed-size • AI Document Intelligence layout output • Propagate document metadata Embed • Image → vector • Text → vector • Embeddings from OpenAI, AI Vision, or custom models Index • Document index • Chunk index • Both Query • Boosting • Weighting • Thresholding
  • 10. Azure AI Foundry Foundry Models Foundry Agent Service Azure AI Search Azure AI Services Azure Machine Learning Azure AI Content Safety Foundry Observability Security • Identity • Management Copilot Studio Visual Studio GitHub Foundry SDK Cloud Azure Azure Arc Foundry Local Edge Serverless Control Azure Kubernetes Service Azure Container Apps Azure App Service Azure Functions
  • 11. Classic search workflow User query L1 L2 → LLM Results as-is from L1 Single-shot retrieval Reranking e.g. cross-encoder Linear ranked list Hybrid RRF(keyword, vector) No interpretation Vector search embeddings + vector index Keyword search segmentation + inverted index
  • 12. A complete stack gives you optimal retrieval Question: "What underwater activities can I do in the Bahamas?" Keyword results 1 Scuba Diving in Bahamas Vector results 1 Scuba Diving in the Carribean 2 Water skiing in Seychelles Fusion (RRF) 1 Scuba Diving in the Carribean 2 Scuba Diving in Bahamas 3 Water skiing in Seychelles Reranking with cutoff 1 Scuba Diving in Bahamas 2 Scuba Diving in the Carribean
  • 13. RAG can struggle with complex queries Works security updates KB4048959 Doesn’t work what does KB4048959 fix and what systems is the patch compatibel with? copay cost whats the difference in costs for copays vs split pays
  • 15. Announcing Agentic retrieval Agentic methods applied to retrieval Query planning Parallel query execution Results merging User query Conversation turns Query planning Search query 1 L1 L2 … L1 L2 Search query n L1 L2 Merged results Single LLM call Use conversation history for context • Correct spellings in context • Decompose queries as needed • Paraphrase queries
  • 16. Query planning & decomposition what does KB4048959 fix and what systems is the patch compatible with? Correcting spelling is much more effective if done in sentence context Use chat history for context (the user said earlier it was a security fix) Split queries for different information requirements, resolving co-references
  • 17. What is agentic retrieval in Azure AI Search? What is included in my Northwind Health Plus plan that is not in standard? Both plans offer coverage for medical services, such as doctor visits and lab tests, but the Health Plus plan also covers hospital stays and X-rays1 User Question Large Language Model Search Northwind Health Plus plan benefits Search Standard Northwind Health plan benefits Large Language Model
  • 18. Introducing Knowledge Agent What is included in my Northwind Health Plus plan that is not in standard? User Question Knowledge Agent Northwind Health Plus plan benefits Standard Northwind Health plan benefits Search Index Activity Log Query Results Northwind Health Plus plan benefits 3 Standard Northwind Health plan benefits 3 Document References Northwind Health Plus Summary.pdf Northwind Health Comparison.pdf Northwind Health Standard Summary.pdf
  • 19. Query planning in agentic retrieval User Question What is included in my Northwind Health Plus plan that is not in standard? LLM Reply Northwind Health Plus adds several types of coverage that the Standard plan does not have… User Question How much more does Northwind Health Plus cost? LLM Query Planning Payroll deduction Northwind Health Plus Payroll deduction Northwind Health Standard
  • 20. Activity log in agentic retrieval LLM Query Planning Input Tokens: 200 Output Tokens: 50 Search Northwind Health Plus plan benefits Results: 3 Duration: 200ms Search Standard Northwind Health plan benefits Results: 3 Duration: 250ms
  • 21. How do we know if it’s actually better?
  • 22. How do we know if it’s actually better? Query Answer relevance Is the answer relevant to the query? Context relevance Is the retrieved context relevant to the query? Response Context Groundedness Is the response supported by the context? Source: The RAG Triad | www.trulens.org/getting_started/core_concepts/rag_triad/
  • 23. Agentic retrieval evals +40% RAG answer relevance +30% RAG answer result rate For complex queries requiring information from multiple documents 0 10 20 30 40 50 60 Answer score Content score Traditional search vs agentic retrieval Complex Agentric Retrieval Complex Search Full evals results and methodology description: aka.ms/aisearch-arevals
  • 24. Resources and next steps Get Started with Azure AI Search aka.ms/AISearch-new Agentic retrieval free in first phase of public preview aka.ms/AgentRAG GitHub Sample – Quickstart aka.ms/AISearch-ar-pyn GitHub Sample – Agent Service Integration aka.ms/AISearch-ar-agent Take the Azure AI Learn Courses aka.ms/CreateAgenticAISolutions
  • 25. Thank you! Connect with me on LinkedIn: • Message me to get a link to this slidedeck with all links • Follow me to get the latest AI announcements • Invite me to deliver a technical session or training on Gen AI, Agents, RAG for your company or conference