SlideShare a Scribd company logo
From Traction to
Production: Maturing your
LLMOps step by step
Maxim Salnikov
Digital & App Innovation Business Lead at
Microsoft
• Building on web platform
since 90s
• Organizing developer
communities and technical
conferences
• Speaking, training, blogging:
Webdev, Cloud, Generative AI,
Prompt Engineering
Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft
I’m Maxim Salnikov
For every $1 a
company invests in AI,
it is realizing an
average return of
$3.50
14months
Average time it takes for
organizations to realize a
return on their AI
investment
Source: IDC, The Business Opportunity of AI November 2023
What slows down Generative AI adoption?
Getting Started The state of the art is evolving so quickly, it makes it difficult to decide what to use. Along with that,
guidance and documentation is hard to find.
Development Applications often require multiple cutting-edge products and frameworks which requires specialized
expertise and new tools to stitch these components together.
Context Large Language Model doesn't know about your data
Evaluation It is hard to figure out which model to use and how to optimize for their use case.
Operationalization Concerns around privacy, security, and grounding. Developers lack the experience and tools to evaluate,
improve and validate the solutions for their Proof of Concepts, and to scale and operate in production.
Introducing LLMOps == How to bring LLMs apps to production
Bring together people, process, and platform to automate LLM-infused software
delivery & provide continuous value to our users.
People Process Platform
LLMOps benefits
Automation Collaboration Reproducibility
== VELOCITY and SECURITY (For LLMs)
The paradigm shift—from MLOps to LLMOps
Assets to share
Metrics/evaluations
ML models
Traditional MLOps
Audiences
ML Engineers
Data Scientists
ML Engineers
App developers
Model, data,
environments, features
LLMs, agents, plugins,
prompts, chains, APIs
Accuracy
Quality: accuracy, similarity
Harm: bias, toxicity
Honest: groundness
Cost: token per request
Latency: response time, RPS
Build from scratch
Pre-built, fine-tuned
served as API (MaaS)
LLMOps
LLM Lifecycle in the real world
Managing
Prototyping
Find LLMs
Hypo
t
h
e
s
i
s
T
r
y
p
r
o
m
p
t
s
BUSINESS
NEED
Ideating/
exploring
PREPARE FOR APP DEPLOYMENT
SEND FEEDBACK
D
e
p
l
o
y
L
L
M
A
p
p
/
U
I
Q
u
o
ta and cost management
C
ontent Filtering
M
o
n
i
t
o
r
i
n
g
Deploying /
Monitoring
Operationalizing
Saf
e
R
o
l
l
o
u
t
/
S
t
a
g
i
n
g
Prompt Engine
e
r
i
n
g
o
r
F
i
n
e
-
t
u
n
i
n
g
Optimizing
ADVANCE PROJECT
REVERT PROJECT
Evalua
t
i
o
n
E
x
c
e
p
t
i
o
n
Handling
Building/
augmenting
R
e
t
r
i
e
v
a
l
A
u
g
m
e
n
t
e
d
Generation
Run flow against
sample data
Evaluate prompt flow
Satisfied?
Run flow against
larger dataset
Evaluate prompt flow
Satisfied?
No No
Yes Yes
Modify flow
(prompts and tools, etc.)
2. Building/augmenting
Deploy endpoint
Integrate into
application
Add monitoring
and alerts
3. Operationalizing
1. Ideating/exploring
Identify business
use case
Connect to your data
and build LLM flows
Discover your model
Test sample prompt
Compare different
models
01 INITIAL
The Foundation of Explorations
Discovery of models and testing prompts.
Basic evaluation and monitoring.
02 DEFINED
Systematizing LLM Apps Development
Iterative model augmentation with prompt engineering and RAG.
Structured deployment and prompt-based evaluations.
03 MANAGED
Advanced LLM Workflows and Proactive Monitoring
Comprehensive prompt management, evaluation, and real-time deployment.
Advanced monitoring and automated alerts.
04 OPTIMIZED
Operational Excellence and Continuous Improvement
Seamless, collaborative environment for CI/CD.
Fully automated monitoring and model/prompt refinement.
LLMOps Maturity Model
Achieve generative AI operational excellence with the LLMOps maturity model | Microsoft Azure Blog
Get to
know
Azure AI
Azure AI Foundry
One place for building and deploying AI solutions
Model
Catalog
Complete AI
Toolchain
Responsible AI
Practices
Enterprise-Grade
Production at Scale
Azure Machine Learning
Full-lifecycle tools for designing and managing responsible AI models
Responsible
Model Design
Prompt Flow
Orchestration
Model
Fine-Tuning
Model
Training
Cutting-Edge Models
Access to the latest foundation and open-source models
Azure AI Services​
Pre-trained, turnkey solutions for intelligent applications
Azure OpenAI Service​ Azure AI Search Azure AI Speech Azure AI Vision
Azure AI Content Safety
Azure AI Document
Intelligence Azure AI Language Azure AI Translator
Azure AI
Infrastructure
State-of-the-art
silicon and systems
for AI workloads
High-Bandwidth
Networking
Microfluidic Cooling
Azure Maia Silicon
(ex-Studio)
https://ai.azure.com
LLM Lifecycle in the real world
Saf
e
R
o
l
l
o
u
t
/
S
t
a
g
i
n
g
Managing
SEND FEEDBACK
PREPARE FOR APP DEPLOYMENT
ADVANCE PROJECT
Find LLMs
T
r
y
p
r
o
m
p
t
s
Hypo
t
h
e
s
i
s
BUSINESS
NEED
D
e
p
l
o
y
L
L
M
A
p
p
/
U
I
Q
u
o
ta and cost management
REVERT PROJECT
Prompt Engine
e
r
i
n
g
o
r
F
i
n
e
-
t
u
n
i
n
g
R
e
t
r
i
e
v
a
l
A
u
g
m
e
n
t
e
d
Generation
Evalua
t
i
o
n
E
x
c
e
p
t
i
o
n
Handling
C
ontent Filtering
M
o
n
i
t
o
r
i
n
g
Operationalizing
Building/
augmenting
Ideating/
exploring
Orca
Orca 1 Orca 2
Phi
Phi-1 Phi-2 Phi-3
Discover the power of small language models
Command R+
Command R
Llama-3
Llama-2
CodeLlama
*Available as
Model as a
service
Azure OpenAI
Service
GPT-4o
GPT-4o-mini
GPT-4o-realtime
GPT-o1-preview
DALL-E 3
Access a catalog full of pre-built and customizable frontier and open-source models
Falcon/TII
Stable Diffusion/
Stability AI
Dolly/Databricks
CLIP/OpenAI
Mistral Large
Mixtal 7b*8 –
Mixture of Experts
Mistral AI
Mistral 7b
Explore 1800+ comprehensive foundation models
*Available as
Model as a
service
Embed-v3
*Available as
Model as a
service
• Choose
• Compare
• Test
Comprehensive comparison view for benchmarking
metrics of foundation models
• Accuracy
• Groundedness
• Relevance
• Coherence
• Fluency
• Cost
• Latency
• Throughput
LLM Lifecycle in the real world
Saf
e
R
o
l
l
o
u
t
/
S
t
a
g
i
n
g
Managing
SEND FEEDBACK
PREPARE FOR APP DEPLOYMENT
ADVANCE PROJECT
Find LLMs
T
r
y
p
r
o
m
p
t
s
Hypo
t
h
e
s
i
s
BUSINESS
NEED
D
e
p
l
o
y
L
L
M
A
p
p
/
U
I
Q
u
o
ta and cost management
REVERT PROJECT
Prompt Engine
e
r
i
n
g
o
r
F
i
n
e
-
t
u
n
i
n
g
R
e
t
r
i
e
v
a
l
A
u
g
m
e
n
t
e
d
Generation
Evalua
t
i
o
n
E
x
c
e
p
t
i
o
n
Handling
C
ontent Filtering
M
o
n
i
t
o
r
i
n
g
Operationalizing
Building/
augmenting
Ideating/
exploring
Retrieval Augmented Generation (RAG)
Anatomy of the workflow
App UX Orchestrator
User question
Query for
relevant content Retriever over
Knowledge
Base
Search results
R
Prompt + Knowledge
A
Response
G
Data Sources
(files, databases, etc.)
Azure AI Search
Azure OpenAI
Service
-2, -1 , 0, 1
2, 3, 4, 5
6, 7, 8, 9
Transform into
Embeddings
Answer
Retrieval techniques
Chunking Vectorization Indexing Ranking
Prompt Flow for LLMOps!
Create and iteratively develop flow
• Create executable flows that link LLMs, prompts,
Python code and other tools together.
• Debug and iterate your flows, especially the
interaction with LLMs with ease.
Evaluate flow quality and performance
• Evaluate your flow's quality and performance with
larger datasets.
• Integrate the testing and evaluation into your
CI/CD system to ensure quality of your flow.
Streamlined development cycle for production
• Deploy your flow to the serving platform you
choose or integrate into your app's code base
easily.
• Collaborate with your team by leveraging the
cloud version of Prompt flow in Azure AI.
Code-first!
https://github.com/microsoft/promptflow
Create your LLM flows
Develop your LLM flow from scratch
• Orchestrate executable flows with LLMs,
prompts, Python tools, and other APIs through
a visualized graph and code-first experiences
• Add new files, edit existing files, and import
filed to local for authoring flows
• Set conditional controls for the execution of
any node in a flow
Manage API connections
Manage APIs and external data sources
• Seamless integration with pre-built LLMs like
Azure OpenAI Service, Mistral Large, the Llama
family, and the Phi family.
• Built-in safety system with Azure AI Content
Safety
• Effectively manage credentials or secrets for
APIs
• Create your own connections in Python tools
Compare different prompt variants
Test different variants
• Create dynamic prompts using external data
and few shot samples
• Edit your complex prompts in full screen
• Quickly tune prompt and LLM configuration
with variants
• Run all variants with a single row of data and
check output
Fine-tune models in Azure AI model catalog
• Ready-to-use finetuning pipelines
to get started quickly – no need to
spend time installing
frameworks/dependencies
• Optimizations to reduce
finetuning resources and time
• Finetune using UI, Notebook
(Python SDK) or CLI (YAML)
• Serverless fine-tuning available in
Models as a Service
LLM Lifecycle in the real world
Managing
PREPARE FOR APP DEPLOYMENT
ADVANCE PROJECT
Find LLMs
T
r
y
p
r
o
m
p
t
s
Hypo
t
h
e
s
i
s
BUSINESS NEED
D
e
p
l
o
y
L
L
M
A
p
p
/
U
I
Q
u
o
ta and cost management
SEND FEEDBACK
REVERT PROJECT
Prompt Engin
e
e
r
i
n
g
F
i
n
e
-
t
u
n
i
n
g
R
e
t
r
i
e
v
a
l
A
u
g
m
e
n
t
ed
Generation
Evalua
t
i
o
n
E
x
c
e
p
t
i
o
n
Handling
C
ontent Filtering
Safe
R
o
l
l
o
u
t
/
S
t
a
g
i
n
g
M
o
n
i
t
o
r
i
n
g
Ideating/ Exploring Building / Augmenting Operationalizing
Code-first LLMOps for production with developer tools
Use code to define flow
File based flow, organized in a well-defined folder structure​
Support CLI/SDK​
Smooth transition between cloud and local
Download flow to local, import flow to cloud​
Develop, test, debug, deploy on local ​
Submit run from local to cloud​
Manage runs/evaluation in cloud
Integrate with OSS frameworks
LangChain, Semantic Kernel, AutoGen
Automate with CI/CD pipelines
SDK/CLI to init, execute, evaluate, visualize flow and metrics
AZD template integration
Local development with VS Code Extension
Flow editor​
Local connection management​
Tracing and run history​
Collaboration on experiment management
and productivity
Submit flow runs to
cloud from your repo
(anywhere)
Cloud resource
consumption
(compute, data,
storage, etc.)
Transitioning
iterative
development into
code base for
version control
Local development
Generative AI monitoring
Helps you track and improve the performance of LLM applications in production
Key benefits
• Utilize the Azure AI instrumentation SDK for
effortless production data logging
• Enable monitoring for operational (error rate,
latency)l, token and cost (token
usage), quality (groundedness, coherence,
etc) metrics​ for prompt flow deployments
• View extensive monitoring results for your
prompt flow deployment within a
comprehensive UI
Mature LLMOps to accelerate Generative AI adoption
• Assess current maturity stage
• Review LLM lifecycle for your solution
• Pick the right tools
Thank you! I kindly prompt you:
+ Special Offer: Identify where you are in the LLMOps journey in 5min!

More Related Content

PPTX
[DSC Europe 23] Ivan Petrovic - Approach to Architecting Generative AI Solutions
PPTX
Agentic RAG and Small & Specialized Models v1.6.pptx
PDF
Agents for Enterprise Workflows - Berkeley LLM AI Agents MOOC
PDF
Building and deploying LLM applications with Apache Airflow
PDF
EIS-Webinar-Agent-Approaches-2024-08-21.pdf
PDF
Vertex AI Agent Builder - GDG Alicante - Julio 2024
PDF
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
PDF
Multi-agent Systems with Mistral AI, Milvus and Llama-agents
[DSC Europe 23] Ivan Petrovic - Approach to Architecting Generative AI Solutions
Agentic RAG and Small & Specialized Models v1.6.pptx
Agents for Enterprise Workflows - Berkeley LLM AI Agents MOOC
Building and deploying LLM applications with Apache Airflow
EIS-Webinar-Agent-Approaches-2024-08-21.pdf
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Agentic AI in Action: Real-Time Vision, Memory & Autonomy with Browser Use & ...
Multi-agent Systems with Mistral AI, Milvus and Llama-agents

What's hot (20)

PDF
Introduction to MLflow
PDF
Ml ops on AWS
PDF
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
PDF
ChatGPT, Generative AI and Microsoft Copilot: Step Into the Future - Geoff Ab...
PDF
MLOps by Sasha Rosenbaum
PDF
What is data engineering?
PPTX
Log analysis using elk
PDF
DevSecOps What Why and How
PPTX
Generative AI Use cases for Enterprise - Second Session
PPTX
SQL Server 使いのための Azure Synapse Analytics - Spark 入門
PDF
Sustainable & Composable Generative AI
PPTX
MLOps and Data Quality: Deploying Reliable ML Models in Production
PDF
On the Application of AI for Failure Management: Problems, Solutions and Algo...
PPTX
Apache Kafka Best Practices
PDF
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
PDF
LLMOps with Azure Machine Learning prompt flow
PDF
An Introduction to Generative AI - May 18, 2023
PDF
MLOps for production-level machine learning
PDF
Apache Kafka in the Healthcare Industry
PPTX
Microservices Architecture - Bangkok 2018
Introduction to MLflow
Ml ops on AWS
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
ChatGPT, Generative AI and Microsoft Copilot: Step Into the Future - Geoff Ab...
MLOps by Sasha Rosenbaum
What is data engineering?
Log analysis using elk
DevSecOps What Why and How
Generative AI Use cases for Enterprise - Second Session
SQL Server 使いのための Azure Synapse Analytics - Spark 入門
Sustainable & Composable Generative AI
MLOps and Data Quality: Deploying Reliable ML Models in Production
On the Application of AI for Failure Management: Problems, Solutions and Algo...
Apache Kafka Best Practices
[MLOps KR 행사] MLOps 춘추 전국 시대 정리(210605)
LLMOps with Azure Machine Learning prompt flow
An Introduction to Generative AI - May 18, 2023
MLOps for production-level machine learning
Apache Kafka in the Healthcare Industry
Microservices Architecture - Bangkok 2018
Ad

Similar to From Traction to Production Maturing your LLMOps step by step (20)

PDF
Best Practices for Building Successful LLM Applications
PDF
Supercharge Your AI Development with Local LLMs
PPTX
SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologis...
PPTX
Introduction-to-LLM-Developers-Guide.pptx
PPTX
introduction-to-llm-developers-240704171234-97422c56.pptx
PDF
LLM Engineers for SaaS_ Top Skills, Tools & Hiring Secrets.pdf
PDF
Oleksii Ivanchenko: FMOps/LLMOps: особливості підходу і чим відрізняється від...
PDF
Use Case Patterns for LLM Applications (1).pdf
PPTX
Generative A IBootcamp-Presentation echnologies and how they connect Using t...
PDF
How to Build an MLOps Pipeline - SoluLab
PDF
architecting-ai-in-the-enterprise-apis-and-applications.pdf
PPTX
MLOps Training Online | MLOps Course in Hyderabad
PDF
What is LLMOps Large Language Model Operations.pdf
PDF
Using the power of OpenAI with your own data: what's possible and how to start?
PDF
solulab.com-What is LLMOps Large Language Model Operations.pdf
PDF
Azure Engineering MLOps
PDF
Introducing MLOps.pdf
PDF
The Significance of Large Language Models (LLMs) in Generative AI2.pdf
PDF
odsc_2023.pdf
PPTX
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Best Practices for Building Successful LLM Applications
Supercharge Your AI Development with Local LLMs
SaaStr Annual 2024: Outsmarting LLMs: 5 Strategies for Founders & Technologis...
Introduction-to-LLM-Developers-Guide.pptx
introduction-to-llm-developers-240704171234-97422c56.pptx
LLM Engineers for SaaS_ Top Skills, Tools & Hiring Secrets.pdf
Oleksii Ivanchenko: FMOps/LLMOps: особливості підходу і чим відрізняється від...
Use Case Patterns for LLM Applications (1).pdf
Generative A IBootcamp-Presentation echnologies and how they connect Using t...
How to Build an MLOps Pipeline - SoluLab
architecting-ai-in-the-enterprise-apis-and-applications.pdf
MLOps Training Online | MLOps Course in Hyderabad
What is LLMOps Large Language Model Operations.pdf
Using the power of OpenAI with your own data: what's possible and how to start?
solulab.com-What is LLMOps Large Language Model Operations.pdf
Azure Engineering MLOps
Introducing MLOps.pdf
The Significance of Large Language Models (LLMs) in Generative AI2.pdf
odsc_2023.pdf
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Ad

More from Maxim Salnikov (20)

PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Azure AI Foundry: The AI app and agent factory
PDF
Reimagining Software Development and DevOps with Agentic AI
PDF
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
PDF
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
PDF
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
PDF
Evaluation as an Essential Component of the Generative AI Lifecycle
PDF
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
PDF
Real-world coding with GitHub Copilot: tips & tricks
PDF
AI-assisted development: how to build and ship with confidence
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
PDF
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
PDF
Building Generative AI-infused apps: what's possible and how to start
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
PDF
ChatGPT and not only: how can you use the power of Generative AI at scale
PDF
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
PDF
Prompt Engineering - an Art, a Science, or your next Job Title?
PDF
ChatGPT and not only: How to use the power of GPT-X models at scale
PDF
How Azure helps to build better business processes and customer experiences w...
PDF
Using the power of Generative AI at scale
Getting started with AI Agents and Multi-Agent Systems
Azure AI Foundry: The AI app and agent factory
Reimagining Software Development and DevOps with Agentic AI
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
Evaluation as an Essential Component of the Generative AI Lifecycle
Privacy-first in-browser Generative AI web apps: offline-ready, future-proof,...
Real-world coding with GitHub Copilot: tips & tricks
AI-assisted development: how to build and ship with confidence
Prompt Engineering - an Art, a Science, or your next Job Title?
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Building Generative AI-infused apps: what's possible and how to start
Prompt Engineering - an Art, a Science, or your next Job Title?
ChatGPT and not only: how can you use the power of Generative AI at scale
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
Prompt Engineering - an Art, a Science, or your next Job Title?
ChatGPT and not only: How to use the power of GPT-X models at scale
How Azure helps to build better business processes and customer experiences w...
Using the power of Generative AI at scale

Recently uploaded (20)

PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
17 Powerful Integrations Your Next-Gen MLM Software Needs
PDF
iTop VPN Free 5.6.0.5262 Crack latest version 2025
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
Cost to Outsource Software Development in 2025
PDF
Complete Guide to Website Development in Malaysia for SMEs
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Nekopoi APK 2025 free lastest update
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Reimagine Home Health with the Power of Agentic AI​
wealthsignaloriginal-com-DS-text-... (1).pdf
17 Powerful Integrations Your Next-Gen MLM Software Needs
iTop VPN Free 5.6.0.5262 Crack latest version 2025
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
Advanced SystemCare Ultimate Crack + Portable (2025)
Cost to Outsource Software Development in 2025
Complete Guide to Website Development in Malaysia for SMEs
Why Generative AI is the Future of Content, Code & Creativity?
Designing Intelligence for the Shop Floor.pdf
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Nekopoi APK 2025 free lastest update
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Odoo Companies in India – Driving Business Transformation.pdf
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Navsoft: AI-Powered Business Solutions & Custom Software Development
Weekly report ppt - harsh dattuprasad patel.pptx
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Operating system designcfffgfgggggggvggggggggg
Reimagine Home Health with the Power of Agentic AI​

From Traction to Production Maturing your LLMOps step by step

  • 1. From Traction to Production: Maturing your LLMOps step by step Maxim Salnikov Digital & App Innovation Business Lead at Microsoft
  • 2. • Building on web platform since 90s • Organizing developer communities and technical conferences • Speaking, training, blogging: Webdev, Cloud, Generative AI, Prompt Engineering Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft I’m Maxim Salnikov
  • 3. For every $1 a company invests in AI, it is realizing an average return of $3.50 14months Average time it takes for organizations to realize a return on their AI investment Source: IDC, The Business Opportunity of AI November 2023
  • 4. What slows down Generative AI adoption? Getting Started The state of the art is evolving so quickly, it makes it difficult to decide what to use. Along with that, guidance and documentation is hard to find. Development Applications often require multiple cutting-edge products and frameworks which requires specialized expertise and new tools to stitch these components together. Context Large Language Model doesn't know about your data Evaluation It is hard to figure out which model to use and how to optimize for their use case. Operationalization Concerns around privacy, security, and grounding. Developers lack the experience and tools to evaluate, improve and validate the solutions for their Proof of Concepts, and to scale and operate in production.
  • 5. Introducing LLMOps == How to bring LLMs apps to production Bring together people, process, and platform to automate LLM-infused software delivery & provide continuous value to our users. People Process Platform
  • 6. LLMOps benefits Automation Collaboration Reproducibility == VELOCITY and SECURITY (For LLMs)
  • 7. The paradigm shift—from MLOps to LLMOps Assets to share Metrics/evaluations ML models Traditional MLOps Audiences ML Engineers Data Scientists ML Engineers App developers Model, data, environments, features LLMs, agents, plugins, prompts, chains, APIs Accuracy Quality: accuracy, similarity Harm: bias, toxicity Honest: groundness Cost: token per request Latency: response time, RPS Build from scratch Pre-built, fine-tuned served as API (MaaS) LLMOps
  • 8. LLM Lifecycle in the real world Managing Prototyping Find LLMs Hypo t h e s i s T r y p r o m p t s BUSINESS NEED Ideating/ exploring PREPARE FOR APP DEPLOYMENT SEND FEEDBACK D e p l o y L L M A p p / U I Q u o ta and cost management C ontent Filtering M o n i t o r i n g Deploying / Monitoring Operationalizing Saf e R o l l o u t / S t a g i n g Prompt Engine e r i n g o r F i n e - t u n i n g Optimizing ADVANCE PROJECT REVERT PROJECT Evalua t i o n E x c e p t i o n Handling Building/ augmenting R e t r i e v a l A u g m e n t e d Generation
  • 9. Run flow against sample data Evaluate prompt flow Satisfied? Run flow against larger dataset Evaluate prompt flow Satisfied? No No Yes Yes Modify flow (prompts and tools, etc.) 2. Building/augmenting Deploy endpoint Integrate into application Add monitoring and alerts 3. Operationalizing 1. Ideating/exploring Identify business use case Connect to your data and build LLM flows Discover your model Test sample prompt Compare different models
  • 10. 01 INITIAL The Foundation of Explorations Discovery of models and testing prompts. Basic evaluation and monitoring. 02 DEFINED Systematizing LLM Apps Development Iterative model augmentation with prompt engineering and RAG. Structured deployment and prompt-based evaluations. 03 MANAGED Advanced LLM Workflows and Proactive Monitoring Comprehensive prompt management, evaluation, and real-time deployment. Advanced monitoring and automated alerts. 04 OPTIMIZED Operational Excellence and Continuous Improvement Seamless, collaborative environment for CI/CD. Fully automated monitoring and model/prompt refinement. LLMOps Maturity Model Achieve generative AI operational excellence with the LLMOps maturity model | Microsoft Azure Blog
  • 11. Get to know Azure AI Azure AI Foundry One place for building and deploying AI solutions Model Catalog Complete AI Toolchain Responsible AI Practices Enterprise-Grade Production at Scale Azure Machine Learning Full-lifecycle tools for designing and managing responsible AI models Responsible Model Design Prompt Flow Orchestration Model Fine-Tuning Model Training Cutting-Edge Models Access to the latest foundation and open-source models Azure AI Services​ Pre-trained, turnkey solutions for intelligent applications Azure OpenAI Service​ Azure AI Search Azure AI Speech Azure AI Vision Azure AI Content Safety Azure AI Document Intelligence Azure AI Language Azure AI Translator Azure AI Infrastructure State-of-the-art silicon and systems for AI workloads High-Bandwidth Networking Microfluidic Cooling Azure Maia Silicon (ex-Studio) https://ai.azure.com
  • 12. LLM Lifecycle in the real world Saf e R o l l o u t / S t a g i n g Managing SEND FEEDBACK PREPARE FOR APP DEPLOYMENT ADVANCE PROJECT Find LLMs T r y p r o m p t s Hypo t h e s i s BUSINESS NEED D e p l o y L L M A p p / U I Q u o ta and cost management REVERT PROJECT Prompt Engine e r i n g o r F i n e - t u n i n g R e t r i e v a l A u g m e n t e d Generation Evalua t i o n E x c e p t i o n Handling C ontent Filtering M o n i t o r i n g Operationalizing Building/ augmenting Ideating/ exploring
  • 13. Orca Orca 1 Orca 2 Phi Phi-1 Phi-2 Phi-3 Discover the power of small language models Command R+ Command R Llama-3 Llama-2 CodeLlama *Available as Model as a service Azure OpenAI Service GPT-4o GPT-4o-mini GPT-4o-realtime GPT-o1-preview DALL-E 3 Access a catalog full of pre-built and customizable frontier and open-source models Falcon/TII Stable Diffusion/ Stability AI Dolly/Databricks CLIP/OpenAI Mistral Large Mixtal 7b*8 – Mixture of Experts Mistral AI Mistral 7b Explore 1800+ comprehensive foundation models *Available as Model as a service Embed-v3 *Available as Model as a service • Choose • Compare • Test
  • 14. Comprehensive comparison view for benchmarking metrics of foundation models • Accuracy • Groundedness • Relevance • Coherence • Fluency • Cost • Latency • Throughput
  • 15. LLM Lifecycle in the real world Saf e R o l l o u t / S t a g i n g Managing SEND FEEDBACK PREPARE FOR APP DEPLOYMENT ADVANCE PROJECT Find LLMs T r y p r o m p t s Hypo t h e s i s BUSINESS NEED D e p l o y L L M A p p / U I Q u o ta and cost management REVERT PROJECT Prompt Engine e r i n g o r F i n e - t u n i n g R e t r i e v a l A u g m e n t e d Generation Evalua t i o n E x c e p t i o n Handling C ontent Filtering M o n i t o r i n g Operationalizing Building/ augmenting Ideating/ exploring
  • 16. Retrieval Augmented Generation (RAG) Anatomy of the workflow App UX Orchestrator User question Query for relevant content Retriever over Knowledge Base Search results R Prompt + Knowledge A Response G Data Sources (files, databases, etc.) Azure AI Search Azure OpenAI Service -2, -1 , 0, 1 2, 3, 4, 5 6, 7, 8, 9 Transform into Embeddings Answer
  • 18. Prompt Flow for LLMOps! Create and iteratively develop flow • Create executable flows that link LLMs, prompts, Python code and other tools together. • Debug and iterate your flows, especially the interaction with LLMs with ease. Evaluate flow quality and performance • Evaluate your flow's quality and performance with larger datasets. • Integrate the testing and evaluation into your CI/CD system to ensure quality of your flow. Streamlined development cycle for production • Deploy your flow to the serving platform you choose or integrate into your app's code base easily. • Collaborate with your team by leveraging the cloud version of Prompt flow in Azure AI. Code-first! https://github.com/microsoft/promptflow
  • 19. Create your LLM flows Develop your LLM flow from scratch • Orchestrate executable flows with LLMs, prompts, Python tools, and other APIs through a visualized graph and code-first experiences • Add new files, edit existing files, and import filed to local for authoring flows • Set conditional controls for the execution of any node in a flow
  • 20. Manage API connections Manage APIs and external data sources • Seamless integration with pre-built LLMs like Azure OpenAI Service, Mistral Large, the Llama family, and the Phi family. • Built-in safety system with Azure AI Content Safety • Effectively manage credentials or secrets for APIs • Create your own connections in Python tools
  • 21. Compare different prompt variants Test different variants • Create dynamic prompts using external data and few shot samples • Edit your complex prompts in full screen • Quickly tune prompt and LLM configuration with variants • Run all variants with a single row of data and check output
  • 22. Fine-tune models in Azure AI model catalog • Ready-to-use finetuning pipelines to get started quickly – no need to spend time installing frameworks/dependencies • Optimizations to reduce finetuning resources and time • Finetune using UI, Notebook (Python SDK) or CLI (YAML) • Serverless fine-tuning available in Models as a Service
  • 23. LLM Lifecycle in the real world Managing PREPARE FOR APP DEPLOYMENT ADVANCE PROJECT Find LLMs T r y p r o m p t s Hypo t h e s i s BUSINESS NEED D e p l o y L L M A p p / U I Q u o ta and cost management SEND FEEDBACK REVERT PROJECT Prompt Engin e e r i n g F i n e - t u n i n g R e t r i e v a l A u g m e n t ed Generation Evalua t i o n E x c e p t i o n Handling C ontent Filtering Safe R o l l o u t / S t a g i n g M o n i t o r i n g Ideating/ Exploring Building / Augmenting Operationalizing
  • 24. Code-first LLMOps for production with developer tools Use code to define flow File based flow, organized in a well-defined folder structure​ Support CLI/SDK​ Smooth transition between cloud and local Download flow to local, import flow to cloud​ Develop, test, debug, deploy on local ​ Submit run from local to cloud​ Manage runs/evaluation in cloud Integrate with OSS frameworks LangChain, Semantic Kernel, AutoGen Automate with CI/CD pipelines SDK/CLI to init, execute, evaluate, visualize flow and metrics AZD template integration Local development with VS Code Extension Flow editor​ Local connection management​ Tracing and run history​ Collaboration on experiment management and productivity Submit flow runs to cloud from your repo (anywhere) Cloud resource consumption (compute, data, storage, etc.) Transitioning iterative development into code base for version control Local development
  • 25. Generative AI monitoring Helps you track and improve the performance of LLM applications in production Key benefits • Utilize the Azure AI instrumentation SDK for effortless production data logging • Enable monitoring for operational (error rate, latency)l, token and cost (token usage), quality (groundedness, coherence, etc) metrics​ for prompt flow deployments • View extensive monitoring results for your prompt flow deployment within a comprehensive UI
  • 26. Mature LLMOps to accelerate Generative AI adoption • Assess current maturity stage • Review LLM lifecycle for your solution • Pick the right tools
  • 27. Thank you! I kindly prompt you: + Special Offer: Identify where you are in the LLMOps journey in 5min!