SlideShare a Scribd company logo
MLOps for Compositional AI
Debmalya Biswas
NeurIPS 2022 Workshop: Challenges in Deploying and Monitoring
Machine Learning Systems (DMML)
Enterprise AI
Enterprise AI/ML use-cases are pervasive.
4
Broadly categorized by the three core
AI/ML capabilities enabling them:
Natural Language Processing (NLP),
Computer Vision and Predictive Analytics
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor
Majority of AI/ML models are still
developed with the goal of solving a
single task, e.g., prediction, classification.
Compositional AI Scenario
Consider the online Repair Service of a
luxury goods vendor.
The service consists of a Computer Vision
(CV) model capable of assessing the repairs
needed, given a picture of the product
uploaded by the customer.
Product Repair
Assessment CV Model
Chatbot Ordering App
Repair
Ordering
Service
The assessment is followed by an Ordering
Chatbot conversation that captures
additional details required to process the
user’s repair request, e.g., damage details,
username, contact details, etc.
Compositional AI Scenario (2)
In future, when the enterprise is looking for models
to develop a Product Recommendation service; the
Repair Service is considered.
The data gathered by the Repair Service: state of
products owned by the users (gathered by CV
assessment model) together with their demographics
(gathered by the Ordering Chatbot) - provides
additional training data for the Recommender Service.
Privacy policies may prevent their data from being
combined, such that, they cannot be used to profile
customers – “data used for a different purpose than
originally intended”. Product Repair
Assessment CV Model Chatbot Ordering App
Repair Ordering
Service
[Damaged product
images + Text
description, Customer
demographics ]
Product
Recommendation
Service
[Products
purchased +
Demographics]
Compositional AI Scenario (3)
Enterprise further wants to develop a CV
App to detect Defective products during
Manufacturing.
The Repair Service can help here as it has
labeled images of damaged products (with
the product damage descriptions provided
to the Chatbot acting as ‘labels’).
Product Repair
Assessment CV Model Chatbot Ordering App
Repair Ordering
Service
[Damaged product
images + Text
description, Customer
demographics ]
Manufacturing Defect
Detection App
[Damaged Product
images + Text
description]
Compositional AI
(Labeled)
Data
(Train)
ML Model
API
Endpoint
(New)
Composite
Service
Arbitrary
composition of
Data, Model, API
Compositional AI envisions seamless composition of existing AI/ML services, to provide a new
(composite) AI/ML service, capable of addressing complex multi-domain use-cases.
MLOps
Manages model versions and
parameters, however model
composition aspect is missing.
* D. Sculley, et. al. Hidden Technical Debt in Machine Learning Systems. NIPS 2015: 2503-2511
MLOps, also known as ModelOps,
combines DevOps with ML to
manage ML models in production.
End-to-end ML lifecycle: Data and
(Serving) API aspects are also
considered.
DataOps
Reality: this curated / processed data is moved to another location,
e.g., cloud storage buckets, or another data lake, where it is further
transformed as part of ML training and deployment.
“DataOps is an
automated, process-
oriented methodology,
used by analytic and
data teams, to improve
the quality and reduce
the cycle time of data
analytics.”
- Wikipedia
Source data, both structured and unstructured, is ingested into the Bronze
layer, where it is cleansed and standardized into the Sliver layer, with
further modeling and transformation into the Gold layer. The data is now
ready for consumption by both BI — Reporting tools & ML pipelines.
This results in redundancy and a fragmentation of the DataOps and
MLOps pipelines
Bridging DataOps & MLOps - Challenges
The data (pre-)processing part of
MLOps includes a series of
transformations that support a
learning algorithm – which are
more complex than those supported
by traditional ETL tools.
Bridging DataOps & MLOps - Solutions
Snowflake recently announced Snowpark Python API that
allows ML models to be trained and deployed within
Snowflake, with Snowpark allowing data scientists to use
Python (rather than writing code in SQL)
Google Cloud Platform (GCP) provides BigQuery ML,
a GCP tool that allows ML models to be trained
purely using SQL within GCP’s Data Warehouse
environment.
AWS Redshift Data API makes it easy for any application
written in Python to interact with Redshift. This allows a
SageMaker notebook to connect to the Redshift cluster
and run Data API commands in Python.
ML Model Inferences as a new Data Source
Inferences made by a deployed ML model
can be provided as a feedback loop to
augment the existing training dataset of the
deployed model, or as training dataset for a
new model.
This leads to the scenario where a deployed ML
model generates new data — acts as a Data
Source for the DataOps pipeline. Synthetic data
can also be considered as an additional data
source here.
Structured
Raw /
Staging
(Bronze)
Cleansed /
Standardized
(Silver)
Transformed /
Modeled
(Gold)
Unstructured
BI / Reporting
AI/ML
Feature
extraction
Training
dataset
Test
dataset
Model
Training
Exploratory
Data
Analysis
Model
Serving
(Inference)
Model
Monitoring
DataOps DQ/Validation Filtering
Historization Aggregation
Data pre-processing
within MLOps
DQ/Cleaning Encoding
Selection Normalization
ML Outputs
(Inferences,
Predictions )
Conclusion
In this paper, we highlighted two aspects of MLOps needed to enable
Compositional AI, primarily
an integrated DataOps-MLOps pipeline
that leverages model inferences to augment/generate new training
data (for new models).
As ML models proliferate in the enterprise, Compositional AI has the
potential to enable reuse of (training) data and models, improving
agility and efficiency in ML model development.
13
Thanks for your attention
Contact: Debmalya Biswas
(LinkedIn) (Medium)

More Related Content

PDF
Sustainable & Composable Generative AI
PPTX
Compositional AI: Fusion of AI/ML Services
PDF
Cloud based Machine Learning Platforms, a review - Sagar Khashu
PDF
Power BI & Advanced Business Intelligence Tools Excel 2013 / 2016 By Spark Tr...
PDF
MongoDB.local Austin 2018: Building Intelligent Apps with MongoDB & Google Cloud
PDF
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
PPTX
Building Intelligent Apps with MongoDB & Google Cloud
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Sustainable & Composable Generative AI
Compositional AI: Fusion of AI/ML Services
Cloud based Machine Learning Platforms, a review - Sagar Khashu
Power BI & Advanced Business Intelligence Tools Excel 2013 / 2016 By Spark Tr...
MongoDB.local Austin 2018: Building Intelligent Apps with MongoDB & Google Cloud
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Building Intelligent Apps with MongoDB & Google Cloud
[Giovanni Galloro] How to use machine learning on Google Cloud Platform

Similar to MLOps for Compositional AI (20)

PPTX
MongoDB.local Sydney 2019: Building Intelligent Apps with MongoDB & Google Cloud
PPTX
[DSC Adria 23] Antoni Ivanov Practical Kimball Data Patterns.pptx
PDF
Meeting the challenges of AI workloads with the Dell AI portfolio
PPTX
Building Intelligent Apps with MongoDB and Google Cloud - Jane Fine
PDF
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
PDF
Microsoft Build 2020: Data Science Recap
PDF
BigQuery ML - Machine learning at scale using SQL
PDF
Accelerating Machine Learning as a Service with Automated Feature Engineering
PDF
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
PDF
Norman Sasono - Incorporating AI/ML into Your Application Architecture
PDF
Norman Sasono - Incorporating AI/ML into Your Application Architecture
PPTX
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
PPTX
AzureML Welcome to the future of Predictive Analytics
PPTX
It7113 research project - group 7
PPSX
It7113 research project - group 7
PDF
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PDF
Managing Large Flask Applications On Google App Engine (GAE)
PDF
leewayhertz.com-Cloud AI services A comprehensive guide.pdf
PDF
Infrastructure Agnostic Machine Learning Workload Deployment
MongoDB.local Sydney 2019: Building Intelligent Apps with MongoDB & Google Cloud
[DSC Adria 23] Antoni Ivanov Practical Kimball Data Patterns.pptx
Meeting the challenges of AI workloads with the Dell AI portfolio
Building Intelligent Apps with MongoDB and Google Cloud - Jane Fine
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google Cloud
Microsoft Build 2020: Data Science Recap
BigQuery ML - Machine learning at scale using SQL
Accelerating Machine Learning as a Service with Automated Feature Engineering
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
AzureML Welcome to the future of Predictive Analytics
It7113 research project - group 7
It7113 research project - group 7
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
Managing Large Flask Applications On Google App Engine (GAE)
leewayhertz.com-Cloud AI services A comprehensive guide.pdf
Infrastructure Agnostic Machine Learning Workload Deployment
Ad

More from Debmalya Biswas (20)

PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
PDF
ICAART 2025 presentation on Stateful Monitoring and Responsible Deployment of...
PDF
Agentic AI: Scalable & Responsible Deployment of AI Agents in the Enterprise
PDF
A comprehensive guide to Agentic AI Systems
PDF
Responsible LLMOps presentation at Webit 2024
PPTX
AI Agents and their implications for Enterprise AI Use-cases
PPTX
Gen AI: Privacy Risks of Large Language Models (LLMs)
PPTX
Constraints Enabled Autonomous Agent Marketplace: Discovery and Matchmaking
PDF
Responsible Generative AI Design Patterns
PPTX
Data-Driven (Reinforcement Learning-Based) Control
PPTX
Regulating Generative AI - LLMOps pipelines with Transparency
PPTX
A Privacy Framework for Hierarchical Federated Learning
PPTX
Edge AI Framework for Healthcare Applications
PPTX
Ethical AI - Open Compliance Summit 2020
PPTX
Privacy Preserving Chatbot Conversations
PPTX
Reinforcement Learning based HVAC Optimization in Factories
PPTX
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
PPTX
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
PDF
Privacy-Preserving Outsourced Profiling
PDF
Privacy Policies Change Management for Smartphones
Agentic AI lifecycle for Enterprise Hyper-Automation
ICAART 2025 presentation on Stateful Monitoring and Responsible Deployment of...
Agentic AI: Scalable & Responsible Deployment of AI Agents in the Enterprise
A comprehensive guide to Agentic AI Systems
Responsible LLMOps presentation at Webit 2024
AI Agents and their implications for Enterprise AI Use-cases
Gen AI: Privacy Risks of Large Language Models (LLMs)
Constraints Enabled Autonomous Agent Marketplace: Discovery and Matchmaking
Responsible Generative AI Design Patterns
Data-Driven (Reinforcement Learning-Based) Control
Regulating Generative AI - LLMOps pipelines with Transparency
A Privacy Framework for Hierarchical Federated Learning
Edge AI Framework for Healthcare Applications
Ethical AI - Open Compliance Summit 2020
Privacy Preserving Chatbot Conversations
Reinforcement Learning based HVAC Optimization in Factories
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Privacy-Preserving Outsourced Profiling
Privacy Policies Change Management for Smartphones
Ad

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
KodekX | Application Modernization Development
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Mobile App Security Testing_ A Comprehensive Guide.pdf
KodekX | Application Modernization Development
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
20250228 LYD VKU AI Blended-Learning.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
“AI and Expert System Decision Support & Business Intelligence Systems”
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Unlocking AI with Model Context Protocol (MCP)
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
CIFDAQ's Market Insight: SEC Turns Pro Crypto
NewMind AI Weekly Chronicles - August'25 Week I
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...

MLOps for Compositional AI

  • 1. MLOps for Compositional AI Debmalya Biswas NeurIPS 2022 Workshop: Challenges in Deploying and Monitoring Machine Learning Systems (DMML)
  • 2. Enterprise AI Enterprise AI/ML use-cases are pervasive. 4 Broadly categorized by the three core AI/ML capabilities enabling them: Natural Language Processing (NLP), Computer Vision and Predictive Analytics Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor Majority of AI/ML models are still developed with the goal of solving a single task, e.g., prediction, classification.
  • 3. Compositional AI Scenario Consider the online Repair Service of a luxury goods vendor. The service consists of a Computer Vision (CV) model capable of assessing the repairs needed, given a picture of the product uploaded by the customer. Product Repair Assessment CV Model Chatbot Ordering App Repair Ordering Service The assessment is followed by an Ordering Chatbot conversation that captures additional details required to process the user’s repair request, e.g., damage details, username, contact details, etc.
  • 4. Compositional AI Scenario (2) In future, when the enterprise is looking for models to develop a Product Recommendation service; the Repair Service is considered. The data gathered by the Repair Service: state of products owned by the users (gathered by CV assessment model) together with their demographics (gathered by the Ordering Chatbot) - provides additional training data for the Recommender Service. Privacy policies may prevent their data from being combined, such that, they cannot be used to profile customers – “data used for a different purpose than originally intended”. Product Repair Assessment CV Model Chatbot Ordering App Repair Ordering Service [Damaged product images + Text description, Customer demographics ] Product Recommendation Service [Products purchased + Demographics]
  • 5. Compositional AI Scenario (3) Enterprise further wants to develop a CV App to detect Defective products during Manufacturing. The Repair Service can help here as it has labeled images of damaged products (with the product damage descriptions provided to the Chatbot acting as ‘labels’). Product Repair Assessment CV Model Chatbot Ordering App Repair Ordering Service [Damaged product images + Text description, Customer demographics ] Manufacturing Defect Detection App [Damaged Product images + Text description]
  • 6. Compositional AI (Labeled) Data (Train) ML Model API Endpoint (New) Composite Service Arbitrary composition of Data, Model, API Compositional AI envisions seamless composition of existing AI/ML services, to provide a new (composite) AI/ML service, capable of addressing complex multi-domain use-cases.
  • 7. MLOps Manages model versions and parameters, however model composition aspect is missing. * D. Sculley, et. al. Hidden Technical Debt in Machine Learning Systems. NIPS 2015: 2503-2511 MLOps, also known as ModelOps, combines DevOps with ML to manage ML models in production. End-to-end ML lifecycle: Data and (Serving) API aspects are also considered.
  • 8. DataOps Reality: this curated / processed data is moved to another location, e.g., cloud storage buckets, or another data lake, where it is further transformed as part of ML training and deployment. “DataOps is an automated, process- oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics.” - Wikipedia Source data, both structured and unstructured, is ingested into the Bronze layer, where it is cleansed and standardized into the Sliver layer, with further modeling and transformation into the Gold layer. The data is now ready for consumption by both BI — Reporting tools & ML pipelines. This results in redundancy and a fragmentation of the DataOps and MLOps pipelines
  • 9. Bridging DataOps & MLOps - Challenges The data (pre-)processing part of MLOps includes a series of transformations that support a learning algorithm – which are more complex than those supported by traditional ETL tools.
  • 10. Bridging DataOps & MLOps - Solutions Snowflake recently announced Snowpark Python API that allows ML models to be trained and deployed within Snowflake, with Snowpark allowing data scientists to use Python (rather than writing code in SQL) Google Cloud Platform (GCP) provides BigQuery ML, a GCP tool that allows ML models to be trained purely using SQL within GCP’s Data Warehouse environment. AWS Redshift Data API makes it easy for any application written in Python to interact with Redshift. This allows a SageMaker notebook to connect to the Redshift cluster and run Data API commands in Python.
  • 11. ML Model Inferences as a new Data Source Inferences made by a deployed ML model can be provided as a feedback loop to augment the existing training dataset of the deployed model, or as training dataset for a new model. This leads to the scenario where a deployed ML model generates new data — acts as a Data Source for the DataOps pipeline. Synthetic data can also be considered as an additional data source here. Structured Raw / Staging (Bronze) Cleansed / Standardized (Silver) Transformed / Modeled (Gold) Unstructured BI / Reporting AI/ML Feature extraction Training dataset Test dataset Model Training Exploratory Data Analysis Model Serving (Inference) Model Monitoring DataOps DQ/Validation Filtering Historization Aggregation Data pre-processing within MLOps DQ/Cleaning Encoding Selection Normalization ML Outputs (Inferences, Predictions )
  • 12. Conclusion In this paper, we highlighted two aspects of MLOps needed to enable Compositional AI, primarily an integrated DataOps-MLOps pipeline that leverages model inferences to augment/generate new training data (for new models). As ML models proliferate in the enterprise, Compositional AI has the potential to enable reuse of (training) data and models, improving agility and efficiency in ML model development.
  • 13. 13 Thanks for your attention Contact: Debmalya Biswas (LinkedIn) (Medium)