SlideShare a Scribd company logo
2
Most read
6
Most read
Starting Your Modern
DataOps Journey
Lucas Stone
Solutions Engineer
01.12.2020
The DataOps Story
Pre
2000
2007
Waterfall
Linear approach
Good for projects where end state is well-defined
(e.g. physical infrastructure)
Not as good where product is continually
changing / developing (e.g. software)
Agile
Began with “Agile Manifesto”
Designed for production of software products
Response to how rapid business
requirements could change
Emphasised iteration
DevOps
Even given the rise of Agile, dev and
ops teams remained siloed
DevOps aimed to bring them together
Set of practices to release high quality
code and faster
DataOps
Bring the DevOps approach to data function
Align data science / management and
operations teams
Ensures the business can fully leverage
data and convert into actionable insight
2000 2014
Where is DataOps in the “Hype Cycle”?
DataOps Search Trends: Past 5 Years
DataOps
DATA PRODUCTION VALUE
DEVELOPMENT
Quality
Data + Operations
Ensure that data value is delivered to business as soon as possible
Intersection of the Value and Innovation Pipelines
Value
Pipeline
Innovation Pipeline
IDEA
Quality
DataOps Misconceptions
A technology
Although there are a set of technologies that
are commonly used to support its
implementation within an organisation
Restricted
Either to 1) “big data” – the scale of data and
complexity doesn’t preclude benefits 2) only
advanced data science applications (e.g. ML)
More than just
DevOps for Data!
It brings together 3 distinct elements:
Agile development, DevOps, and
Statistical Process Control (Lean)
A methodology
It brings together a number of principles
and practices around the way an
organization manages and processes data
DataOps is… DataOps is not…
DataOps – What are the benefits?
Faster
Deployment & Feedback
Consumers of data analytics will get what
they need faster and be able to feedback
more often, creating virtuous circle with
business requirements as the starting point
Happier Colleagues
Introduction of DataOps will mean those involved in
the process are more able to quickly see the positive
impact of their work leading to more engaged and
productive teams
Higher Data Quality
Increased automation (particularly of testing) and
standardised processes will lead to high Data
Quality, which will in turn lead to better insights
generated from Machine Learning models
Collaboration
DataOps promotes collaboration,
communication, and coordination between
teams that may otherwise remained siloed
Which companies use DataOps?
Please see Qubole’s Creating a Data-Driven Enterprise with DataOps ebook for further
information on how each of these organisations implements DataOps
Case Study: Facebook & Apache Hive
Stage 2
Created a Hadoop data
lake and developed Hive
to make it more
accessible – data team
evolved from a service
team to building self-
service platforms for data
extraction
Stage 4
Developed Uis that were
easy for business users to
understand and use to
independently extract data
- data becomes fully
democratised
Stage 1
No structure around data
requests, rather the
business would request on
an ad-hoc basis – the data
function would act as a
service
Stage 3
Combined metadata
services with Hive
allowing users to look at
data and metadata –
however, data still not
accessible to non-tech
users
What is required to begin implementing DataOps?
Processes
It is then important to establish clear processes
including who is “RACI”. Those responsible should
receive appropriate training. Measuring process
effectiveness with appropriate KPIs is also crucial
People & Culture
The foundation for introducing DataOps lies firstly
with buy-in from the key stakeholder groups
particularly the business so that business
requirements are understood
Technologies
Once the correct culture and processes have been
established, an organisation can introduce tooling to
support related activities, notably automation, testing,
and orchestration
2
1
3
People: Stakeholders Groups
Data Consumers
Those who will use data to perform
analysis and extract insights to then
deliver to those in the business who
can use these to drive value
Data Suppliers
Those managing the integrity of
Authoritative Data Sources to ensure
data quality and availability
Data Preparers
Those who build data pipelines
linking one source to another as well
as managing its transformation into a
usable format for Data Consumers
Business
Other parts of the organisation would
not use DataOps – rather they rely
on and benefit from better outputs in
terms of insights / BI / analytics and
convey Business Requirements
Data Ops builds two crucial bridges, firstly between the business and technology
functions, secondly within the data function itself
People: Ingraining a DataOps Culture
Push from the Top
Cultural changes must be endorsed by
senior management both within and
outside of the data function before being
pushed down to individual teams
Embrace the Process
Acknowledge that change won’t happen
over night and that improvements will be
incremental – allow a realistic timeframe for
the process of implementing data ops
Remove silos
Breaking down organisational barriers between
Data Suppliers, Preparers, Consumers, and the
Business will be crucial to the smooth flow of
data to those making decisions
Emphasise Data
Data should be front and centre of
strategic decision making for DataOps to
realise its full potential – this should be
embedded as a company value
Invest in Tools
Carefully selecting a complementary set of
technologies underpinning the implementation of
DataOps is essential as will be providing the
relevant training to upskill your teams
1
2
3
4
5
Processes: Building a “Data Supply Chain”
Data
Suppliers
Data
Preparers
Data
Consumers
The
Business
Source Owner, DBA, Infrastructure and Ops Personnel,
Application Admins + Developers
Data Engineers, Data Architects, Data Stewards,
Integration Architects + Developers, Data Modelers
Machine Learning Model Developers, Data Scientists
HR, Finance, Strategy, Operations etc.
Data Product Managers
Business Analysts
Data Security Teams
Data Privacy Officers
Technology: Agile, Collaboration,
Automation, Infrastructure as Code
Agile
Small but frequent deliveries
of new features
Constant feedback loop
between the business and tech
Version Control to decrease risk
and increase productivity
Job / issue tracking to ensure
even minor feedback captured
Collaboration
Automation
Continuous Integration,
Deployment, Delivery
Automate testing and speed up
getting code into production
Infrastructure as Code
Manage IT infrastructure
using code
Make changes to
existing infrastructure
much more easily
CloverDX & DataOps
Increased deployment frequency
Package, share, and reuse any
functionality you design
Automated testing
Incorporate data quality tests and build in error
handling to your data pipelines
Consistent metadata and version control
CloverDX is easy to integrate with most VC tools,
metadata can easily be tracked and visualized
Monitoring
The CloverDX server has a monitoring suite that
can be applied to individual jobs or whole
business processes
Collaboration across all stakeholders
CloverDX’s visual design allows technical and
non-technical users to “speak the same language”
Gartner identified 5 “key techniques” that will support with the delivery of DataOps –
Clover can support “Data Preparers” with each
Upcoming Webinar
Code Management with
Version Control in CloverDX
December 8th
11am EST / 4pm GMT / 5pm CET
Register
Q&A

More Related Content

PDF
Lakehouse in Azure
PPTX
Azure DataBricks for Data Engineering by Eugene Polonichko
PPT
Gen etoh present_8_06
PDF
Microsoft Power BI Overview
PPTX
Introducing the Snowflake Computing Cloud Data Warehouse
PDF
Data Platform Architecture Principles and Evaluation Criteria
PDF
Speeding Time to Insight with a Modern ELT Approach
PPTX
Introduction to Power BI to make smart decisions
Lakehouse in Azure
Azure DataBricks for Data Engineering by Eugene Polonichko
Gen etoh present_8_06
Microsoft Power BI Overview
Introducing the Snowflake Computing Cloud Data Warehouse
Data Platform Architecture Principles and Evaluation Criteria
Speeding Time to Insight with a Modern ELT Approach
Introduction to Power BI to make smart decisions

What's hot (20)

PDF
Modernizing to a Cloud Data Architecture
PPTX
Azure Data Lake Intro (SQLBits 2016)
PDF
Modern Data architecture Design
PDF
What is MLOps
PPTX
Azure Synapse Analytics Overview (r2)
PDF
Data Engineering.pdf
PDF
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
PPTX
Building a modern data warehouse
PPTX
Data platform modernization with Databricks.pptx
PPTX
Data Lakehouse Symposium | Day 4
PPTX
Snowflake Datawarehouse Architecturing
PDF
Data Catalog for Better Data Discovery and Governance
PDF
Technical Deck Delta Live Tables.pdf
PPTX
How API Enablement Drives Legacy Modernization
PDF
Working with Microsoft Power Business Inteligence Tools - Presented by Atidan
PDF
Introduction to ETL and Data Integration
PPT
Power BI: Types of gateways in Power BI
PDF
Time to Talk about Data Mesh
PPT
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Modernizing to a Cloud Data Architecture
Azure Data Lake Intro (SQLBits 2016)
Modern Data architecture Design
What is MLOps
Azure Synapse Analytics Overview (r2)
Data Engineering.pdf
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Building a modern data warehouse
Data platform modernization with Databricks.pptx
Data Lakehouse Symposium | Day 4
Snowflake Datawarehouse Architecturing
Data Catalog for Better Data Discovery and Governance
Technical Deck Delta Live Tables.pdf
How API Enablement Drives Legacy Modernization
Working with Microsoft Power Business Inteligence Tools - Presented by Atidan
Introduction to ETL and Data Integration
Power BI: Types of gateways in Power BI
Time to Talk about Data Mesh
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Ad

Similar to Starting Your Modern DataOps Journey (20)

PPTX
Should You Invest In DataOps Services?
PDF
A Detailed Guide To DataOps
PDF
Streamline Your Data Workflows with DataOps for Better Efficiency.pdf
PDF
Creating a Successful DataOps Framework for Your Business.pdf
PDF
How Can You Implement DataOps In Your Existing Workflow?
PPTX
DataOps Best Practices for Real-Time Big Data Management
PDF
Should You Integrate DataOps in Your Business Process?
PPTX
Data summit connect fall 2020 - rise of data ops
PDF
Best practices in data ops
PPTX
DataOps: Nine steps to transform your data science impact Strata London May 18
PPTX
[DSC Europe 24] Josip Saban - Buidling cloud data platforms in enterprises
PPTX
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
PDF
Introdution to Dataops and AIOps (or MLOps)
PDF
How Can You Leverage DevSecOps Approach For Secure Data Analytics?
PPTX
How to add security in dataops and devops
PPTX
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
PDF
My code, my environment, and yes, my data
PDF
DevOps Spain 2019. Olivier Perard-Oracle
PPTX
Everything you wanted to know about data ops
PDF
DataOps , cbuswaw April '23
Should You Invest In DataOps Services?
A Detailed Guide To DataOps
Streamline Your Data Workflows with DataOps for Better Efficiency.pdf
Creating a Successful DataOps Framework for Your Business.pdf
How Can You Implement DataOps In Your Existing Workflow?
DataOps Best Practices for Real-Time Big Data Management
Should You Integrate DataOps in Your Business Process?
Data summit connect fall 2020 - rise of data ops
Best practices in data ops
DataOps: Nine steps to transform your data science impact Strata London May 18
[DSC Europe 24] Josip Saban - Buidling cloud data platforms in enterprises
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Introdution to Dataops and AIOps (or MLOps)
How Can You Leverage DevSecOps Approach For Secure Data Analytics?
How to add security in dataops and devops
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
My code, my environment, and yes, my data
DevOps Spain 2019. Olivier Perard-Oracle
Everything you wanted to know about data ops
DataOps , cbuswaw April '23
Ad

More from CloverDX (14)

PPTX
Data architecture principles to accelerate your data strategy
PPTX
Characteristics of modern data architecture that drive innovation
PPTX
How to build an automated customer data onboarding pipeline
PPTX
Automating Data Pipelines: Moving away from Scripts and Excel
PPTX
CloverDX 6.2 Release
PDF
How to Effectively Migrate Data From Legacy Apps
PDF
Deploying ETL to Cloud
PDF
Moving Legacy Apps to Cloud: How to Avoid Risk
PPTX
CloverDX for IBM Infosphere MDM (for 11.4 and later)
PDF
Modern management of data pipelines made easier
PDF
Removing Danger From Data
PDF
Data Anonymization For Better Software Testing
PDF
How to publish data and transformations over APIs with CloverDX Data Services
PPTX
Moving "Something Simple" To The Cloud - What It Really Takes
Data architecture principles to accelerate your data strategy
Characteristics of modern data architecture that drive innovation
How to build an automated customer data onboarding pipeline
Automating Data Pipelines: Moving away from Scripts and Excel
CloverDX 6.2 Release
How to Effectively Migrate Data From Legacy Apps
Deploying ETL to Cloud
Moving Legacy Apps to Cloud: How to Avoid Risk
CloverDX for IBM Infosphere MDM (for 11.4 and later)
Modern management of data pipelines made easier
Removing Danger From Data
Data Anonymization For Better Software Testing
How to publish data and transformations over APIs with CloverDX Data Services
Moving "Something Simple" To The Cloud - What It Really Takes

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Big Data Technologies - Introduction.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Machine learning based COVID-19 study performance prediction
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Spectral efficient network and resource selection model in 5G networks
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Big Data Technologies - Introduction.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Machine learning based COVID-19 study performance prediction
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Building Integrated photovoltaic BIPV_UPV.pdf
Unlocking AI with Model Context Protocol (MCP)
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The AUB Centre for AI in Media Proposal.docx
“AI and Expert System Decision Support & Business Intelligence Systems”

Starting Your Modern DataOps Journey

  • 1. Starting Your Modern DataOps Journey Lucas Stone Solutions Engineer 01.12.2020
  • 2. The DataOps Story Pre 2000 2007 Waterfall Linear approach Good for projects where end state is well-defined (e.g. physical infrastructure) Not as good where product is continually changing / developing (e.g. software) Agile Began with “Agile Manifesto” Designed for production of software products Response to how rapid business requirements could change Emphasised iteration DevOps Even given the rise of Agile, dev and ops teams remained siloed DevOps aimed to bring them together Set of practices to release high quality code and faster DataOps Bring the DevOps approach to data function Align data science / management and operations teams Ensures the business can fully leverage data and convert into actionable insight 2000 2014
  • 3. Where is DataOps in the “Hype Cycle”?
  • 4. DataOps Search Trends: Past 5 Years
  • 5. DataOps DATA PRODUCTION VALUE DEVELOPMENT Quality Data + Operations Ensure that data value is delivered to business as soon as possible Intersection of the Value and Innovation Pipelines Value Pipeline Innovation Pipeline IDEA Quality
  • 6. DataOps Misconceptions A technology Although there are a set of technologies that are commonly used to support its implementation within an organisation Restricted Either to 1) “big data” – the scale of data and complexity doesn’t preclude benefits 2) only advanced data science applications (e.g. ML) More than just DevOps for Data! It brings together 3 distinct elements: Agile development, DevOps, and Statistical Process Control (Lean) A methodology It brings together a number of principles and practices around the way an organization manages and processes data DataOps is… DataOps is not…
  • 7. DataOps – What are the benefits? Faster Deployment & Feedback Consumers of data analytics will get what they need faster and be able to feedback more often, creating virtuous circle with business requirements as the starting point Happier Colleagues Introduction of DataOps will mean those involved in the process are more able to quickly see the positive impact of their work leading to more engaged and productive teams Higher Data Quality Increased automation (particularly of testing) and standardised processes will lead to high Data Quality, which will in turn lead to better insights generated from Machine Learning models Collaboration DataOps promotes collaboration, communication, and coordination between teams that may otherwise remained siloed
  • 8. Which companies use DataOps? Please see Qubole’s Creating a Data-Driven Enterprise with DataOps ebook for further information on how each of these organisations implements DataOps
  • 9. Case Study: Facebook & Apache Hive Stage 2 Created a Hadoop data lake and developed Hive to make it more accessible – data team evolved from a service team to building self- service platforms for data extraction Stage 4 Developed Uis that were easy for business users to understand and use to independently extract data - data becomes fully democratised Stage 1 No structure around data requests, rather the business would request on an ad-hoc basis – the data function would act as a service Stage 3 Combined metadata services with Hive allowing users to look at data and metadata – however, data still not accessible to non-tech users
  • 10. What is required to begin implementing DataOps? Processes It is then important to establish clear processes including who is “RACI”. Those responsible should receive appropriate training. Measuring process effectiveness with appropriate KPIs is also crucial People & Culture The foundation for introducing DataOps lies firstly with buy-in from the key stakeholder groups particularly the business so that business requirements are understood Technologies Once the correct culture and processes have been established, an organisation can introduce tooling to support related activities, notably automation, testing, and orchestration 2 1 3
  • 11. People: Stakeholders Groups Data Consumers Those who will use data to perform analysis and extract insights to then deliver to those in the business who can use these to drive value Data Suppliers Those managing the integrity of Authoritative Data Sources to ensure data quality and availability Data Preparers Those who build data pipelines linking one source to another as well as managing its transformation into a usable format for Data Consumers Business Other parts of the organisation would not use DataOps – rather they rely on and benefit from better outputs in terms of insights / BI / analytics and convey Business Requirements Data Ops builds two crucial bridges, firstly between the business and technology functions, secondly within the data function itself
  • 12. People: Ingraining a DataOps Culture Push from the Top Cultural changes must be endorsed by senior management both within and outside of the data function before being pushed down to individual teams Embrace the Process Acknowledge that change won’t happen over night and that improvements will be incremental – allow a realistic timeframe for the process of implementing data ops Remove silos Breaking down organisational barriers between Data Suppliers, Preparers, Consumers, and the Business will be crucial to the smooth flow of data to those making decisions Emphasise Data Data should be front and centre of strategic decision making for DataOps to realise its full potential – this should be embedded as a company value Invest in Tools Carefully selecting a complementary set of technologies underpinning the implementation of DataOps is essential as will be providing the relevant training to upskill your teams 1 2 3 4 5
  • 13. Processes: Building a “Data Supply Chain” Data Suppliers Data Preparers Data Consumers The Business Source Owner, DBA, Infrastructure and Ops Personnel, Application Admins + Developers Data Engineers, Data Architects, Data Stewards, Integration Architects + Developers, Data Modelers Machine Learning Model Developers, Data Scientists HR, Finance, Strategy, Operations etc. Data Product Managers Business Analysts Data Security Teams Data Privacy Officers
  • 14. Technology: Agile, Collaboration, Automation, Infrastructure as Code Agile Small but frequent deliveries of new features Constant feedback loop between the business and tech Version Control to decrease risk and increase productivity Job / issue tracking to ensure even minor feedback captured Collaboration Automation Continuous Integration, Deployment, Delivery Automate testing and speed up getting code into production Infrastructure as Code Manage IT infrastructure using code Make changes to existing infrastructure much more easily
  • 15. CloverDX & DataOps Increased deployment frequency Package, share, and reuse any functionality you design Automated testing Incorporate data quality tests and build in error handling to your data pipelines Consistent metadata and version control CloverDX is easy to integrate with most VC tools, metadata can easily be tracked and visualized Monitoring The CloverDX server has a monitoring suite that can be applied to individual jobs or whole business processes Collaboration across all stakeholders CloverDX’s visual design allows technical and non-technical users to “speak the same language” Gartner identified 5 “key techniques” that will support with the delivery of DataOps – Clover can support “Data Preparers” with each
  • 16. Upcoming Webinar Code Management with Version Control in CloverDX December 8th 11am EST / 4pm GMT / 5pm CET Register Q&A