SlideShare a Scribd company logo
Building an enterprise-ready
analytics and operational
ecosystem on DC/OS
Ignacio Mulas
Index:
● Data Centric Overview
● Non-functional Requirements
● Functional Case:
○ Data Exploration
○ Data Preparation
○ Data Validation
○ Productionalization
○ Evaluation
Overview
SAP : ERP
Mobile App
Campaign
Manager
CRM
Call
center
THE ROOT OF THE PROBLEM OF PHYSICAL COMPANIES HAS BEEN
IDENTIFIED: SILOS & APPLICATION CENTRIC
Big Data LakeDATA MART
DATA MART
E-commerce
DATA
WAREHOUSE
TPV
APP
Lost data
No Real Time
10X Data Replication
Low TPO/TCO
10X Costs
Day-1 analytics
Non-integrated vision
Silos between departments
Not a real IA
Problems
Mobile APP Campaign
Management
Digital
Marketing
Legacy
Applications
Call center
Core
Application
ATG
TPV APP
CRM
E-commerce
Microservices of the Data
Intelligence layer
New Applications are developed
through microservice orchestration
reducing code in half
Unique data at the center and
applications around it using it in real
time with maximum intelligence
Operational and Informational
Applications use the microservices
of the Data as a Service layer
Microservices
SOLUTION: STRATIO DATACENTRIC
Operationalizing Big Data
DATA
Data intelligence
Api Daas
(Data as a Service)
DC/OS
Infrastructure and container manager
MultiDataStore
& Multiprocessing
Outer look....
Stratio DataCentric
Stratio
EOS
Stratio
XData
Stratio
Sparta
Stratio
Discovery
Stratio
Governance
Stratio
GoSec
Deploy and
manage all your
services with a
single click
Gain a centralized
vision of all your data
and easily govern its
access and
management
Apply real-time and
batch processing
across multiple
engines in distributed
environments
Become a truly data-
driven company with
AI
Turn difficult
concepts into
something simple
Protect your data
against security
breaches and
maintain
compliance
Stratio
Intelligence
Begin the journey
from data to
knowledge
Microservices
Framework
Design, Develop and
manager applications
easily
Non-Functional
requirements
Key non-functional requirements on data centric
1. Security levels & profiling —On this scenario, we need to be able to support encrypted
communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security
manager that enforces complex policies on applications and data.
2. Isolation of resources—we should guarantee that each application/user have what they need to work
properly without stepping into others resources. Mixing different workloads should not affect the correct
functioning of the most critical services, i.e. operational microservices vs big data frameworks.
3. Data governance tools—getting all together imposes new levels of data management requirements
where data is not modelled but auto-discovered and enriched with business context.
4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and
operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a
day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
Key non-functional requirements on data centric
1. Security levels & profiling —On this scenario, we need to be able to support encrypted
communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security
manager that enforces complex policies on applications and data.
µs
SSO
Policies Audit
µs-2
Secrets
Key non-functional requirements on data centric
2. Isolation of resources—we should guarantee that each application/user have what they need to work
properly without stepping into others resources. Mixing different workloads should not affect the correct
functioning of the most critical services, i.e. operational microservices vs big data frameworks.
µs
Big Data
Process
...
- Network isolation
- CPU, RAM, Disk isolation
Key non-functional requirements on data centric
3. Data governance tools—getting all together imposes new levels of data management requirements
where data is not modelled but auto-discovered and enriched with business context.
Big Data
Tool
A process / application need data to work properly but, we need
to maintain certain guarantees:
- Data Security:
- Who are you?
- Are you authorized to read/write data from here
- Data processes development:
- Where can I read a trusted source of information
containing my clients emails?
- Is this personal data? I need to follow GDPR!
- Can I delete this record? I do not think it is used in
our business…
- Who created this?
Data
Dictionary
Business
glossary
Lineage
A process / application need data to work properly but, we need
to maintain certain guarantees:
- Data Security:
- Who are you?
- Are you authorized to read/write data from here
- Data processes development:
- Where can I read a trusted source of information
containing my clients emails?
- Is this personal data? I need to follow GDPR!
- Can I delete this record? I do not think it is used in
our business…
- Who created this?
Key non-functional requirements on data centric
4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and
operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a
day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
Different deployment models:
● Replace version
● Blue/Green
● Canary Testing
● Versioning and history
● Rollback mechanisms
● Models retraining
● Functioning Evaluation
● Metrics tracking
● Versions comparison
Applications are monitored on several
metrics:
● Application metrics
● Business metrics
● Computational metrics
Deployment Monitoring
Management Evaluation
Functional case:
Clients Scoring
Functional case: Client Scoring for a financial institution
Functional case: Client Scoring for a financial institution
1. Data exploration—Occurs early in a project; may include viewing sample data, running queries
for statistical profiling, exploratory analysis and visualizing data.
2. Data preparation —Iterative task; may include cleaning, standardizing, transforming,
denormalizing, and aggregating data; typically the most time-intensive task of a project
3. Data validation —Recurring task; may include viewing sample data, running queries for
statistical profiling and aggregate analysis, and visualizing data; typically occurs as part of data
exploration, data preparation, development, pre-deployment, and post-deployment phases
4. Productionalization—Occurs late in a project; may include deploying code to production,
backfilling datasets, training models, validating data, and scheduling workflows
Data Exploration
Data Exploration
Data Preparation
Data Preparation
Data Validation
Data Validation
Productionalization -
Workflow
Productionalization - Workflow Versioning
Productionalization - Workflow Deployment
Evaluation
Evaluation
BIG DATA
CHILD`S PLAY
Questions? :)
Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS
● Facial Recognition: ability to correctly identify a high percentage of the known individuals, given the image of face.
Ability to learn new faces.
● Emotion classification: ability to correctly classify above 65% of the emotions of persons, given the image of face.
The emotions identified are: happiness, sadness, surprise, anger.
● Object Recognition: ability to segment and classify objects from images.
● Natural Interaction Agent: ability to talk to humans in a natural way (typing or through voice using a phone terminal).
Ability to trigger basic actions based on the identified intent, e.g., "show a document" or "switch on a light bulb".
● Semantic Document Retrieval: ability to find documents based on their content. The way of querying is based on a
natural interaction using standard text.
● Question Answering: ability to answer a specific questions from a text or a document. E.g., "when was Peter born?"
=> "May 20th, 2001"
● Awareness: ability to manage any amount of data in an almost instantaneous way in order to reach conclusions,
create warnings or trigger actions. The data managed by this ability could come from the previous abilities and/or any
other external feed.
New Capabilities…
Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS

More Related Content

PPTX
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
PDF
Biznet Gio Presentation - Database Security
PDF
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
PDF
Biznet Gio Presentation - Cloud Computing
PDF
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFB
PDF
ECS: Delivering Better Cyber Intelligence and Compliance
PDF
Using Data Science for Cybersecurity
PDF
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali
Using Kafka on Event-driven Microservices Architectures - Apache Kafka Meetup
Biznet Gio Presentation - Database Security
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
Biznet Gio Presentation - Cloud Computing
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFB
ECS: Delivering Better Cyber Intelligence and Compliance
Using Data Science for Cybersecurity
Kubernetes Jakarta Meetup 010 - Service Mesh Observability with Kiali

What's hot (19)

PPTX
PPTX
Getting Started with Splunk Enterprise Hands-On
PPTX
Horizontal Scaling for Millions of Customers!
PPTX
Dev talks 2021 Data Science @crowdstrike
PPTX
Fighting cyber fraud with hadoop
PDF
SAP Cloud security overview 2.0
PDF
Challenges with Cloud Security by Ken Y Chan
PDF
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
PPTX
Hands-On Security Breakout Session- Disrupting the Kill Chain
PDF
Deep Learning Image Processing Applications in the Enterprise
PPTX
How Cloudera SDX can aid GDPR compliance 6.21.18
PPTX
SIEM game changer
PDF
Security Breakout Session
PPTX
Observability – the good, the bad, and the ugly
PPTX
A Little Security For Big Data
PDF
Introduction to Cloud Applications
PPTX
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
PPTX
Creating A Solvency II Data Governance Framework
PPTX
Preparing for the Cybersecurity Renaissance
Getting Started with Splunk Enterprise Hands-On
Horizontal Scaling for Millions of Customers!
Dev talks 2021 Data Science @crowdstrike
Fighting cyber fraud with hadoop
SAP Cloud security overview 2.0
Challenges with Cloud Security by Ken Y Chan
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
Hands-On Security Breakout Session- Disrupting the Kill Chain
Deep Learning Image Processing Applications in the Enterprise
How Cloudera SDX can aid GDPR compliance 6.21.18
SIEM game changer
Security Breakout Session
Observability – the good, the bad, and the ugly
A Little Security For Big Data
Introduction to Cloud Applications
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Creating A Solvency II Data Governance Framework
Preparing for the Cybersecurity Renaissance
Ad

Similar to Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS (20)

PPTX
Fundamentals of Big Data
PPTX
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
PDF
Big data Analytics
PPTX
Usama Fayyad talk in South Africa: From BigData to Data Science
PPTX
Big data unit 2
PDF
Data Virtualization - Enabling Next Generation Analytics
PDF
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
PDF
Analyze This! Best Practices For Big And Fast Data
 
PPT
Datapreneurs
PPTX
Data mining with big data implementation
PDF
Slow Data Kills Business eBook - Improve the Customer Experience
PPT
Choosing the Right Big Data Architecture for your Business
PDF
Big Data Analytics Architecture PowerPoint Presentation Slides
PDF
Big data trends challenges opportunities
PDF
Accelerating Time to Success for Your Big Data Initiatives
PPSX
Intro to Data Science Big Data
PDF
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
PDF
Addressing Storage Challenges to Support Business Analytics and Big Data Work...
PPTX
IBM Solutions Connect 2013 - Getting started with Big Data
KEY
Exploring Big Data value for your business
Fundamentals of Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Big data Analytics
Usama Fayyad talk in South Africa: From BigData to Data Science
Big data unit 2
Data Virtualization - Enabling Next Generation Analytics
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Analyze This! Best Practices For Big And Fast Data
 
Datapreneurs
Data mining with big data implementation
Slow Data Kills Business eBook - Improve the Customer Experience
Choosing the Right Big Data Architecture for your Business
Big Data Analytics Architecture PowerPoint Presentation Slides
Big data trends challenges opportunities
Accelerating Time to Success for Your Big Data Initiatives
Intro to Data Science Big Data
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
Addressing Storage Challenges to Support Business Analytics and Big Data Work...
IBM Solutions Connect 2013 - Getting started with Big Data
Exploring Big Data value for your business
Ad

More from Stratio (20)

PPTX
Can an intelligent system exist without awareness? BDS18
PPTX
Kafka and KSQL - Apache Kafka Meetup
PPTX
Wild Data - The Data Science Meetup
PPTX
Ensemble methods in Machine Learning
PPTX
Stratio Sparta 2.0
PPTX
Big Data Security: Facing the challenge
PPTX
Operationalizing Big Data
PPTX
Artificial Intelligence on Data Centric Platform
PDF
Introduction to Artificial Neural Networks
PDF
“A Distributed Operational and Informational Technological Stack”
PDF
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
PPTX
Lunch&Learn: Combinación de modelos
PDF
Meetup: Spark + Kerberos
PDF
Distributed Logistic Model Trees
PDF
Multiplaform Solution for Graph Datasources
PDF
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
PPTX
[Strata] Sparkta
PDF
Introduction to Asynchronous scala
PDF
Functional programming in scala
PDF
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Can an intelligent system exist without awareness? BDS18
Kafka and KSQL - Apache Kafka Meetup
Wild Data - The Data Science Meetup
Ensemble methods in Machine Learning
Stratio Sparta 2.0
Big Data Security: Facing the challenge
Operationalizing Big Data
Artificial Intelligence on Data Centric Platform
Introduction to Artificial Neural Networks
“A Distributed Operational and Informational Technological Stack”
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Lunch&Learn: Combinación de modelos
Meetup: Spark + Kerberos
Distributed Logistic Model Trees
Multiplaform Solution for Graph Datasources
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
[Strata] Sparkta
Introduction to Asynchronous scala
Functional programming in scala
Spark Streaming @ Berlin Apache Spark Meetup, March 2015

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
A Presentation on Artificial Intelligence
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PPTX
Machine Learning_overview_presentation.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Spectroscopy.pptx food analysis technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Big Data Technologies - Introduction.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
NewMind AI Weekly Chronicles - August'25-Week II
A Presentation on Artificial Intelligence
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Machine Learning_overview_presentation.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectroscopy.pptx food analysis technology
The AUB Centre for AI in Media Proposal.docx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Big Data Technologies - Introduction.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS

  • 1. Building an enterprise-ready analytics and operational ecosystem on DC/OS Ignacio Mulas
  • 2. Index: ● Data Centric Overview ● Non-functional Requirements ● Functional Case: ○ Data Exploration ○ Data Preparation ○ Data Validation ○ Productionalization ○ Evaluation
  • 4. SAP : ERP Mobile App Campaign Manager CRM Call center THE ROOT OF THE PROBLEM OF PHYSICAL COMPANIES HAS BEEN IDENTIFIED: SILOS & APPLICATION CENTRIC Big Data LakeDATA MART DATA MART E-commerce DATA WAREHOUSE TPV APP Lost data No Real Time 10X Data Replication Low TPO/TCO 10X Costs Day-1 analytics Non-integrated vision Silos between departments Not a real IA Problems
  • 5. Mobile APP Campaign Management Digital Marketing Legacy Applications Call center Core Application ATG TPV APP CRM E-commerce Microservices of the Data Intelligence layer New Applications are developed through microservice orchestration reducing code in half Unique data at the center and applications around it using it in real time with maximum intelligence Operational and Informational Applications use the microservices of the Data as a Service layer Microservices SOLUTION: STRATIO DATACENTRIC Operationalizing Big Data DATA Data intelligence Api Daas (Data as a Service) DC/OS Infrastructure and container manager MultiDataStore & Multiprocessing
  • 6. Outer look.... Stratio DataCentric Stratio EOS Stratio XData Stratio Sparta Stratio Discovery Stratio Governance Stratio GoSec Deploy and manage all your services with a single click Gain a centralized vision of all your data and easily govern its access and management Apply real-time and batch processing across multiple engines in distributed environments Become a truly data- driven company with AI Turn difficult concepts into something simple Protect your data against security breaches and maintain compliance Stratio Intelligence Begin the journey from data to knowledge Microservices Framework Design, Develop and manager applications easily
  • 8. Key non-functional requirements on data centric 1. Security levels & profiling —On this scenario, we need to be able to support encrypted communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security manager that enforces complex policies on applications and data. 2. Isolation of resources—we should guarantee that each application/user have what they need to work properly without stepping into others resources. Mixing different workloads should not affect the correct functioning of the most critical services, i.e. operational microservices vs big data frameworks. 3. Data governance tools—getting all together imposes new levels of data management requirements where data is not modelled but auto-discovered and enriched with business context. 4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
  • 9. Key non-functional requirements on data centric 1. Security levels & profiling —On this scenario, we need to be able to support encrypted communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security manager that enforces complex policies on applications and data. µs SSO Policies Audit µs-2 Secrets
  • 10. Key non-functional requirements on data centric 2. Isolation of resources—we should guarantee that each application/user have what they need to work properly without stepping into others resources. Mixing different workloads should not affect the correct functioning of the most critical services, i.e. operational microservices vs big data frameworks. µs Big Data Process ... - Network isolation - CPU, RAM, Disk isolation
  • 11. Key non-functional requirements on data centric 3. Data governance tools—getting all together imposes new levels of data management requirements where data is not modelled but auto-discovered and enriched with business context. Big Data Tool A process / application need data to work properly but, we need to maintain certain guarantees: - Data Security: - Who are you? - Are you authorized to read/write data from here - Data processes development: - Where can I read a trusted source of information containing my clients emails? - Is this personal data? I need to follow GDPR! - Can I delete this record? I do not think it is used in our business… - Who created this? Data Dictionary Business glossary Lineage A process / application need data to work properly but, we need to maintain certain guarantees: - Data Security: - Who are you? - Are you authorized to read/write data from here - Data processes development: - Where can I read a trusted source of information containing my clients emails? - Is this personal data? I need to follow GDPR! - Can I delete this record? I do not think it is used in our business… - Who created this?
  • 12. Key non-functional requirements on data centric 4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them. Different deployment models: ● Replace version ● Blue/Green ● Canary Testing ● Versioning and history ● Rollback mechanisms ● Models retraining ● Functioning Evaluation ● Metrics tracking ● Versions comparison Applications are monitored on several metrics: ● Application metrics ● Business metrics ● Computational metrics Deployment Monitoring Management Evaluation
  • 14. Functional case: Client Scoring for a financial institution
  • 15. Functional case: Client Scoring for a financial institution 1. Data exploration—Occurs early in a project; may include viewing sample data, running queries for statistical profiling, exploratory analysis and visualizing data. 2. Data preparation —Iterative task; may include cleaning, standardizing, transforming, denormalizing, and aggregating data; typically the most time-intensive task of a project 3. Data validation —Recurring task; may include viewing sample data, running queries for statistical profiling and aggregate analysis, and visualizing data; typically occurs as part of data exploration, data preparation, development, pre-deployment, and post-deployment phases 4. Productionalization—Occurs late in a project; may include deploying code to production, backfilling datasets, training models, validating data, and scheduling workflows
  • 29. ● Facial Recognition: ability to correctly identify a high percentage of the known individuals, given the image of face. Ability to learn new faces. ● Emotion classification: ability to correctly classify above 65% of the emotions of persons, given the image of face. The emotions identified are: happiness, sadness, surprise, anger. ● Object Recognition: ability to segment and classify objects from images. ● Natural Interaction Agent: ability to talk to humans in a natural way (typing or through voice using a phone terminal). Ability to trigger basic actions based on the identified intent, e.g., "show a document" or "switch on a light bulb". ● Semantic Document Retrieval: ability to find documents based on their content. The way of querying is based on a natural interaction using standard text. ● Question Answering: ability to answer a specific questions from a text or a document. E.g., "when was Peter born?" => "May 20th, 2001" ● Awareness: ability to manage any amount of data in an almost instantaneous way in order to reach conclusions, create warnings or trigger actions. The data managed by this ability could come from the previous abilities and/or any other external feed. New Capabilities…