SlideShare a Scribd company logo
Customer Success Story
Cloudera & Xpand IT
Nuno Barreto
Associate Partner & Big Data Lead
nuno.barreto@xpand-it.com
Proprietary & Confidential www.xpand-it.com
THE PROBLEM
How is process Y
progressing?
Who are the main cluster
users/departments?
Which engines does
each department use?
Do I need to plan
on an upgrade?
How much is process
X costing me?
Are there available
time slots?
THE SOLUTION
TELEMETRY
ETL FLOW CONTROL
DATA
PREPARATION
ARCHITECURE
CORE AGENT(s)
QUEUE
REAL-TIME
ONLINEDB
ANALYTICSREPO
ETL
start/stop
jobs
start/stop
jobs
PDI
extensionlogflow control
ANALYTICS
ANALYTICSDB
status check
metadata access
data
access
analytics data
analytical
queries
operational data
THIS INVOLES A NUMBER OF CONCEPTS
NEAR REAL-TIME
CLOUDERA INTEGRATION
LAMBDA ARCHITECTURE
STREAMING
NEAR REAL-TIME
AND
STREAMING
REAL-TIME & STREAMING
CORE AGENT(s)
QUEUE
REAL-TIME
ONLINEDB
ANALYTICSREPO
ETL
start/stop
jobs
start/stop
jobs
PDI
extensionlogflow control
ANALYTICS
ANALYTICSDB
status check
metadata access
data
access
analytics data
analytical
queries
operational data
REMOTE AGENTS
FINE GRAINED CONTROL
ETL TOOL SPECIFIC
REAL TIME LOGGING
ASYNC EXECUTION
PDI EXTENSION POINTS
CAPTURE LOG START/END
CAPTURE CONNECTION TYPE
CAPTURE STEP LINEAGE DETAIL
GATHERING EXECUTION DATA
USE KAFKA AS A LOG SINK
FAULT TOLERANT
REAL TIME
CONSISTENT
COLLECT LOG DATA IN (AS) REALTIME (AS
POSSIBLE)
SPARK AS KAFKA COLLECTOR
REAL TIME LOG PARSING
ETL TOOL ADAPTABLE
DATA DUMPS IN IMPALA AND
HBASE
GENERATES NOTIFICATIONS
LAMBDA
ARCHITECTURE
LAMBDA ARCHITECTURE
CORE AGENT(s)
QUEUE
REAL-TIME
ONLINEDB
ANALYTICSREPO
ETL
start/stop
jobs
start/stop
jobs
PDI
extensionlogflow control
ANALYTICS
ANALYTICSDB
status check
metadata access
data
access
analytics data
analytical
queries
operational data
DISCLAIMER
What you are about to see is a
Work In Progress so, be gentle in
case…
• the demo doesn’t work
• features don’t work as
described
• connection goes down
DEMO
REAL-TIME AND STREAMING
CLOUDERA
INTEGRATION
HOW TO MANAGE ALL THESE COMPONENTS
LOTS OF MOVING PARTS
OPERATIONS
LOADS OF CONFIG FILES
THE ANSWER
EXTENSIBLE ARCHITECTURE
SEAMLESS INTEGRATION
MONITORING
CONFIGURATION MANAGEMENT
DEPENDENCIES MANAGEMENT
LOG CHECK
SETUP AND ADMIN
DEMO
CLOUDERA INTEGRATION
SUMMARY
NOT EVERYTING WE DO IS THIS
COMPLEX
HADOOP STACK CHOICE MATTERS
RE-USABLE DESIGN PATTERNS
QUESTIONS?

More Related Content

PPTX
Introduction to GCP presentation
PPTX
What is AWS?
PPTX
Cloud platform technical sales presentation
PPTX
Introduction to Azure Event Grid
PPT
Amazon Web Services (AWS) Case study
PPTX
Azure architecture
PPTX
Seminar
PPTX
Azure data platform overview
Introduction to GCP presentation
What is AWS?
Cloud platform technical sales presentation
Introduction to Azure Event Grid
Amazon Web Services (AWS) Case study
Azure architecture
Seminar
Azure data platform overview

What's hot (20)

PPTX
PDF
Data Platform Architecture Principles and Evaluation Criteria
PPTX
Azure Cloud PPT
PPTX
Microsoft Azure Technical Overview
PPTX
SQL to Azure Migrations
PPTX
AWS Simple Storage Service (s3)
PPT
Planning Data Warehouse
PDF
Introduction to Microsoft Azure Cloud
PPTX
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
PPTX
Cloud Deployment
PPTX
Building Modern Data Platform with Microsoft Azure
PPTX
Azure SQL Database Managed Instance
PPTX
Cloudera SDX
PPTX
Microsoft Cloud Adoption Framework for Azure: Governance Conversation
PDF
Cloud Migration Strategy and Best Practices
PDF
Azure 101
PDF
Google cloud platform introduction
PPT
Microsoft SQL Server - SQL Server Migrations Presentation
PPTX
Google Cloud Platform (GCP)
PPTX
Introduction to Power BI for Data Analysis & Visualization.pptx
Data Platform Architecture Principles and Evaluation Criteria
Azure Cloud PPT
Microsoft Azure Technical Overview
SQL to Azure Migrations
AWS Simple Storage Service (s3)
Planning Data Warehouse
Introduction to Microsoft Azure Cloud
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
Cloud Deployment
Building Modern Data Platform with Microsoft Azure
Azure SQL Database Managed Instance
Cloudera SDX
Microsoft Cloud Adoption Framework for Azure: Governance Conversation
Cloud Migration Strategy and Best Practices
Azure 101
Google cloud platform introduction
Microsoft SQL Server - SQL Server Migrations Presentation
Google Cloud Platform (GCP)
Introduction to Power BI for Data Analysis & Visualization.pptx
Ad

Viewers also liked (9)

PDF
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
PPTX
Put Alternative Data to Use in Capital Markets

PPTX
Transform Banking with Big Data and Automated Machine Learning 9.12.17
PPTX
The Big Picture: Real-time Data is Defining Intelligent Offers
PPTX
Large-Scale Data Science on Hadoop (Intel Big Data Day)
PPTX
IoT - Data Management Trends, Best Practices, & Use Cases
PPTX
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
PPTX
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
PPTX
Webinar - Sehr empfehlenswert: wie man aus Daten durch maschinelles Lernen We...
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Put Alternative Data to Use in Capital Markets

Transform Banking with Big Data and Automated Machine Learning 9.12.17
The Big Picture: Real-time Data is Defining Intelligent Offers
Large-Scale Data Science on Hadoop (Intel Big Data Day)
IoT - Data Management Trends, Best Practices, & Use Cases
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Webinar - Sehr empfehlenswert: wie man aus Daten durch maschinelles Lernen We...
Ad

Similar to Cloudera Customer Success Story (20)

PPTX
Breakout: Hadoop and the Operational Data Store
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PDF
Strata EU tutorial - Architectural considerations for hadoop applications
PPTX
The 5 Biggest Data Myths in Telco: Exposed
PDF
IDEAS Global A.I. Conference 2022.pdf
PPTX
Driving Better Products with Customer Intelligence

PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Data Stack Summit 2023
PPTX
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
PPTX
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
PPTX
Data Warehouse Optimization
PPTX
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
PDF
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
PPTX
Introducing Workload XM 8.7.18
PPTX
Modern Data Warehouse Fundamentals Part 3
PDF
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
PPTX
POLESTAR XEUS, Hypervisor-agnostic Cloud Management Platform from NKIA
PPTX
Big Data/Cloudera from Excelerate Systems
PPTX
Turning Data into Business Value with a Modern Data Platform
Breakout: Hadoop and the Operational Data Store
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
Strata EU tutorial - Architectural considerations for hadoop applications
The 5 Biggest Data Myths in Telco: Exposed
IDEAS Global A.I. Conference 2022.pdf
Driving Better Products with Customer Intelligence

Modern Data Warehouse Fundamentals Part 1
Edc event vienna presentation 1 oct 2019
Data Stack Summit 2023
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Data Warehouse Optimization
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera + Syncsort: Fuel Business Insights, Analytics, and Next Generation T...
Introducing Workload XM 8.7.18
Modern Data Warehouse Fundamentals Part 3
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
POLESTAR XEUS, Hypervisor-agnostic Cloud Management Platform from NKIA
Big Data/Cloudera from Excelerate Systems
Turning Data into Business Value with a Modern Data Platform

More from Xpand IT (20)

PDF
Xray & Xporter were in Austria: Jira & Confluence Solutions Day 2018
PDF
Using Xamarin for your Mobile+ Apps – Xamarin Experience London 2017
PPTX
Xporter for Jira - Overview
PPTX
Xray for Jira - How to automate your QA process
PPTX
Xpand Addons - Addon Discovery Day 2017
PPTX
Xray for Jira 3.0 - What's New?
PPTX
Xray for Jira - Overview
PPTX
Xporter for Jira - Advanced topics
PDF
Keynote - Xamarin Experience London 2017
PPTX
Welcome & Introduction – Xamarin Experience London 2017
PDF
Gathering Customer Insights with Sitecore - Xamarin Experience London 2017
PPTX
Why Speed Matters in Mobile Apps – Xamarin Experience London 2017
PDF
Mobile & Cognitive Services | Harnessing the Power of IoT – Xamarin Experienc...
PDF
Atlassian Tools in Practice: A Customer Success Story – Xpand IT & Atlassian ...
PDF
The Secret Sauce of Successful Teams - Xpand IT & Atlassian JAM Sessions 2017
PPTX
Quality Assurance Made Easy in JIRA - Xpand IT & Atlassian JAM Sessions 2017
PDF
Improved Reporting with JIRA Add-ons - Xpand IT & Atlassian JAM Sessions 2017
PPTX
How our Team Collaborates with Atlassian Tools - Xpand IT & Atlassian JAM Ses...
PPTX
Welcome & Introduction - Xpand IT & Atlassian JAM Sessions 2017
PDF
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017
Xray & Xporter were in Austria: Jira & Confluence Solutions Day 2018
Using Xamarin for your Mobile+ Apps – Xamarin Experience London 2017
Xporter for Jira - Overview
Xray for Jira - How to automate your QA process
Xpand Addons - Addon Discovery Day 2017
Xray for Jira 3.0 - What's New?
Xray for Jira - Overview
Xporter for Jira - Advanced topics
Keynote - Xamarin Experience London 2017
Welcome & Introduction – Xamarin Experience London 2017
Gathering Customer Insights with Sitecore - Xamarin Experience London 2017
Why Speed Matters in Mobile Apps – Xamarin Experience London 2017
Mobile & Cognitive Services | Harnessing the Power of IoT – Xamarin Experienc...
Atlassian Tools in Practice: A Customer Success Story – Xpand IT & Atlassian ...
The Secret Sauce of Successful Teams - Xpand IT & Atlassian JAM Sessions 2017
Quality Assurance Made Easy in JIRA - Xpand IT & Atlassian JAM Sessions 2017
Improved Reporting with JIRA Add-ons - Xpand IT & Atlassian JAM Sessions 2017
How our Team Collaborates with Atlassian Tools - Xpand IT & Atlassian JAM Ses...
Welcome & Introduction - Xpand IT & Atlassian JAM Sessions 2017
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Cloud computing and distributed systems.
PDF
KodekX | Application Modernization Development
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Approach and Philosophy of On baking technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mobile App Security Testing_ A Comprehensive Guide.pdf
Cloud computing and distributed systems.
KodekX | Application Modernization Development
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Review of recent advances in non-invasive hemoglobin estimation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
The AUB Centre for AI in Media Proposal.docx
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Monthly Chronicles - July 2025
Approach and Philosophy of On baking technology
“AI and Expert System Decision Support & Business Intelligence Systems”
20250228 LYD VKU AI Blended-Learning.pptx

Cloudera Customer Success Story