SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
ClouderaAltus: Big Data in the Cloud Made
Easy
David Tishgart | Product Marketing
Jennifer Wu | Product Management
2© Cloudera, Inc. All rights reserved.
We believe
data can make what is impossible
today, possible tomorrow
3© Cloudera, Inc. All rights reserved.
We empower
people to transform complex data
into clear and actionable insights
DRIVE
CUSTOMER INSIGHTS
CONNECT
PRODUCTS & SERVICES (IoT)
PROTECT
BUSINESS
4© Cloudera, Inc. All rights reserved.
We deliver
the modern platform for
machine learning and advanced analytics
RUNS ANYWHERE
Cloud
Multi-cloud
On-prem
SCALABLE
Elastic
Cost-effective
Lower TCO
ENTERPRISE GRADE
Secure
Performant
Compliant
5© Cloudera, Inc. All rights reserved.
The data-driven enterprise
IoT explosion of new data
30B
connected
devices
440x
more data
Enterprises re-architect to
modernize IT infrastructure
open source
cloud
machine
learning
6© Cloudera, Inc. All rights reserved.
<1%
of an organization’s
unstructured data is
analyzed or
used at all
<50%
of an organization’s
structured data is
actively used in
making decisions
80%
of analysts’ time
is spent simply
discovering and
preparing data
7© Cloudera, Inc. All rights reserved.
• Move processing to the cloud without risk
• Focus on your workload, not cluster
operations
• Simplify and unify your analytics
8© Cloudera, Inc. All rights reserved.
A platform for enabling data-driven decisions
Modern data processing
(ETL) and data governance at
scale
Data
Engineering
Explore, analyze, and
understand all your data
Analytic
Database
Data-driven applications to
deliver real-time insights
Operational
Database
Multi-Storage,
Multi-Environment
Exploratory data science and
machine learning for the
enterprise
Data Science
9© Cloudera, Inc. All rights reserved.
Data Engineering use cases
10© Cloudera, Inc. All rights reserved.
Ingest, Process, and Deliver Insights to Drive Decisions
Leverage any source
and format
• Unstructured data
• Structured data
• Social data
• Machine data
• IOT data
Large scale data
processing
• Stream or batch
• Choice of engine:
MapReduce, Spark,
Hive, Hive-on-Spark
• Job SLAs
• Data storage
Analyze, build and
train models
• BI analytic engines
• Data science and
BI tools/libraries
• Real-time analysis
• Report generation
Ingest from data
sources
Process and
transform data
Deliver insights
Make business
decisions
Using data to grow
the business
• Report
consumption
• Apply judgement
• Inform business
decisions
11© Cloudera, Inc. All rights reserved.
Example: Data Engineering in ML Pipeline
clean, merge, filter
Data
Engineering
Raw Data
● formats
● sources
● volume
Processed
data
● training
● validation
● test model, train, tune
Data Science
processing, execution
Data
Engineering
Validated model
● model
● parameters
Live data
ingest
End results
● insights
● predictions
● results
12© Cloudera, Inc. All rights reserved.
Goals of an Organization: Data Engineers and IT
Agility
● Time to market for new use
cases
● Reliably meet SLAs for
processed data
Ease-of-use
● Production: run
operationalized workflows;
monitor and troubleshoot
● Development: interactive
workflows and tools
Goals of Data
Engineers
Goals of IT
Administrators
● Cost
● Security
● Standardization
● Self-service LOBs
13© Cloudera, Inc. All rights reserved.
IT and DE goals
- Agility
- Ease of use
- Cost effectiveness
- Standardization
Public Cloud Properties
- on-demand infrastructure resourcing
- hyperscale storage
- data durable + highly available
Bridging the Gap Between Users and Cloud
Cloudera Altus
providing a bridge since 2017
14© Cloudera, Inc. All rights reserved.
Cloudera Altus for data engineering workloads
PaaS for ETL, machine learning, and data
processing on AWS
● Managed transient clusters in customer VPC
● Support for MR2, Hive, Spark, Hive-on-Spark
● Workload analytics and troubleshooting
15© Cloudera, Inc. All rights reserved.
Easy
16© Cloudera, Inc. All rights reserved.
https://console.altus.cloudera.com
17© Cloudera, Inc. All rights reserved.
Everything you don’t have to do
• Install any software to start working
• Install any hardware
• Worry about cluster configuration
• Upgrade/reconfigure clusters
• OS upgrades/patching
• Resource Management
18© Cloudera, Inc. All rights reserved.
Focus on Workloads
• Jobs/Workloads as first class entities
• Clusters as supporting entities
• Abstracts away most infrastructure and cluster details
19© Cloudera, Inc. All rights reserved.
Workload troubleshooting and analytics
● Troubleshoot jobs after cluster termination
through job log and configuration browsing
● Insight into causes of job failure
● Identification and root cause analysis of slow
jobs
20© Cloudera, Inc. All rights reserved.
Agile
21© Cloudera, Inc. All rights reserved.
• The same CDH is present regardless of deployment model
• Simplified application migration
• Minimizes cloud migration risk
• Core components open-source
• Reduces risk for lock-in
• Reduces unknowns for third-party partners
One Platform Everywhere
22© Cloudera, Inc. All rights reserved.
Embrace Transience for
Lower Costs
Compartmentalize for
Optimal Performance
Hyperscale Cloud Store
Grow Storage and Compute
Discretely for Efficiency
STORE
COMPUTE
Cloud-native architecture
23© Cloudera, Inc. All rights reserved.
Unified
24© Cloudera, Inc. All rights reserved.
• Specialized clusters provide optimized
computation
• Data and Metadata need to be
consistently accessible across all clusters
• Whether CDH clusters are managed by
Altus or not
No Data Silos
Object Store
25© Cloudera, Inc. All rights reserved.
• S3 eventual consistency model can lead to errors
• S3Guard provides a consistent view of S3 data across clusters
• Table schemas are often relevant across clusters
• For simple scenarios, inline DDL in jobs is sufficient
• For complex scenarios, share a persistent backing database for the metadata
store
Multi-cluster data and metadata consistency
26© Cloudera, Inc. All rights reserved.
Altus feature overview
Low cost
• Per-node/per-hour pricing
• Terminate clusters when not in
use
• Spot with self-healing
End-user focused
• Manages your cluster so you don’t
have to
• Job submission CLI/API
• Workload troubleshooting
Easy to use
• Self-service for end-users
• Cloud console + familiar tools
• Cluster provisioning in minutes
Cloud-native
• Decouple storage and compute
• R/W to/from Amazon S3
• Spin EC2 clusters up and down
Integrated Platform
• Same Cloudera platform on-
premises and in the cloud
• Feed cleaned data into Impala
clusters for BI analytics
• Share metadata across clusters
Secure
• Integrated with AWS security
• No Cloudera access to customer
data
• Support for multi-AWS accounts
• Admin and end-user accounts
27© Cloudera, Inc. All rights reserved.
Altus Demo
28© Cloudera, Inc. All rights reserved.
Demo “Analysis: Hospital Death Rates vs. State GDP”
Data
Engineering
Data Science
Well-formatted
data
Hospital data
Death Rates
Complications
cleaning
merging
filtering
Linear
Regression
Statistical
AnalysisPer-capita State
GDP
[State, Condition,
Hospital, Type,
Average Score,
GDP]
Insights
Data Science Workbench
29© Cloudera, Inc. All rights reserved.
BUILT FOR THE
ENTERPRISE
SIMPLIFIED &
OPTIMIZED
OPEN
ARCHITECTURE
Your choice for the cloud
● Portable for hybrid
and multi-cloud
● Rich ISV partner
ecosystem
● Open source,
supported by
community
● Low TCO on cloud-
native infrastructure
● Deployed as-a-service
with minimal cluster
management
UNIFIED
PLATFORM
● Minimize data silos
● Simplify operations
● Improve ease of
development and
cloud migration
● Security
● Availability
● Performance
● Governance
CLOUDERA
● Disjointed services
create data silos
● Confusing and
disjoined services,
data, and pricing
models
● Vendor lock in
OTHER CLOUD
DATA
PLATFORMS
● No common
metadata, security,
lineage across apps
30© Cloudera, Inc. All rights reserved.
Visit
www.cloudera.com/altus
for access, and ask about
our free trial
31© Cloudera, Inc. All rights reserved.
Questions?
32© Cloudera, Inc. All rights reserved.
Thank you
David Tishgart
Jennifer Wu

More Related Content

PPTX
Part 3: Models in Production: A Look From Beginning to End
PPTX
Kudu Forrester Webinar
PPTX
How Data Drives Business at Choice Hotels
PPTX
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
PPTX
Analyzing Hadoop Data Using Sparklyr

PPTX
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
PPTX
Simplifying Real-Time Architectures for IoT with Apache Kudu
PPTX
Consolidate your data marts for fast, flexible analytics 5.24.18
Part 3: Models in Production: A Look From Beginning to End
Kudu Forrester Webinar
How Data Drives Business at Choice Hotels
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Analyzing Hadoop Data Using Sparklyr

Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Simplifying Real-Time Architectures for IoT with Apache Kudu
Consolidate your data marts for fast, flexible analytics 5.24.18

What's hot (20)

PPTX
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
PPTX
Part 1: Introducing the Cloudera Data Science Workbench
PPTX
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
PPTX
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
PPTX
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
PPTX
Supercharge Splunk with Cloudera

PPTX
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
PPTX
Big data journey to the cloud rohit pujari 5.30.18
PPT
A Community Approach to Fighting Cyber Threats
PPTX
The Big Picture: Learned Behaviors in Churn
PPTX
Part 2: A Visual Dive into Machine Learning and Deep Learning 

PPTX
Driving Better Products with Customer Intelligence

PPTX
Big data journey to the cloud 5.30.18 asher bartch
PPTX
Get started with Cloudera's cyber solution
PPTX
Part 1: Lambda Architectures: Simplified by Apache Kudu
PPTX
Solr consistency and recovery internals
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
PPTX
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
PPTX
The Vision & Challenge of Applied Machine Learning
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 1: Introducing the Cloudera Data Science Workbench
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Supercharge Splunk with Cloudera

Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Big data journey to the cloud rohit pujari 5.30.18
A Community Approach to Fighting Cyber Threats
The Big Picture: Learned Behaviors in Churn
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Driving Better Products with Customer Intelligence

Big data journey to the cloud 5.30.18 asher bartch
Get started with Cloudera's cyber solution
Part 1: Lambda Architectures: Simplified by Apache Kudu
Solr consistency and recovery internals
Leveraging the cloud for analytics and machine learning 1.29.19
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
The Vision & Challenge of Applied Machine Learning
Ad

Similar to Cloudera Altus: Big Data in the Cloud Made Easy (20)

PDF
Cloudera GoDataFest Deploying Cloudera in the Cloud
PPTX
Cloudera Altus: Big Data in der Cloud einfach gemacht
PPTX
A deep dive into running data analytic workloads in the cloud
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
PPTX
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
PPTX
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Intel and Cloudera: Accelerating Enterprise Big Data Success
PPTX
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
PPTX
Turning Data into Business Value with a Modern Data Platform
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PDF
Gab Genai Cloudera - Going Beyond Traditional Analytic
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Cloud Data Warehousing with Cloudera Altus 7.24.18
PDF
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
PPTX
Optimize your cloud strategy for machine learning and analytics
PPTX
The 5 Biggest Data Myths in Telco: Exposed
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
High-Performance Analytics in the Cloud with Apache Impala
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera Altus: Big Data in der Cloud einfach gemacht
A deep dive into running data analytic workloads in the cloud
Leveraging the Cloud for Big Data Analytics 12.11.18
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Edc event vienna presentation 1 oct 2019
Intel and Cloudera: Accelerating Enterprise Big Data Success
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Turning Data into Business Value with a Modern Data Platform
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
Gab Genai Cloudera - Going Beyond Traditional Analytic
Modern Data Warehouse Fundamentals Part 2
Cloud Data Warehousing with Cloudera Altus 7.24.18
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
Optimize your cloud strategy for machine learning and analytics
The 5 Biggest Data Myths in Telco: Exposed
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
High-Performance Analytics in the Cloud with Apache Impala
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
PPTX
Cloudera SDX
PPTX
Introducing Workload XM 8.7.18
PPTX
Spark and Deep Learning Frameworks at Scale 7.19.18
PPTX
How Cloudera SDX can aid GDPR compliance
PPTX
When SAP alone is not enough
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Machine Learning with Limited Labeled Data 4/3/19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18
Cloudera SDX
Introducing Workload XM 8.7.18
Spark and Deep Learning Frameworks at Scale 7.19.18
How Cloudera SDX can aid GDPR compliance
When SAP alone is not enough

Recently uploaded (20)

PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PPTX
L1 - Introduction to python Backend.pptx
PDF
AutoCAD Professional Crack 2025 With License Key
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Complete Guide to Website Development in Malaysia for SMEs
PPTX
assetexplorer- product-overview - presentation
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Transform Your Business with a Software ERP System
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
iTop VPN Free 5.6.0.5262 Crack latest version 2025
PDF
Download FL Studio Crack Latest version 2025 ?
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PPTX
Monitoring Stack: Grafana, Loki & Promtail
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
L1 - Introduction to python Backend.pptx
AutoCAD Professional Crack 2025 With License Key
Design an Analysis of Algorithms I-SECS-1021-03
Complete Guide to Website Development in Malaysia for SMEs
assetexplorer- product-overview - presentation
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
Designing Intelligence for the Shop Floor.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Transform Your Business with a Software ERP System
Oracle Fusion HCM Cloud Demo for Beginners
Design an Analysis of Algorithms II-SECS-1021-03
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Autodesk AutoCAD Crack Free Download 2025
iTop VPN Free 5.6.0.5262 Crack latest version 2025
Download FL Studio Crack Latest version 2025 ?
Operating system designcfffgfgggggggvggggggggg
Why Generative AI is the Future of Content, Code & Creativity?
Monitoring Stack: Grafana, Loki & Promtail

Cloudera Altus: Big Data in the Cloud Made Easy

  • 1. 1© Cloudera, Inc. All rights reserved. ClouderaAltus: Big Data in the Cloud Made Easy David Tishgart | Product Marketing Jennifer Wu | Product Management
  • 2. 2© Cloudera, Inc. All rights reserved. We believe data can make what is impossible today, possible tomorrow
  • 3. 3© Cloudera, Inc. All rights reserved. We empower people to transform complex data into clear and actionable insights DRIVE CUSTOMER INSIGHTS CONNECT PRODUCTS & SERVICES (IoT) PROTECT BUSINESS
  • 4. 4© Cloudera, Inc. All rights reserved. We deliver the modern platform for machine learning and advanced analytics RUNS ANYWHERE Cloud Multi-cloud On-prem SCALABLE Elastic Cost-effective Lower TCO ENTERPRISE GRADE Secure Performant Compliant
  • 5. 5© Cloudera, Inc. All rights reserved. The data-driven enterprise IoT explosion of new data 30B connected devices 440x more data Enterprises re-architect to modernize IT infrastructure open source cloud machine learning
  • 6. 6© Cloudera, Inc. All rights reserved. <1% of an organization’s unstructured data is analyzed or used at all <50% of an organization’s structured data is actively used in making decisions 80% of analysts’ time is spent simply discovering and preparing data
  • 7. 7© Cloudera, Inc. All rights reserved. • Move processing to the cloud without risk • Focus on your workload, not cluster operations • Simplify and unify your analytics
  • 8. 8© Cloudera, Inc. All rights reserved. A platform for enabling data-driven decisions Modern data processing (ETL) and data governance at scale Data Engineering Explore, analyze, and understand all your data Analytic Database Data-driven applications to deliver real-time insights Operational Database Multi-Storage, Multi-Environment Exploratory data science and machine learning for the enterprise Data Science
  • 9. 9© Cloudera, Inc. All rights reserved. Data Engineering use cases
  • 10. 10© Cloudera, Inc. All rights reserved. Ingest, Process, and Deliver Insights to Drive Decisions Leverage any source and format • Unstructured data • Structured data • Social data • Machine data • IOT data Large scale data processing • Stream or batch • Choice of engine: MapReduce, Spark, Hive, Hive-on-Spark • Job SLAs • Data storage Analyze, build and train models • BI analytic engines • Data science and BI tools/libraries • Real-time analysis • Report generation Ingest from data sources Process and transform data Deliver insights Make business decisions Using data to grow the business • Report consumption • Apply judgement • Inform business decisions
  • 11. 11© Cloudera, Inc. All rights reserved. Example: Data Engineering in ML Pipeline clean, merge, filter Data Engineering Raw Data ● formats ● sources ● volume Processed data ● training ● validation ● test model, train, tune Data Science processing, execution Data Engineering Validated model ● model ● parameters Live data ingest End results ● insights ● predictions ● results
  • 12. 12© Cloudera, Inc. All rights reserved. Goals of an Organization: Data Engineers and IT Agility ● Time to market for new use cases ● Reliably meet SLAs for processed data Ease-of-use ● Production: run operationalized workflows; monitor and troubleshoot ● Development: interactive workflows and tools Goals of Data Engineers Goals of IT Administrators ● Cost ● Security ● Standardization ● Self-service LOBs
  • 13. 13© Cloudera, Inc. All rights reserved. IT and DE goals - Agility - Ease of use - Cost effectiveness - Standardization Public Cloud Properties - on-demand infrastructure resourcing - hyperscale storage - data durable + highly available Bridging the Gap Between Users and Cloud Cloudera Altus providing a bridge since 2017
  • 14. 14© Cloudera, Inc. All rights reserved. Cloudera Altus for data engineering workloads PaaS for ETL, machine learning, and data processing on AWS ● Managed transient clusters in customer VPC ● Support for MR2, Hive, Spark, Hive-on-Spark ● Workload analytics and troubleshooting
  • 15. 15© Cloudera, Inc. All rights reserved. Easy
  • 16. 16© Cloudera, Inc. All rights reserved. https://console.altus.cloudera.com
  • 17. 17© Cloudera, Inc. All rights reserved. Everything you don’t have to do • Install any software to start working • Install any hardware • Worry about cluster configuration • Upgrade/reconfigure clusters • OS upgrades/patching • Resource Management
  • 18. 18© Cloudera, Inc. All rights reserved. Focus on Workloads • Jobs/Workloads as first class entities • Clusters as supporting entities • Abstracts away most infrastructure and cluster details
  • 19. 19© Cloudera, Inc. All rights reserved. Workload troubleshooting and analytics ● Troubleshoot jobs after cluster termination through job log and configuration browsing ● Insight into causes of job failure ● Identification and root cause analysis of slow jobs
  • 20. 20© Cloudera, Inc. All rights reserved. Agile
  • 21. 21© Cloudera, Inc. All rights reserved. • The same CDH is present regardless of deployment model • Simplified application migration • Minimizes cloud migration risk • Core components open-source • Reduces risk for lock-in • Reduces unknowns for third-party partners One Platform Everywhere
  • 22. 22© Cloudera, Inc. All rights reserved. Embrace Transience for Lower Costs Compartmentalize for Optimal Performance Hyperscale Cloud Store Grow Storage and Compute Discretely for Efficiency STORE COMPUTE Cloud-native architecture
  • 23. 23© Cloudera, Inc. All rights reserved. Unified
  • 24. 24© Cloudera, Inc. All rights reserved. • Specialized clusters provide optimized computation • Data and Metadata need to be consistently accessible across all clusters • Whether CDH clusters are managed by Altus or not No Data Silos Object Store
  • 25. 25© Cloudera, Inc. All rights reserved. • S3 eventual consistency model can lead to errors • S3Guard provides a consistent view of S3 data across clusters • Table schemas are often relevant across clusters • For simple scenarios, inline DDL in jobs is sufficient • For complex scenarios, share a persistent backing database for the metadata store Multi-cluster data and metadata consistency
  • 26. 26© Cloudera, Inc. All rights reserved. Altus feature overview Low cost • Per-node/per-hour pricing • Terminate clusters when not in use • Spot with self-healing End-user focused • Manages your cluster so you don’t have to • Job submission CLI/API • Workload troubleshooting Easy to use • Self-service for end-users • Cloud console + familiar tools • Cluster provisioning in minutes Cloud-native • Decouple storage and compute • R/W to/from Amazon S3 • Spin EC2 clusters up and down Integrated Platform • Same Cloudera platform on- premises and in the cloud • Feed cleaned data into Impala clusters for BI analytics • Share metadata across clusters Secure • Integrated with AWS security • No Cloudera access to customer data • Support for multi-AWS accounts • Admin and end-user accounts
  • 27. 27© Cloudera, Inc. All rights reserved. Altus Demo
  • 28. 28© Cloudera, Inc. All rights reserved. Demo “Analysis: Hospital Death Rates vs. State GDP” Data Engineering Data Science Well-formatted data Hospital data Death Rates Complications cleaning merging filtering Linear Regression Statistical AnalysisPer-capita State GDP [State, Condition, Hospital, Type, Average Score, GDP] Insights Data Science Workbench
  • 29. 29© Cloudera, Inc. All rights reserved. BUILT FOR THE ENTERPRISE SIMPLIFIED & OPTIMIZED OPEN ARCHITECTURE Your choice for the cloud ● Portable for hybrid and multi-cloud ● Rich ISV partner ecosystem ● Open source, supported by community ● Low TCO on cloud- native infrastructure ● Deployed as-a-service with minimal cluster management UNIFIED PLATFORM ● Minimize data silos ● Simplify operations ● Improve ease of development and cloud migration ● Security ● Availability ● Performance ● Governance CLOUDERA ● Disjointed services create data silos ● Confusing and disjoined services, data, and pricing models ● Vendor lock in OTHER CLOUD DATA PLATFORMS ● No common metadata, security, lineage across apps
  • 30. 30© Cloudera, Inc. All rights reserved. Visit www.cloudera.com/altus for access, and ask about our free trial
  • 31. 31© Cloudera, Inc. All rights reserved. Questions?
  • 32. 32© Cloudera, Inc. All rights reserved. Thank you David Tishgart Jennifer Wu