SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Security Implementation on Hadoop
Dr. Wei-Chiu Chuang | Software
Engineer
2© Cloudera, Inc. All rights reserved.
$ whoami
Software Engineer, Cloudera Apache Hadoop Committer/PMC
3© Cloudera, Inc. All rights reserved.
Unguarded data stores are the victims
4© Cloudera, Inc. All rights reserved.
Regulatory Compliance
Organizations can be fined up to 4% of
annual global turnover for breaching GDPR
or €20 Million
6© Cloudera, Inc. All rights reserved.
Security Implementation
7© Cloudera, Inc. All rights reserved.
Disclaimer
This talk serves as a general guideline for
security implementation on Hadoop.
The actual implementation procedures and
scope of implementation vary on a case-
by-case basis, and should be assessed by
Cloudera’s Professional Services team or
certified Cloudera SI Partners.
8© Cloudera, Inc. All rights reserved.
Non-secure #0
Data Free for All
9© Cloudera, Inc. All rights reserved.
Firewall
ActiveDirectory/KDC
Hadoop cluster
Cloudera
Manager
Gateway
node
Cloudera
NavigatorDatacenter
Applications
10© Cloudera, Inc. All rights reserved.
High Availability made Easy
11© Cloudera, Inc. All rights reserved.
Identity Management
Simple Authentication
File group ownership
• AD integration
• SSSD or Centrify
Consideration in large enterprises.
SSSD
via
12© Cloudera, Inc. All rights reserved.
System Diagram #0
Firewall
ActiveDirectory
Master
Worker Worker Worker
Cloudera
Manager
Master
(SSSD/Centrify)
13© Cloudera, Inc. All rights reserved.
Simple authentication =
no authentication
14© Cloudera, Inc. All rights reserved.
Minimal Security #1
Reduce Risk Exposure
15© Cloudera, Inc. All rights reserved.
Kerberos
EXAMPLE.COM
KDC
user@EXAMPLE.COM
Hadoop
user@EXAMPLE.COM 
user
Strong Authentication
KDC
• MIT
• ActiveDirectory (more common)
realmprimary
16© Cloudera, Inc. All rights reserved.
Kerberos
Consideration in large corporates
Time synchronization
CM Kerberos Wizard
• Configure AD to create a Kerberos
principal for CM server, and to
delegate CM the ability to
create/manage Kerberos principals
17© Cloudera, Inc. All rights reserved.
LDAPAuthentication
* LDAP over SSL
18© Cloudera, Inc. All rights reserved.
Authorization/Access Control
HDFS File ACL YARN job submission
Hbase ACLsOozie ACL
Access Control List (ACLs)
Hive
Sentry Managed
(RBAC)
Impala
19© Cloudera, Inc. All rights reserved.
Auditing
20© Cloudera, Inc. All rights reserved.
Backup/Disaster Recovery
Cloudera Backup/Disaster Recovery (BDR)
• A high performance data replicator
• Copies incremental data on the source cluster at specified schedules
Supports
 Kerberos
 Data encryption
 HDFS replication to cloud
21© Cloudera, Inc. All rights reserved.
Kerberized BDR Best Practice
Production DR
Cloudera BDR
PROD.EXAMPLE.COM
Cross-realm trust
KDC KDC
DR.EXAMPLE.COM
22© Cloudera, Inc. All rights reserved.
Firewall
System Diagram #1
ActiveDirectory/
KDC
Master
Worker Worker Worker
Cloudera
Manager
Kerberos
Master
(SSSD/Centrify)
DR
23© Cloudera, Inc. All rights reserved.
More Security #2
Managed, Secure, Protected
24© Cloudera, Inc. All rights reserved.
Data In-Transit Encryption
RPC encryption
Data transport encryption
• Supports AES CTR, up to 256-bit
key length
HTTP TLS/SSL encryption
• No self-signed certificates in
production
Master
Worker Worker Worker
Master
Application
RPC encryption
Transport
encryption
TLS/SSL
25© Cloudera, Inc. All rights reserved.
Data At-Rest Encryption
Transparent encryption
Supports any Hadoop applications
Encryption Zone
$ hadoop key create mykey
$ hadoop fs -mkdir /zone
$ hdfs crypto -createZone -keyName mykey -path /zone
/
/tmp
/zon
e
foo bar
Encryption zone
26© Cloudera, Inc. All rights reserved.
Key Management Server Deployment (non-prod)
HDFS
NameNode
Client
Java
Keystore
KMS
Keystore
file
Separation of duties
• Encryption Zone Key (EZK) is stored in
KMS server
• HDFS super user can not decrypt files
27© Cloudera, Inc. All rights reserved.
Key Management Server/Key Trustee Server Deployment
HDFS
NameNode
Client
Key Trustee
KMS
Key Trustee
KMS
Firewall
Key Trustee
Server
(Active)
Key Trustee
Server
(Passive)
synchronization
(or more)
28© Cloudera, Inc. All rights reserved.
KMS+KTS+HSM Deployment
HDFS
NameNode
Client HSM KMS
HSM KMS
Firewall
Key Trustee
Server
(Active)
Key Trustee
Server
(Passive)
synchronization
Key HSM
(or more)
Key HSM
HSM
HSM
29© Cloudera, Inc. All rights reserved.
Encryption Performance
30© Cloudera, Inc. All rights reserved.
Troubleshooting: Encryption Performance Anomaly
• Configuration
• AES-NI Hardware acceleration
• OpenSSL library
• Entropy
31© Cloudera, Inc. All rights reserved.
Fine Grained Access Control with Apache Sentry
32© Cloudera, Inc. All rights reserved.
Firewall
System Diagram #2
ActiveDirectory/
KDC
Master
Worker Worker Worker
Cloudera
Manager
Kerberos
Master
KMSKMS
Firewall
KeyTrusteeKeyTrustee
(SSSD/Centrify)
33© Cloudera, Inc. All rights reserved.
Most Security #3
Secure Data Vault
34© Cloudera, Inc. All rights reserved.
Data Redaction
Personal Identifiable Information
• PCI-DSS, HIPAA
Best practice
Password
• stores in credential files, not in configuration
Log, queries
• Cloudera Manager
35© Cloudera, Inc. All rights reserved.
Full Encryption
Encrypt Data Spills
• MapReduce
• Impala
• Hive
• Flume
OS-level encryption
• Navigator Encrypt
36© Cloudera, Inc. All rights reserved.
Security Vulnerabilities
37© Cloudera, Inc. All rights reserved.
Vulnerability Response and Process
Vulnerability
reports
Upstream
Internal
External
Fix Publish
CVE
Cloudera TSB
38© Cloudera, Inc. All rights reserved.
Cloudera Certified Technology
39© Cloudera, Inc. All rights reserved.
Cloudera Certified Technology Partners
Data Sources Data Ingest
Process, Refine
& Prep
Data Discovery Advanced Analytics
Connected
Machines/Data sources
Other Data Sources
40© Cloudera, Inc. All rights reserved.
A certified product ensures it integrates with a secure
cluster
• Authenticate via Kerberos or LDAP
Authentication
• Handle Apache Sentry with Hive, Impala, Search, HDFS
Authorization
• Support HDFS transport encryption, at-rest encryption; support
SSL/TLS connection encryption
Encryption
41© Cloudera, Inc. All rights reserved.
Cloudera SDX
42© Cloudera, Inc. All rights reserved.
Cloudera Enterprise
42
The modern platform for machine learning and analytics optimized for the cloud
EXTENSIBLE
SERVICES
CORE SERVICES
DATA
ENGINEERING
OPERATIONAL
DATABASE
ANALYTIC
DATABASE
DATA CATALOG
INGEST &
REPLICATION
SECURITY GOVERNANCE
WORKLOAD
MANAGEMENT
DATA
SCIENCE
S3 ADLS HDFS KUDU
STORAGE
SERVICES
43© Cloudera, Inc. All rights reserved.
• Unified security – protects sensitive data with consistent
controls, even for transient and recurring workloads
• Consistent governance – enables secure self-service access
to all relevant data and increases compliance
• Easy workload management – increases user productivity
and boosts job predictability
• Flexible ingest and replication – aggregates a single copy of
all data, provides disaster recovery, and eases migration
• Shared catalog – defines and preserves structure and
business context of data for new applications and partner
solutions
Open platform services
Built for multi-function analytics | Optimized for cloud
44© Cloudera, Inc. All rights reserved.
Successful use cases
45© Cloudera, Inc. All rights reserved.
Cloudera Overview & Financial Services Focus
2000
Strong Partner
Ecosystem
+
1600 Employees
Globally
+
19 Of the 30 G-SIBs Run
on Cloudera
Strong Focus &
Momentum in
Financial Services
3 Of the Fortune 500
Top 5 Insurers Run on
Cloudera
5 Of the Top 6 Asset
Management Firms
Run on Cloudera
200+
Financial Services
Customers
47© Cloudera, Inc. All rights reserved.
Building a Fantastic Customer Experience
• Improved customer experience
• 80 percent reduction in operating costs
through a wide-range of customer
service and operational improvements
• Decrease in cost to service customers
while increasing revenue through better
service
CUSTOMER 360
FINANCIAL SERVICES
» PREDICTIVE ANALYTICS
» 360 CUSTOMER VIEW
» OPERATIONAL ANALYTICS
48© Cloudera, Inc. All rights reserved.
Large healthcare
provider enables
practitioners to
recommend at-home
actions to prevent
hospital visits
• Flexible, automatic
data classification for
diverse medical
ontologies
• Self-service data
discovery for real-
time, data-driven
decisions
49© Cloudera, Inc. All rights reserved.
Thank you
Wei-ChiuChuang | weichiu@cloudera.com
50© Cloudera, Inc. All rights reserved.
More information on Hadoop Security
51© Cloudera, Inc. All rights reserved.
Books authored by Clouderans

More Related Content

PDF
Cloudera hadoop security overview 20171003
PPTX
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Spark and Deep Learning Frameworks at Scale 7.19.18
PPTX
How Data Drives Business at Choice Hotels
PPTX
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
PPTX
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
PPTX
Get started with Cloudera's cyber solution
Cloudera hadoop security overview 20171003
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Leveraging the Cloud for Big Data Analytics 12.11.18
Spark and Deep Learning Frameworks at Scale 7.19.18
How Data Drives Business at Choice Hotels
Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Get started with Cloudera's cyber solution

What's hot (20)

PPTX
PaaS or Fail: Rule the Cloud with Altus
PPTX
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
PPTX
Cloud Data Warehousing with Cloudera Altus 7.24.18
PPTX
The Vision & Challenge of Applied Machine Learning
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Self-service Big Data Analytics on Microsoft Azure
PPTX
Introducing Workload XM 8.7.18
PDF
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
PPTX
Cloudera - The Modern Platform for Analytics
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Turning Data into Business Value with a Modern Data Platform
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
PPTX
Introducing the data science sandbox as a service 8.30.18
PPTX
Managing the Dewey Decimal System
PPTX
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
PPTX
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
PPTX
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
PPTX
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
PPTX
Xpress azure - Extension of Azure in Tunisia
PaaS or Fail: Rule the Cloud with Altus
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Cloud Data Warehousing with Cloudera Altus 7.24.18
The Vision & Challenge of Applied Machine Learning
Leveraging the cloud for analytics and machine learning 1.29.19
Self-service Big Data Analytics on Microsoft Azure
Introducing Workload XM 8.7.18
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Cloudera - The Modern Platform for Analytics
Modern Data Warehouse Fundamentals Part 3
Turning Data into Business Value with a Modern Data Platform
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Hadoop Distributed File System (HDFS) Encryption with Cloudera Navigator Key ...
Introducing the data science sandbox as a service 8.30.18
Managing the Dewey Decimal System
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
SplunkLive! Nutanix Session - Turnkey and scalable infrastructure for Splunk ...
Xpress azure - Extension of Azure in Tunisia
Ad

Similar to Hadoop security implementationon 20171003 (20)

PPTX
Cloudera training: secure your Cloudera cluster
PDF
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
PPTX
Five Tips for Running Cloudera on AWS
PPTX
Big data journey to the cloud 5.30.18 asher bartch
PPTX
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
PPTX
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
PPTX
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
PPTX
SDX Pitch Deck (201) - Apresentação SDP 2024
PPTX
Hadoop security @ Philly Hadoop Meetup May 2015
PPTX
Seeking Cybersecurity--Strategies to Protect the Data
PPTX
Cloudera training secure your cloudera cluster 7.10.18
PPTX
Project Rhino: Enhancing Data Protection for Hadoop
PPTX
Cloudera Altus: Big Data in the Cloud Made Easy
PPTX
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
PPTX
Cloudera SDX
PPTX
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Cloudera training: secure your Cloudera cluster
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Five Tips for Running Cloudera on AWS
Big data journey to the cloud 5.30.18 asher bartch
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Build a modern platform for anti-money laundering 9.19.18
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
SDX Pitch Deck (201) - Apresentação SDP 2024
Hadoop security @ Philly Hadoop Meetup May 2015
Seeking Cybersecurity--Strategies to Protect the Data
Cloudera training secure your cloudera cluster 7.10.18
Project Rhino: Enhancing Data Protection for Hadoop
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera SDX
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Modern Data Warehouse Fundamentals Part 2
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Ad

Recently uploaded (20)

PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Geodesy 1.pptx...............................................
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
bas. eng. economics group 4 presentation 1.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Internet of Things (IOT) - A guide to understanding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
CYBER-CRIMES AND SECURITY A guide to understanding
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Geodesy 1.pptx...............................................
CH1 Production IntroductoryConcepts.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Embodied AI: Ushering in the Next Era of Intelligent Systems
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...

Hadoop security implementationon 20171003

  • 1. 1© Cloudera, Inc. All rights reserved. Security Implementation on Hadoop Dr. Wei-Chiu Chuang | Software Engineer
  • 2. 2© Cloudera, Inc. All rights reserved. $ whoami Software Engineer, Cloudera Apache Hadoop Committer/PMC
  • 3. 3© Cloudera, Inc. All rights reserved. Unguarded data stores are the victims
  • 4. 4© Cloudera, Inc. All rights reserved. Regulatory Compliance Organizations can be fined up to 4% of annual global turnover for breaching GDPR or €20 Million
  • 5. 6© Cloudera, Inc. All rights reserved. Security Implementation
  • 6. 7© Cloudera, Inc. All rights reserved. Disclaimer This talk serves as a general guideline for security implementation on Hadoop. The actual implementation procedures and scope of implementation vary on a case- by-case basis, and should be assessed by Cloudera’s Professional Services team or certified Cloudera SI Partners.
  • 7. 8© Cloudera, Inc. All rights reserved. Non-secure #0 Data Free for All
  • 8. 9© Cloudera, Inc. All rights reserved. Firewall ActiveDirectory/KDC Hadoop cluster Cloudera Manager Gateway node Cloudera NavigatorDatacenter Applications
  • 9. 10© Cloudera, Inc. All rights reserved. High Availability made Easy
  • 10. 11© Cloudera, Inc. All rights reserved. Identity Management Simple Authentication File group ownership • AD integration • SSSD or Centrify Consideration in large enterprises. SSSD via
  • 11. 12© Cloudera, Inc. All rights reserved. System Diagram #0 Firewall ActiveDirectory Master Worker Worker Worker Cloudera Manager Master (SSSD/Centrify)
  • 12. 13© Cloudera, Inc. All rights reserved. Simple authentication = no authentication
  • 13. 14© Cloudera, Inc. All rights reserved. Minimal Security #1 Reduce Risk Exposure
  • 14. 15© Cloudera, Inc. All rights reserved. Kerberos EXAMPLE.COM KDC user@EXAMPLE.COM Hadoop user@EXAMPLE.COM  user Strong Authentication KDC • MIT • ActiveDirectory (more common) realmprimary
  • 15. 16© Cloudera, Inc. All rights reserved. Kerberos Consideration in large corporates Time synchronization CM Kerberos Wizard • Configure AD to create a Kerberos principal for CM server, and to delegate CM the ability to create/manage Kerberos principals
  • 16. 17© Cloudera, Inc. All rights reserved. LDAPAuthentication * LDAP over SSL
  • 17. 18© Cloudera, Inc. All rights reserved. Authorization/Access Control HDFS File ACL YARN job submission Hbase ACLsOozie ACL Access Control List (ACLs) Hive Sentry Managed (RBAC) Impala
  • 18. 19© Cloudera, Inc. All rights reserved. Auditing
  • 19. 20© Cloudera, Inc. All rights reserved. Backup/Disaster Recovery Cloudera Backup/Disaster Recovery (BDR) • A high performance data replicator • Copies incremental data on the source cluster at specified schedules Supports  Kerberos  Data encryption  HDFS replication to cloud
  • 20. 21© Cloudera, Inc. All rights reserved. Kerberized BDR Best Practice Production DR Cloudera BDR PROD.EXAMPLE.COM Cross-realm trust KDC KDC DR.EXAMPLE.COM
  • 21. 22© Cloudera, Inc. All rights reserved. Firewall System Diagram #1 ActiveDirectory/ KDC Master Worker Worker Worker Cloudera Manager Kerberos Master (SSSD/Centrify) DR
  • 22. 23© Cloudera, Inc. All rights reserved. More Security #2 Managed, Secure, Protected
  • 23. 24© Cloudera, Inc. All rights reserved. Data In-Transit Encryption RPC encryption Data transport encryption • Supports AES CTR, up to 256-bit key length HTTP TLS/SSL encryption • No self-signed certificates in production Master Worker Worker Worker Master Application RPC encryption Transport encryption TLS/SSL
  • 24. 25© Cloudera, Inc. All rights reserved. Data At-Rest Encryption Transparent encryption Supports any Hadoop applications Encryption Zone $ hadoop key create mykey $ hadoop fs -mkdir /zone $ hdfs crypto -createZone -keyName mykey -path /zone / /tmp /zon e foo bar Encryption zone
  • 25. 26© Cloudera, Inc. All rights reserved. Key Management Server Deployment (non-prod) HDFS NameNode Client Java Keystore KMS Keystore file Separation of duties • Encryption Zone Key (EZK) is stored in KMS server • HDFS super user can not decrypt files
  • 26. 27© Cloudera, Inc. All rights reserved. Key Management Server/Key Trustee Server Deployment HDFS NameNode Client Key Trustee KMS Key Trustee KMS Firewall Key Trustee Server (Active) Key Trustee Server (Passive) synchronization (or more)
  • 27. 28© Cloudera, Inc. All rights reserved. KMS+KTS+HSM Deployment HDFS NameNode Client HSM KMS HSM KMS Firewall Key Trustee Server (Active) Key Trustee Server (Passive) synchronization Key HSM (or more) Key HSM HSM HSM
  • 28. 29© Cloudera, Inc. All rights reserved. Encryption Performance
  • 29. 30© Cloudera, Inc. All rights reserved. Troubleshooting: Encryption Performance Anomaly • Configuration • AES-NI Hardware acceleration • OpenSSL library • Entropy
  • 30. 31© Cloudera, Inc. All rights reserved. Fine Grained Access Control with Apache Sentry
  • 31. 32© Cloudera, Inc. All rights reserved. Firewall System Diagram #2 ActiveDirectory/ KDC Master Worker Worker Worker Cloudera Manager Kerberos Master KMSKMS Firewall KeyTrusteeKeyTrustee (SSSD/Centrify)
  • 32. 33© Cloudera, Inc. All rights reserved. Most Security #3 Secure Data Vault
  • 33. 34© Cloudera, Inc. All rights reserved. Data Redaction Personal Identifiable Information • PCI-DSS, HIPAA Best practice Password • stores in credential files, not in configuration Log, queries • Cloudera Manager
  • 34. 35© Cloudera, Inc. All rights reserved. Full Encryption Encrypt Data Spills • MapReduce • Impala • Hive • Flume OS-level encryption • Navigator Encrypt
  • 35. 36© Cloudera, Inc. All rights reserved. Security Vulnerabilities
  • 36. 37© Cloudera, Inc. All rights reserved. Vulnerability Response and Process Vulnerability reports Upstream Internal External Fix Publish CVE Cloudera TSB
  • 37. 38© Cloudera, Inc. All rights reserved. Cloudera Certified Technology
  • 38. 39© Cloudera, Inc. All rights reserved. Cloudera Certified Technology Partners Data Sources Data Ingest Process, Refine & Prep Data Discovery Advanced Analytics Connected Machines/Data sources Other Data Sources
  • 39. 40© Cloudera, Inc. All rights reserved. A certified product ensures it integrates with a secure cluster • Authenticate via Kerberos or LDAP Authentication • Handle Apache Sentry with Hive, Impala, Search, HDFS Authorization • Support HDFS transport encryption, at-rest encryption; support SSL/TLS connection encryption Encryption
  • 40. 41© Cloudera, Inc. All rights reserved. Cloudera SDX
  • 41. 42© Cloudera, Inc. All rights reserved. Cloudera Enterprise 42 The modern platform for machine learning and analytics optimized for the cloud EXTENSIBLE SERVICES CORE SERVICES DATA ENGINEERING OPERATIONAL DATABASE ANALYTIC DATABASE DATA CATALOG INGEST & REPLICATION SECURITY GOVERNANCE WORKLOAD MANAGEMENT DATA SCIENCE S3 ADLS HDFS KUDU STORAGE SERVICES
  • 42. 43© Cloudera, Inc. All rights reserved. • Unified security – protects sensitive data with consistent controls, even for transient and recurring workloads • Consistent governance – enables secure self-service access to all relevant data and increases compliance • Easy workload management – increases user productivity and boosts job predictability • Flexible ingest and replication – aggregates a single copy of all data, provides disaster recovery, and eases migration • Shared catalog – defines and preserves structure and business context of data for new applications and partner solutions Open platform services Built for multi-function analytics | Optimized for cloud
  • 43. 44© Cloudera, Inc. All rights reserved. Successful use cases
  • 44. 45© Cloudera, Inc. All rights reserved. Cloudera Overview & Financial Services Focus 2000 Strong Partner Ecosystem + 1600 Employees Globally + 19 Of the 30 G-SIBs Run on Cloudera Strong Focus & Momentum in Financial Services 3 Of the Fortune 500 Top 5 Insurers Run on Cloudera 5 Of the Top 6 Asset Management Firms Run on Cloudera 200+ Financial Services Customers
  • 45. 47© Cloudera, Inc. All rights reserved. Building a Fantastic Customer Experience • Improved customer experience • 80 percent reduction in operating costs through a wide-range of customer service and operational improvements • Decrease in cost to service customers while increasing revenue through better service CUSTOMER 360 FINANCIAL SERVICES » PREDICTIVE ANALYTICS » 360 CUSTOMER VIEW » OPERATIONAL ANALYTICS
  • 46. 48© Cloudera, Inc. All rights reserved. Large healthcare provider enables practitioners to recommend at-home actions to prevent hospital visits • Flexible, automatic data classification for diverse medical ontologies • Self-service data discovery for real- time, data-driven decisions
  • 47. 49© Cloudera, Inc. All rights reserved. Thank you Wei-ChiuChuang | weichiu@cloudera.com
  • 48. 50© Cloudera, Inc. All rights reserved. More information on Hadoop Security
  • 49. 51© Cloudera, Inc. All rights reserved. Books authored by Clouderans