SlideShare a Scribd company logo
© Hortonworks Inc. 2016
Protecting Enterprise Data
in Apache Hadoop
Oct 2016
Page 1
Owen O’Malley
owen@hortonworks.com
@owen_omalley
© Hortonworks Inc. 2016
Security
Page 2
© Hortonworks Inc. 2016
Threat: Accidental File Deletion
Page 3
© Hortonworks Inc. 2016
Threat: Accidental Killing Tasks
Page 4
© Hortonworks Inc. 2016
Threat: Pretending to be a User
Page 5
© Hortonworks Inc. 2016
Threat: User accesses private data
Page 6
© Hortonworks Inc. 2016
Threat: Pretending to be a Service
Page 7
© Hortonworks Inc. 2016
Threat: Remote Access
Page 8
© Hortonworks Inc. 2016
Security Architecture
Page 9
© Hortonworks Inc. 2016
Threat: Eavesdropping Inside Data Center
Page 10
© Hortonworks Inc. 2016
Threat: Eavesdropping Outside Data Center
Page 11
© Hortonworks Inc. 2016
Threat: Physical access
Page 12
© Hortonworks Inc. 2016
Threat: Bad Hadoop Admin in Cluster
Page 13
© Hortonworks Inc. 2016
HDFS Encryption
Page 14
© Hortonworks Inc. 2016
KeyProvider API
Page 15
© Hortonworks Inc. 2016
Encryption Scheme
Page 16
© Hortonworks Inc. 2016
Original Hive Architecture
Page 17
© Hortonworks Inc. 2016
Threat: User Accesses DB directly
Page 18
© Hortonworks Inc. 2016
Hive Architecture with Metastore
Page 19
© Hortonworks Inc. 2016
Threat: User Deletes Hive tables
Page 20
© Hortonworks Inc. 2016
Hive Architecture with Storage-Based Auth
Page 21
© Hortonworks Inc. 2016
Threat: User reads private columns
Page 22
© Hortonworks Inc. 2016
Hive Architecture with Hive Server 2
Page 23
© Hortonworks Inc. 2016
Threat: User reads private columns
Page 24
© Hortonworks Inc. 2016
Threat: User isn’t Allowed to see Details
Page 25
© Hortonworks Inc. 2016
Caution: Shadow Security
Page 26
© Hortonworks Inc. 2016
Resources
Page 27
© Hortonworks Inc. 2016
Thank You!
Page 28

More Related Content

PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Protecting enterprise Data in Hadoop
PDF
Dataflow with Apache NiFi - Crash Course - HS16SJ
PDF
Getting involved with Open Source at the ASF
PDF
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
PDF
HDF 3.0 IoT Platform for Everyone
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
Protecting enterprise Data in Hadoop
Dataflow with Apache NiFi - Crash Course - HS16SJ
Getting involved with Open Source at the ASF
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
HDF 3.0 IoT Platform for Everyone

What's hot (20)

PDF
Hive2 Introduction -- Interactive SQL for Big Data
PDF
Apache NiFi Meetup - Princeton NJ 2016
PPTX
Apache Zeppelin and Spark for Enterprise Data Science
PPTX
The Elephant in the Clouds
PDF
Data Science with Apache Spark - Crash Course - HS16SJ
PPTX
Apache NiFi Crash Course - San Jose Hadoop Summit
PDF
Dataflow with Apache NiFi
PPTX
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
PPTX
Intro to Spark with Zeppelin
PDF
Apache NiFi: Ingesting Enterprise Data At Scale
PPT
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
PPTX
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
PDF
Hortonworks Data Cloud for AWS 1.11 Updates
PPTX
Webinar Series Part 5 New Features of HDF 5
PPTX
Best Practices for Enterprise User Management in Hadoop Environment
PDF
Introduction to Streaming Analytics Manager
PPTX
Hortonworks Data Cloud for AWS
PPTX
Double Your Hadoop Hardware Performance with SmartSense
PPTX
S3Guard: What's in your consistency model?
PPTX
Edw Optimization Solution
Hive2 Introduction -- Interactive SQL for Big Data
Apache NiFi Meetup - Princeton NJ 2016
Apache Zeppelin and Spark for Enterprise Data Science
The Elephant in the Clouds
Data Science with Apache Spark - Crash Course - HS16SJ
Apache NiFi Crash Course - San Jose Hadoop Summit
Dataflow with Apache NiFi
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Intro to Spark with Zeppelin
Apache NiFi: Ingesting Enterprise Data At Scale
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Welcome to Apache Hadoop's Teenage Years, Arun Murthy Keynote
Hortonworks Data Cloud for AWS 1.11 Updates
Webinar Series Part 5 New Features of HDF 5
Best Practices for Enterprise User Management in Hadoop Environment
Introduction to Streaming Analytics Manager
Hortonworks Data Cloud for AWS
Double Your Hadoop Hardware Performance with SmartSense
S3Guard: What's in your consistency model?
Edw Optimization Solution
Ad

Viewers also liked (20)

PPTX
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
PPTX
Moving towards enterprise ready Hadoop clusters on the cloud
PDF
The real world use of Big Data to change business
PPTX
Apache Hadoop 3.0 What's new in YARN and MapReduce
PPTX
Apache NiFi 1.0 in Nutshell
PPTX
Using Hadoop to build a Data Quality Service for both real-time and batch data
PPTX
Rebuilding Web Tracking Infrastructure for Scale
PPTX
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
PPTX
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
PDF
Hadoop Summit Tokyo HDP Sandbox Workshop
PPTX
Security and Data Governance using Apache Ranger and Apache Atlas
PPTX
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
PPTX
The truth about SQL and Data Warehousing on Hadoop
PPTX
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
PPTX
Data infrastructure architecture for medium size organization: tips for colle...
PDF
Real-time Analytics in Financial: Use Case, Architecture and Challenges
PDF
Comparison of Transactional Libraries for HBase
PDF
Apache Eagle - Monitor Hadoop in Real Time
PPTX
LLAP: Sub-Second Analytical Queries in Hive
PDF
Path to 400M Members: LinkedIn’s Data Powered Journey
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Moving towards enterprise ready Hadoop clusters on the cloud
The real world use of Big Data to change business
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache NiFi 1.0 in Nutshell
Using Hadoop to build a Data Quality Service for both real-time and batch data
Rebuilding Web Tracking Infrastructure for Scale
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
Hadoop Summit Tokyo HDP Sandbox Workshop
Security and Data Governance using Apache Ranger and Apache Atlas
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
The truth about SQL and Data Warehousing on Hadoop
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Data infrastructure architecture for medium size organization: tips for colle...
Real-time Analytics in Financial: Use Case, Architecture and Challenges
Comparison of Transactional Libraries for HBase
Apache Eagle - Monitor Hadoop in Real Time
LLAP: Sub-Second Analytical Queries in Hive
Path to 400M Members: LinkedIn’s Data Powered Journey
Ad

Similar to Protecting Enterprise Data In Apache Hadoop (20)

PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Data protection2015
PDF
Preventing Hybrid Cloud Environments from Being Breached
PDF
Hortonworks sqrrl webinar v5.pptx
PDF
6 Most Surprising SharePoint Security Risks
PPTX
ORC Column Encryption
PPTX
Bring your Service to YARN
PDF
Mobile Penetration Testing: Episode 1 - The Forensic Menace
PDF
Webinar: Is your web security broken? - 10 things you need to know
PDF
The fundamentals of Android and iOS app security
PPTX
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
PPTX
OpenText: Can Your Remote Access Solution Keep Up?
PDF
Webinar: A deep dive on ransomware
PPTX
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
PPTX
Curb your insecurity with HDP
PPTX
Hadoop In Action
PDF
Five mobile security challenges facing the enterprise
PDF
Hortonworks - What's Possible with a Modern Data Architecture?
PPTX
Hortonworks Data In Motion Webinar Series Pt. 2
PDF
OWASP Mobile Top 10
Protecting Enterprise Data in Apache Hadoop
Data protection2015
Preventing Hybrid Cloud Environments from Being Breached
Hortonworks sqrrl webinar v5.pptx
6 Most Surprising SharePoint Security Risks
ORC Column Encryption
Bring your Service to YARN
Mobile Penetration Testing: Episode 1 - The Forensic Menace
Webinar: Is your web security broken? - 10 things you need to know
The fundamentals of Android and iOS app security
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
OpenText: Can Your Remote Access Solution Keep Up?
Webinar: A deep dive on ransomware
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb your insecurity with HDP
Hadoop In Action
Five mobile security challenges facing the enterprise
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks Data In Motion Webinar Series Pt. 2
OWASP Mobile Top 10

More from DataWorks Summit/Hadoop Summit (20)

PPT
Running Apache Spark & Apache Zeppelin in Production
PPT
State of Security: Apache Spark & Apache Zeppelin
PDF
Unleashing the Power of Apache Atlas with Apache Ranger
PDF
Enabling Digital Diagnostics with a Data Science Platform
PDF
Revolutionize Text Mining with Spark and Zeppelin
PDF
Double Your Hadoop Performance with Hortonworks SmartSense
PDF
Hadoop Crash Course
PDF
Data Science Crash Course
PDF
Apache Spark Crash Course
PPTX
Schema Registry - Set you Data Free
PPTX
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
PDF
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
PPTX
Mool - Automated Log Analysis using Data Science and ML
PPTX
How Hadoop Makes the Natixis Pack More Efficient
PPTX
HBase in Practice
PPTX
The Challenge of Driving Business Value from the Analytics of Things (AOT)
PDF
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
PPTX
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
PPTX
Backup and Disaster Recovery in Hadoop
PPTX
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Data Science Crash Course
Apache Spark Crash Course
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
How Hadoop Makes the Natixis Pack More Efficient
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Machine Learning_overview_presentation.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
A Presentation on Artificial Intelligence
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Spectroscopy.pptx food analysis technology
Encapsulation_ Review paper, used for researhc scholars
Programs and apps: productivity, graphics, security and other tools
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Chapter 3 Spatial Domain Image Processing.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Reach Out and Touch Someone: Haptics and Empathic Computing
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
NewMind AI Weekly Chronicles - August'25-Week II
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Machine Learning_overview_presentation.pptx
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
Digital-Transformation-Roadmap-for-Companies.pptx
cuic standard and advanced reporting.pdf
A Presentation on Artificial Intelligence

Protecting Enterprise Data In Apache Hadoop