SlideShare a Scribd company logo
© Hortonworks Inc. 2015
Protecting Enterprise Data
in Apache Hadoop
April 2015
Page 1
Owen O’Malley
owen@hortonworks.com
@owen_omalley
© Hortonworks Inc. 2015
Security
Page 2
© Hortonworks Inc. 2015
Security Architecture
Page 3
© Hortonworks Inc. 2015
Attack Vectors
Page 4
© Hortonworks Inc. 2015
Attack Vectors
Page 5
© Hortonworks Inc. 2015
Threat: Accidental Damage
Page 6
© Hortonworks Inc. 2015
Threat: Remote Access
Page 7
© Hortonworks Inc. 2015
Threat: Eavesdropping
Page 8
© Hortonworks Inc. 2015
Threat: User accesses private data
Page 9
© Hortonworks Inc. 2015
Threat: Physical access
Page 10
© Hortonworks Inc. 2015
Threat: Hadoop Admin in Cluster
Page 11
© Hortonworks Inc. 2015
HDFS Encryption
Page 12
© Hortonworks Inc. 2015
KeyProvider API
Page 13
© Hortonworks Inc. 2015
Encryption Scheme
Page 14
© Hortonworks Inc. 2015
Threat: User reads private columns
Page 15
© Hortonworks Inc. 2015
ORC File Layout
Page 16
File Footer
Postscript
Index Data
Row Data
Stripe Footer
256MBStripe
Index Data
Row Data
Stripe Footer
256MBStripe
Index Data
Row Data
Stripe Footer
256MBStripe
Column 1
Column 2
Column 7
Column 8
Column 3
Column 6
Column 4
Column 5
Column 1
Column 2
Column 7
Column 8
Column 3
Column 6
Column 4
Column 5
Stream 2.1
Stream 2.2
Stream 2.3
Stream 2.4
© Hortonworks Inc. 2015
Threat: User reads hidden values
Page 17
© Hortonworks Inc. 2015
Threat: Shadow Security
Page 18
© Hortonworks Inc. 2015
Resources
Page 19
© Hortonworks Inc. 2015
Thank You!
Page 20

More Related Content

PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Adding ACID Updates to Hive
PPTX
ORC Column Encryption
PPTX
Protecting Enterprise Data in Apache Hadoop
PDF
Plugging the Holes: Security and Compatability in Hadoop
PPTX
File Format Benchmarks - Avro, JSON, ORC, & Parquet
PPT
Hadoop Security Architecture
PPTX
Structor - Automated Building of Virtual Hadoop Clusters
Protecting Enterprise Data in Apache Hadoop
Adding ACID Updates to Hive
ORC Column Encryption
Protecting Enterprise Data in Apache Hadoop
Plugging the Holes: Security and Compatability in Hadoop
File Format Benchmarks - Avro, JSON, ORC, & Parquet
Hadoop Security Architecture
Structor - Automated Building of Virtual Hadoop Clusters

Viewers also liked (17)

PPTX
ORC File Introduction
PDF
Bay Area HUG Feb 2011 Intro
PDF
Next Generation MapReduce
PDF
Next Generation Hadoop Operations
PDF
Optimizing Hive Queries
PDF
ORC Files
PPTX
ORC File and Vectorization - Hadoop Summit 2013
PDF
Hadoop Security Now and Future
PDF
Optimizing Hive Queries
PPTX
ORC 2015
PDF
Parquet Hadoop Summit 2013
PPTX
Apache Ranger
PDF
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
PPTX
Hadoop Security Today & Tomorrow with Apache Knox
PDF
Hive tuning
PPTX
Securing Hadoop with Apache Ranger
PPTX
Hadoop security
ORC File Introduction
Bay Area HUG Feb 2011 Intro
Next Generation MapReduce
Next Generation Hadoop Operations
Optimizing Hive Queries
ORC Files
ORC File and Vectorization - Hadoop Summit 2013
Hadoop Security Now and Future
Optimizing Hive Queries
ORC 2015
Parquet Hadoop Summit 2013
Apache Ranger
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hadoop Security Today & Tomorrow with Apache Knox
Hive tuning
Securing Hadoop with Apache Ranger
Hadoop security
Ad

Similar to Data protection2015 (20)

PPTX
Protecting enterprise Data in Hadoop
PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Protecting Enterprise Data in Apache Hadoop
PPTX
Protecting Enterprise Data In Apache Hadoop
PDF
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
PDF
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
PDF
Hortonworks and Voltage Security webinar
PDF
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
PDF
Hadoop Security Protecting Your Big Data Platform 1st Edition Ben Spivey
PPTX
Fine Grain Access Control for Big Data: ORC Column Encryption
PDF
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
PDF
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
PPTX
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
PPTX
Curb your insecurity with HDP
PDF
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
PDF
Hortonworks sqrrl webinar v5.pptx
PPTX
Improvements in Hadoop Security
PPTX
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
PDF
BigData Security - A Point of View
Protecting enterprise Data in Hadoop
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data In Apache Hadoop
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks and Voltage Security webinar
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hadoop Security Protecting Your Big Data Platform 1st Edition Ben Spivey
Fine Grain Access Control for Big Data: ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb your insecurity with HDP
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Hortonworks sqrrl webinar v5.pptx
Improvements in Hadoop Security
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
BigData Security - A Point of View
Ad

More from Owen O'Malley (7)

PPTX
Running An Apache Project: 10 Traps and How to Avoid Them
PPTX
Big Data's Journey to ACID
PPTX
ORC Deep Dive 2020
PPTX
Protect your private data with ORC column encryption
PPTX
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
PDF
Strata NYC 2018 Iceberg
PPTX
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Running An Apache Project: 10 Traps and How to Avoid Them
Big Data's Journey to ACID
ORC Deep Dive 2020
Protect your private data with ORC column encryption
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Strata NYC 2018 Iceberg
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet

Recently uploaded (20)

PPTX
Chapter 5: Probability Theory and Statistics
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Getting Started with Data Integration: FME Form 101
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
A Presentation on Touch Screen Technology
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Encapsulation theory and applications.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
Chapter 5: Probability Theory and Statistics
Group 1 Presentation -Planning and Decision Making .pptx
Getting Started with Data Integration: FME Form 101
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation_ Review paper, used for researhc scholars
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
A Presentation on Touch Screen Technology
A comparative analysis of optical character recognition models for extracting...
A comparative study of natural language inference in Swahili using monolingua...
TLE Review Electricity (Electricity).pptx
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Enhancing emotion recognition model for a student engagement use case through...
Encapsulation theory and applications.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Assigned Numbers - 2025 - Bluetooth® Document

Data protection2015