SlideShare a Scribd company logo
Big Data Security
Top 5 Security Risks and Best Practices
Jitendra Chauhan
Head R&D, iViZ Security
jitendra.chauhan@gmail.com
Agenda
• Key Insights of Big Data Architecture
• Top 5 Big Data Security Risks
• Top 5 Best Practices
Key Insights of Big Data
Architecture
Distributed Architecture
(Hadoop as example)
Data Partition, Replication
and Distribution
Auto-tiering
Move the
Code
Real Time, Streaming and Continuous
ComputationIntegration Patterns
Real
time
Variety of
Input
Sources
Adhoc
Queries
Parallel & Powerful Programming
Framework
Example:
• 16TB Data
• 128 MB Chunks
• 82000 Maps
Java vs SQL / PLSQL
Frameworks:
• MapReduce
• Storm Topology
(Spouts & Bolts)
Big Data Architecture
No Single Silver Bullet
• Hadoop is already unsuitable for many Big
data problems
• Real-time analytics
• Cloudscale, Storm
• Graph computation
o Giraph and Pregel (Some examples graph
computation are Shortest Paths, Degree of
Separation etc.)
• Low latency queries
o Dremel
Top 5 Security Risks
Insecure Computation
Sensitive
Info
• Information Leak
• Data Corruption
• DoS
Health Data
Untrusted
Computation program
Input Validation and Filtering
• Input Validation
o What kind of data is untrusted?
o What are the untrusted data sources?
• Data Filtering
o Filter Rogue or malicious data
• Challenges
o GBs or TBs continuous data
o Signature based data filtering has limitations
 How to filter Behavior aspect of data?
Granular Access Controls
• Designed for Performance, almost no
security in mind
• Security in Big Data still ongoing research
• Table, Row or Cell level access control gone
missing
• Adhoc Queries poses additional challenges
• Access Control is disabled by default
Insecure Data Storage
• Data at various nodes, Authentication,
Authorization & Encryption is challenging
• Autotiering moves cold data to lesser secure
medium
o What if cold data is sensitive?
• Encryption of Real time data can have
performance impacts
• Secure communication among nodes,
middleware and end users are disabled by
default
Privacy Concerns in Data Mining
and Analytics
• Monetization of Big Data generally involves
Data Mining and Analytics
• Sharing of Results involve multiple
challenges
o Invasion of Privacy
o Invasive Marketing
o Unintentional Disclosure of Information
• Examples
o AOL release of Anonymzed search logs, Users can
easily be identified
o Netflix faced a similar problem
Top 5 Best Practices
• Secure your Computation Code
• Implement access control, code signing, dynamic
analysis of computational code
• Strategy to prevent data in case of untrusted code
• Implement Comprehensive Input Validation
and Filtering
• Implement validation and filtering of input data, from
internal or external sources
• Evaluate input validation filtering of your Big Data
solution
Top 5 Best Practices
• Implement Granular Access Control
• Review Role and Privilege Matrix
• Review permission to execute Adhoc queries
• Enable Access Control
• Secure your Data Storage and Computation
• Sensitive Data should be segregated
• Enable Data encryption for sensitive data
• Audit Administrative Access on Data Nodes
• API Security
Top 5 Best Practices
• Review and Implement Privacy Preserving
Data Mining and Analytics
• Analytics data should not disclose sensitive
information
• Get the Big Data Audited
Thank You
jitendra.chauhan@ivizsecurity.com
http://www.ivizsecurity.com/blog/
Big Data Architecture
Key Insights
• Distributed Architecture & Auto Tiering
• Real Time, Streaming and Continuous
Computation
• Adhoc Queries
• Parallel and Powerful Computation
Language
• Move the Code, Not the data
• Non Relational Data
• Variety of Input Sources
Top 5 Security Risks
• Insecure Computation
• End Point Input Validation and
Filtering
• Granular Access Control
• Insecure Data Storage and
Communication
• Privacy Preserving Data Mining and
Analytics

More Related Content

PPTX
Security bigdata
PPTX
Big data security the perfect storm
PPTX
Information Security in Big Data : Privacy and Data Mining
PPT
Information security in big data -privacy and data mining
PPTX
Big Data Security Analytics (BDSA) with Randy Franklin
PPTX
Security issues associated with big data in cloud
PPTX
A Little Security For Big Data
PPTX
The REAL Impact of Big Data on Privacy
Security bigdata
Big data security the perfect storm
Information Security in Big Data : Privacy and Data Mining
Information security in big data -privacy and data mining
Big Data Security Analytics (BDSA) with Randy Franklin
Security issues associated with big data in cloud
A Little Security For Big Data
The REAL Impact of Big Data on Privacy

What's hot (19)

PDF
Big Data Meets Privacy:De-identification Maturity Model for Benchmarking and ...
PDF
Expanded top ten_big_data_security_and_privacy_challenges
PDF
Solve Big Data Security Issues
PPT
Big Data (security Issue)
PPTX
Big Data and Security - Where are we now? (2015)
PDF
Big Data Security and Governance
PDF
Atlanta ISSA 2010 Enterprise Data Protection Ulf Mattsson
PDF
Network Security‬ and Big ‪‎Data Analytics‬
PDF
Security and Audit for Big Data
PDF
Threat Ready Data: Protect Data from the Inside and the Outside
PPTX
Hadoop and Big Data Security
PDF
Big Data Analytics to Enhance Security
PPTX
Security Analytics and Big Data: What You Need to Know
PDF
To Serve and Protect: Making Sense of Hadoop Security
PPT
Data Leakage Presentation
PDF
Privacy and Security by Design
PDF
The Definitive Guide to Data Loss Prevention
PDF
Security and privacy of cloud data: what you need to know (Interop)
PPTX
Privacy Secrets Your Systems May Be Telling
Big Data Meets Privacy:De-identification Maturity Model for Benchmarking and ...
Expanded top ten_big_data_security_and_privacy_challenges
Solve Big Data Security Issues
Big Data (security Issue)
Big Data and Security - Where are we now? (2015)
Big Data Security and Governance
Atlanta ISSA 2010 Enterprise Data Protection Ulf Mattsson
Network Security‬ and Big ‪‎Data Analytics‬
Security and Audit for Big Data
Threat Ready Data: Protect Data from the Inside and the Outside
Hadoop and Big Data Security
Big Data Analytics to Enhance Security
Security Analytics and Big Data: What You Need to Know
To Serve and Protect: Making Sense of Hadoop Security
Data Leakage Presentation
Privacy and Security by Design
The Definitive Guide to Data Loss Prevention
Security and privacy of cloud data: what you need to know (Interop)
Privacy Secrets Your Systems May Be Telling
Ad

Viewers also liked (19)

PDF
Big Data Security with Hadoop
PPTX
Cyber crime and security ppt
PPT
Big data
PPT
Overview of policies for security and data sharing
PDF
ciso-platform-annual-summit-2013-Fgont-ipv6-myths-dynamic
PPTX
Big Data Security (ChinaNetCloud - Guiyang Conference)
PPT
PaaSword: A Holistic Data Privacy and Security by Design Framework for Cloud ...
PDF
Enterprise 2.0: What it is and why it matters
PDF
SMB Security Opportunity –Use and Plans for Solutions and Profile of "Securit...
PPT
Keynote Address at 2013 CloudCon: A day in the life of the SMB by Michael To...
PPT
Winning the war against data- Strategies to beat your arch nemesis: files - G...
PPTX
Keeping up with the Revolution in IT Security
PPTX
Building a database security program
PPT
Box.net overview
PPTX
Security Essentials for the SMB IT Network (on a Shoestring Budget!) - Adam W...
PDF
Advanced IT and Cyber Security for Your Business
PPTX
modern security risks for big data and mobile applications
PDF
VO Course 10: Big data challenges in astronomy
PPTX
The next generation of IT security
Big Data Security with Hadoop
Cyber crime and security ppt
Big data
Overview of policies for security and data sharing
ciso-platform-annual-summit-2013-Fgont-ipv6-myths-dynamic
Big Data Security (ChinaNetCloud - Guiyang Conference)
PaaSword: A Holistic Data Privacy and Security by Design Framework for Cloud ...
Enterprise 2.0: What it is and why it matters
SMB Security Opportunity –Use and Plans for Solutions and Profile of "Securit...
Keynote Address at 2013 CloudCon: A day in the life of the SMB by Michael To...
Winning the war against data- Strategies to beat your arch nemesis: files - G...
Keeping up with the Revolution in IT Security
Building a database security program
Box.net overview
Security Essentials for the SMB IT Network (on a Shoestring Budget!) - Adam W...
Advanced IT and Cyber Security for Your Business
modern security risks for big data and mobile applications
VO Course 10: Big data challenges in astronomy
The next generation of IT security
Ad

Similar to Big data security challenges and recommendations! (20)

PPTX
Security issues in big data
PDF
Five_Big_Data_Security_Pitfalls
PDF
IRJET- Big Data Privacy and Security Challenges in Industries
PDF
Keith prabhu global high on cloud summit
PDF
Security for Big Data
PPTX
Real callenges in big data security
PDF
Big Data - Cyberroot Risk Advisory
PPTX
Big data in term of security measure
DOCX
Handling and Analyzing Big Data_ A Professional Guide
PDF
PPTX
Generating actionable consumer insights from analytics
PDF
Big data analysis concepts and references by Cloud Security Alliance
PDF
Data Analytics Governance and Ethics
PDF
Big Data & Security Have Collided - What Are You Going to do About It?
 
PDF
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
PPTX
Innovation Without Compromise: The Challenges of Securing Big Data
PPTX
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
PDF
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
PPTX
ppt about big data analysis in the indusry
PPTX
21312312312ppt about big data analysis in the indusry.pptx
Security issues in big data
Five_Big_Data_Security_Pitfalls
IRJET- Big Data Privacy and Security Challenges in Industries
Keith prabhu global high on cloud summit
Security for Big Data
Real callenges in big data security
Big Data - Cyberroot Risk Advisory
Big data in term of security measure
Handling and Analyzing Big Data_ A Professional Guide
Generating actionable consumer insights from analytics
Big data analysis concepts and references by Cloud Security Alliance
Data Analytics Governance and Ethics
Big Data & Security Have Collided - What Are You Going to do About It?
 
Big Data Everywhere Chicago: The Big Data Imperative -- Discovering & Protect...
Innovation Without Compromise: The Challenges of Securing Big Data
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
ppt about big data analysis in the indusry
21312312312ppt about big data analysis in the indusry.pptx

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
KodekX | Application Modernization Development
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PPTX
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
Machine learning based COVID-19 study performance prediction
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
KodekX | Application Modernization Development
Advanced methodologies resolving dimensionality complications for autism neur...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Reach Out and Touch Someone: Haptics and Empathic Computing
Mobile App Security Testing_ A Comprehensive Guide.pdf
Network Security Unit 5.pdf for BCA BBA.
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Weekly Chronicles - August'25 Week I
CIFDAQ's Market Insight: SEC Turns Pro Crypto
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Cloud computing and distributed systems.

Big data security challenges and recommendations!

  • 1. Big Data Security Top 5 Security Risks and Best Practices Jitendra Chauhan Head R&D, iViZ Security jitendra.chauhan@gmail.com
  • 2. Agenda • Key Insights of Big Data Architecture • Top 5 Big Data Security Risks • Top 5 Best Practices
  • 3. Key Insights of Big Data Architecture
  • 4. Distributed Architecture (Hadoop as example) Data Partition, Replication and Distribution Auto-tiering Move the Code
  • 5. Real Time, Streaming and Continuous ComputationIntegration Patterns Real time Variety of Input Sources Adhoc Queries
  • 6. Parallel & Powerful Programming Framework Example: • 16TB Data • 128 MB Chunks • 82000 Maps Java vs SQL / PLSQL Frameworks: • MapReduce • Storm Topology (Spouts & Bolts)
  • 7. Big Data Architecture No Single Silver Bullet • Hadoop is already unsuitable for many Big data problems • Real-time analytics • Cloudscale, Storm • Graph computation o Giraph and Pregel (Some examples graph computation are Shortest Paths, Degree of Separation etc.) • Low latency queries o Dremel
  • 9. Insecure Computation Sensitive Info • Information Leak • Data Corruption • DoS Health Data Untrusted Computation program
  • 10. Input Validation and Filtering • Input Validation o What kind of data is untrusted? o What are the untrusted data sources? • Data Filtering o Filter Rogue or malicious data • Challenges o GBs or TBs continuous data o Signature based data filtering has limitations  How to filter Behavior aspect of data?
  • 11. Granular Access Controls • Designed for Performance, almost no security in mind • Security in Big Data still ongoing research • Table, Row or Cell level access control gone missing • Adhoc Queries poses additional challenges • Access Control is disabled by default
  • 12. Insecure Data Storage • Data at various nodes, Authentication, Authorization & Encryption is challenging • Autotiering moves cold data to lesser secure medium o What if cold data is sensitive? • Encryption of Real time data can have performance impacts • Secure communication among nodes, middleware and end users are disabled by default
  • 13. Privacy Concerns in Data Mining and Analytics • Monetization of Big Data generally involves Data Mining and Analytics • Sharing of Results involve multiple challenges o Invasion of Privacy o Invasive Marketing o Unintentional Disclosure of Information • Examples o AOL release of Anonymzed search logs, Users can easily be identified o Netflix faced a similar problem
  • 14. Top 5 Best Practices • Secure your Computation Code • Implement access control, code signing, dynamic analysis of computational code • Strategy to prevent data in case of untrusted code • Implement Comprehensive Input Validation and Filtering • Implement validation and filtering of input data, from internal or external sources • Evaluate input validation filtering of your Big Data solution
  • 15. Top 5 Best Practices • Implement Granular Access Control • Review Role and Privilege Matrix • Review permission to execute Adhoc queries • Enable Access Control • Secure your Data Storage and Computation • Sensitive Data should be segregated • Enable Data encryption for sensitive data • Audit Administrative Access on Data Nodes • API Security
  • 16. Top 5 Best Practices • Review and Implement Privacy Preserving Data Mining and Analytics • Analytics data should not disclose sensitive information • Get the Big Data Audited
  • 18. Big Data Architecture Key Insights • Distributed Architecture & Auto Tiering • Real Time, Streaming and Continuous Computation • Adhoc Queries • Parallel and Powerful Computation Language • Move the Code, Not the data • Non Relational Data • Variety of Input Sources
  • 19. Top 5 Security Risks • Insecure Computation • End Point Input Validation and Filtering • Granular Access Control • Insecure Data Storage and Communication • Privacy Preserving Data Mining and Analytics

Editor's Notes

  • #5: Partitioned, Distributed and Replicated among multiple Data Nodes 1000,s of Data nodes Autotiering: Moving hottest data to high performance drive, coldest data to low performance, less secure drive