SlideShare a Scribd company logo
Hadoop users understand the following barriers must be overcome to have a secure functioning Data Lake.
1. Hadoop strips Cataloguing/Metadata from files as the file enters the ecosystem.
- Until Cataloguing/Metadata information is rebuilt the Data Lake is of little value.
2. Big Data software products lack the sophisticated security mechanisms available with legacy databases.
- As a result, Hadoop Data Lakes are soft targets for intruders to penetrate.
3. Locating Personally Identifiable Information and other Sensitive data is difficult and may require many people hours from Data Scientists.
- Therefore, many Data Lakes contain undiscovered Personally Identifiable Information and Sensitive data fields that are vulnerable to attackers.
Hadoop databases are designed to capture and organize incredible volumes of raw data reaching into the Peta Bytes. A properly built Data Lake can provide a
company with a 360% view of it’s activities, customers and machinery, but can also supply hackers with the same bounty of information if not properly managed.
BigDataRevealed was designed to address each of the above major weakness in Hadoop with limited effort from Data Scientists and Data Management folks.
This is why you want BigDataRevealed on your side to help create a Useful and Protected Data Lake.
- As Data streams or imports into your Data Lake BigDataRevealed’s Intelligent Catalogue will re-create catalogue data and metadata that was
stripped away as well as determine the business classification form more precise columnar naming.
- Again, as data streams into your Data Lake BDR’s Intelligent Catalogue will identify PII and other Sensitive data and Sequester/Encrypt the fields
before writing them to HDFS or Hbase. The decryption key is safely stored outside of Hadoop. PII and Sensitive Data are never exposed.
- The same processes can be run against ‘data at rest’ as well as streaming data with little effort.
- BDR provides a graphical interface to connect to IoT, and Social Media data feeds directly to your Data Lake, Eliminating the
need for Data Scientist to build unique connectors for every data feed you wish to process. Saving many hours of coding and testing while
automating the SecureSequester/Encrypt of Personally Identifiable Information.
BigDataRevealed Fills the Weaknesses Inherent in Hadoop And makes IoT as easy as 1-2-3 And all with SecureSequester/Encrypt
BigDataRevealed employs 3 components
• The producer is the tool that tells BigDataRevealed about the stream
of data that you wish to include in the data lake
• The Intelligent Catalogue (SecureSequester Facility) defines what
patterns are to be detected as potential Personal Identifiable
Information (PII) and which ones have been deemed false positives
• The Intelligent Catalogue (SecureSequester Facility) then takes the
configuration information and applies it to streams of information
intended for your data lake
BigDataRevealed employs three components to protect streams of data made available by IoT and other
devices that write in your data lake
I-o-T as easy as 1-2-3
Protection for EU GDPR – US and worldwide Data Protection, Sequester / Encryption
1. Producer is used to register
potential streams of information to
BigDataRevealed. Connections are
automatically generated.
2. The SecureSequester
Administrative workbench is used to
define PII patterns and known false
positives. Set Duration and
parameters for the Stream Job.
3. SecureSequester will interrogate
streams as they are introduced to
the data lake, encrypt potential PII,
dispatch alerts for review and
sequester encrypted source data.
BigDataRevealed employs three components to protect streams of data made available by IOT and other
devices that write in your Data Lake
Cataloguing / Metadata / Columnar Naming
1. Executive Summary of Discovery
Process and Patterns Detected.
2. BigDataRevealed creating
Business Column Classification.
3. Creating the File Headers,
Catalog Info and Collaborative
usage.
Pattern Discovery of Private Data then Sequester / Encrypt
Protection for EU GDPR
1. View File/Column where Personal
Data or Discovered Data was found
and select Sequester
2. Select individual columns by data
type to Encrypt, or select entire file
to Encrypt. Then Run Process.
3. View the results of what Columns
or if the entire file is encrypted by
seeing the actual data.
Pattern Discovery of Private Data then Consolidate into one Folder for further Analysis and Remediation
1. View File/Column where Personal
Data or Discovered Sensitive Data
was found and select Sequester /
Encrypt the sensitive data.
2. Select files to be copied into a
new folder containing like data for
further analytics, and remediation.
3. View the results of files that
where written into the New Folder
as per number 2.
info@bigdatarevealed.com
847-791-7838

More Related Content

PDF
Kogni - A Data Security Product. Discovers, Secures, & Monitors Sensitive Ent...
PPTX
A Little Security For Big Data
PDF
IRJET- An Efficient Ranked Multi-Keyword Search for Multiple Data Owners Over...
PDF
Ijcatr04051015
PPTX
HEBE Platform Technology
PDF
GDPR/CCPA Compliance and Data Governance in Hadoop
PDF
Unstructured Data Fact Sheet
PPTX
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Kogni - A Data Security Product. Discovers, Secures, & Monitors Sensitive Ent...
A Little Security For Big Data
IRJET- An Efficient Ranked Multi-Keyword Search for Multiple Data Owners Over...
Ijcatr04051015
HEBE Platform Technology
GDPR/CCPA Compliance and Data Governance in Hadoop
Unstructured Data Fact Sheet
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...

What's hot (19)

PDF
Database Management in Different Applications of IOT
PDF
Gdpr ccpa automated compliance - spark java application features and functi...
PDF
Understanding IoT Management for Smart Refrigerator
PDF
Privacy preserving detection of sensitive data exposure
PDF
Big Data Security and Governance
PDF
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
PDF
IRJET- Review on Privacy Preserving on Multi Keyword Search over Encrypte...
PDF
Existco Scan and File Utility
PPTX
2nd rvw
PDF
Eu gdpr technical workflow and productionalization neccessary w privacy ass...
PPTX
Secure Channels Financal Institution Presentation
PPTX
Fuel Good 2018: Is your Nonprofit at Risk? Security and Privacy Best Practices
PDF
Expanded top ten_big_data_security_and_privacy_challenges
PDF
Five steps to secure big data
PDF
Corporate & Regulatory Compliance Boot Camp - Data Privacy Compliance
PDF
Hacking databases
PDF
Database security
PDF
Information Leakage Prevention In Cloud Computing
Database Management in Different Applications of IOT
Gdpr ccpa automated compliance - spark java application features and functi...
Understanding IoT Management for Smart Refrigerator
Privacy preserving detection of sensitive data exposure
Big Data Security and Governance
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
IRJET- Review on Privacy Preserving on Multi Keyword Search over Encrypte...
Existco Scan and File Utility
2nd rvw
Eu gdpr technical workflow and productionalization neccessary w privacy ass...
Secure Channels Financal Institution Presentation
Fuel Good 2018: Is your Nonprofit at Risk? Security and Privacy Best Practices
Expanded top ten_big_data_security_and_privacy_challenges
Five steps to secure big data
Corporate & Regulatory Compliance Boot Camp - Data Privacy Compliance
Hacking databases
Database security
Information Leakage Prevention In Cloud Computing
Ad

Similar to BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata discovery (20)

PDF
IRJET- Secured Hadoop Environment
PDF
The past, present, and future of big data security
PPTX
Security issues in big data
PPTX
Overview of Big Data by Sunny
PDF
Solving the Really Big Tech Problems with IoT
PDF
UNIT-II-BIG-DATA-FINAL(aktu imp)-PDF.pdf
DOCX
Key aspects of big data storage and its architecture
PDF
A robust and verifiable threshold multi authority access control system in pu...
PDF
Isaca journal - bridging the gap between access and security in big data...
PPTX
Security issues associated with big data in cloud
PPTX
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
PDF
Webinar Data Mesh - Part 3
PDF
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
PDF
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
PDF
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
PDF
BD_Architecture and Charateristics.pptx.pdf
PDF
Complete Guide to Microsoft Azure Data Lake.pdf
PDF
IRJET - A Secure Access Policies based on Data Deduplication System
PPTX
Expand a Data warehouse with Hadoop and Big Data
IRJET- Secured Hadoop Environment
The past, present, and future of big data security
Security issues in big data
Overview of Big Data by Sunny
Solving the Really Big Tech Problems with IoT
UNIT-II-BIG-DATA-FINAL(aktu imp)-PDF.pdf
Key aspects of big data storage and its architecture
A robust and verifiable threshold multi authority access control system in pu...
Isaca journal - bridging the gap between access and security in big data...
Security issues associated with big data in cloud
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Webinar Data Mesh - Part 3
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
BD_Architecture and Charateristics.pptx.pdf
Complete Guide to Microsoft Azure Data Lake.pdf
IRJET - A Secure Access Policies based on Data Deduplication System
Expand a Data warehouse with Hadoop and Big Data
Ad

More from Steven Meister (10)

PDF
Gdpr CCPA Why Benchmarks of Billions of rows are as meaningful as compliance ...
PDF
Gdpr, analytics, big data compliance beta
PDF
Steven Meister GDPR and Regulatory Compliance and Big Data Excelerator Profes...
PDF
Privacy assurance initiative
PDF
GDPR BigDataRevealed Readiness Requirements and Evaluation
PDF
Are you prepared for eu gdpr indirect identifiers? what are indirect identifi...
PDF
I have listed 3 informative youtube videos on the eu gdpr
PDF
Gdpr questions for compliance difficulties
PDF
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
PPSX
Big datarevealed hadoop catalog
Gdpr CCPA Why Benchmarks of Billions of rows are as meaningful as compliance ...
Gdpr, analytics, big data compliance beta
Steven Meister GDPR and Regulatory Compliance and Big Data Excelerator Profes...
Privacy assurance initiative
GDPR BigDataRevealed Readiness Requirements and Evaluation
Are you prepared for eu gdpr indirect identifiers? what are indirect identifi...
I have listed 3 informative youtube videos on the eu gdpr
Gdpr questions for compliance difficulties
The U.S. Privacy Shield Frameworks is coming to America as is EU GDPR– It’s t...
Big datarevealed hadoop catalog

Recently uploaded (20)

PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Lecture1 pattern recognition............
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Mega Projects Data Mega Projects Data
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Computer network topology notes for revision
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
Introduction to Business Data Analytics.
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Lecture1 pattern recognition............
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Mega Projects Data Mega Projects Data
IB Computer Science - Internal Assessment.pptx
Introduction to Knowledge Engineering Part 1
Computer network topology notes for revision
Fluorescence-microscope_Botany_detailed content
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Launch Your Data Science Career in Kochi – 2025
Introduction to Business Data Analytics.
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction-to-Cloud-ComputingFinal.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Acceptance and paychological effects of mandatory extra coach I classes.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx

BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata discovery

  • 1. Hadoop users understand the following barriers must be overcome to have a secure functioning Data Lake. 1. Hadoop strips Cataloguing/Metadata from files as the file enters the ecosystem. - Until Cataloguing/Metadata information is rebuilt the Data Lake is of little value. 2. Big Data software products lack the sophisticated security mechanisms available with legacy databases. - As a result, Hadoop Data Lakes are soft targets for intruders to penetrate. 3. Locating Personally Identifiable Information and other Sensitive data is difficult and may require many people hours from Data Scientists. - Therefore, many Data Lakes contain undiscovered Personally Identifiable Information and Sensitive data fields that are vulnerable to attackers. Hadoop databases are designed to capture and organize incredible volumes of raw data reaching into the Peta Bytes. A properly built Data Lake can provide a company with a 360% view of it’s activities, customers and machinery, but can also supply hackers with the same bounty of information if not properly managed. BigDataRevealed was designed to address each of the above major weakness in Hadoop with limited effort from Data Scientists and Data Management folks. This is why you want BigDataRevealed on your side to help create a Useful and Protected Data Lake. - As Data streams or imports into your Data Lake BigDataRevealed’s Intelligent Catalogue will re-create catalogue data and metadata that was stripped away as well as determine the business classification form more precise columnar naming. - Again, as data streams into your Data Lake BDR’s Intelligent Catalogue will identify PII and other Sensitive data and Sequester/Encrypt the fields before writing them to HDFS or Hbase. The decryption key is safely stored outside of Hadoop. PII and Sensitive Data are never exposed. - The same processes can be run against ‘data at rest’ as well as streaming data with little effort. - BDR provides a graphical interface to connect to IoT, and Social Media data feeds directly to your Data Lake, Eliminating the need for Data Scientist to build unique connectors for every data feed you wish to process. Saving many hours of coding and testing while automating the SecureSequester/Encrypt of Personally Identifiable Information. BigDataRevealed Fills the Weaknesses Inherent in Hadoop And makes IoT as easy as 1-2-3 And all with SecureSequester/Encrypt
  • 2. BigDataRevealed employs 3 components • The producer is the tool that tells BigDataRevealed about the stream of data that you wish to include in the data lake • The Intelligent Catalogue (SecureSequester Facility) defines what patterns are to be detected as potential Personal Identifiable Information (PII) and which ones have been deemed false positives • The Intelligent Catalogue (SecureSequester Facility) then takes the configuration information and applies it to streams of information intended for your data lake BigDataRevealed employs three components to protect streams of data made available by IoT and other devices that write in your data lake
  • 3. I-o-T as easy as 1-2-3 Protection for EU GDPR – US and worldwide Data Protection, Sequester / Encryption 1. Producer is used to register potential streams of information to BigDataRevealed. Connections are automatically generated. 2. The SecureSequester Administrative workbench is used to define PII patterns and known false positives. Set Duration and parameters for the Stream Job. 3. SecureSequester will interrogate streams as they are introduced to the data lake, encrypt potential PII, dispatch alerts for review and sequester encrypted source data. BigDataRevealed employs three components to protect streams of data made available by IOT and other devices that write in your Data Lake
  • 4. Cataloguing / Metadata / Columnar Naming 1. Executive Summary of Discovery Process and Patterns Detected. 2. BigDataRevealed creating Business Column Classification. 3. Creating the File Headers, Catalog Info and Collaborative usage.
  • 5. Pattern Discovery of Private Data then Sequester / Encrypt Protection for EU GDPR 1. View File/Column where Personal Data or Discovered Data was found and select Sequester 2. Select individual columns by data type to Encrypt, or select entire file to Encrypt. Then Run Process. 3. View the results of what Columns or if the entire file is encrypted by seeing the actual data.
  • 6. Pattern Discovery of Private Data then Consolidate into one Folder for further Analysis and Remediation 1. View File/Column where Personal Data or Discovered Sensitive Data was found and select Sequester / Encrypt the sensitive data. 2. Select files to be copied into a new folder containing like data for further analytics, and remediation. 3. View the results of files that where written into the New Folder as per number 2. info@bigdatarevealed.com 847-791-7838