SlideShare a Scribd company logo
Big Data Governance in Hadoop
with Cloudera Navigator
Emre Sevinç
emre.sevinc@bigindustries.be
Agenda
● Introduction
● What is data governance and why should you care about it?
● What is Cloudera Navigator and how does it fit in?
● Cloudera Navigator Demonstration
● What’s new in the latest release of Cloudera 5.10?
Where are You with Hadoop? 1/2
Your relationship with Hadoop ...
● Still learning
● Evaluating distributions
● Testing / Development / Prototyping
● In production
Where are You with Hadoop? 2/2
Your are using / planning to use Hadoop in ...
● Banking
● Telecom
● Healthcare
● Media / entertainment
● Internet services ...
Data Governance...
“... refers to the overall management of the availability,
usability, integrity, auditability ,and security of the data
employed in an enterprise.
A sound data governance program includes a governing
body, a defined set of procedures and policies, and a plan
to execute them.
Data governance is used by organizations to exercise
control over processes and methods used by their data
stewards in order to improve data quality.”
Cloudera Big Data Maturity Survey 2016
https://goo.gl/d3A0ps
Data Governance & Challenges
● Compliance Officers: how to track, understand, and protect access to sensitive data?
○ Am I prepared for an audit?
○ Who’s accessing what data?
○ What are they doing with the data?
○ Is sensitive data governed and protected?
● Data Stewards and Curators: how to manage and organize data assets at Hadoop
scale?
○ How to efficiently manage the data lifecycle from ingest to purge?
○ How to classify data efficiently?
○ How to make data available to end users efficiently?
● Data Scientists and BI Users: how to effortlessly find and trust the data that matter
the most?
○ How can I explore data on my own?
○ Can I trust what I find?
○ How to find related data sets?
● Hadoop Administrators and DBAs: how to boost user productivity and cluster
performance?
○ How is data being used today?
○ How can I optimize for future workloads?
Your Hadoop data management concern
is...
● Compliance, e.g. EU General Data Protection Regulation
(GDPR)
● Stewardship (lifecycle management)
● Curation (metadata tagging)
● Enabling end-user self-service
● Administration (optimization)
● Other
What is Cloudera Navigator?
How does Cloudera Navigator fit into
the Big Data Governance picture?
Cloudera Navigator Governance Foundation
Unified Auditing Comprehensive Lineage
Unified Metadata Universal Policies
Cloudera Navigator
● Trusted for production: deployed at 100s of customers in
various industries, running in production for 4 years
● Compliance-ready: Cloudera is the first Hadoop
distribution that passed an independent PCI audit
● Integrates well with industry-leading partner solutions
Integration with Others 1/2
Integration with Others 2/2
https://github.com/cloudera/navigator-sdk
Lineage
Metadata - Business & Technical
Cloudera Navigator Demo
Unified Auditing
Unified Auditing
Unified Auditing
What’s new in Cloudera 5.10 (1/3)
● Comprehensive Governance for the Cloud
○ Cataloging, metadata management, and
comprehensive lineage for data on Amazon S3
○ The only big data governance solution for data
stored on-premise as well as in the cloud
What’s new in Cloudera 5.10 (2/3)
Comprehensive
Governance for the Cloud
What’s new in Cloudera 5.10 (3/3)
● Policy-based business metadata assignment and validation
● Major performance optimizations
● Refreshed look-and-feel for increased data stewardship
productivity
● Solr indexing has been optimized to improve search speed
and reduce memory requirements.
Thanks for attending!
Questions? Comments?
emre.sevinc@bigindustries.be

More Related Content

PPTX
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
PPTX
Building trust in your data lake. A fintech case study on automated data disc...
PPTX
Open Source in the Energy Industry - Creating a New Operational Model for Dat...
PPTX
Enterprise Data Hub: The Next Big Thing in Big Data
PDF
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
PPTX
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
PDF
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
PPTX
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Building trust in your data lake. A fintech case study on automated data disc...
Open Source in the Energy Industry - Creating a New Operational Model for Dat...
Enterprise Data Hub: The Next Big Thing in Big Data
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera

What's hot (20)

PDF
Hortonworks Hybrid Cloud - Putting you back in control of your data
PPT
Making the Case for Hadoop in a Large Enterprise-British Airways
PPT
Emergence of MongoDB as an Enterprise Data Hub
PDF
Modern Data Management for Federal Modernization
PPTX
Building a Modern Analytic Database with Cloudera 5.8
PPTX
Harnessing the Power of Big Data at Freddie Mac
PPTX
Rethink Analytics with an Enterprise Data Hub
PPTX
Meet the experts dwo bde vds v7
PDF
Constant Contact: An Online Marketing Leader’s Data Lake Journey
PDF
The Future of Data Management: The Enterprise Data Hub
PPTX
Oil and gas big data edition
PDF
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
PDF
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
PPTX
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
PPTX
The Future of Data Management: The Enterprise Data Hub
PPTX
Hadoop: Making it work for the Business Unit
PPTX
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
PDF
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
PDF
4 Steps to Make Customer Data Actionable
PPTX
zData Inc. Big Data Consulting and Services - Overview and Summary
Hortonworks Hybrid Cloud - Putting you back in control of your data
Making the Case for Hadoop in a Large Enterprise-British Airways
Emergence of MongoDB as an Enterprise Data Hub
Modern Data Management for Federal Modernization
Building a Modern Analytic Database with Cloudera 5.8
Harnessing the Power of Big Data at Freddie Mac
Rethink Analytics with an Enterprise Data Hub
Meet the experts dwo bde vds v7
Constant Contact: An Online Marketing Leader’s Data Lake Journey
The Future of Data Management: The Enterprise Data Hub
Oil and gas big data edition
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
The Future of Data Management: The Enterprise Data Hub
Hadoop: Making it work for the Business Unit
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
4 Steps to Make Customer Data Actionable
zData Inc. Big Data Consulting and Services - Overview and Summary
Ad

Similar to Big Data Governance in Hadoop Environments with Cloudera Navigatorfeb2017meetu (20)

PDF
Total Data Governance on Hadoop with Talend and Cloudera
PDF
大数据数据治理及数据安全
PPTX
Perspectives on Ethical Big Data Governance
PPTX
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
PPTX
Govern This! Data Discovery and the application of data governance with new s...
PPTX
The Journey to Success with Big Data
PPTX
Data Governance, Compliance and Security in Hadoop with Cloudera
PPTX
Bringing Trus and Visibility to Apache Hadoop
PPTX
Seeking Cybersecurity--Strategies to Protect the Data
PDF
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PPTX
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
PPTX
Defining and Applying Data Governance in Today’s Business Environment
PPTX
Turning Data into Business Value with a Modern Data Platform
PDF
Meet up roadmap cloudera 2020 - janeiro
PPTX
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
PDF
Cloudera GoDataFest Security and Governance
PDF
The new big data
PPTX
A deep dive into running data analytic workloads in the cloud
PPTX
Modern Data Warehouse Fundamentals Part 1
Total Data Governance on Hadoop with Talend and Cloudera
大数据数据治理及数据安全
Perspectives on Ethical Big Data Governance
Comprehensive Security for the Enterprise IV: Visibility Through a Single End...
Govern This! Data Discovery and the application of data governance with new s...
The Journey to Success with Big Data
Data Governance, Compliance and Security in Hadoop with Cloudera
Bringing Trus and Visibility to Apache Hadoop
Seeking Cybersecurity--Strategies to Protect the Data
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Defining and Applying Data Governance in Today’s Business Environment
Turning Data into Business Value with a Modern Data Platform
Meet up roadmap cloudera 2020 - janeiro
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera GoDataFest Security and Governance
The new big data
A deep dive into running data analytic workloads in the cloud
Modern Data Warehouse Fundamentals Part 1
Ad

Recently uploaded (20)

PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Transform Your Business with a Software ERP System
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
System and Network Administration Chapter 2
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
medical staffing services at VALiNTRY
PPTX
ai tools demonstartion for schools and inter college
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Digital Strategies for Manufacturing Companies
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
history of c programming in notes for students .pptx
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Nekopoi APK 2025 free lastest update
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Transform Your Business with a Software ERP System
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Odoo POS Development Services by CandidRoot Solutions
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
System and Network Administration Chapter 2
How to Choose the Right IT Partner for Your Business in Malaysia
ManageIQ - Sprint 268 Review - Slide Deck
Upgrade and Innovation Strategies for SAP ERP Customers
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
medical staffing services at VALiNTRY
ai tools demonstartion for schools and inter college
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Odoo Companies in India – Driving Business Transformation.pdf
Digital Strategies for Manufacturing Companies
How to Migrate SBCGlobal Email to Yahoo Easily
history of c programming in notes for students .pptx
Online Work Permit System for Fast Permit Processing
Nekopoi APK 2025 free lastest update
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises

Big Data Governance in Hadoop Environments with Cloudera Navigatorfeb2017meetu

  • 1. Big Data Governance in Hadoop with Cloudera Navigator Emre Sevinç emre.sevinc@bigindustries.be
  • 2. Agenda ● Introduction ● What is data governance and why should you care about it? ● What is Cloudera Navigator and how does it fit in? ● Cloudera Navigator Demonstration ● What’s new in the latest release of Cloudera 5.10?
  • 3. Where are You with Hadoop? 1/2 Your relationship with Hadoop ... ● Still learning ● Evaluating distributions ● Testing / Development / Prototyping ● In production
  • 4. Where are You with Hadoop? 2/2 Your are using / planning to use Hadoop in ... ● Banking ● Telecom ● Healthcare ● Media / entertainment ● Internet services ...
  • 5. Data Governance... “... refers to the overall management of the availability, usability, integrity, auditability ,and security of the data employed in an enterprise. A sound data governance program includes a governing body, a defined set of procedures and policies, and a plan to execute them. Data governance is used by organizations to exercise control over processes and methods used by their data stewards in order to improve data quality.”
  • 6. Cloudera Big Data Maturity Survey 2016 https://goo.gl/d3A0ps
  • 7. Data Governance & Challenges ● Compliance Officers: how to track, understand, and protect access to sensitive data? ○ Am I prepared for an audit? ○ Who’s accessing what data? ○ What are they doing with the data? ○ Is sensitive data governed and protected? ● Data Stewards and Curators: how to manage and organize data assets at Hadoop scale? ○ How to efficiently manage the data lifecycle from ingest to purge? ○ How to classify data efficiently? ○ How to make data available to end users efficiently? ● Data Scientists and BI Users: how to effortlessly find and trust the data that matter the most? ○ How can I explore data on my own? ○ Can I trust what I find? ○ How to find related data sets? ● Hadoop Administrators and DBAs: how to boost user productivity and cluster performance? ○ How is data being used today? ○ How can I optimize for future workloads?
  • 8. Your Hadoop data management concern is... ● Compliance, e.g. EU General Data Protection Regulation (GDPR) ● Stewardship (lifecycle management) ● Curation (metadata tagging) ● Enabling end-user self-service ● Administration (optimization) ● Other
  • 9. What is Cloudera Navigator? How does Cloudera Navigator fit into the Big Data Governance picture?
  • 10. Cloudera Navigator Governance Foundation Unified Auditing Comprehensive Lineage Unified Metadata Universal Policies
  • 11. Cloudera Navigator ● Trusted for production: deployed at 100s of customers in various industries, running in production for 4 years ● Compliance-ready: Cloudera is the first Hadoop distribution that passed an independent PCI audit ● Integrates well with industry-leading partner solutions
  • 13. Integration with Others 2/2 https://github.com/cloudera/navigator-sdk
  • 15. Metadata - Business & Technical
  • 20. What’s new in Cloudera 5.10 (1/3) ● Comprehensive Governance for the Cloud ○ Cataloging, metadata management, and comprehensive lineage for data on Amazon S3 ○ The only big data governance solution for data stored on-premise as well as in the cloud
  • 21. What’s new in Cloudera 5.10 (2/3) Comprehensive Governance for the Cloud
  • 22. What’s new in Cloudera 5.10 (3/3) ● Policy-based business metadata assignment and validation ● Major performance optimizations ● Refreshed look-and-feel for increased data stewardship productivity ● Solr indexing has been optimized to improve search speed and reduce memory requirements.
  • 23. Thanks for attending! Questions? Comments? emre.sevinc@bigindustries.be