SlideShare a Scribd company logo
Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Data Dictionary or a Business Glossary
Amichai Fenner, Product Lead, Octopai
With over 7 years experience working as a full stack
BI expert, Amichai has expertise in BI methodology
and architecture, as well as technical skills in various
BI tools, from ETLs to Reporting and Analytics. He
currently manages Octopai’s automated data catalog.
Malcolm Chisholm, Ph.D., President,
Data Millennium
Thought leader, author, and speaker in data
governance and data management, Malcolm has over
25 years of experience in data-related disciplines and
has worked in a variety of sectors including finance,
manufacturing, government, pharmaceuticals,
telecoms. Malcolm has been awarded the prestigious
DAMA International Professional Achievement Award
for contributions to Master Data Management and
Reference Data Management.
The Shift to Data-Centricity
High-Level Metadata Storage
Business Glossary
• Manage Terminology for both
Information and Data
Concepts
• Manage Definitions
• Manage Classifications
Data Dictionary
• Schema > Table > Column
Structural Metadata
• Data Profiling Information
• Data Universe Information
• Other Relational Data Objects,
e.g. Views
Data Catalog
• Information on Files, Datasets
• Information on Reports, Other
Data Assets
• Attaches definitions to data
assets
Provides Terminology and
Semantics for
Provides Data Structures
/ Profiles for
Capability Business
User
Self-
Service
User
Data
Architect
Data
Engineer
DBA Data
Governance
Professional
Business Glossary
Data Catalog
Data Dictionary
Traditional Usage by Role
Data Catalogs Need Content
Time
Level of
Content
Production rollout of Data
Catalog with automation
Data Catalog based
on automation
Minimum level of
content needed for
business adoption
Data Catalog based
on user input
Data Universes
Metadata Consolidation
CUST_MSTR
CFN CMI CL
Immanuel Kant
Georg W Hegel
Customer Profile
Customer
First Name
Customer
Middle Initial
Customer
Last Name
Immanuel Kant
Georg W Hegel
Daily Customer Tracking
First Name MI Last Name
Immanuel Kant
Georg W Hegel
Business Term Synonym of Report Database Column Database Table
Customer First Name Customer Profile CFN CUST_MSTR
First Name Customer First Name Customer Daily Tracking CFN CUST_MSTR
Customer Middle Initial Customer Profile CMI CUST_MSTR
MI Customer Middle Initial Customer Daily Tracking CMI CUST_MSTR
Customer Last Name Customer Profile CL CUST_MSTR
Last Name Customer Last Name Customer Daily Tracking CL CUST_MSTR
Database
Reports
Business Glossary
Data Catalog
Functionality
Business Term Synonym of
Customer First Name
First Name Customer First Name
Customer Middle Initial
MI Customer Middle Initial
Customer Last Name
Last Name Customer Last Name Consolidated View (Data Catalog)
1. How do you collect the
metadata?
2. How does all the
metadata get related
(how do you establish
relationships among it)?
3. How do you keep it
updated?
Problems
Technical Metadata Needs Automation
to Gather It
• The scale and complexity of data ecosystems is just too large for
human effort
• How do you find the relations among the metadata?
Data Lineage
Data Lineage can harvest metadata and build relationships among it
At a Very High Level
Business Glossary
• Manage Terminology for
both Information and Data
Concepts
• Manage Definitions
• Manage Classifications
Data Dictionary
• Schema > Table > Column
Structural Metadata
• Data Profiling Information
• Data Universe Information
• Other Relational Data
Objects, e.g. Views
Data Catalog
• Information on Files,
Datasets
• Information on Reports,
Other Data Assets
• Attaches definitions to data
assets
Provides
Terminology
and Semantics
for
Provides Data
Structures /
Profiles for
Data Traceability
• There are well understood use cases for needing to know data traceability for impact
analysis (if something is changed, what will be impacted?)
• Similarly, data lineage is also well understood (where did this data in this report come from –
especially if it seems to be in error?) Or what broke my ETL process?
• But data traceability is becoming a general data governance requirement, such as BCBS 239
where you have to prove that data in reports comes from operational environments
Risk
Data
Mart
Dataset
Processing
Environment
Risk
Reports
Manual
Adjustment
• Business Glossary, Data Dictionary, and Data Catalog
each have a different focus in terms of the types of
metadata they manage
• But there are relationships between them
• The Business Glossary gives business meaning to the
technical metadata, which is not otherwise
understandable by businesspeople
• Automation is needed to harvest metadata
• Data Lineage is a great way to do this and establish the
needed relationships in the metadata
• Data Lineage is essential for creating trust in the data
by providing full traceability
• The Data Catalog then becomes the place where all this
information is integrated, and becomes the 1 stop shop
to understand and collaborate about data
Conclusion
Lack of visibility &
control of data and
business knowledge
scattered throughout the
data eco-system
Data teams face
major challenges
Loss of tribal
knowledge
Main
challenges in
the data eco-
system
Inefficient use
of data & lack
of
independence
in using data
Single Source Of
Truth
Increased
pressure on
the data team
for analytics &
reports
Ever-growing
amount of
data in the
organization
A day in the life of the data ecosystem
Achieving Data Literacy
Leverage automation to create one source of the truth for your data
Data Lineage
Trace any data end-to-end
through your entire data
eco-system, in seconds.
Data Discovery
Find your data you need
anywhere in your data
eco-system, in seconds.
Data Catalog
Create company-wide
consistency with a self-
creating, self-updating
data catalog.
Let’s see what we’re talking about
An effective data catalog will help
your users answer questions such as:
o Where should I look for my
data?
o Does this data matter?
o What does this data represent?
o Is this data relevant and
important?
o How can I use this data? Before After
Data Catalog connects all data citizens
in your eco-system
Q&A
THANK YOU
Got any questions?
Malcolm Chisholm, President of Data Millennium
mchisholm@datamillennium.com
Amichai Fenner, Product Lead, Octopai
amichaif@octopai.com

More Related Content

PDF
8 Steps to Creating a Data Strategy
PDF
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
PDF
Modern Metadata Strategies
PPTX
Master Data Management methodology
PPTX
How to Build & Sustain a Data Governance Operating Model
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
PDF
Data Governance Takes a Village (So Why is Everyone Hiding?)
8 Steps to Creating a Data Strategy
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Modern Metadata Strategies
Master Data Management methodology
How to Build & Sustain a Data Governance Operating Model
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Data Governance Takes a Village (So Why is Everyone Hiding?)

What's hot (20)

PDF
Glossaries, Dictionaries, and Catalogs Result in Data Governance
PDF
Data strategy in a Big Data world
PDF
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
PDF
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
PDF
Data Catalogs Are the Answer – What is the Question?
PDF
Data Catalog for Better Data Discovery and Governance
PPT
Master Data Management
PDF
Data Governance Trends - A Look Backwards and Forwards
PDF
Data Quality Best Practices
PDF
Data Strategy
PDF
RWDG Slides: A Complete Set of Data Governance Roles & Responsibilities
PDF
Data Governance and Metadata Management
PDF
DAS Slides: Data Governance - Combining Data Management with Organizational ...
PDF
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
PPTX
Data Quality & Data Governance
PDF
Improving Data Literacy Around Data Architecture
PPTX
Free Training: How to Build a Lakehouse
PDF
Data Architecture Strategies: Data Architecture for Digital Transformation
PDF
Data Quality Best Practices
PDF
New Analytic Uses of Master Data Management in the Enterprise
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Data strategy in a Big Data world
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Catalogs Are the Answer – What is the Question?
Data Catalog for Better Data Discovery and Governance
Master Data Management
Data Governance Trends - A Look Backwards and Forwards
Data Quality Best Practices
Data Strategy
RWDG Slides: A Complete Set of Data Governance Roles & Responsibilities
Data Governance and Metadata Management
DAS Slides: Data Governance - Combining Data Management with Organizational ...
Becoming a Data-Driven Organization - Aligning Business & Data Strategy
Data Quality & Data Governance
Improving Data Literacy Around Data Architecture
Free Training: How to Build a Lakehouse
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Quality Best Practices
New Analytic Uses of Master Data Management in the Enterprise
Ad

Similar to Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Data Dictionary or a Business Glossary (20)

PDF
Introduction to Business and Data Analysis Undergraduate.pdf
PDF
Data Systems Integration & Business Value Pt. 1: Metadata
PDF
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
PDF
Data-Ed Online: Trends in Data Modeling
PDF
Data-Ed: Trends in Data Modeling
PPTX
You Need a Data Catalog. Do You Know Why?
PPTX
You Need a Data Catalog. Do You Know Why?
PDF
What Data Do You Have and Where is It?
PDF
Best Practices for Meeting State Data Management Objectives
PDF
Chief Data & Analytics Officer Fall Boston - Presentation
PDF
The Missing Link in Enterprise Data Governance - Automated Metadata Management
PDF
Top 60+ Data Warehouse Interview Questions and Answers.pdf
PDF
Trends in Data Modeling
PDF
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
PDF
Managing Data Strategically
PDF
Five Things to Consider About Data Mesh and Data Governance
PPTX
Data Mesh in Azure using Cloud Scale Analytics (WAF)
PDF
Data Profiling: The First Step to Big Data Quality
PDF
Business Intelligence Priorities, Products and Services required in Enterprise
PPTX
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
Introduction to Business and Data Analysis Undergraduate.pdf
Data Systems Integration & Business Value Pt. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed Online: Trends in Data Modeling
Data-Ed: Trends in Data Modeling
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
What Data Do You Have and Where is It?
Best Practices for Meeting State Data Management Objectives
Chief Data & Analytics Officer Fall Boston - Presentation
The Missing Link in Enterprise Data Governance - Automated Metadata Management
Top 60+ Data Warehouse Interview Questions and Answers.pdf
Trends in Data Modeling
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
Managing Data Strategically
Five Things to Consider About Data Mesh and Data Governance
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Profiling: The First Step to Big Data Quality
Business Intelligence Priorities, Products and Services required in Enterprise
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
Ad

More from DATAVERSITY (20)

PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
PDF
Data at the Speed of Business with Data Mastering and Governance
PDF
Exploring Levels of Data Literacy
PDF
Make Data Work for You
PDF
Data Catalogs Are the Answer – What Is the Question?
PDF
Data Modeling Fundamentals
PDF
Showing ROI for Your Analytic Project
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
PDF
Is Enterprise Data Literacy Possible?
PDF
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
PDF
Emerging Trends in Data Architecture – What’s the Next Big Thing?
PDF
Data Governance Trends and Best Practices To Implement Today
PDF
2023 Trends in Enterprise Analytics
PDF
Data Strategy Best Practices
PDF
Who Should Own Data Governance – IT or Business?
PDF
Data Management Best Practices
PDF
MLOps – Applying DevOps to Competitive Advantage
PDF
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
PDF
Empowering the Data Driven Business with Modern Business Intelligence
PDF
Enterprise Architecture vs. Data Architecture
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Data at the Speed of Business with Data Mastering and Governance
Exploring Levels of Data Literacy
Make Data Work for You
Data Catalogs Are the Answer – What Is the Question?
Data Modeling Fundamentals
Showing ROI for Your Analytic Project
How a Semantic Layer Makes Data Mesh Work at Scale
Is Enterprise Data Literacy Possible?
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data Governance Trends and Best Practices To Implement Today
2023 Trends in Enterprise Analytics
Data Strategy Best Practices
Who Should Own Data Governance – IT or Business?
Data Management Best Practices
MLOps – Applying DevOps to Competitive Advantage
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Empowering the Data Driven Business with Modern Business Intelligence
Enterprise Architecture vs. Data Architecture

Recently uploaded (20)

PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Mega Projects Data Mega Projects Data
PPTX
Introduction to Knowledge Engineering Part 1
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Global journeys: estimating international migration
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
Computer network topology notes for revision
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
oil_refinery_comprehensive_20250804084928 (1).pptx
Reliability_Chapter_ presentation 1221.5784
Mega Projects Data Mega Projects Data
Introduction to Knowledge Engineering Part 1
.pdf is not working space design for the following data for the following dat...
Global journeys: estimating international migration
Business Ppt On Nestle.pptx huunnnhhgfvu
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Computer network topology notes for revision
Data_Analytics_and_PowerBI_Presentation.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Fluorescence-microscope_Botany_detailed content
IB Computer Science - Internal Assessment.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf

Webinar: Decoding the Mystery - How to Know if You Need a Data Catalog, a Data Dictionary or a Business Glossary

  • 2. Amichai Fenner, Product Lead, Octopai With over 7 years experience working as a full stack BI expert, Amichai has expertise in BI methodology and architecture, as well as technical skills in various BI tools, from ETLs to Reporting and Analytics. He currently manages Octopai’s automated data catalog.
  • 3. Malcolm Chisholm, Ph.D., President, Data Millennium Thought leader, author, and speaker in data governance and data management, Malcolm has over 25 years of experience in data-related disciplines and has worked in a variety of sectors including finance, manufacturing, government, pharmaceuticals, telecoms. Malcolm has been awarded the prestigious DAMA International Professional Achievement Award for contributions to Master Data Management and Reference Data Management.
  • 4. The Shift to Data-Centricity
  • 5. High-Level Metadata Storage Business Glossary • Manage Terminology for both Information and Data Concepts • Manage Definitions • Manage Classifications Data Dictionary • Schema > Table > Column Structural Metadata • Data Profiling Information • Data Universe Information • Other Relational Data Objects, e.g. Views Data Catalog • Information on Files, Datasets • Information on Reports, Other Data Assets • Attaches definitions to data assets Provides Terminology and Semantics for Provides Data Structures / Profiles for
  • 7. Data Catalogs Need Content Time Level of Content Production rollout of Data Catalog with automation Data Catalog based on automation Minimum level of content needed for business adoption Data Catalog based on user input
  • 9. Metadata Consolidation CUST_MSTR CFN CMI CL Immanuel Kant Georg W Hegel Customer Profile Customer First Name Customer Middle Initial Customer Last Name Immanuel Kant Georg W Hegel Daily Customer Tracking First Name MI Last Name Immanuel Kant Georg W Hegel Business Term Synonym of Report Database Column Database Table Customer First Name Customer Profile CFN CUST_MSTR First Name Customer First Name Customer Daily Tracking CFN CUST_MSTR Customer Middle Initial Customer Profile CMI CUST_MSTR MI Customer Middle Initial Customer Daily Tracking CMI CUST_MSTR Customer Last Name Customer Profile CL CUST_MSTR Last Name Customer Last Name Customer Daily Tracking CL CUST_MSTR Database Reports Business Glossary Data Catalog Functionality Business Term Synonym of Customer First Name First Name Customer First Name Customer Middle Initial MI Customer Middle Initial Customer Last Name Last Name Customer Last Name Consolidated View (Data Catalog)
  • 10. 1. How do you collect the metadata? 2. How does all the metadata get related (how do you establish relationships among it)? 3. How do you keep it updated? Problems
  • 11. Technical Metadata Needs Automation to Gather It • The scale and complexity of data ecosystems is just too large for human effort • How do you find the relations among the metadata?
  • 12. Data Lineage Data Lineage can harvest metadata and build relationships among it At a Very High Level Business Glossary • Manage Terminology for both Information and Data Concepts • Manage Definitions • Manage Classifications Data Dictionary • Schema > Table > Column Structural Metadata • Data Profiling Information • Data Universe Information • Other Relational Data Objects, e.g. Views Data Catalog • Information on Files, Datasets • Information on Reports, Other Data Assets • Attaches definitions to data assets Provides Terminology and Semantics for Provides Data Structures / Profiles for
  • 13. Data Traceability • There are well understood use cases for needing to know data traceability for impact analysis (if something is changed, what will be impacted?) • Similarly, data lineage is also well understood (where did this data in this report come from – especially if it seems to be in error?) Or what broke my ETL process? • But data traceability is becoming a general data governance requirement, such as BCBS 239 where you have to prove that data in reports comes from operational environments Risk Data Mart Dataset Processing Environment Risk Reports Manual Adjustment
  • 14. • Business Glossary, Data Dictionary, and Data Catalog each have a different focus in terms of the types of metadata they manage • But there are relationships between them • The Business Glossary gives business meaning to the technical metadata, which is not otherwise understandable by businesspeople • Automation is needed to harvest metadata • Data Lineage is a great way to do this and establish the needed relationships in the metadata • Data Lineage is essential for creating trust in the data by providing full traceability • The Data Catalog then becomes the place where all this information is integrated, and becomes the 1 stop shop to understand and collaborate about data Conclusion
  • 15. Lack of visibility & control of data and business knowledge scattered throughout the data eco-system Data teams face major challenges
  • 16. Loss of tribal knowledge Main challenges in the data eco- system Inefficient use of data & lack of independence in using data Single Source Of Truth Increased pressure on the data team for analytics & reports Ever-growing amount of data in the organization
  • 17. A day in the life of the data ecosystem
  • 18. Achieving Data Literacy Leverage automation to create one source of the truth for your data Data Lineage Trace any data end-to-end through your entire data eco-system, in seconds. Data Discovery Find your data you need anywhere in your data eco-system, in seconds. Data Catalog Create company-wide consistency with a self- creating, self-updating data catalog.
  • 19. Let’s see what we’re talking about
  • 20. An effective data catalog will help your users answer questions such as: o Where should I look for my data? o Does this data matter? o What does this data represent? o Is this data relevant and important? o How can I use this data? Before After Data Catalog connects all data citizens in your eco-system
  • 21. Q&A
  • 22. THANK YOU Got any questions? Malcolm Chisholm, President of Data Millennium mchisholm@datamillennium.com Amichai Fenner, Product Lead, Octopai amichaif@octopai.com