SlideShare a Scribd company logo
Improving Data Management through 
Utilizing Big Data:
Mapping a Technology to a Data Concept
March 10, 2015
Mike Jennings – Walgreens Boots Alliance
©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Big Data
Defining
2©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Describe any voluminous amount of structured, 
semi‐structured and unstructured data that has 
the potential to be analyzed for information
From www.bizcubed.com.au
Enterprise Data Management Framework
Starting EDM Definition
3©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Enterprise Data Management Framework
Context with the DMBOK Framework
4©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Enterprise Data Management Framework
Alternative EDM FrameworkMetadata Management
Data Context
Data Model/Classification
Data Structure and Framework
Structured 
Data 
Management
Unstructured
Data
Management
Master Data &
Reference Data
Management
Business 
Intelligence & 
Data 
Warehousing
Data Quality 
Management
Data Security 
Management
Data
Integration
Management
Data Delivery
Management
Data Governance
Policies, Processes, Standards, 
Organization, and Stewardship
5©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
6©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
7©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
8
©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
DMBOK Functions & Big Data Projects
Data Storage & Operations
9©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
The technologies and processes organizations use to maximize or improve 
the performance of their data storage resources.
File system that provides the ability to store large volumes of 
structured and unstructured data
Operations, resource (node), and scheduling management for 
write and read to the cluster
Workflow scheduling component for data transformations
Manages services, configurations, and their synchronization 
across the cluster
DMBOK Functions & Big Data Projects
Data Integration & Interoperability
10©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
The combination of technical and business processes used to combine 
data from disparate sources into meaningful and unified view, according 
to business requirements and accepted practices.
Provides real‐time processing of data streams for monitoring 
and alerts.
Provides ability to import data from a RDBMS to HDFS. 
Provides ability to collect, aggregate, and move huge log files ). 
into HDFS (e.g., apps, GPS, social, sensors, other).  
Provides high volume fault tolerant publish & subscribe 
messaging for real‐time analysis.
DMBOK Functions & Big Data Projects
Data Quality
11©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
A measure of the degree to which data satisfies the information needs of its 
consumers, reflects the nature and state of the real world concepts to which it 
relates, is coherent within itself, and provides value in the decision‐making 
processes for which it is to be utilized.
Provides relational structure to HDFS data. File 
formats can be applied  to data from HDFS or local file 
system
Provides ability to import data from a RDBMS to 
HDFS. Imported data can be constrained through 
import control  arguments and basic SQL execution.
Provides ability to collect, aggregate, and move huge 
log files ). into HDFS (e.g., apps, GPS, social, sensors, 
other).  Flume agent can be use with predefined data 
patterns (sinks) to ensure data format.
DMBOK Functions & Big Data Projects
Meta‐data
12©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
All the physical data and knowledge about the business and technical 
processes used by an organization.  Meta‐data is knowledge about the 
organization’s data. 
Provides data lineage between data sources and the  cluster 
including integration with the metastore/catalog (e.g., Hive 
HCatalog).
Provides relational structure to HDFS data. File formats can be 
applied  to data from HDFS or local filesystem
DMBOK Functions & Big Data Projects
Documents & Content
13©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
The management of documents and non‐structured content found 
in audio, video, email, images, etc. and the meta‐data associated 
with this material
Provides ability to collect, aggregate, and move huge log files ). 
into HDFS (e.g., apps, GPS, social, sensors, email, other).  
Provides ability to search of data in the cluster by indexing to 
enable full text search.
DMBOK Functions & Big Data Projects
Data Warehousing & Business Intelligence
14©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
A data warehouse is a subject‐oriented, integrated, time‐variant and non‐volatile 
collection of data in support of management's decision making process.  Business 
Intelligence is the collection of activities that allow an organization to analyze data 
and make decisions based on facts from historical and predictive data sets. 
Provides fast big table access to large quantities of data 
typically on top of the cluster.
Provides compute algorithm typically used to produce output data 
from a large volume of data in the cluster for consumption.
Provides semantic layer for accessing data in the cluster.
Provides a enhanced compute approach typically used to produce 
output data from a large volume of data in the cluster for consumption.
Provides a in‐memory compute method typically used to produce 
output data from a large volume of data in the cluster for consumption 
(e.g., machine learning algorithms).
DMBOK Functions & Big Data Projects
Data Security
15©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Data security concerns the protection of data from accidental or intentional 
but unauthorized modification, destruction or disclosure through the use of 
physical security, administrative controls, logical controls, and other 
safeguards to limit accessibility.
Provides security authorization (grant/revoke), policy 
administration, and audit for the cluster.
Provides service level authorization for users/groups.
Provides semantic layer (table) for accessing data in the cluster 
that can be secured.
DMBOK Functions & Big Data Projects
Data Governance – Potential Opportunity Areas
16©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Provides the organizational oversight, processes and methods to 
effectively manage data as an asset across the organization
Provides data lineage between data sources and the  cluster 
including integration with the metastore/catalog (e.g., Hive 
HCatalog).
Provides relational structure to HDFS data. File formats can be 
applied  to data from HDFS or local filesystem
Provides ability to search of data in the cluster by indexing to 
enable full text search.
Provides security authorization (grant/revoke), policy 
administration, and audit for the cluster.
17©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information
Data about core business entities and concepts,  independent of 
transactions, and data that defines the set of permissible values to be 
used by other data fields
DMBOK Functions & Big Data Projects
Reference & Master Data – Potential Opportunity Areas
Provides ability to import data from a RDBMS to HDFS. 
Provides semantic layer for accessing data in the cluster.
Bio
Michael Jennings
Senior Director, Enterprise Data Architecture
Walgreens Boots Alliance
1419 Lake Cook Road, MS: L497
Deerfield, IL 60015  USA
847 964 7692
Mike.Jennings@Walgreens.com
www.linkedin.com/in/micahelfjennings
Michael Jennings is a recognized industry expert in enterprise architecture and information
management with more than twenty-five years of experience in various industries. Mike speaks
frequently on enterprise architecture and information management concepts and practices at major
industry conferences.
He is a co-author of the book "Universal Meta Data Models" (2004) and a contributing author to the
books "Building and Managing the Meta Data Repository" (2000) and “The DAMA Guide to the Data
Management Body of Knowledge - DMBOK” (2009).
Mike was recognized with the 2013 DAMA International Professional Achievement Award and as
one of Information Management Magazine’s 25 Top Information Managers for 2012.
He currently serves as VP of Programs for the Wisconsin DAMA Chapter and as VP of Operations
for DAMA International.
18©2015 Walgreens Boots Alliance. All rights reserved. Confidential and proprietary information

More Related Content

PDF
Enacting the Data Subjects Access Rights for GDPR with Data Services and Data...
PDF
Are Your Data Ready for GDPR? (with MAPR and Talend)
PDF
Practical steps to GDPR compliance
PPTX
Secure Your Enterprise Data Now and Be Ready for CCPA in 2020
PPTX
Ensuring GDPR Compliance - A Zymplify Guide
PDF
Your Worst GDPR Nightmare - Unstructured Data
PDF
Delivering Analytics at Scale with a Governed Data Lake
PDF
BigID, OneTrust, IAPP Webinar: Bridging the Privacy Office with IT
Enacting the Data Subjects Access Rights for GDPR with Data Services and Data...
Are Your Data Ready for GDPR? (with MAPR and Talend)
Practical steps to GDPR compliance
Secure Your Enterprise Data Now and Be Ready for CCPA in 2020
Ensuring GDPR Compliance - A Zymplify Guide
Your Worst GDPR Nightmare - Unstructured Data
Delivering Analytics at Scale with a Governed Data Lake
BigID, OneTrust, IAPP Webinar: Bridging the Privacy Office with IT

What's hot (11)

PDF
DAMA Webinar: The Data Governance of Personal (PII) Data
PDF
Dama Ireland slides - Data Trust event 9th June 2016
PDF
Piwik PRO The Real Cost of Data Privacy
PDF
dcVAST GDPR Compliance One Pager
PDF
Navigating the Complex World of Compliance Guidelines
PDF
Web Analytics and Privacy
PDF
Privacy Regulations and Your Digital Setup
PDF
Operationalising gdpr compliance with data management
PPTX
Collibra Data Citizen '19 - Bridging Data Privacy with Data Governance
PPTX
Webinar: Designing Storage Architectures for Data Privacy, Compliance and Gov...
DAMA Webinar: The Data Governance of Personal (PII) Data
Dama Ireland slides - Data Trust event 9th June 2016
Piwik PRO The Real Cost of Data Privacy
dcVAST GDPR Compliance One Pager
Navigating the Complex World of Compliance Guidelines
Web Analytics and Privacy
Privacy Regulations and Your Digital Setup
Operationalising gdpr compliance with data management
Collibra Data Citizen '19 - Bridging Data Privacy with Data Governance
Webinar: Designing Storage Architectures for Data Privacy, Compliance and Gov...
Ad

Viewers also liked (20)

PDF
WBA Factsheet (2)
PDF
WBA at a GlanceFTSummit
PDF
my document
PDF
Mar-10 Improving Data Management through utilizing Big Data - Mapping a Techn...
PPT
DAMA Ireland Kick-Off Event 29Mar2016
PPT
Metadata & Interoperability: Free Tools
PDF
DAMA - Innovations in DG Architecture and Analytics (online)
PDF
DAMA Ireland - CDMP Overview (How to become a Certified Data Management Pract...
PPTX
DV 2016: Why Your Organization Needs Data and Analytics Governance
PPTX
Dama - Protecting Sensitive Data on a Database
PDF
DAMA Ireland - Data Trust event 9th June 2016
PDF
Data-Ed Webinar: Data Modeling Fundamentals
PDF
DAMA Ireland - GDPR
PPT
SOA for Data Management
PDF
The Data Lake - Balancing Data Governance and Innovation
PDF
Metadata Strategies
PDF
Information Management training courses in Dubai
PDF
The Business Value of Metadata for Data Governance
PDF
Big Data Scotland 2016
PDF
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
WBA Factsheet (2)
WBA at a GlanceFTSummit
my document
Mar-10 Improving Data Management through utilizing Big Data - Mapping a Techn...
DAMA Ireland Kick-Off Event 29Mar2016
Metadata & Interoperability: Free Tools
DAMA - Innovations in DG Architecture and Analytics (online)
DAMA Ireland - CDMP Overview (How to become a Certified Data Management Pract...
DV 2016: Why Your Organization Needs Data and Analytics Governance
Dama - Protecting Sensitive Data on a Database
DAMA Ireland - Data Trust event 9th June 2016
Data-Ed Webinar: Data Modeling Fundamentals
DAMA Ireland - GDPR
SOA for Data Management
The Data Lake - Balancing Data Governance and Innovation
Metadata Strategies
Information Management training courses in Dubai
The Business Value of Metadata for Data Governance
Big Data Scotland 2016
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Ad

Similar to 2015 Mar-10 Improving Data Management through Utilizing Big Data - Mapping a Technology to a Data Concept v1 (20)

PDF
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
PDF
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
PDF
The Importance of Master Data Management
PDF
Data-Ed: Business Value From MDM
PDF
Data-Ed Online Webinar: Business Value from MDM
PPTX
Introduction to DCAM, the Data Management Capability Assessment Model - Editi...
PDF
The Importance of MDM - Eternal Management of the Data Mind
PDF
Data-Ed Webinar: The Importance of MDM
PPTX
Introduction to DCAM, the Data Management Capability Assessment Model
PDF
MDM_ECC_SFDC_POC
PDF
datamanagementtraining-170817080750.pdf
PDF
Business Value Through Reference and Master Data Strategies
PPTX
DAMA International DMBOK V2 - Comparison with V1
PDF
Data-Ed: Unlock Business Value Through Reference & MDM
PDF
Data-Ed Online: Unlock Business Value through Reference & MDM
PPTX
Chapter 1: The Importance of Data Assets
PDF
chapter1-220725121543-7c158b33.pdf
PDF
Essential Reference and Master Data Management
PDF
Master+Data+Managementsdfdsfsdfdsfsd.pdf
PDF
Synergizing Master Data Management and Big Data
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
The Importance of Master Data Management
Data-Ed: Business Value From MDM
Data-Ed Online Webinar: Business Value from MDM
Introduction to DCAM, the Data Management Capability Assessment Model - Editi...
The Importance of MDM - Eternal Management of the Data Mind
Data-Ed Webinar: The Importance of MDM
Introduction to DCAM, the Data Management Capability Assessment Model
MDM_ECC_SFDC_POC
datamanagementtraining-170817080750.pdf
Business Value Through Reference and Master Data Strategies
DAMA International DMBOK V2 - Comparison with V1
Data-Ed: Unlock Business Value Through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDM
Chapter 1: The Importance of Data Assets
chapter1-220725121543-7c158b33.pdf
Essential Reference and Master Data Management
Master+Data+Managementsdfdsfsdfdsfsd.pdf
Synergizing Master Data Management and Big Data

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Cloud computing and distributed systems.
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Chapter 3 Spatial Domain Image Processing.pdf
Network Security Unit 5.pdf for BCA BBA.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25 Week I
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
NewMind AI Monthly Chronicles - July 2025
20250228 LYD VKU AI Blended-Learning.pptx
Review of recent advances in non-invasive hemoglobin estimation
Cloud computing and distributed systems.
Per capita expenditure prediction using model stacking based on satellite ima...
Mobile App Security Testing_ A Comprehensive Guide.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Building Integrated photovoltaic BIPV_UPV.pdf

2015 Mar-10 Improving Data Management through Utilizing Big Data - Mapping a Technology to a Data Concept v1