SlideShare a Scribd company logo
Modern Data Integration
William McKnight
Jake Freivald
Information Builders
McKnight Consulting Group
Expert Sessions
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 2
Unlock Potential
William McKnight
www.mcknightcg.com
214-514-1444
Modern Data Integration - Expert Sessions
@williammcknight
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 3
Data is the Most Important Asset
in the World
• We trade it for services instead of money
• Our information is exploding
• Business is moving to real-time, all the time
• Our information differentiates us from our
competitors
• Information is a key business asset
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 4
Corporate Initiatives
 80% of Initiatives That Matter are about DATA
• Budget
• Energy
 80% of Initiatives should be Business-Focused
• ROI
• Resource-Leveled
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 5Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 5
Data Maturity is Highly Correlated to
Business Success
Data
Maturity
Business
Success
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 6Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 6
The Money Tree Doesn’t
Exist
Hitch your Architecture and Maturity Efforts to an Application Budget
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 7Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 7
AI is disruptive
Data is the Foundation
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 8
Choosing a Platform: 3
Major Decisions
 Decision #1: The Data Store Type
• The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is
best for data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller
volumes of all data -- still belong in a relational database.
 Decision #2: Data Store Placement
• You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear
choice for most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this
remains the best approach for a data platform. For more on why databases are moving to the cloud, please read this article.
 Decision #3: The Workload Architecture
• Finally, you must keep in mind the distinction between operational or analytical workloads. Short transactional requests and
more complex (often longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are
the preferred platforms for the analytics workload.
8
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 9Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 9
Data Everywhere
And in Numerous Technical Forms
And in Numerous Clouds
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 10Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 10
,
Low Maturity Data
Integration
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 11Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 11
Leverageable Vehicles
 Data Warehouse
 Master Data Management
 Data Lake
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 12Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 12
Points of Data Integration
• Into the Data Warehouse(s)
• Into the Data Marts/cubes that do not integrate with the data warehouse
• Into the Data Marts/cubes that do integrate with the data warehouse
• Into Big Data platforms from sensor, clickstream, other systems
• Into Big Data platforms from Data Stream Processing
• Into the Master Data Management Hub from publishing/master systems
• From the Master Data Management Hub to every subscribing system (ERPs, NoSQL, Hadoop, data
warehouse, analytical databases, etc.)
• Between analytical stores
• Between operational stores
• Summaries of Big Data for the data warehouse and other analytical stores
• Data migrations for setting up new environments
• Etc.!
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 13Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 13
Modern Realities of Data
Integration
 Desire for consolidated methods for data integration
 New types of data sources
• Logs, sensors, etc.
 We have more than OLTP and OLAP
• Distributed data platforms
 Desire for real-time data
 High-velocity data increasingly needs integration
 Traditional approaches, without Stream Processing, turn
into ETL+custom scripts+middleware+MQ
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 14Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 14
Real-Time Data
 A.k.a. messaging, live feeds, real-time, event-driven
 Comes in continuously and often quickly, so we call
it streaming data.
 Needs special attention and can be of immense
value, but only if we are alerted in time.
 Foundation for Artificial Intelligence
• Stream data forms the core of data for artificial
intelligence
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 15Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 15
Message-Oriented Middleware / Message
Queueing Technology
 An architectural component that deals with messages
 Manage and distribute streaming data
• Any kind of data wrapped in a neat package with a very simple
header
• Sent by “producers”—systems, sensors, or devices that generate
the messages—toward a “broker”.
• Routes them into queues according to the information enclosed in the
message header or its own routing process
• “Consumers” retrieve the messages from the queues to which they
subscribe
• Open the messages and perform some kind of action on them.
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 16Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 16
Streaming Architecture
Apps
Streaming
Platform
Change logs
Streaming data pipelines
Messaging or
Stream processing
Request - Response
DW Hadoop
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 17Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 17
Every Project is Burdened
(with Grander Opportunity)
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 18Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 18
Data Success Measurement
User Satisfaction
Business ROI and
growth instigated
Data Maturity
(Long-term User Sat
and Bus ROI)
Misc.
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 19
“Beyond the Mountain is
another mountain.”
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 20Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 20
Champion Initiatives That
Matter
 Every single item on a company mission statement
relates to data at some level
 It is from the position of data expertise that the
mission will be executed and company leadership
will emerge
 The data professional is absolutely sitting on the
performance of the company in this information
economy and has an obligation to demonstrate the
possibilities and originate the architecture, data
and projects that will deliver.
 It’s not enough to be responsive to urgent requests
and be the data leader that companies need.
Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 21
Unlock Potential
William McKnight
www.mcknightcg.com
214-514-1444
Modern Data Integration - Expert Sessions
@williammcknight
Modern Data Integration
Problems with Normal Data Integration Processes
Data modeling. Too much time spent coping with slight changes
in our business data
Business/IT alignment. Data architects, DBAs, and others can’t
communicate with businesspeople
Processes. Too much detail lost by handing off responsibility for
business data to different people
Problem: Data Modeling
Too much time spent coping with slight changes in our business data
Johann Sebastian Bach
Given Middle Family
Mougi
Problem: Data Modeling
Too much time spent coping with slight changes in our business data
Johann Sebastian Bach
Given Middle FamilyHon.
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art.
Ludwig van Beethoven
ChenYi
Repeated changes in operational systems’ row-and-column structures
Problem: Data Modeling
Ripple effects of changes in one system lead to changes in others
Mougi
Johann Sebastian Bach
Given Middle FamilyHon
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art
Ludwig van Beethoven
ChenYi
Operational, designed for transactions
Data warehouse, designed for abstractions
Sebastian
Middle
Dmitriyevich
Patronymic
el
al
Art
Hon
van
Mougi
Bach
Family
Shostakovich
Qasabgi
Beethoven
Chen
Johann
Given
Dmitri
Mohamed
Muhammad
Ludwig
Yi
Data mart, designed for analysis
Mo
ugi
Bac
h
Fam
ily
Sho
stak
ovic
h
Qas
abgi
Bee
thov
en
Che
n
Joh
ann
Giv
en
Dmi
tri
Mo
ha
me
d
Mu
ha
mm
ad
Lud
wig
Yi
Mo
ugi
Bac
h
Sho
stak
ovic
h
Qas
abgi
Bee
thov
en
Che
n
Joh
ann
Dmi
tri
Mo
ha
me
d
Mu
ha
mm
ad
Lud
wig
Yi
Mo
ugi
Bac
h
Sho
stak
ovic
h
Qas
abgi
Bee
thov
en
Che
n
Joh
ann
Dmi
tri
Mo
ha
me
d
Mu
ha
mm
ad
Lud
wig
Yi
Mougi
Johann Sebastian Bach
Given Middle FamilyHn
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art
Ludwig vn Beethoven
ChenYi
Sebastian
Sebastian
Sebastian
el
el
el
Dmitriyevich
Dmitriyevich
Dmitriyevich
Dmitriyevich
Mougi
Johann Sebastian Bach
Given Middle FamilyHn
Dmitri ShostakovichDmitriyevich
Mohamed el
Muhammad Qasabgial
Patronymic Art
Ludwig vn Beethoven
ChenYi
Mougi
Johann Sebastian Bach
Given Middle Family
Dmitri Shostakovich
Mohamed
Muhammad Qasabgi
Ludwig Beethoven
ChenYi
Sebastian
Mougi
Johann Sebastian Bach
Given Middle Family
Dmitri Shostakovich
Mohamed
Muhammad Qasabgi
Ludwig Beethoven
ChenYi
Sebastian
Sebastian
Sebastian
Sebastian
Sebastian
Problem: Business/IT Alignment
Data people often can’t communicate with businesspeople
Data architect thinks
 Model the data
 Govern the data
 Watch out for “quick fixes”
IT:
Gets it
That modeling stuff
we just talked about
Business:
Hates it
Business thinks
 Modeling, metadata are hindrances
 Analytical tools best without governance
 IT slows them down
Problem: Processes
Too much information lost by distributing responsibility for business data
Cleansing occurs in transformation step: Different rules being fired
Different tools and metadata being used by platform
Loss of timestamps, context, before-and-after: No cross-platform auditability
No comprehensive rollback, alternate history, what-if
Operational
application
Data
warehouse
Cloud
application
F
a
m
i
l
y
Transformation
Cleansing
Standardization
Transformation
Cleansing
Standardization
F
a
m
i
l
y
F
a
m
i
l
y
How much time do we
spend mapping one set
of rows and columns
to another?
Modern Data Integration
A modern solution:
post-relational for data capture, transformation,
subject-oriented storage (perhaps), and exchange,
rich documents instead of relational models
Operational
application
Data
warehouse
Analytics
How much time do we
spend mapping one set
of rows and columns
to another?
Modern Data Integration
A modern solution:
post-relational for data capture, transformation,
subject-oriented storage (perhaps), and exchange,
rich documents instead of relational models
Operational
application
Data
warehouse
Analytics
Operational
application
Data
warehouse
Analytics
Modern Data Integration
A modern solution:
ELT capture/integrate to capture data as it is,
time-stamped apply trustworthy processes to it,
subject-oriented and share it in trusted ways
How much info
do we lose
by distributing
ETL processes?
Operational
application
Analytics
Data Capture/Transformation Hub
Transformation
Cleansing
Standardization
Application
to business
use cases
Modern Data Integration
How much info
do we lose
by distributing
ETL processes?
A modern solution:
ELT capture/integrate to capture data as it is,
time-stamped apply trustworthy processes to it,
subject-oriented and share it in trusted ways
Modern Data Integration: The Omni-Gen Approach
We built software to make ourselves successful
 Immediate capture in automatically generated data hub
 Master data: business-user-oriented, subject-oriented
 Rapid, integrated data quality rules
 Mastered and transactional subjects
 Rapid cycle times to keep the business engaged
 Support and automatically apply best practices
Modern Data Integration: The Omni-Gen Approach
Extending Value
We built “persona models” for customer and supplier
Everything you get in Omni-Gen, plus
 Pre-built models
 Pre-built data quality rules
 Pre-built match/merge rules
 Pre-built data governance
 Immediate 360° core view, unlimited extensions
 Supports different consumers with different, but trusted, data
Omni-Gen: More Value in Far Less Time
12-181-3 4-6
Project timeline, in months
Traditional
Data management tools
Build-it-yourself development environment
Omni-Gen
Software solution with built-in best practices
MDM, DQ, integration software with rules,
automatically generated data vault, remediation portal,
360° viewer, history, data interfaces, APIs, and feeds
Omnifor
Persona
Software solution with built-in best practices and complete master data models
Data vault model, data onramps; MDM, data quality, and integration software; MDM
and data quality rules, remediation portal, 360° viewer; Data interfaces, APIs,
history, & feeds; Analytical foundation for dashboarding, advanced analytics, more.
Modern Data Integration
William McKnight
Jake Freivald
Information Builders
McKnight Consulting Group
Expert Sessions

More Related Content

PPTX
Embedded Analytics Expert Session Webinar
 
PPTX
Predictive and Prescriptive Analytics Expert Session Webinar
 
PDF
Accelerating Fast Data Strategy with Data Virtualization
PPTX
Make data simple in the cognitive era
PPTX
Big and fast data strategy 2017 jr
PDF
Reveal the Intelligence in your Data with Talend Data Fabric
PPTX
The Importance of DataOps in a Multi-Cloud World
PPTX
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
Embedded Analytics Expert Session Webinar
 
Predictive and Prescriptive Analytics Expert Session Webinar
 
Accelerating Fast Data Strategy with Data Virtualization
Make data simple in the cognitive era
Big and fast data strategy 2017 jr
Reveal the Intelligence in your Data with Talend Data Fabric
The Importance of DataOps in a Multi-Cloud World
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal

What's hot (20)

PDF
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
PPTX
Crowdsourcing Data Governance
PDF
Getting down to business on Big Data analytics
PPTX
Making big data work
PDF
Building Your Data Hub to Support Digital
PPTX
Asking the Right Questions of Your Data
PDF
Big Data – From Strategy to Production
PPTX
Multi Cloud Data Integration- Retail
PDF
Self -Service Data preparation for Data-Driven marketing
PDF
Data is cheap; strategy still matters by Jason Lee
PDF
Data Lake: A simple introduction
PPTX
Building Confidence in Big Data - IBM Smarter Business 2013
PDF
Talend Summer 16 launch présentation: Open Data Preparation for Everyone
PDF
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
PPTX
Unlock Data-driven Insights in Databricks Using Location Intelligence
PDF
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
PDF
Gain 3 Benefits with Delta Sharing
PDF
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
PPTX
Business Analytics & Big Data Trends and Predictions 2014 - 2015
PDF
Alignment: Office of the Chief Data Officer & BCBS 239
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
Crowdsourcing Data Governance
Getting down to business on Big Data analytics
Making big data work
Building Your Data Hub to Support Digital
Asking the Right Questions of Your Data
Big Data – From Strategy to Production
Multi Cloud Data Integration- Retail
Self -Service Data preparation for Data-Driven marketing
Data is cheap; strategy still matters by Jason Lee
Data Lake: A simple introduction
Building Confidence in Big Data - IBM Smarter Business 2013
Talend Summer 16 launch présentation: Open Data Preparation for Everyone
DAMA Webinar: Turn Grand Designs into a Reality with Data Virtualization
Unlock Data-driven Insights in Databricks Using Location Intelligence
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
Gain 3 Benefits with Delta Sharing
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Alignment: Office of the Chief Data Officer & BCBS 239
Ad

Similar to Modern Data Integration Expert Session Webinar (20)

PPTX
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
PDF
When and How Data Lakes Fit into a Modern Data Architecture
PPTX
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
PDF
What Is My Enterprise Data Maturity 2021
PPTX
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
PDF
Big Data LDN 2017: The New Dominant Companies Are Running on Data
PDF
Big Data LDN 2017: The New Dominant Companies Are Running on Data
PDF
Four Key Considerations for your Big Data Analytics Strategy
PDF
The Maturity Model: Taking the Growing Pains Out of Hadoop
PPTX
The new dominant companies are running on data
PDF
Bimodal IT and EDW Modernization
PDF
Die Big Data Fabric als Enabler für Machine Learning & AI
PDF
Capgemini Leap Data Transformation Framework with Cloudera
PPTX
There are 250 Database products, are you running the right one?
PPTX
Hadoop 2015: what we larned -Think Big, A Teradata Company
PPTX
When SAP alone is not enough
PDF
Rethinking Data Availability and Governance in a Mobile World
PDF
Rethinking Data Availability and Governance in a Mobile World
PPTX
Klarna Tech Talk - Mind the Data!
PDF
Where the Warehouse Ends: A New Age of Information Access
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
When and How Data Lakes Fit into a Modern Data Architecture
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
What Is My Enterprise Data Maturity 2021
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Four Key Considerations for your Big Data Analytics Strategy
The Maturity Model: Taking the Growing Pains Out of Hadoop
The new dominant companies are running on data
Bimodal IT and EDW Modernization
Die Big Data Fabric als Enabler für Machine Learning & AI
Capgemini Leap Data Transformation Framework with Cloudera
There are 250 Database products, are you running the right one?
Hadoop 2015: what we larned -Think Big, A Teradata Company
When SAP alone is not enough
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
Klarna Tech Talk - Mind the Data!
Where the Warehouse Ends: A New Age of Information Access
Ad

More from ibi (20)

PDF
Data Monetization Expert Session Webinar
 
PDF
Internet of Things (IoT) Expert Session Webinar
 
PPT
Artificial Intelligence Expert Session Webinar
 
PPTX
Celebrating Women Today and Everyday
 
PDF
The Value of Improved Clinical Information Management for Payers
 
PPTX
Five Hot Trends for 2018
 
PPTX
What Employees Think of Working at Information Builders
 
PPTX
What Customers Are Saying About Information Builders
 
PPTX
Accelerating Your Move to Value-Based Care
 
PPTX
Top 10 Reasons to Work at Information Builders
 
PPTX
Five Critical Success Factors for Embedded Analytics
 
PDF
Why Attend Summit 2017?
 
PDF
Data Discovery and Governance
 
PPTX
Solving the BI Adoption Challenge With Report Consolidation
 
PPTX
5 Hot Trends for Data and Analytics in 2017
 
PPTX
What the Data Says...About Elections
 
PPTX
Transforming Healthcare: Improving Decision Support with Your Partners
 
PPTX
Transforming Healthcare: Build vs Buy
 
PPTX
UX & Design Thinking for BI Applications
 
PPTX
Summer Shorts: Big Data Integration
 
Data Monetization Expert Session Webinar
 
Internet of Things (IoT) Expert Session Webinar
 
Artificial Intelligence Expert Session Webinar
 
Celebrating Women Today and Everyday
 
The Value of Improved Clinical Information Management for Payers
 
Five Hot Trends for 2018
 
What Employees Think of Working at Information Builders
 
What Customers Are Saying About Information Builders
 
Accelerating Your Move to Value-Based Care
 
Top 10 Reasons to Work at Information Builders
 
Five Critical Success Factors for Embedded Analytics
 
Why Attend Summit 2017?
 
Data Discovery and Governance
 
Solving the BI Adoption Challenge With Report Consolidation
 
5 Hot Trends for Data and Analytics in 2017
 
What the Data Says...About Elections
 
Transforming Healthcare: Improving Decision Support with Your Partners
 
Transforming Healthcare: Build vs Buy
 
UX & Design Thinking for BI Applications
 
Summer Shorts: Big Data Integration
 

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Computer network topology notes for revision
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
annual-report-2024-2025 original latest.
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Computer network topology notes for revision
Clinical guidelines as a resource for EBP(1).pdf
Miokarditis (Inflamasi pada Otot Jantung)
Reliability_Chapter_ presentation 1221.5784
Supervised vs unsupervised machine learning algorithms
annual-report-2024-2025 original latest.
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Knowledge Engineering Part 1
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
oil_refinery_comprehensive_20250804084928 (1).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Mega Projects Data Mega Projects Data
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
IB Computer Science - Internal Assessment.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
ISS -ESG Data flows What is ESG and HowHow
Introduction-to-Cloud-ComputingFinal.pptx

Modern Data Integration Expert Session Webinar

  • 1. Modern Data Integration William McKnight Jake Freivald Information Builders McKnight Consulting Group Expert Sessions
  • 2. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 2 Unlock Potential William McKnight www.mcknightcg.com 214-514-1444 Modern Data Integration - Expert Sessions @williammcknight
  • 3. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 3 Data is the Most Important Asset in the World • We trade it for services instead of money • Our information is exploding • Business is moving to real-time, all the time • Our information differentiates us from our competitors • Information is a key business asset
  • 4. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 4 Corporate Initiatives  80% of Initiatives That Matter are about DATA • Budget • Energy  80% of Initiatives should be Business-Focused • ROI • Resource-Leveled
  • 5. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 5Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 5 Data Maturity is Highly Correlated to Business Success Data Maturity Business Success
  • 6. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 6Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 6 The Money Tree Doesn’t Exist Hitch your Architecture and Maturity Efforts to an Application Budget
  • 7. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 7Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 7 AI is disruptive Data is the Foundation
  • 8. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 8 Choosing a Platform: 3 Major Decisions  Decision #1: The Data Store Type • The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is best for data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller volumes of all data -- still belong in a relational database.  Decision #2: Data Store Placement • You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear choice for most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this remains the best approach for a data platform. For more on why databases are moving to the cloud, please read this article.  Decision #3: The Workload Architecture • Finally, you must keep in mind the distinction between operational or analytical workloads. Short transactional requests and more complex (often longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are the preferred platforms for the analytics workload. 8
  • 9. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 9Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 9 Data Everywhere And in Numerous Technical Forms And in Numerous Clouds
  • 10. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 10Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 10 , Low Maturity Data Integration
  • 11. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 11Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 11 Leverageable Vehicles  Data Warehouse  Master Data Management  Data Lake
  • 12. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 12Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 12 Points of Data Integration • Into the Data Warehouse(s) • Into the Data Marts/cubes that do not integrate with the data warehouse • Into the Data Marts/cubes that do integrate with the data warehouse • Into Big Data platforms from sensor, clickstream, other systems • Into Big Data platforms from Data Stream Processing • Into the Master Data Management Hub from publishing/master systems • From the Master Data Management Hub to every subscribing system (ERPs, NoSQL, Hadoop, data warehouse, analytical databases, etc.) • Between analytical stores • Between operational stores • Summaries of Big Data for the data warehouse and other analytical stores • Data migrations for setting up new environments • Etc.!
  • 13. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 13Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 13 Modern Realities of Data Integration  Desire for consolidated methods for data integration  New types of data sources • Logs, sensors, etc.  We have more than OLTP and OLAP • Distributed data platforms  Desire for real-time data  High-velocity data increasingly needs integration  Traditional approaches, without Stream Processing, turn into ETL+custom scripts+middleware+MQ
  • 14. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 14Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 14 Real-Time Data  A.k.a. messaging, live feeds, real-time, event-driven  Comes in continuously and often quickly, so we call it streaming data.  Needs special attention and can be of immense value, but only if we are alerted in time.  Foundation for Artificial Intelligence • Stream data forms the core of data for artificial intelligence
  • 15. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 15Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 15 Message-Oriented Middleware / Message Queueing Technology  An architectural component that deals with messages  Manage and distribute streaming data • Any kind of data wrapped in a neat package with a very simple header • Sent by “producers”—systems, sensors, or devices that generate the messages—toward a “broker”. • Routes them into queues according to the information enclosed in the message header or its own routing process • “Consumers” retrieve the messages from the queues to which they subscribe • Open the messages and perform some kind of action on them.
  • 16. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 16Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 16 Streaming Architecture Apps Streaming Platform Change logs Streaming data pipelines Messaging or Stream processing Request - Response DW Hadoop
  • 17. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 17Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 17 Every Project is Burdened (with Grander Opportunity)
  • 18. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 18Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 18 Data Success Measurement User Satisfaction Business ROI and growth instigated Data Maturity (Long-term User Sat and Bus ROI) Misc.
  • 19. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 19 “Beyond the Mountain is another mountain.”
  • 20. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 20Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 20 Champion Initiatives That Matter  Every single item on a company mission statement relates to data at some level  It is from the position of data expertise that the mission will be executed and company leadership will emerge  The data professional is absolutely sitting on the performance of the company in this information economy and has an obligation to demonstrate the possibilities and originate the architecture, data and projects that will deliver.  It’s not enough to be responsive to urgent requests and be the data leader that companies need.
  • 21. Copyright © 2018 McKnight Consulting Group, LLC All Rights Reserved Slide 21 Unlock Potential William McKnight www.mcknightcg.com 214-514-1444 Modern Data Integration - Expert Sessions @williammcknight
  • 23. Problems with Normal Data Integration Processes Data modeling. Too much time spent coping with slight changes in our business data Business/IT alignment. Data architects, DBAs, and others can’t communicate with businesspeople Processes. Too much detail lost by handing off responsibility for business data to different people
  • 24. Problem: Data Modeling Too much time spent coping with slight changes in our business data Johann Sebastian Bach Given Middle Family
  • 25. Mougi Problem: Data Modeling Too much time spent coping with slight changes in our business data Johann Sebastian Bach Given Middle FamilyHon. Dmitri ShostakovichDmitriyevich Mohamed el Muhammad Qasabgial Patronymic Art. Ludwig van Beethoven ChenYi Repeated changes in operational systems’ row-and-column structures
  • 26. Problem: Data Modeling Ripple effects of changes in one system lead to changes in others Mougi Johann Sebastian Bach Given Middle FamilyHon Dmitri ShostakovichDmitriyevich Mohamed el Muhammad Qasabgial Patronymic Art Ludwig van Beethoven ChenYi Operational, designed for transactions Data warehouse, designed for abstractions Sebastian Middle Dmitriyevich Patronymic el al Art Hon van Mougi Bach Family Shostakovich Qasabgi Beethoven Chen Johann Given Dmitri Mohamed Muhammad Ludwig Yi Data mart, designed for analysis Mo ugi Bac h Fam ily Sho stak ovic h Qas abgi Bee thov en Che n Joh ann Giv en Dmi tri Mo ha me d Mu ha mm ad Lud wig Yi Mo ugi Bac h Sho stak ovic h Qas abgi Bee thov en Che n Joh ann Dmi tri Mo ha me d Mu ha mm ad Lud wig Yi Mo ugi Bac h Sho stak ovic h Qas abgi Bee thov en Che n Joh ann Dmi tri Mo ha me d Mu ha mm ad Lud wig Yi Mougi Johann Sebastian Bach Given Middle FamilyHn Dmitri ShostakovichDmitriyevich Mohamed el Muhammad Qasabgial Patronymic Art Ludwig vn Beethoven ChenYi Sebastian Sebastian Sebastian el el el Dmitriyevich Dmitriyevich Dmitriyevich Dmitriyevich Mougi Johann Sebastian Bach Given Middle FamilyHn Dmitri ShostakovichDmitriyevich Mohamed el Muhammad Qasabgial Patronymic Art Ludwig vn Beethoven ChenYi Mougi Johann Sebastian Bach Given Middle Family Dmitri Shostakovich Mohamed Muhammad Qasabgi Ludwig Beethoven ChenYi Sebastian Mougi Johann Sebastian Bach Given Middle Family Dmitri Shostakovich Mohamed Muhammad Qasabgi Ludwig Beethoven ChenYi Sebastian Sebastian Sebastian Sebastian Sebastian
  • 27. Problem: Business/IT Alignment Data people often can’t communicate with businesspeople Data architect thinks  Model the data  Govern the data  Watch out for “quick fixes” IT: Gets it That modeling stuff we just talked about Business: Hates it Business thinks  Modeling, metadata are hindrances  Analytical tools best without governance  IT slows them down
  • 28. Problem: Processes Too much information lost by distributing responsibility for business data Cleansing occurs in transformation step: Different rules being fired Different tools and metadata being used by platform Loss of timestamps, context, before-and-after: No cross-platform auditability No comprehensive rollback, alternate history, what-if Operational application Data warehouse Cloud application F a m i l y Transformation Cleansing Standardization Transformation Cleansing Standardization
  • 29. F a m i l y F a m i l y How much time do we spend mapping one set of rows and columns to another? Modern Data Integration A modern solution: post-relational for data capture, transformation, subject-oriented storage (perhaps), and exchange, rich documents instead of relational models Operational application Data warehouse Analytics
  • 30. How much time do we spend mapping one set of rows and columns to another? Modern Data Integration A modern solution: post-relational for data capture, transformation, subject-oriented storage (perhaps), and exchange, rich documents instead of relational models Operational application Data warehouse Analytics
  • 31. Operational application Data warehouse Analytics Modern Data Integration A modern solution: ELT capture/integrate to capture data as it is, time-stamped apply trustworthy processes to it, subject-oriented and share it in trusted ways How much info do we lose by distributing ETL processes?
  • 32. Operational application Analytics Data Capture/Transformation Hub Transformation Cleansing Standardization Application to business use cases Modern Data Integration How much info do we lose by distributing ETL processes? A modern solution: ELT capture/integrate to capture data as it is, time-stamped apply trustworthy processes to it, subject-oriented and share it in trusted ways
  • 33. Modern Data Integration: The Omni-Gen Approach We built software to make ourselves successful  Immediate capture in automatically generated data hub  Master data: business-user-oriented, subject-oriented  Rapid, integrated data quality rules  Mastered and transactional subjects  Rapid cycle times to keep the business engaged  Support and automatically apply best practices
  • 34. Modern Data Integration: The Omni-Gen Approach Extending Value We built “persona models” for customer and supplier Everything you get in Omni-Gen, plus  Pre-built models  Pre-built data quality rules  Pre-built match/merge rules  Pre-built data governance  Immediate 360° core view, unlimited extensions  Supports different consumers with different, but trusted, data
  • 35. Omni-Gen: More Value in Far Less Time 12-181-3 4-6 Project timeline, in months Traditional Data management tools Build-it-yourself development environment Omni-Gen Software solution with built-in best practices MDM, DQ, integration software with rules, automatically generated data vault, remediation portal, 360° viewer, history, data interfaces, APIs, and feeds Omnifor Persona Software solution with built-in best practices and complete master data models Data vault model, data onramps; MDM, data quality, and integration software; MDM and data quality rules, remediation portal, 360° viewer; Data interfaces, APIs, history, & feeds; Analytical foundation for dashboarding, advanced analytics, more.
  • 36. Modern Data Integration William McKnight Jake Freivald Information Builders McKnight Consulting Group Expert Sessions