SlideShare a Scribd company logo
The First Step in Information Management
looker.com
Produced by:
MONTHLY SERIES
In partnership with:
Data-centric Analytics and
Understanding the Full Data Supply Chain
May 3, 2018
Welcome to Today’s Discussion
 Understanding the data supply chain and how it impacts analytics
 Data-centric design considerations for the data supply chain
 Data supply chain features and components for the data lake
 Key roles and responsibilities
 How analytics must interact with the data supply chain
 Best practices and key takeaways
 Q&A
pg 2© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Background
 An insurance company believes it found a strong predictor for policy
renewal only to discover the model was based on an indicator that
actually meant a policy was cancelled, not expiring.
 A real estate AI model was corrupted because, while one record in a
million was wrong, it was wrong by a magnitude of 1000, and there
was no way to tell if it was correct or an error.
 “There are many downstream processes, including EHR configuration,
data transport, aggregation, normalization, and reporting mechanisms,
that through omission or commission can negatively impact data
quality.” Healthcare IT News
 Postal addresses and emails change constantly  so 20–23% of all of
this data is wrong as soon as it is received.
 There are new examples of incorrect conclusions and bad AI results
every day. Example site – www.towardsdatascience.com
pg 3© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
What is a Data Supply Chain?
 The Data supply chain represents
the sources, flows, management
and distribution of data and
information in an organization.
pg 4© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
• Steel
• Aluminum
Raw Material
• Supplier
• Fabrication
Make/
Acquire Parts • Processes
• Sequence
Assembly
• Ship
• Store
• Sell
Distribute
•Internal Database
•Derived data
Raw Material
•External File
•Clean Data
•Standardize and
Position Data
Make/
Acquire Parts •Populate Sand Box
•Develop and Run
Models
Assembly
•Visualization
•Publish and Pull
•Sell Results
Distribute
A regular supply chain coordinates
material sourcing and assembly to
fulfill and deliver goods to customers.
Why Data Supply Chains Are Important
 A well-run factory has a “parts and tools”
crib or cage.
 Contents are well-managed and tracked.
 Distribution from that area depends on
strong guidance, policy and lean
management.
pg 5© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Why Data Supply Chains Are Important
 A well-run factory has a “parts and tools”
crib or cage.
 Contents are well-managed and tracked.
 Distribution from that area depends on
strong guidance, policy and lean
management.
pg 6© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
The tool crib has an
inventory control system
and an item master.
A data supply chain and
data lake uses metadata
and data governance.
Why Data Supply Chains Are Important
 Unknown bad data is like a hidden
manufacturing fault, but instead of a recall,
you get to explain that the model and AI are
in error and have been putting out bad
recommendations.
 Gathering external, internal and then
blending in deduced data is like switching
suppliers in a supply chain.
 Sometimes the pieces don’t fit in spite of the
same specs.
pg 7© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
This isn’t new, but data
scientists who are new
to this type of use of
data are “discovering”
things that have
already been
discovered.
Why Data Supply Chains Are Important
 Unknown bad data is like a hidden
manufacturing fault, but instead of a recall,
you get to explain that the model and AI are
in error and have been putting out bad
recommendations.
 Gathering external, internal and then
blending in deduced data is like switching
suppliers in a supply chain.
 Sometimes the pieces don’t fit in spite of the
same specs.
pg 8© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Like modern
manufacturing, you do
not fix the product
after it is built
anymore. You fix the
process, i.e., you build
a data supply chain.
Data-centric Thinking for Design
pg 9
Treat and View Your Data as an Asset
Sell to Suppliers, Not Consumers
Integrate With Your Data Strategy
Treat Like a Real Business Line
AI and analytics
architectures are
really logistical
challenges.
Pretend the
analytics results
are a product,
even if internal.
Design the data
supply chain.
=
© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Manage Your Data Architecture
 Most large organizations
have complicated
architectures.
 Even smaller
organizations need to
balance COTS,
homegrown and modern
data assets.
pg 10© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Insight Architecture
Integration
&
Abstraction
Layer
Data Management Layer
Data Access Layer
Business Strategy
Vintage Area
Contemporary
Area
Manage Your Data Architecture
 Use the concept of a
supply chain to
understand, design and
manage the balancing of
the vintage and
contemporary sides of
your data architecture.
pg 11© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Insight Architecture
Integration
&
Abstraction
Layer
Data Management Layer
Data Access Layer
Business Strategy
Vintage Area
Contemporary
Area
Source
FabricateAssembly
Distribute
Data Supply Chain
Features of a Modern Data Supply Chain
CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE
Goods and Services Supply Chain
Distribution
Purchasing
Operations
Integration
Functions
Logistics
Compliance
Organization
Product Mgmt
Capabilities
Data Push and Pull
Data Management
Data Operations
Data Integration
Functions
BI, Analytics
Governance
Engagement Model
Product Mgmt
Capabilities
Data Supply Chain
pg 12© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Major Components of the Data Supply Chain
pg 13© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Supply Chain
CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE
Major Components of the Data Supply Chain
pg 14© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Supply Chain
Data Governance
Data Catalog (Metadata)
CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE
Data Quality
Major Components of the Data Supply Chain
pg 15© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Supply Chain
Data Governance
Data Catalog (Metadata)
CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE
Data Sources Data Lake
LANDING
ZONE
STANDARDIZATION
ZONE
ANALYTICS
PLATFORMS
Legacy Apps ERP External Files
Traditional BI and Reporting
Reporting
Data
Mart
Data
Warehouse
Data Quality
Roles – Treat and View Your Data as an Asset
pg 16© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
•Data Product Manager, quality control, etc. (Custodian,
Steward)
Information product
management
•Define the supply chain, from source, production, marketing
and shipping
Architect, engineer
•Manage data quality  The data scientists cannot sustain the
product without good raw materials. Govern data  There
will be standards required for source and usage of data.
Governance and quality
•90% of the same support mechanisms are shared. Integrate
with the data strategy and vision for the organizationLeadership and alignment
Roles Responsibilities
Responsibilities – Treat Like a Real Business Line
pg 17
• Define revenue streams; Monitor costs; Measure
effectiveness
Oversight of data use
• Measure the business; Monitor costs, returns on
sales
Manage the numbers
• Engage, Transact, Fulfill, Service exists for internal
and external
Manage the customer
• Attain peer status with finance, legal, other
products/revenue streams
Alignment and acceptance
© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Roles Responsibilities
Best Practices
pg 18
Sell to
suppliers,
not
consumers
Solve customer needs with data
products.
Provide data (or data access) that someone
else can “productize.”
Develop premium data products
for sellers versus buyers.
Sellers (and their agents) are more willing to
pay than buyers.
Direct-to-consumer tends to be fickle and
higher-cost (Zillow and Uber, as examples).
Develop a data exchange with
customers on the platform of
your customer’s preference.
Don't make your customer have to invest
heavily –standards are one thing, capital is
another.
Achieve market scale quickly. POCs are good, but make a call. Someone else
will beat you.
© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Best Practices
pg 19
Integrate
with
data
strategy
IaaS and internal data
ecosystems tend to be
intertwined.
They will share the same data management
and governance functions (it is a win-win).
Metadata is your inventory
management system.
Indices, registries and lineage are vital
functions for oversight and scaling.
Data landscape management is
key – a good inventory of data
assets.
The most important or biggest DB might be
one you don’t own (e.g. external data).
Fuse data from many sources and formats.
Culture has strategy for lunch. Acceptance of new standards and capabilities
is never smooth.
© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
How Analytics Must Interact with the Data Supply Chain
pg 20© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Supply Chain
DATA SOURCES
Analytics will lie somewhere along your data supply chain.
LANDING ZONE STANDARDIZATION ZONE ANALYTICS PLATFORMS
DATA GOVERNANCE
DATA CONSUMERS
DATA OPERATIONS
DATA
SCIENTISTS
DATA MANAGEMENT
Create Use Update Measure Model Delete
How Analytics Must Interact with the Data Supply Chain
pg 21© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Data Supply Chain
DATA SOURCES
Analytics will lie somewhere along your data supply chain.
LANDING ZONE STANDARDIZATION ZONE ANALYTICS PLATFORMS
DATA GOVERNANCE
DATA CONSUMERS
DATA OPERATIONS
DATA
SCIENTISTS
DATA MANAGEMENT
Create Use Update Measure Report Delete
Best Practices and Key Takeaways
 Understand the bigger picture of balancing old and new, many
sources and many uses of data
 Understand there is a strong metaphor in the supply chain
 Realize lean management, logistics and compliance are models in
the non-data world that can be applied to data management of
data supply chains
 Ensure the end point is the data lake and analytics/insights
 Start with information requirements when you plan for the data
supply chain
 Fully exploit the assembly line metaphor: raw data in – and out
comes an analytics conclusion
 Remember well-established best practices along the way.
Metadata and the data catalog are your item master and
inventory management systems
pg 22© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
Please Share Your Questions and Comments
MONTHLY SERIES
Thank you for joining us today!
Our Thursday, June 7 #DIAnaltyics webinar is:
Top 5 Priorities of an Analytics Leader.
John Ladley @jladley
john@firstsanfranciscopartners.com
Kelle O’Neal @kellezoneal
kelle@firstsanfranciscopartners.com

More Related Content

PDF
A Practical Guide to Implementing Effective BI Governance
PDF
Modern Metadata Strategies
PDF
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
PDF
Master Data Management - Practical Strategies for Integrating into Your Data ...
PDF
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
PDF
RWDG Webinar: Using Data Governance to Improve Data Understanding
PDF
Improving Data Analytics with Data Governance
PDF
The Evolving Role of the Data Architect – What Does It Mean for Your Career?
A Practical Guide to Implementing Effective BI Governance
Modern Metadata Strategies
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
Master Data Management - Practical Strategies for Integrating into Your Data ...
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
RWDG Webinar: Using Data Governance to Improve Data Understanding
Improving Data Analytics with Data Governance
The Evolving Role of the Data Architect – What Does It Mean for Your Career?

What's hot (20)

PDF
Webinar: Maximizing Your Potential with Data Leadership
PDF
DI&A Webinar: Top 5 Priorities for an Analytics Leader
PDF
DAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
PDF
Noise to Signal - The Biggest Problem in Data
PDF
The Missing Link in Enterprise Data Governance - Automated Metadata Management
PPTX
Lessons Learned from Building a Data Supply Chain
PDF
DI&A Webinar: Big Data Analytics
PDF
RWDG: Measuring Data Governance Performance
PDF
How to Govern Your Master Data
PDF
Slides: Metadata Management for the Governance Minded
PDF
Confessions of a CDO - The Evolving Role of the Chief Data Officer
PDF
Data Insights and Analytics: The Importance of Effective Communications in An...
PDF
Using Machine Learning to Understand and Predict Marketing ROI
PDF
RWDG Slides: Using Tools to Advance Your Data Governance Program
PDF
Data Leadership - Stop Talking About Data and Start Making an Impact!
PDF
Trends in Data Analytics - From Database to Analyst
PDF
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
PPTX
Data Strategy for Telcos : Preparedness and Management
PDF
Data Management vs Data Strategy
PDF
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Webinar: Maximizing Your Potential with Data Leadership
DI&A Webinar: Top 5 Priorities for an Analytics Leader
DAS Slides: Self-Service Reporting and Data Prep – Benefits & Risks
Noise to Signal - The Biggest Problem in Data
The Missing Link in Enterprise Data Governance - Automated Metadata Management
Lessons Learned from Building a Data Supply Chain
DI&A Webinar: Big Data Analytics
RWDG: Measuring Data Governance Performance
How to Govern Your Master Data
Slides: Metadata Management for the Governance Minded
Confessions of a CDO - The Evolving Role of the Chief Data Officer
Data Insights and Analytics: The Importance of Effective Communications in An...
Using Machine Learning to Understand and Predict Marketing ROI
RWDG Slides: Using Tools to Advance Your Data Governance Program
Data Leadership - Stop Talking About Data and Start Making an Impact!
Trends in Data Analytics - From Database to Analyst
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data Strategy for Telcos : Preparedness and Management
Data Management vs Data Strategy
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Ad

Similar to Data-Centric Analytics and Understanding the Full Data Supply Chain (20)

PDF
Advanced Analytics Governance - Effective Model Management and Stewardship
PDF
Big Data & Analytics perspectives in Banking
PDF
Data Lake Architecture – Modern Strategies & Approaches
PPTX
Advanced Databases and Knowledge Management
PPTX
Fuel your Data-Driven Ambitions with Data Governance
PPTX
Supply Chain 2030
PDF
Analytics, Business Intelligence, and Data Science - What's the Progression?
PPTX
dataversitydatacatalogslidenotesslidenotesslidenotes
PDF
Building a Data Strategy – Practical Steps for Aligning with Business Goals
PPTX
Reinventing the Modern Information Pipeline: Paxata and MapR
PDF
td-ameritrades-journey-from-data-warehouses-to-data-lakes_237777
PDF
Data-Analytics-Resource-updated for analysis
PDF
Data Catalogues - Architecting for Collaboration & Self-Service
PDF
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
PDF
uae views on big data
PPTX
Pactera Big Data Solutions for Retail
PDF
Align Business Data & Analytics for Digital Transformation
PDF
Sustaining Data Governance and Adding Value for the Long Term
PDF
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Advanced Analytics Governance - Effective Model Management and Stewardship
Big Data & Analytics perspectives in Banking
Data Lake Architecture – Modern Strategies & Approaches
Advanced Databases and Knowledge Management
Fuel your Data-Driven Ambitions with Data Governance
Supply Chain 2030
Analytics, Business Intelligence, and Data Science - What's the Progression?
dataversitydatacatalogslidenotesslidenotesslidenotes
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Reinventing the Modern Information Pipeline: Paxata and MapR
td-ameritrades-journey-from-data-warehouses-to-data-lakes_237777
Data-Analytics-Resource-updated for analysis
Data Catalogues - Architecting for Collaboration & Self-Service
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
uae views on big data
Pactera Big Data Solutions for Retail
Align Business Data & Analytics for Digital Transformation
Sustaining Data Governance and Adding Value for the Long Term
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Ad

More from DATAVERSITY (20)

PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
PDF
Data at the Speed of Business with Data Mastering and Governance
PDF
Exploring Levels of Data Literacy
PDF
Make Data Work for You
PDF
Data Catalogs Are the Answer – What is the Question?
PDF
Data Catalogs Are the Answer – What Is the Question?
PDF
Data Modeling Fundamentals
PDF
Showing ROI for Your Analytic Project
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
PDF
Is Enterprise Data Literacy Possible?
PDF
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
PDF
Emerging Trends in Data Architecture – What’s the Next Big Thing?
PDF
Data Governance Trends - A Look Backwards and Forwards
PDF
Data Governance Trends and Best Practices To Implement Today
PDF
2023 Trends in Enterprise Analytics
PDF
Data Strategy Best Practices
PDF
Who Should Own Data Governance – IT or Business?
PDF
Data Management Best Practices
PDF
MLOps – Applying DevOps to Competitive Advantage
PDF
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Data at the Speed of Business with Data Mastering and Governance
Exploring Levels of Data Literacy
Make Data Work for You
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What Is the Question?
Data Modeling Fundamentals
Showing ROI for Your Analytic Project
How a Semantic Layer Makes Data Mesh Work at Scale
Is Enterprise Data Literacy Possible?
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends and Best Practices To Implement Today
2023 Trends in Enterprise Analytics
Data Strategy Best Practices
Who Should Own Data Governance – IT or Business?
Data Management Best Practices
MLOps – Applying DevOps to Competitive Advantage
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...

Recently uploaded (20)

PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PDF
How to Get Funding for Your Trucking Business
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
Nidhal Samdaie CV - International Business Consultant
PPTX
Probability Distribution, binomial distribution, poisson distribution
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
Types of control:Qualitative vs Quantitative
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PDF
Training And Development of Employee .pdf
PPTX
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
PDF
IFRS Notes in your pocket for study all the time
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
How to Get Funding for Your Trucking Business
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
MSPs in 10 Words - Created by US MSP Network
Nidhal Samdaie CV - International Business Consultant
Probability Distribution, binomial distribution, poisson distribution
ICG2025_ICG 6th steering committee 30-8-24.pptx
Types of control:Qualitative vs Quantitative
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
Unit 1 Cost Accounting - Cost sheet
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
Roadmap Map-digital Banking feature MB,IB,AB
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
340036916-American-Literature-Literary-Period-Overview.ppt
unit 1 COST ACCOUNTING AND COST SHEET
Training And Development of Employee .pdf
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
IFRS Notes in your pocket for study all the time

Data-Centric Analytics and Understanding the Full Data Supply Chain

  • 1. The First Step in Information Management looker.com Produced by: MONTHLY SERIES In partnership with: Data-centric Analytics and Understanding the Full Data Supply Chain May 3, 2018
  • 2. Welcome to Today’s Discussion  Understanding the data supply chain and how it impacts analytics  Data-centric design considerations for the data supply chain  Data supply chain features and components for the data lake  Key roles and responsibilities  How analytics must interact with the data supply chain  Best practices and key takeaways  Q&A pg 2© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 3. Background  An insurance company believes it found a strong predictor for policy renewal only to discover the model was based on an indicator that actually meant a policy was cancelled, not expiring.  A real estate AI model was corrupted because, while one record in a million was wrong, it was wrong by a magnitude of 1000, and there was no way to tell if it was correct or an error.  “There are many downstream processes, including EHR configuration, data transport, aggregation, normalization, and reporting mechanisms, that through omission or commission can negatively impact data quality.” Healthcare IT News  Postal addresses and emails change constantly  so 20–23% of all of this data is wrong as soon as it is received.  There are new examples of incorrect conclusions and bad AI results every day. Example site – www.towardsdatascience.com pg 3© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 4. What is a Data Supply Chain?  The Data supply chain represents the sources, flows, management and distribution of data and information in an organization. pg 4© 2018 First San Francisco Partners www.firstsanfranciscopartners.com • Steel • Aluminum Raw Material • Supplier • Fabrication Make/ Acquire Parts • Processes • Sequence Assembly • Ship • Store • Sell Distribute •Internal Database •Derived data Raw Material •External File •Clean Data •Standardize and Position Data Make/ Acquire Parts •Populate Sand Box •Develop and Run Models Assembly •Visualization •Publish and Pull •Sell Results Distribute A regular supply chain coordinates material sourcing and assembly to fulfill and deliver goods to customers.
  • 5. Why Data Supply Chains Are Important  A well-run factory has a “parts and tools” crib or cage.  Contents are well-managed and tracked.  Distribution from that area depends on strong guidance, policy and lean management. pg 5© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 6. Why Data Supply Chains Are Important  A well-run factory has a “parts and tools” crib or cage.  Contents are well-managed and tracked.  Distribution from that area depends on strong guidance, policy and lean management. pg 6© 2018 First San Francisco Partners www.firstsanfranciscopartners.com The tool crib has an inventory control system and an item master. A data supply chain and data lake uses metadata and data governance.
  • 7. Why Data Supply Chains Are Important  Unknown bad data is like a hidden manufacturing fault, but instead of a recall, you get to explain that the model and AI are in error and have been putting out bad recommendations.  Gathering external, internal and then blending in deduced data is like switching suppliers in a supply chain.  Sometimes the pieces don’t fit in spite of the same specs. pg 7© 2018 First San Francisco Partners www.firstsanfranciscopartners.com This isn’t new, but data scientists who are new to this type of use of data are “discovering” things that have already been discovered.
  • 8. Why Data Supply Chains Are Important  Unknown bad data is like a hidden manufacturing fault, but instead of a recall, you get to explain that the model and AI are in error and have been putting out bad recommendations.  Gathering external, internal and then blending in deduced data is like switching suppliers in a supply chain.  Sometimes the pieces don’t fit in spite of the same specs. pg 8© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Like modern manufacturing, you do not fix the product after it is built anymore. You fix the process, i.e., you build a data supply chain.
  • 9. Data-centric Thinking for Design pg 9 Treat and View Your Data as an Asset Sell to Suppliers, Not Consumers Integrate With Your Data Strategy Treat Like a Real Business Line AI and analytics architectures are really logistical challenges. Pretend the analytics results are a product, even if internal. Design the data supply chain. = © 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 10. Manage Your Data Architecture  Most large organizations have complicated architectures.  Even smaller organizations need to balance COTS, homegrown and modern data assets. pg 10© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Insight Architecture Integration & Abstraction Layer Data Management Layer Data Access Layer Business Strategy Vintage Area Contemporary Area
  • 11. Manage Your Data Architecture  Use the concept of a supply chain to understand, design and manage the balancing of the vintage and contemporary sides of your data architecture. pg 11© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Insight Architecture Integration & Abstraction Layer Data Management Layer Data Access Layer Business Strategy Vintage Area Contemporary Area Source FabricateAssembly Distribute
  • 12. Data Supply Chain Features of a Modern Data Supply Chain CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE Goods and Services Supply Chain Distribution Purchasing Operations Integration Functions Logistics Compliance Organization Product Mgmt Capabilities Data Push and Pull Data Management Data Operations Data Integration Functions BI, Analytics Governance Engagement Model Product Mgmt Capabilities Data Supply Chain pg 12© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 13. Major Components of the Data Supply Chain pg 13© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Supply Chain CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE
  • 14. Major Components of the Data Supply Chain pg 14© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Supply Chain Data Governance Data Catalog (Metadata) CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE Data Quality
  • 15. Major Components of the Data Supply Chain pg 15© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Supply Chain Data Governance Data Catalog (Metadata) CREATE USE UPDATE MEASURE MODEL ANALYZE DELETE Data Sources Data Lake LANDING ZONE STANDARDIZATION ZONE ANALYTICS PLATFORMS Legacy Apps ERP External Files Traditional BI and Reporting Reporting Data Mart Data Warehouse Data Quality
  • 16. Roles – Treat and View Your Data as an Asset pg 16© 2018 First San Francisco Partners www.firstsanfranciscopartners.com •Data Product Manager, quality control, etc. (Custodian, Steward) Information product management •Define the supply chain, from source, production, marketing and shipping Architect, engineer •Manage data quality  The data scientists cannot sustain the product without good raw materials. Govern data  There will be standards required for source and usage of data. Governance and quality •90% of the same support mechanisms are shared. Integrate with the data strategy and vision for the organizationLeadership and alignment Roles Responsibilities
  • 17. Responsibilities – Treat Like a Real Business Line pg 17 • Define revenue streams; Monitor costs; Measure effectiveness Oversight of data use • Measure the business; Monitor costs, returns on sales Manage the numbers • Engage, Transact, Fulfill, Service exists for internal and external Manage the customer • Attain peer status with finance, legal, other products/revenue streams Alignment and acceptance © 2018 First San Francisco Partners www.firstsanfranciscopartners.com Roles Responsibilities
  • 18. Best Practices pg 18 Sell to suppliers, not consumers Solve customer needs with data products. Provide data (or data access) that someone else can “productize.” Develop premium data products for sellers versus buyers. Sellers (and their agents) are more willing to pay than buyers. Direct-to-consumer tends to be fickle and higher-cost (Zillow and Uber, as examples). Develop a data exchange with customers on the platform of your customer’s preference. Don't make your customer have to invest heavily –standards are one thing, capital is another. Achieve market scale quickly. POCs are good, but make a call. Someone else will beat you. © 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 19. Best Practices pg 19 Integrate with data strategy IaaS and internal data ecosystems tend to be intertwined. They will share the same data management and governance functions (it is a win-win). Metadata is your inventory management system. Indices, registries and lineage are vital functions for oversight and scaling. Data landscape management is key – a good inventory of data assets. The most important or biggest DB might be one you don’t own (e.g. external data). Fuse data from many sources and formats. Culture has strategy for lunch. Acceptance of new standards and capabilities is never smooth. © 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 20. How Analytics Must Interact with the Data Supply Chain pg 20© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Supply Chain DATA SOURCES Analytics will lie somewhere along your data supply chain. LANDING ZONE STANDARDIZATION ZONE ANALYTICS PLATFORMS DATA GOVERNANCE DATA CONSUMERS DATA OPERATIONS DATA SCIENTISTS DATA MANAGEMENT Create Use Update Measure Model Delete
  • 21. How Analytics Must Interact with the Data Supply Chain pg 21© 2018 First San Francisco Partners www.firstsanfranciscopartners.com Data Supply Chain DATA SOURCES Analytics will lie somewhere along your data supply chain. LANDING ZONE STANDARDIZATION ZONE ANALYTICS PLATFORMS DATA GOVERNANCE DATA CONSUMERS DATA OPERATIONS DATA SCIENTISTS DATA MANAGEMENT Create Use Update Measure Report Delete
  • 22. Best Practices and Key Takeaways  Understand the bigger picture of balancing old and new, many sources and many uses of data  Understand there is a strong metaphor in the supply chain  Realize lean management, logistics and compliance are models in the non-data world that can be applied to data management of data supply chains  Ensure the end point is the data lake and analytics/insights  Start with information requirements when you plan for the data supply chain  Fully exploit the assembly line metaphor: raw data in – and out comes an analytics conclusion  Remember well-established best practices along the way. Metadata and the data catalog are your item master and inventory management systems pg 22© 2018 First San Francisco Partners www.firstsanfranciscopartners.com
  • 23. Please Share Your Questions and Comments MONTHLY SERIES
  • 24. Thank you for joining us today! Our Thursday, June 7 #DIAnaltyics webinar is: Top 5 Priorities of an Analytics Leader. John Ladley @jladley john@firstsanfranciscopartners.com Kelle O’Neal @kellezoneal kelle@firstsanfranciscopartners.com