SlideShare a Scribd company logo
Is it sensible to use
Data Vault at all?
Conclusions from a project.
Mainz, 15th March 2016
11. Oracle DWH Community Treffen
Alexander Mendle (Insights & Data, Capgemini)
Our customers’ business model.
Project setup.
Green field DW with 3 source systems – ERP, CMS, and a transaction system.
Data Vault was preset by our customers group who also provided a Data Vault architect.
Capgemini supported the project in implementation and testing.
With over 11,000 professionals across 40+ countries …
… and being part of a multi-faceted group …
180,000 employees(1)
in more than 40 countries
A promise that expresses
our brand philosophy
Revenues(2)
€10.573 billion
Operating margin
€486 million
Operating profit
€447 million
Net cash and
cash equivalents
€1,464 million
6 strategic alliances
EMC2, HP, IBM, Microsoft,
Oracle, SAP
7 values shared
since the company’s
creation in 1967
honesty/boldness/
trust/freedom/
team spirit/modesty/fun
A wide range of
cutting-edge
expertise for all
our clients
Five strategic sectors
Expertise in
Automotive ,Banking
Consumer Products & Retail
Energy and Utilities Insurance
A unique way
1 Headcount including IGATE
2 For the FY15-16
Capgemini Insights & Data Services Model
Copyright © Capgemini 2013. All Rights Reserved
5DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Industry
verticalization
Automotive
Consumer Products
& Retail
Public Sector
Financial Services
Telco
Energy & Utilities
Life Sciences
Media &
Entertainment
Core capabilities and offers
Data & Info
Management
Master Data
Management
Big Data
(Hadoop/NoSQL)
Optimized data
warehouse
EPM
(Enterprise Performance
Management)
BI & Data
Visualization
Predictive +
data science
Real-time
analytics
Delivery
models
BI Service Center
Cloud
Application
management
Agile
as-a-service &
BPO
IP Solutions
Rapid prototyping/
POC
Business engagement
Strategic
customer
partnership
Digital
transformation
Specific
alliance
initiatives
Performance
management
& strategy
Risk sharing
Governance, architecture and strategy
Data and
information
governance
Architecture
Privacy &
security
Information
strategy
Business
Data Lake
Copyright © Capgemini 2013. All Rights Reserved
6DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
 What is Data Vault? – just a quick glimpse
 Impact on Architecture
 Impact on Implementation
 Impact on Project
 Summary
Data Vault is applicable in a multi-layer DWH architecture
Copyright © Capgemini 2013. All Rights Reserved
7DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Stage
Data
Mart(s)
Core
(mostly
3NF)
Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
Data Vault is mostly tied to its unique data modelling approach. However, as of its newest version it’s a
comprehensive set of data modelling, project methodology and system architecture.
Stage
Data
Mart(s)
Data Vault
Hub: List of
buiness keys.
Link: N:M
relations
between Hubs.
Satellites:
details for
Hubs & Links
(historized)
The Data Vault proposition: agile, quick and cheap
Copyright © Capgemini 2013. All Rights Reserved
8DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
That’s the proposed enhancements...
Source: Linstedt, Olschimke: “Experiences from a Data Vault 2.0 pilot”, 13th European TDWI Conference,
Munich, 2013
Reduction in Total Cost of Ownership More agility
• Supports cross-functional areas of
business
• Near zero change impacts to existing
system
• Reduction in data acquisition costs
• Reduction in maintenance costs over the
life of the EDW
• Reduction in implementation complexitiy
• Compliance with full audits (“all the data, all
of the time”)
• Rapid turn-around for new requirements
• Parallel teams – all agile
• Scalable teams – with limited ramp-up
necessary
• Automated ETL generation based on
patterns.
Copyright © Capgemini 2013. All Rights Reserved
9DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
 What is Data Vault?
 Impact on Architecture
 Impact on Implementation
 Impact on Project
 Summary
Architecture easily deals with changes
Copyright © Capgemini 2013. All Rights Reserved
10DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Existing structures are not changed – no matter if its just a new attribute or a whole new source system,
changes in business or in semantics. New things are just added without side effects.
Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
Architecture easily deals with changes
Copyright © Capgemini 2013. All Rights Reserved
11DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Existing structures are not changed – no matter if its just a new attribute or a whole new source system,
changes in business or in semantics. New things are just added without side effects.
Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
Copyright © Capgemini 2013. All Rights Reserved
12DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
 What is Data Vault?
 Impact on Architecture
 Impact on Implementation
 Impact on Project
 Summary
Unique construction of tables enables industrialization
Copyright © Capgemini 2013. All Rights Reserved
13DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Hub tables
Unique “list” of business
keys
And the scheduling: first the hubs, then hub satellites and links, then link satellites. No more dependencies.
With Data Vault 2 this became even more flexible.
Link tables
Unique list of business
key combinations
Hub & Link satellite
tables
Attributes belonging to
hub / link entity
NumberofmanualETLs
Numberoftables
Effects of Data Vault on implementation
Copyright © Capgemini 2013. All Rights Reserved
14DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Quick implementation of Stage and Core using Generators – must-be as there is a vast number of objects
Data Vault requires a huge number of objects, but allows highly industrialized implementation
Stage | CORE | Marts Stage | CORE | Marts
Layers
Stage Layer
Data Vault Layer
Data Marts
Considerations from an implementation viewpoint
Does your staff support that?
Copyright © Capgemini 2013. All Rights Reserved
15DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
What will happen when implementing.
 This is business as usual.
 You will need a strong architect who masters Data Vault, and also
advocates Data Vault against developers not familiar with Data Vault.
Be prepared for arguments.
 Developers will use scripts, generators, configurations and will not
build manual ETLs
 Most time will be spent programming generators, a build toolchain,
test automation
 You might do here: 1) integrate data into context, 2) homogenize data
and 3) realize the data mart requirements. Probably all at a time.
 Watch out that you do not build things over and over again. So be
sure to have a good understanding of the semantics of your business
model in the team. Use Business Vault or other helpers.
 Developers will implement lots of joins in their queries.
A generator-based DW requires a more software-development oriented team – that’s probably a slightly
different story than a “BI consultant” team. Do you have the team that bridges the gap?
Copyright © Capgemini 2013. All Rights Reserved
16DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
 What is Data Vault?
 Impact on Architecture
 Impact on Implementation
 Impact on Project
 Summary
Process for building Data Vault is straight forward, the
CORE can be built quickly...
Copyright © Capgemini 2013. All Rights Reserved
17DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
 Analysis of business as usual....
 ...but not much efforts for design needed – Data Vault rules for modelling apply.
 ETLs can be generated from only four different templates (Hubs, Links, Hub Satellites and Link
Satellites)
Data Vault can help you pick up pace with the CORE (raw Data Vault)
Analysis Design ETL generation Loading
The thing is not to easily store lots of things, it’s about how
to retrieve information from it.
Copyright © Capgemini 2013. All Rights Reserved
18DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Data Vault is a paradigm change in many ways.
Quick and large store.
„All the data – all of the time“
Needs experience for
retrieval
High effort for systematic
Semantics built-in
...but may thwart you building the marts
Copyright © Capgemini 2013. All Rights Reserved
19DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Data Vault may help you pick up pace – up to the CORE layer (i. e. “DataVault Layer”)
 Analysis of business
as usual but
 ....no consideration of
a “CORE” modelling
because all Data Vault
rules apply
 “CORE” is done, but still
 Lacking data
homogenisation?
 “business entities”
(n:m)?
 probably the proper
analysis of all that?
 On top of that, you need
to (technically) design
your datamart
 Implementing all these
complex rules in ETL
 Quick setup of
CORE model
 Quick
implementation
for STAGE and
CORE --
generated and
automated build,
test, rollout
faster slower
That was our plan looking at it in november
Copyright © Capgemini 2013. All Rights Reserved
20DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
AugTimeline Sep Oct Nov Dec Jan Feb Mar
Phase 2Phase Phase 1
Phase 2 DeliverablesPhase 1 Deliverables
Mappings and
workflows for
 STAGE
 RAW DATA VAULT
Mappings and workflows for
 MASTER and MASTER CHECK
 BUSINESS VAULT
 MART
Deliver
ables
Staging and Raw DataVault went quite fast.
Business rules AND requirements are implemented within Data Mart, whereas CORE can be generated!
Make use of helpers such as Business Vault.
Not the DW – the toolchain software is the deliverable
Copyright © Capgemini 2013. All Rights Reserved
21DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Do you have a need to write / request a special type of offer?
Consider buying / offering a generator tool chain instead of tables and ETL programs.
Customer Service Provider
Consider writing your next RfP not as
RfP for a DW – but as RfP for a
generator software.
Obtain control over the build tools in
your project – the result is reproducible.
Consider a bid offering the generator
software. Your offer might look
astonishingly compelling.
Give your customer a good
argumentation why to go for a generated
solution – and refrain from it if you think
there is no fit.
Copyright © Capgemini 2013. All Rights Reserved
22DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
 What is Data Vault?
 Impact on Architecture
 Impact on Implementation
 Impact on Project
 Summary
Layers
Do I have a strong need to
enable agile methodologies?
Do I have the right people to
support that?
Is there a strong need for
any special Data Vault
characteristic?
Summary
Putting these aspects together with a decision focus
Copyright © Capgemini 2013. All Rights Reserved
23DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
What will happen when implementing.
 Data Vault enables you to integrate new source systems quickly and
realize new requirements with minimal dependencies.
 Data Vault requires somewhat different skills than a classic BI project
 Data Vault is different in analysis and operation than a classic BI
environment.
 Are you able to bring a better understanding of your business into the
(BI) team?
 Are you really in need to be highly agile on CORE level?
 Do you really need to have high traceability and / or auditability?
 Which other possibilities do you have to realize your requirements?
How is my budget and time
situation?
 Do you have a small, probably volatile and for future phases not
overseeable budget for your BI initiative?
 Are you in need to quickly obtain a consolidated and flexible data
layer (CORE-DW vs Data Vault)
Operative: Data Vault allows for automation, puts analysis in two points of architecture, allows agility
Tactic: Data Vault can help you get as much budget through the door as there is.
Strategic: Service Providers can build new business models with Data Vault – for CORE layer
Contact information
Alexander
Mendle
Consultant Insights & Data
Alexander.Mendle@capgemini.com
Capgemini Deutschland GmbH
Olof-Palme-Str. 14
81829 München
Insert
contact
picture
Copyright © Capgemini 2013. All Rights Reserved
24DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
www.capgemini.com
About Capgemini
With more than 120,000 people in 40 countries, Capgemini is one
of the world's foremost providers of consulting, technology and
outsourcing services. The Group reported 2011 global revenues
of EUR 9.7 billion.
Together with its clients, Capgemini creates and delivers
business and technology solutions that fit their needs and drive
the results they want. A deeply multicultural organization,
Capgemini has developed its own way of working, the
Collaborative Business ExperienceTM, and draws on Rightshore ®,
its worldwide delivery model.
Rightshore® is a trademark belonging to Capgemini
The information contained in this presentation is proprietary.
Copyright © 2013 Capgemini. All rights reserved.
Just a few Data Vault Tools
Copyright © Capgemini 2013. All Rights Reserved
26DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
• Example: quipu – http://www.datawarehousemanagement.org/
• An engine to play around: https://sourceforge.net/projects/pdidatavaultfw/
(Linux, MySQL, Kettle, Excel-configurated)
Pictures
Copyright © Capgemini 2013. All Rights Reserved
27DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
• https://commons.wikimedia.org/wiki/File:Gao-report-on-interchange.gif,
13.3.16, Public domain
• https://commons.wikimedia.org/wiki/File:KUKA_robot_for_flat_glas_handling.j
pg, 9.3.16, Public domain
• https://pixabay.com/de/b%C3%BCro-ordner-regal-fenster-firma-638247/,
9.3.16, Public domain
• https://www.flickr.com/photos/nationalsecurityzone/8552562622/in/photostrea
m/, 9.3.16, https://creativecommons.org/licenses/by/2.0/, By: MedillNSZ
(https://www.flickr.com/photos/nationalsecurityzone/)
• https://commons.wikimedia.org/wiki/File:2010.07.21.152950_Abf%C3%BCllan
lage_Gerolstein.jpg, 9.3.16, By Hermann Luyken (Own work) [Public domain],
via Wikimedia Commons
• https://commons.wikimedia.org/wiki/File:Euplectes_progne_male_South_Afric
a_cropped.jpg, 9.3.16, Public domain
Refer to these Websites for more information.

More Related Content

PPTX
ETL Process
PDF
Oracle Analytics Cloud のご紹介【2021年3月版】
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
【旧版】Oracle Database Cloud Service:サービス概要のご紹介 [2021年7月版]
PPTX
Building a modern data warehouse
PDF
PostgreSQL Tutorial For Beginners | Edureka
PDF
PostgreSQL and Benchmarks
PDF
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
ETL Process
Oracle Analytics Cloud のご紹介【2021年3月版】
Democratizing Data Quality Through a Centralized Platform
【旧版】Oracle Database Cloud Service:サービス概要のご紹介 [2021年7月版]
Building a modern data warehouse
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL and Benchmarks
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...

What's hot (20)

PPTX
Operational Data Vault
PDF
Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing
PDF
Change Data Feed in Delta
PDF
Delta from a Data Engineer's Perspective
PPTX
Azure purview
PDF
Modern Data Flow
PDF
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
PDF
Oracle Cloud Infrastructure:2021年3月度サービス・アップデート
PDF
CDC patterns in Apache Kafka®
PPTX
Oracle SQL Developer Tips & Tricks
PPTX
Data Lake Overview
PDF
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
PPTX
From cache to in-memory data grid. Introduction to Hazelcast.
PDF
Oracle 資料庫建立
PDF
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
PPT
Database migration
PPTX
Core Concepts in azure data factory
PDF
Oracle GoldenGate FAQ
PDF
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Operational Data Vault
Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing
Change Data Feed in Delta
Delta from a Data Engineer's Perspective
Azure purview
Modern Data Flow
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Oracle Cloud Infrastructure:2021年3月度サービス・アップデート
CDC patterns in Apache Kafka®
Oracle SQL Developer Tips & Tricks
Data Lake Overview
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
From cache to in-memory data grid. Introduction to Hazelcast.
Oracle 資料庫建立
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Database migration
Core Concepts in azure data factory
Oracle GoldenGate FAQ
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Ad

Similar to Is it sensible to use Data Vault at all? Conclusions from a project. (20)

PDF
Introduction to Modern Data Virtualization 2021 (APAC)
PDF
The Essentials Of Project Management
PPTX
Open Source DWBI-A Primer
PDF
4SubseaEngLR
PDF
Microsoft Fabric Intro D Koutsanastasis
PDF
A Logical Architecture is Always a Flexible Architecture (ASEAN)
PDF
Big data presentation, explanations and use cases in industrial sector
PDF
Data virtualization an introduction
PDF
Real-life Customer Cases using Data Vault and Data Warehouse Automation
PDF
Accelerate and Scale Big Data Analytics and Machine Learning Pipelines with D...
PDF
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
PDF
Why Data Virtualization? An Introduction
PDF
IBM TS7610 ProtecTIER Deduplication Appliance Express – Enterprise Level Tech...
PDF
Sadas Engine + QlikView
PDF
The new EDW
DOC
SAP vs SAS - Comparison
PDF
The Growth Of Data Centers
PDF
Bridging the Last Mile: Getting Data to the People Who Need It
PDF
Unify Data at Memory Speed
PPTX
Data Virtualization – Gateway to a Digital Business - Barry Devlin
Introduction to Modern Data Virtualization 2021 (APAC)
The Essentials Of Project Management
Open Source DWBI-A Primer
4SubseaEngLR
Microsoft Fabric Intro D Koutsanastasis
A Logical Architecture is Always a Flexible Architecture (ASEAN)
Big data presentation, explanations and use cases in industrial sector
Data virtualization an introduction
Real-life Customer Cases using Data Vault and Data Warehouse Automation
Accelerate and Scale Big Data Analytics and Machine Learning Pipelines with D...
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
Why Data Virtualization? An Introduction
IBM TS7610 ProtecTIER Deduplication Appliance Express – Enterprise Level Tech...
Sadas Engine + QlikView
The new EDW
SAP vs SAS - Comparison
The Growth Of Data Centers
Bridging the Last Mile: Getting Data to the People Who Need It
Unify Data at Memory Speed
Data Virtualization – Gateway to a Digital Business - Barry Devlin
Ad

More from Capgemini (20)

PPTX
Top Healthcare Trends 2022
PPTX
Top P&C Insurance Trends 2022
PPTX
Commercial Banking Trends book 2022
PPTX
Top Trends in Payments 2022
PPTX
Top Trends in Wealth Management 2022
PPTX
Retail Banking Trends book 2022
PPTX
Top Life Insurance Trends 2022
PPTX
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
PPTX
Property & Casualty Insurance Top Trends 2021
PPTX
Life Insurance Top Trends 2021
PPTX
Top Trends in Commercial Banking: 2021
PPTX
Top Trends in Wealth Management: 2021
PPTX
Top Trends in Payments: 2021
PPTX
Health Insurance Top Trends 2021
PPTX
Top Trends in Retail Banking: 2021
PDF
Capgemini’s Connected Autonomous Planning
PPTX
Top Trends in Retail Banking: 2020
PPTX
Top Trends in Life Insurance: 2020
PPTX
Top Trends in Health Insurance: 2020
PPTX
Top Trends in Payments: 2020
Top Healthcare Trends 2022
Top P&C Insurance Trends 2022
Commercial Banking Trends book 2022
Top Trends in Payments 2022
Top Trends in Wealth Management 2022
Retail Banking Trends book 2022
Top Life Insurance Trends 2022
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
Property & Casualty Insurance Top Trends 2021
Life Insurance Top Trends 2021
Top Trends in Commercial Banking: 2021
Top Trends in Wealth Management: 2021
Top Trends in Payments: 2021
Health Insurance Top Trends 2021
Top Trends in Retail Banking: 2021
Capgemini’s Connected Autonomous Planning
Top Trends in Retail Banking: 2020
Top Trends in Life Insurance: 2020
Top Trends in Health Insurance: 2020
Top Trends in Payments: 2020

Recently uploaded (20)

PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Cloud computing and distributed systems.
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Modernizing your data center with Dell and AMD
PPT
Teaching material agriculture food technology
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
NewMind AI Monthly Chronicles - July 2025
Empathic Computing: Creating Shared Understanding
Digital-Transformation-Roadmap-for-Companies.pptx
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
Cloud computing and distributed systems.
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Modernizing your data center with Dell and AMD
Teaching material agriculture food technology
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Building Integrated photovoltaic BIPV_UPV.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation

Is it sensible to use Data Vault at all? Conclusions from a project.

  • 1. Is it sensible to use Data Vault at all? Conclusions from a project. Mainz, 15th March 2016 11. Oracle DWH Community Treffen Alexander Mendle (Insights & Data, Capgemini)
  • 2. Our customers’ business model. Project setup. Green field DW with 3 source systems – ERP, CMS, and a transaction system. Data Vault was preset by our customers group who also provided a Data Vault architect. Capgemini supported the project in implementation and testing.
  • 3. With over 11,000 professionals across 40+ countries …
  • 4. … and being part of a multi-faceted group … 180,000 employees(1) in more than 40 countries A promise that expresses our brand philosophy Revenues(2) €10.573 billion Operating margin €486 million Operating profit €447 million Net cash and cash equivalents €1,464 million 6 strategic alliances EMC2, HP, IBM, Microsoft, Oracle, SAP 7 values shared since the company’s creation in 1967 honesty/boldness/ trust/freedom/ team spirit/modesty/fun A wide range of cutting-edge expertise for all our clients Five strategic sectors Expertise in Automotive ,Banking Consumer Products & Retail Energy and Utilities Insurance A unique way 1 Headcount including IGATE 2 For the FY15-16
  • 5. Capgemini Insights & Data Services Model Copyright © Capgemini 2013. All Rights Reserved 5DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Industry verticalization Automotive Consumer Products & Retail Public Sector Financial Services Telco Energy & Utilities Life Sciences Media & Entertainment Core capabilities and offers Data & Info Management Master Data Management Big Data (Hadoop/NoSQL) Optimized data warehouse EPM (Enterprise Performance Management) BI & Data Visualization Predictive + data science Real-time analytics Delivery models BI Service Center Cloud Application management Agile as-a-service & BPO IP Solutions Rapid prototyping/ POC Business engagement Strategic customer partnership Digital transformation Specific alliance initiatives Performance management & strategy Risk sharing Governance, architecture and strategy Data and information governance Architecture Privacy & security Information strategy Business Data Lake
  • 6. Copyright © Capgemini 2013. All Rights Reserved 6DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Agenda  What is Data Vault? – just a quick glimpse  Impact on Architecture  Impact on Implementation  Impact on Project  Summary
  • 7. Data Vault is applicable in a multi-layer DWH architecture Copyright © Capgemini 2013. All Rights Reserved 7DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Stage Data Mart(s) Core (mostly 3NF) Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010 Data Vault is mostly tied to its unique data modelling approach. However, as of its newest version it’s a comprehensive set of data modelling, project methodology and system architecture. Stage Data Mart(s) Data Vault Hub: List of buiness keys. Link: N:M relations between Hubs. Satellites: details for Hubs & Links (historized)
  • 8. The Data Vault proposition: agile, quick and cheap Copyright © Capgemini 2013. All Rights Reserved 8DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX That’s the proposed enhancements... Source: Linstedt, Olschimke: “Experiences from a Data Vault 2.0 pilot”, 13th European TDWI Conference, Munich, 2013 Reduction in Total Cost of Ownership More agility • Supports cross-functional areas of business • Near zero change impacts to existing system • Reduction in data acquisition costs • Reduction in maintenance costs over the life of the EDW • Reduction in implementation complexitiy • Compliance with full audits (“all the data, all of the time”) • Rapid turn-around for new requirements • Parallel teams – all agile • Scalable teams – with limited ramp-up necessary • Automated ETL generation based on patterns.
  • 9. Copyright © Capgemini 2013. All Rights Reserved 9DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Agenda  What is Data Vault?  Impact on Architecture  Impact on Implementation  Impact on Project  Summary
  • 10. Architecture easily deals with changes Copyright © Capgemini 2013. All Rights Reserved 10DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Existing structures are not changed – no matter if its just a new attribute or a whole new source system, changes in business or in semantics. New things are just added without side effects. Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
  • 11. Architecture easily deals with changes Copyright © Capgemini 2013. All Rights Reserved 11DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Existing structures are not changed – no matter if its just a new attribute or a whole new source system, changes in business or in semantics. New things are just added without side effects. Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
  • 12. Copyright © Capgemini 2013. All Rights Reserved 12DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Agenda  What is Data Vault?  Impact on Architecture  Impact on Implementation  Impact on Project  Summary
  • 13. Unique construction of tables enables industrialization Copyright © Capgemini 2013. All Rights Reserved 13DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Hub tables Unique “list” of business keys And the scheduling: first the hubs, then hub satellites and links, then link satellites. No more dependencies. With Data Vault 2 this became even more flexible. Link tables Unique list of business key combinations Hub & Link satellite tables Attributes belonging to hub / link entity
  • 14. NumberofmanualETLs Numberoftables Effects of Data Vault on implementation Copyright © Capgemini 2013. All Rights Reserved 14DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Quick implementation of Stage and Core using Generators – must-be as there is a vast number of objects Data Vault requires a huge number of objects, but allows highly industrialized implementation Stage | CORE | Marts Stage | CORE | Marts
  • 15. Layers Stage Layer Data Vault Layer Data Marts Considerations from an implementation viewpoint Does your staff support that? Copyright © Capgemini 2013. All Rights Reserved 15DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX What will happen when implementing.  This is business as usual.  You will need a strong architect who masters Data Vault, and also advocates Data Vault against developers not familiar with Data Vault. Be prepared for arguments.  Developers will use scripts, generators, configurations and will not build manual ETLs  Most time will be spent programming generators, a build toolchain, test automation  You might do here: 1) integrate data into context, 2) homogenize data and 3) realize the data mart requirements. Probably all at a time.  Watch out that you do not build things over and over again. So be sure to have a good understanding of the semantics of your business model in the team. Use Business Vault or other helpers.  Developers will implement lots of joins in their queries. A generator-based DW requires a more software-development oriented team – that’s probably a slightly different story than a “BI consultant” team. Do you have the team that bridges the gap?
  • 16. Copyright © Capgemini 2013. All Rights Reserved 16DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Agenda  What is Data Vault?  Impact on Architecture  Impact on Implementation  Impact on Project  Summary
  • 17. Process for building Data Vault is straight forward, the CORE can be built quickly... Copyright © Capgemini 2013. All Rights Reserved 17DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX  Analysis of business as usual....  ...but not much efforts for design needed – Data Vault rules for modelling apply.  ETLs can be generated from only four different templates (Hubs, Links, Hub Satellites and Link Satellites) Data Vault can help you pick up pace with the CORE (raw Data Vault) Analysis Design ETL generation Loading
  • 18. The thing is not to easily store lots of things, it’s about how to retrieve information from it. Copyright © Capgemini 2013. All Rights Reserved 18DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Data Vault is a paradigm change in many ways. Quick and large store. „All the data – all of the time“ Needs experience for retrieval High effort for systematic Semantics built-in
  • 19. ...but may thwart you building the marts Copyright © Capgemini 2013. All Rights Reserved 19DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Data Vault may help you pick up pace – up to the CORE layer (i. e. “DataVault Layer”)  Analysis of business as usual but  ....no consideration of a “CORE” modelling because all Data Vault rules apply  “CORE” is done, but still  Lacking data homogenisation?  “business entities” (n:m)?  probably the proper analysis of all that?  On top of that, you need to (technically) design your datamart  Implementing all these complex rules in ETL  Quick setup of CORE model  Quick implementation for STAGE and CORE -- generated and automated build, test, rollout faster slower
  • 20. That was our plan looking at it in november Copyright © Capgemini 2013. All Rights Reserved 20DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX AugTimeline Sep Oct Nov Dec Jan Feb Mar Phase 2Phase Phase 1 Phase 2 DeliverablesPhase 1 Deliverables Mappings and workflows for  STAGE  RAW DATA VAULT Mappings and workflows for  MASTER and MASTER CHECK  BUSINESS VAULT  MART Deliver ables Staging and Raw DataVault went quite fast. Business rules AND requirements are implemented within Data Mart, whereas CORE can be generated! Make use of helpers such as Business Vault.
  • 21. Not the DW – the toolchain software is the deliverable Copyright © Capgemini 2013. All Rights Reserved 21DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Do you have a need to write / request a special type of offer? Consider buying / offering a generator tool chain instead of tables and ETL programs. Customer Service Provider Consider writing your next RfP not as RfP for a DW – but as RfP for a generator software. Obtain control over the build tools in your project – the result is reproducible. Consider a bid offering the generator software. Your offer might look astonishingly compelling. Give your customer a good argumentation why to go for a generated solution – and refrain from it if you think there is no fit.
  • 22. Copyright © Capgemini 2013. All Rights Reserved 22DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX Agenda  What is Data Vault?  Impact on Architecture  Impact on Implementation  Impact on Project  Summary
  • 23. Layers Do I have a strong need to enable agile methodologies? Do I have the right people to support that? Is there a strong need for any special Data Vault characteristic? Summary Putting these aspects together with a decision focus Copyright © Capgemini 2013. All Rights Reserved 23DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX What will happen when implementing.  Data Vault enables you to integrate new source systems quickly and realize new requirements with minimal dependencies.  Data Vault requires somewhat different skills than a classic BI project  Data Vault is different in analysis and operation than a classic BI environment.  Are you able to bring a better understanding of your business into the (BI) team?  Are you really in need to be highly agile on CORE level?  Do you really need to have high traceability and / or auditability?  Which other possibilities do you have to realize your requirements? How is my budget and time situation?  Do you have a small, probably volatile and for future phases not overseeable budget for your BI initiative?  Are you in need to quickly obtain a consolidated and flexible data layer (CORE-DW vs Data Vault) Operative: Data Vault allows for automation, puts analysis in two points of architecture, allows agility Tactic: Data Vault can help you get as much budget through the door as there is. Strategic: Service Providers can build new business models with Data Vault – for CORE layer
  • 24. Contact information Alexander Mendle Consultant Insights & Data Alexander.Mendle@capgemini.com Capgemini Deutschland GmbH Olof-Palme-Str. 14 81829 München Insert contact picture Copyright © Capgemini 2013. All Rights Reserved 24DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
  • 25. www.capgemini.com About Capgemini With more than 120,000 people in 40 countries, Capgemini is one of the world's foremost providers of consulting, technology and outsourcing services. The Group reported 2011 global revenues of EUR 9.7 billion. Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business ExperienceTM, and draws on Rightshore ®, its worldwide delivery model. Rightshore® is a trademark belonging to Capgemini The information contained in this presentation is proprietary. Copyright © 2013 Capgemini. All rights reserved.
  • 26. Just a few Data Vault Tools Copyright © Capgemini 2013. All Rights Reserved 26DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX • Example: quipu – http://www.datawarehousemanagement.org/ • An engine to play around: https://sourceforge.net/projects/pdidatavaultfw/ (Linux, MySQL, Kettle, Excel-configurated)
  • 27. Pictures Copyright © Capgemini 2013. All Rights Reserved 27DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX • https://commons.wikimedia.org/wiki/File:Gao-report-on-interchange.gif, 13.3.16, Public domain • https://commons.wikimedia.org/wiki/File:KUKA_robot_for_flat_glas_handling.j pg, 9.3.16, Public domain • https://pixabay.com/de/b%C3%BCro-ordner-regal-fenster-firma-638247/, 9.3.16, Public domain • https://www.flickr.com/photos/nationalsecurityzone/8552562622/in/photostrea m/, 9.3.16, https://creativecommons.org/licenses/by/2.0/, By: MedillNSZ (https://www.flickr.com/photos/nationalsecurityzone/) • https://commons.wikimedia.org/wiki/File:2010.07.21.152950_Abf%C3%BCllan lage_Gerolstein.jpg, 9.3.16, By Hermann Luyken (Own work) [Public domain], via Wikimedia Commons • https://commons.wikimedia.org/wiki/File:Euplectes_progne_male_South_Afric a_cropped.jpg, 9.3.16, Public domain Refer to these Websites for more information.