SlideShare a Scribd company logo
Allen Keyte
Director - Mero
Data Virtualization for
Data Architects
21 October 2020
Chris Day
Director Sales Engineering - Denodo
Data Virtualization for Data Architects
Mero & Denodo
Extending your Data Architecture
Questions
Next Steps
This Webinar - agenda
Leader in data virtualization
Combine disparate sources
Consume with a data catalog
Data engineering & analytics consulting
Over 100 active clients
Modern data platforms
Data Virtualization for Data Architects
New Zealand partnership
Denodo Data Virtualization
5
Gartner – The Rise of Logical Architectures
This is a Second Major Cycle of Analytical Consolidation
Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
Operational
Application
Operational
Application
Cube
Operational
Application
Cube
? Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
1980s
Pre EDW
1990s
EDW
2010s2000s
Post EDW
Time
LDW
Operational
Application
Operational
Application
Operational
Application
Data
Warehouse
Data
Warehouse
Data
Lake
?
Logical Data
Warehouse
Data Warehouse
Data Lake
Marts
ODS
Staging/Ingest
Unified analysis
› Consolidated data
› "Collect the data"
› Single server, multiple nodes
› More analysis than any
one server can provide
©2018 Gartner, Inc.
Unified analysis
› Logically consolidated view of all data
› "Connect and collect"
› Multiple servers, of multiple nodes
› More analysis than any one system can provide
ID: 342254
Fragmented/
nonexistent analysis
› Multiple sources
› Multiple structured sources
Fragmented analysis
› "Collect the data" (Into
› different repositories)
› New data types,
› processing, requirements
› Uncoordinated views
6
Gartner – The Rise of Logical Architectures
This is a Second Major Cycle of Analytical Consolidation
Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
Operational
Application
Operational
Application
Cube
Operational
Application
Cube
? Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
1980s
Pre EDW
1990s
EDW
2010s2000s
Post EDW
Time
LDW
Operational
Application
Operational
Application
Operational
Application
Data
Warehouse
Data
Warehouse
Data
Lake
?
Unified analysis
› Consolidated data
› "Collect the data"
› Single server, multiple nodes
› More analysis than any
one server can provide
©2018 Gartner, Inc.
Unified analysis
› Logically consolidated view of all data
› "Connect and collect"
› Multiple servers, of multiple nodes
› More analysis than any one system can provide
ID: 342254
Fragmented/
nonexistent analysis
› Multiple sources
› Multiple structured sources
Fragmented analysis
› "Collect the data" (Into
› different repositories)
› New data types,
› processing, requirements
› Uncoordinated views
Operational Application
Operational Application
Operational Application
IoT Data
Other NewData
Logical Data
Warehouse
Data Warehouse
Data Lake
Marts
ODS
Staging/Ingest
Data
Virtualization
√ Improved Time to Market by 50 to 90%
√ Improved Report Consistency
√ Reduce Duplication of Data
√ Improve Transparency
√ Reduced development Cost
√ Future Proof the architecture against
technology changes
DATA CONSUMERS
DISPARATE DATA SOURCES
SQL Queries
(JDBC, ODBC, ADO.NET)
Web Services
(SOAP, REST, OData)
Web-based catalog
& search
Secure delivery
(SSL/TLS)
DATA CONSUMERS
MPP Processing
Relational Cache
Corporate Security
Monitoring & Auditing
Metadata
Repository
Execution Engine
& Optimizer
Data Virtualization as a Data Access Layer
DATA VIRTUALIZATION
Consume
Combine
2
3
Connect
1
DATA CONSUMERS
DISPARATE DATA SOURCES
SQL Queries
(JDBC, ODBC, ADO.NET)
Web Services
(SOAP, REST, OData)
Web-based catalog
& search
Secure delivery
(SSL/TLS)
DATA CONSUMERS
Data Virtualization in Action
Consume
Combine
2
3
Connect
1
Base/Raw views
Standardized
views
Customer Product Order
Business viewsFinance Operations Sales
Less Structured
Operational
Each Layer of Views
provides more refined
Single Views of Truth
Platform Demonstration
10
Demo Scenario
▪ Historical sales data offloaded to Hadoop
cluster for cheaper storage
▪ Marketing campaigns managed in an external
cloud app
▪ Country is part of the customer details table,
stored in the DW
Sources
Combine,
Transform
&
Integrate
Consume
Base View
Source
Abstraction
join
group by state
join
Sales Campaign Customer
SaaS solution
How effective are our marketing Campaigns?
11
Personas
Denodo Developer
Business User
& BI Analyst Data Scientist
Application-to-Application
Administration &
Operations
Unified Web Administration: Central Web Portal
Entry point for all
users to all Denodo
Environments.
SSO to all tools
with Kerberos, SAML
or OAuth
Data Virtualization:
1. Enables data re-use reducing costs & increasing
collaboration
2. Unifies disparate data sources in real-time
3. Supports self-service & data discovery
4. Centralises governance & security of enterprise
data assets
Key Takeaways
Data Virtualization for Data Architects
Questions
Wed Nov 11 | Data Virtualization for Business Consumption
Workshop | Hands-on virtual workshops - greg.laws@mero.co.nz | +64 21 875 875
Data Virtualization for Data Architects
Next Steps
Webinar series continues
Test Drive | Try it out on mero.co.nz/denodo/
16
What is the optimizer doing?
SELECT c.state, AVG(s.amount)
FROM customer c JOIN sales s
ON c.id = s.customer_id
GROUP BY c.state
Sales Customer
join
group by
Sales Customer
Create temp
table
join
group by
Option 1?
Option 2? Option 3?
Temp_Customer
Customer and Sales are in different sources.
What is the best execution plan?
Naïve Strategy Temporary Data Movement
300 M 2 M
2 M
50 M
Sales Customer
join
group by ID
Group by
state
Partial Aggregation Pushdown
2 M
2 M
‘Cost’ ~302 M ‘Cost’ ~52 M ‘Cost’ ~4 M
17
Why is this so important?
SELECT c.name, AVG(s.amount)
FROM customer c JOIN sales s
ON c.id = s.customer_id
GROUP BY c.state
How Denodo works compared with other federation engines
System Execution Time Data Transferred Optimization Technique
Denodo 9 sec. 4 M Aggregation push-down
Others 125 sec. 302 M None: full scan
300 M 2 M
Sales Customer
join
group by
2 M
2 M
Sales Customer
join
group by ID
Group by
state
To maximize push
down to the EDW
the aggregation is
split in 2 steps:
• 1st by customerID
• 2nd by state
This significantly
reduces network
Traffic and processing
In Denodo
18
Denodo Performance Strategies
• Post-processing and Federation in the DV engine
• Delegation
▪ Process as much as possible in the data sources
• Temporary Tables
▪ Automatically move data to the biggest data source to optimize the execution
• Summaries
▪ Based on the query the Denodo optimizer can use a “summary” for accelerating the execution
• MPP Integration
▪ Move processing to an external MPP system on the fly
• Caching
▪ Persist data beforehand in a relational database

More Related Content

PDF
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
PPTX
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
PPTX
Fast Data Strategy Houston Roadshow Presentation
PDF
Data Virtualization: From Zero to Hero
PDF
Data Services and the Modern Data Ecosystem
PDF
Logical Data Warehouse and Data Lakes
PDF
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
PDF
Enabling Self-Service Analytics with Logical Data Warehouse
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Fast Data Strategy Houston Roadshow Presentation
Data Virtualization: From Zero to Hero
Data Services and the Modern Data Ecosystem
Logical Data Warehouse and Data Lakes
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Enabling Self-Service Analytics with Logical Data Warehouse

What's hot (20)

PDF
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
PDF
GDPR Noncompliance: Avoid the Risk with Data Virtualization
PDF
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
PDF
Secure your data with Virtual Data Fabric (Middle East)
PDF
Why Data Virtualization? An Introduction
PDF
Denodo 6.0: Self Service Search, Discovery & Governance using an Universal Se...
PDF
Where does Fast Data Strategy Fit within IT Projects
PDF
In Memory Parallel Processing for Big Data Scenarios
PDF
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
PDF
Why Data Virtualization Matters in Your Portfolio
PDF
Enabling Cloud Data Integration (EMEA)
PDF
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
PDF
Denodo DataFest 2017: Data Virtualization in the World of Edge Computing
PDF
Data Virtualization: From Zero to Hero (Middle East)
PDF
Data Virtualization - Enabling Next Generation Analytics
PDF
Best Practices: Data Virtualization Perspectives and Best Practices
PDF
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
PDF
Data Virtualization: An Essential Component of a Cloud Data Lake
PDF
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
PDF
Why Data Virtualization? An Introduction.
Partner Enablement: Key Differentiators of Denodo Platform 6.0 for the Field
GDPR Noncompliance: Avoid the Risk with Data Virtualization
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Secure your data with Virtual Data Fabric (Middle East)
Why Data Virtualization? An Introduction
Denodo 6.0: Self Service Search, Discovery & Governance using an Universal Se...
Where does Fast Data Strategy Fit within IT Projects
In Memory Parallel Processing for Big Data Scenarios
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Why Data Virtualization Matters in Your Portfolio
Enabling Cloud Data Integration (EMEA)
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Denodo DataFest 2017: Data Virtualization in the World of Edge Computing
Data Virtualization: From Zero to Hero (Middle East)
Data Virtualization - Enabling Next Generation Analytics
Best Practices: Data Virtualization Perspectives and Best Practices
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Data Virtualization: An Essential Component of a Cloud Data Lake
Empowering your Enterprise with a Self-Service Data Marketplace (ASEAN)
Why Data Virtualization? An Introduction.
Ad

Similar to Data Virtualization for Data Architects (New Zealand) (20)

PDF
Data Virtualization for Data Architects (Australia)
PDF
Data Virtualization: An Introduction
PDF
Data Virtualization. An Introduction (ASEAN)
PPTX
Take your Data Management Practice to the Next Level with Denodo 7
PDF
Impulser la digitalisation et modernisation de la fonction Finance grâce à la...
PDF
Introduction to Modern Data Virtualization 2021 (APAC)
PDF
Data virtualization an introduction
PDF
An Introduction to Data Virtualization in 2018
PDF
Virtualisation de données : Enjeux, Usages & Bénéfices
PDF
Data Virtualization: An Introduction
PDF
Introduction to Modern Data Virtualization (US)
PDF
Data Virtualization: An Introduction
PDF
Modern Data Management for Federal Modernization
PDF
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
PDF
Building Resiliency and Agility with Data Virtualization for the New Normal
PDF
MasterClass Series: Unlocking Data Sharing Velocity with Data Virtualization
PDF
Belgium & Luxembourg dedicated online Data Virtualization discovery workshop
PDF
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
PDF
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
PDF
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Data Virtualization for Data Architects (Australia)
Data Virtualization: An Introduction
Data Virtualization. An Introduction (ASEAN)
Take your Data Management Practice to the Next Level with Denodo 7
Impulser la digitalisation et modernisation de la fonction Finance grâce à la...
Introduction to Modern Data Virtualization 2021 (APAC)
Data virtualization an introduction
An Introduction to Data Virtualization in 2018
Virtualisation de données : Enjeux, Usages & Bénéfices
Data Virtualization: An Introduction
Introduction to Modern Data Virtualization (US)
Data Virtualization: An Introduction
Modern Data Management for Federal Modernization
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Building Resiliency and Agility with Data Virtualization for the New Normal
MasterClass Series: Unlocking Data Sharing Velocity with Data Virtualization
Belgium & Luxembourg dedicated online Data Virtualization discovery workshop
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Ad

More from Denodo (20)

PDF
Enterprise Monitoring and Auditing in Denodo
PDF
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
PDF
Achieving Self-Service Analytics with a Governed Data Services Layer
PDF
What you need to know about Generative AI and Data Management?
PDF
Mastering Data Compliance in a Dynamic Business Landscape
PDF
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
PDF
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
PDF
Drive Data Privacy Regulatory Compliance
PDF
Знакомство с виртуализацией данных для профессионалов в области данных
PDF
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
PDF
Denodo Partner Connect - Technical Webinar - Ask Me Anything
PDF
Lunch and Learn ANZ: Key Takeaways for 2023!
PDF
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
PDF
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
PDF
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
PDF
How to Build Your Data Marketplace with Data Virtualization?
PDF
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
PDF
Enabling Data Catalog users with advanced usability
PDF
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
PDF
GenAI y el futuro de la gestión de datos: mitos y realidades
Enterprise Monitoring and Auditing in Denodo
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Achieving Self-Service Analytics with a Governed Data Services Layer
What you need to know about Generative AI and Data Management?
Mastering Data Compliance in a Dynamic Business Landscape
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Drive Data Privacy Regulatory Compliance
Знакомство с виртуализацией данных для профессионалов в области данных
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Lunch and Learn ANZ: Key Takeaways for 2023!
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
How to Build Your Data Marketplace with Data Virtualization?
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Enabling Data Catalog users with advanced usability
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
GenAI y el futuro de la gestión de datos: mitos y realidades

Recently uploaded (20)

PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Oracle OFSAA_ The Complete Guide to Transforming Financial Risk Management an...
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
annual-report-2024-2025 original latest.
PDF
How to run a consulting project- client discovery
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPT
DATA COLLECTION METHODS-ppt for nursing research
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
modul_python (1).pptx for professional and student
PPTX
Leprosy and NLEP programme community medicine
PPTX
Managing Community Partner Relationships
PPT
Predictive modeling basics in data cleaning process
PDF
Introduction to Data Science and Data Analysis
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Oracle OFSAA_ The Complete Guide to Transforming Financial Risk Management an...
Data_Analytics_and_PowerBI_Presentation.pptx
annual-report-2024-2025 original latest.
How to run a consulting project- client discovery
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
SAP 2 completion done . PRESENTATION.pptx
DATA COLLECTION METHODS-ppt for nursing research
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Acceptance and paychological effects of mandatory extra coach I classes.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
modul_python (1).pptx for professional and student
Leprosy and NLEP programme community medicine
Managing Community Partner Relationships
Predictive modeling basics in data cleaning process
Introduction to Data Science and Data Analysis
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Qualitative Qantitative and Mixed Methods.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305

Data Virtualization for Data Architects (New Zealand)

  • 1. Allen Keyte Director - Mero Data Virtualization for Data Architects 21 October 2020 Chris Day Director Sales Engineering - Denodo
  • 2. Data Virtualization for Data Architects Mero & Denodo Extending your Data Architecture Questions Next Steps This Webinar - agenda
  • 3. Leader in data virtualization Combine disparate sources Consume with a data catalog Data engineering & analytics consulting Over 100 active clients Modern data platforms Data Virtualization for Data Architects New Zealand partnership
  • 5. 5 Gartner – The Rise of Logical Architectures This is a Second Major Cycle of Analytical Consolidation Operational Application Operational Application Operational Application IoT Data Other NewData Operational Application Operational Application Cube Operational Application Cube ? Operational Application Operational Application Operational Application IoT Data Other NewData 1980s Pre EDW 1990s EDW 2010s2000s Post EDW Time LDW Operational Application Operational Application Operational Application Data Warehouse Data Warehouse Data Lake ? Logical Data Warehouse Data Warehouse Data Lake Marts ODS Staging/Ingest Unified analysis › Consolidated data › "Collect the data" › Single server, multiple nodes › More analysis than any one server can provide ©2018 Gartner, Inc. Unified analysis › Logically consolidated view of all data › "Connect and collect" › Multiple servers, of multiple nodes › More analysis than any one system can provide ID: 342254 Fragmented/ nonexistent analysis › Multiple sources › Multiple structured sources Fragmented analysis › "Collect the data" (Into › different repositories) › New data types, › processing, requirements › Uncoordinated views
  • 6. 6 Gartner – The Rise of Logical Architectures This is a Second Major Cycle of Analytical Consolidation Operational Application Operational Application Operational Application IoT Data Other NewData Operational Application Operational Application Cube Operational Application Cube ? Operational Application Operational Application Operational Application IoT Data Other NewData 1980s Pre EDW 1990s EDW 2010s2000s Post EDW Time LDW Operational Application Operational Application Operational Application Data Warehouse Data Warehouse Data Lake ? Unified analysis › Consolidated data › "Collect the data" › Single server, multiple nodes › More analysis than any one server can provide ©2018 Gartner, Inc. Unified analysis › Logically consolidated view of all data › "Connect and collect" › Multiple servers, of multiple nodes › More analysis than any one system can provide ID: 342254 Fragmented/ nonexistent analysis › Multiple sources › Multiple structured sources Fragmented analysis › "Collect the data" (Into › different repositories) › New data types, › processing, requirements › Uncoordinated views Operational Application Operational Application Operational Application IoT Data Other NewData Logical Data Warehouse Data Warehouse Data Lake Marts ODS Staging/Ingest Data Virtualization √ Improved Time to Market by 50 to 90% √ Improved Report Consistency √ Reduce Duplication of Data √ Improve Transparency √ Reduced development Cost √ Future Proof the architecture against technology changes
  • 7. DATA CONSUMERS DISPARATE DATA SOURCES SQL Queries (JDBC, ODBC, ADO.NET) Web Services (SOAP, REST, OData) Web-based catalog & search Secure delivery (SSL/TLS) DATA CONSUMERS MPP Processing Relational Cache Corporate Security Monitoring & Auditing Metadata Repository Execution Engine & Optimizer Data Virtualization as a Data Access Layer DATA VIRTUALIZATION Consume Combine 2 3 Connect 1
  • 8. DATA CONSUMERS DISPARATE DATA SOURCES SQL Queries (JDBC, ODBC, ADO.NET) Web Services (SOAP, REST, OData) Web-based catalog & search Secure delivery (SSL/TLS) DATA CONSUMERS Data Virtualization in Action Consume Combine 2 3 Connect 1 Base/Raw views Standardized views Customer Product Order Business viewsFinance Operations Sales Less Structured Operational Each Layer of Views provides more refined Single Views of Truth
  • 10. 10 Demo Scenario ▪ Historical sales data offloaded to Hadoop cluster for cheaper storage ▪ Marketing campaigns managed in an external cloud app ▪ Country is part of the customer details table, stored in the DW Sources Combine, Transform & Integrate Consume Base View Source Abstraction join group by state join Sales Campaign Customer SaaS solution How effective are our marketing Campaigns?
  • 11. 11 Personas Denodo Developer Business User & BI Analyst Data Scientist Application-to-Application Administration & Operations
  • 12. Unified Web Administration: Central Web Portal Entry point for all users to all Denodo Environments. SSO to all tools with Kerberos, SAML or OAuth
  • 13. Data Virtualization: 1. Enables data re-use reducing costs & increasing collaboration 2. Unifies disparate data sources in real-time 3. Supports self-service & data discovery 4. Centralises governance & security of enterprise data assets Key Takeaways
  • 14. Data Virtualization for Data Architects Questions
  • 15. Wed Nov 11 | Data Virtualization for Business Consumption Workshop | Hands-on virtual workshops - greg.laws@mero.co.nz | +64 21 875 875 Data Virtualization for Data Architects Next Steps Webinar series continues Test Drive | Try it out on mero.co.nz/denodo/
  • 16. 16 What is the optimizer doing? SELECT c.state, AVG(s.amount) FROM customer c JOIN sales s ON c.id = s.customer_id GROUP BY c.state Sales Customer join group by Sales Customer Create temp table join group by Option 1? Option 2? Option 3? Temp_Customer Customer and Sales are in different sources. What is the best execution plan? Naïve Strategy Temporary Data Movement 300 M 2 M 2 M 50 M Sales Customer join group by ID Group by state Partial Aggregation Pushdown 2 M 2 M ‘Cost’ ~302 M ‘Cost’ ~52 M ‘Cost’ ~4 M
  • 17. 17 Why is this so important? SELECT c.name, AVG(s.amount) FROM customer c JOIN sales s ON c.id = s.customer_id GROUP BY c.state How Denodo works compared with other federation engines System Execution Time Data Transferred Optimization Technique Denodo 9 sec. 4 M Aggregation push-down Others 125 sec. 302 M None: full scan 300 M 2 M Sales Customer join group by 2 M 2 M Sales Customer join group by ID Group by state To maximize push down to the EDW the aggregation is split in 2 steps: • 1st by customerID • 2nd by state This significantly reduces network Traffic and processing In Denodo
  • 18. 18 Denodo Performance Strategies • Post-processing and Federation in the DV engine • Delegation ▪ Process as much as possible in the data sources • Temporary Tables ▪ Automatically move data to the biggest data source to optimize the execution • Summaries ▪ Based on the query the Denodo optimizer can use a “summary” for accelerating the execution • MPP Integration ▪ Move processing to an external MPP system on the fly • Caching ▪ Persist data beforehand in a relational database