SlideShare a Scribd company logo
Hi-Speed DataWarehousing Jos van Dongen, Tholis Consulting
Agenda Introduction Why Hi-Speed DWH? Where do we ‘Hi-speed’ the DWH? Part 1: Hi-Speed Strategies Upgrade Extend Migrate Part 2: New Hi-Speed DWH solutions What’s new? Which products? How fast are they? What does it cost? Database Systems 2008 THOLIS CONSULTING
Hi-Speed Why? Growing data volume: Gartner group: 2007: 50% of DWH’s > 10 TB  2011: 50% of DWH’s > 50 TB < 5 TB is considered small (!) Increasing workload: Operational BI Pervasive BI Advanced Analytics & Mining Database Systems 2008 THOLIS CONSULTING
Hi-Speed Where? Datawarehousing: Development ETL Query & Analysis Maintenance (Index, aggregate, backup, restore, authorization, etc.)  Presentation focus:  Query & Analysis Database Systems 2008 THOLIS CONSULTING
Hi-Speed How? Database Systems 2008 THOLIS CONSULTING Upgrade 2-5* Extend 5-100* Migrate 10-400* Hardware Processing power Memory Disk Software 64bit New OS versions New RDBMS versions  Add datamarts OLAP engines (Datamart) Appliances  ‘ Buddy’ system  Datastore replacement DWH Appliances HW/SW packages SW (roll your own) Outsource
Upgrade Hardware: Cost ‘no issue’ Database Systems 2008 THOLIS CONSULTING <2000: Solve performance problems in software (tuning, optimization) 2008: hardware is  cheaper than time!
Upgrade Hardware: Memory Database Systems 2008 THOLIS CONSULTING Feb 2008: 2 GB €41,- (PC) 4 GB €160,- (Server) Entry level server:  32 GB à €1.280,-
Upgrade Hardware: CPU Database Systems 2008 THOLIS CONSULTING 2004 2008
Upgrade Hardware: disk Database Systems 2008 THOLIS CONSULTING 2008: SSD disks  + Access time 0,1 ms (vs. 4-5 ms SAS disk) + I/O *2-3 + Power consumption 10-20% of HDD + Noise level 0 db - Still very expensive: €2.300,- for 128 GB EMC Symmetrix:  SSD as ‘ultra  high performance’  option
Upgrade Software 64 bit OS is mandatory (32 bit max 4 GB) Oracle 11g: Cube Organized Materialized View (auto) Partitioning options Data Compression Information LifeCycle Management Hot Standby DB for real-time reporting SQL Server 2008: Partitioning Data Compression Win2008/SQL2008 doubles Win2003/SQL2005 performance! Database Systems 2008 THOLIS CONSULTING
Extend: Add datamart(s) Default in Inmon (CIF) architecture.  Dimensional: Often DM’s as views on DWH Add OLAP engine (or replace RDMBS) for datamarts Add Appliance for datamarts Netezza started here (but is scaling up to EDW level) TeraData scales ‘down’ to this level as well Competitive ‘sweet spot’ for Appliance Vendors Use alternative solution (see also ‘Roll your own’) Database Systems 2008 THOLIS CONSULTING
Extend: ‘Buddy’ system ParAccel ‘Amigo’: Database Systems 2008 THOLIS CONSULTING Q-Router handles all requests: OLTP is executed on Database of Record Analytical query is executed on ParAccel MPP Grid
Extend: Datastore replacement DatAupia ‘Satori’: Database Systems 2008 THOLIS CONSULTING
Migrate: Appliances ‘ Traditional’ TeraData  HP NeoView Kognitio DATAllegro GreenPlum Netezza Characteristics: Plug and play Combination of HW, SW, Support & Services Database Systems 2008 THOLIS CONSULTING
Migrate: ‘Roll your own’ Mostly column based, MPP, Shared Nothing architectures 1 Established Vendor: Sybase IQ, since 1993 Wide choice of closed  and  open source products: Open Source: LucidDB, MonetDB Software only:  Vertica*, ParAccel*, Brighthouse,  ExaSol, Valentina, VectorStar,  Tenbase, Sand, etc # . Soft/hardware:  Dataupia  ‘ Lab’ware:  Calpont Database Systems 2008 THOLIS CONSULTING *Also available as DWH Appliance #Mostly special purpose solutions, e.g. BigTable
Part 2: New Hi-Speed Solutions Since 2005, 4 new vendors on the market: Vertica (Michael Stonebraker, $25Mln funding) ParAccel (Barry Zane*, $20Mln funding) DatAupia  (Foster Hinshaw*, $16Mln funding) InfoBright (Warsaw University, $8 Mln funding) Database Systems 2008 THOLIS CONSULTING * Netezza founders
What’s different? Massive Parallel Processing (MPP) Throw lots of commodity hardware at it (see ‘Upgrade’) Column based data organization Limit I/O by ‘pruning’ (compare horizontal partitioning) 1 datatype per column allows for heavy compression Data compression  CPU is not the bottleneck, I/O is Read optimization In memory operation Database Systems 2008 THOLIS CONSULTING
SMP vs MPP Database Systems 2008 THOLIS CONSULTING Different storage approaches: Shared Disk (clustering) Shared Nothing All DWH appliance & new software vendors use Shared Nothing architecture
Rows vs Columns Nothing new about column storage: Taxir, 1969 Conceptual view:  Database Systems 2008 THOLIS CONSULTING Rows Columns
Products: Vertica (1) Database Systems 2008 THOLIS CONSULTING Architecture: WOS & ROS Architecture: Columns & projections MPP Shared Nothing Column Storage Compression Read Optimized
Products: Vertica (2) Database Systems 2008 THOLIS CONSULTING
Products: ParAccel (1) Database Systems 2008 THOLIS CONSULTING Two implementation modes: Amigo* & Maverick Two versions: in memory & disk based (no hybrid solution yet)  MPP Shared Nothing Compression Parallel loader *SQL Server only; Oracle version in Beta
Products: ParAccel (2) Database Systems 2008 THOLIS CONSULTING High availability built in: Shattered TPC-H benchmark: Appliance partnership with Sun: Phoenix all in memory DWH appliance Sedona disk based VLDB
Products: ExaSol Database Systems 2008 THOLIS CONSULTING MPP Column based Auto tuning In-Memory based In-Memory Compression ExaCluster OS
Products: BrightHouse (1) Uses MySQL as DBMS Not columns but 64K Data Packs Knowledge Grid and DP nodes replace traditional indexes Heavy Compression (10:1) Database Systems 2008 THOLIS CONSULTING
Products: BrightHouse (2) Database Systems 2008 THOLIS CONSULTING
Products: DatAupia Database Systems 2008 THOLIS CONSULTING Database Appliance Adds MPP capability to DB/2, Oracle & SQL ‘ Invisible’ appliance Lowest cost solution on the market Plug and play:
How Fast: TPC/H Benchmark Database Systems 2008 THOLIS CONSULTING Typical BI queries, e.g. Top 10 of non-shipped orders on date x Annual growth of marketshare Profit per producttype, year and country Profit share local suppliers etc.
Query Example: TPC-H Q9 -- $ID$ -- TPC-H/TPC-R Product Type Profit Measure Query (Q9) -- Functional Query Definition -- Approved February 1998 Select   nation,   o_year,   sum(amount) as sum_profit from ( select   n_name as nation,   extract(year from o_orderdate) as o_year,   l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part,   supplier,   lineitem,   partsupp,   orders,   nation where s_suppkey = l_suppkey   and ps_suppkey = l_suppkey   and  ps_partkey = l_partkey   and p_partkey = l_partkey   and  o_orderkey = l_orderke y  and   s_nationkey = n_nationkey   and  p_name like '%green%' ) as profit group by   nation,   o_year order by   nation,   o_year desc Database Systems 2008 THOLIS CONSULTING
Remember last year? Database Systems 2008 THOLIS CONSULTING Qry 1-10 on SF2 (2 GB data) Single CPU, Single disk, 2GB Ram, Windows 2003
How Fast: TPC-H 100 & 300GB Database Systems 2008 THOLIS CONSULTING
How Fast: TPC-H 1 TB Database Systems 2008 THOLIS CONSULTING But: So: Always verify your own workload against your own data!
Hi-Speed, what now? Upgrading hardware might be the most time- and cost effective short term solution to performance problems OK, this software is fast, but what about  ETL/ELT ?  (physical) Design?  Maintenance ?  Support ? When you hit the limits of your traditional DWH: Evaluate & Proof of Value When will Oracle, Microsoft and IBM enter this arena? Database Systems 2008 THOLIS CONSULTING
Database Systems 2008 THOLIS CONSULTING ?

More Related Content

PPTX
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
PDF
Accelerating analytics workloads with Alluxio data orchestration and Intel® O...
PDF
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
PPTX
Storage Efficiency Customer Success Stories Sept 2010 power point
PPTX
Oracle Database Appliance
PDF
Hadoop and Hive Development at Facebook
PPT
Hadoop World Vertica
PDF
How to Develop and Operate Cloud First Data Platforms
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
Accelerating analytics workloads with Alluxio data orchestration and Intel® O...
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Storage Efficiency Customer Success Stories Sept 2010 power point
Oracle Database Appliance
Hadoop and Hive Development at Facebook
Hadoop World Vertica
How to Develop and Operate Cloud First Data Platforms

What's hot (18)

PPTX
Next generation databases july2010
PDF
Xldb2011 wed 1415_andrew_lamb-buildingblocks
PPTX
Alluxio Presentation at Strata San Jose 2016
PDF
Technical Report NetApp Clustered Data ONTAP 8.2: An Introduction
PDF
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
PDF
Optimizing Lustre and GPFS with DDN
PDF
Vizuri Exadata East Coast Users Conference
PDF
Changing the game with cloud dw
PDF
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
PDF
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
PPT
The eX5 Portfolio
PDF
VSP Mainframe Dynamic Tiering Performance Considerations
PDF
Best Practices for Using Alluxio with Spark
PDF
Bigtable and Dynamo
PPTX
Exploiting machine learning to keep Hadoop clusters healthy
PDF
Acunu Whitepaper v1
PDF
ODA X6-2 family
PPT
Hive Evolution: ApacheCon NA 2010
Next generation databases july2010
Xldb2011 wed 1415_andrew_lamb-buildingblocks
Alluxio Presentation at Strata San Jose 2016
Technical Report NetApp Clustered Data ONTAP 8.2: An Introduction
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
Optimizing Lustre and GPFS with DDN
Vizuri Exadata East Coast Users Conference
Changing the game with cloud dw
The IBM eX5 Memory Advantage How Additional Memory Capacity on eX5 Can Benefi...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
The eX5 Portfolio
VSP Mainframe Dynamic Tiering Performance Considerations
Best Practices for Using Alluxio with Spark
Bigtable and Dynamo
Exploiting machine learning to keep Hadoop clusters healthy
Acunu Whitepaper v1
ODA X6-2 family
Hive Evolution: ApacheCon NA 2010
Ad

Viewers also liked (20)

PPTX
Visualization 101 BA4All
ODP
Database Shootout: What's best for BI?
PDF
Data Scientist 101 BI Dutch
PPTX
Open Source Business Intelligence
PDF
Bin3 Open Source BI, overhyped or undervalued?
PDF
PDI data vault framework #pcmams 2012
PDF
A Journey to Modern Apps with Containers, Microservices and Big Data
PDF
World Domination with Pentaho EE?
PDF
Lambda at Weather Scale - Cassandra Summit 2015
PPTX
SnappyData overview NikeTechTalk 11/19/15
PPTX
Always On: Building Highly Available Applications on Cassandra
PDF
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
PPTX
Scalable On-Demand Hadoop Clusters with Docker and Mesos
PDF
Online Analytics with Hadoop and Cassandra
PPT
Datawarehousing and Business Intelligence
PDF
Streaming Big Data & Analytics For Scale
PDF
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
PDF
[db tech showcase Tokyo 2015] A14:Amazon Redshiftの元となったスケールアウト型カラムナーDB徹底解説 その...
PPTX
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
PPTX
Teradata introduction - A basic introduction for Taradate system Architecture
Visualization 101 BA4All
Database Shootout: What's best for BI?
Data Scientist 101 BI Dutch
Open Source Business Intelligence
Bin3 Open Source BI, overhyped or undervalued?
PDI data vault framework #pcmams 2012
A Journey to Modern Apps with Containers, Microservices and Big Data
World Domination with Pentaho EE?
Lambda at Weather Scale - Cassandra Summit 2015
SnappyData overview NikeTechTalk 11/19/15
Always On: Building Highly Available Applications on Cassandra
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
Scalable On-Demand Hadoop Clusters with Docker and Mesos
Online Analytics with Hadoop and Cassandra
Datawarehousing and Business Intelligence
Streaming Big Data & Analytics For Scale
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
[db tech showcase Tokyo 2015] A14:Amazon Redshiftの元となったスケールアウト型カラムナーDB徹底解説 その...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
Teradata introduction - A basic introduction for Taradate system Architecture
Ad

Similar to Hi Speed Datawarehousing (20)

PPTX
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
PPTX
Deutsche Telekom on Big Data
PDF
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
ODP
Presentation to dm as november 2007 with dynamic provisioning information
PDF
Provisioning Servers Made Easy
PPT
Eric Baldeschwieler Keynote from Storage Developers Conference
PPTX
Big Data 2107 for Ribbon
PPTX
SQL Server In-Memory OLTP introduction (Hekaton)
PPTX
Exadata
PPTX
Sql server 2016 it just runs faster sql bits 2017 edition
PDF
The state of SQL-on-Hadoop in the Cloud
PPT
Making MySQL Great For Business Intelligence
PPTX
Modernizing Mission-Critical Apps with SQL Server
PPTX
The Most Trusted In-Memory database in the world- Altibase
PPT
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
PPTX
The modern analytics architecture
PDF
HPC DAY 2017 | HPE Storage and Data Management for Big Data
PDF
Presentation architecting virtualized infrastructure for big data
PDF
Presentation architecting virtualized infrastructure for big data
PPTX
Webinar: The Bifurcation of the Flash Market
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 2)
Deutsche Telekom on Big Data
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
Presentation to dm as november 2007 with dynamic provisioning information
Provisioning Servers Made Easy
Eric Baldeschwieler Keynote from Storage Developers Conference
Big Data 2107 for Ribbon
SQL Server In-Memory OLTP introduction (Hekaton)
Exadata
Sql server 2016 it just runs faster sql bits 2017 edition
The state of SQL-on-Hadoop in the Cloud
Making MySQL Great For Business Intelligence
Modernizing Mission-Critical Apps with SQL Server
The Most Trusted In-Memory database in the world- Altibase
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
The modern analytics architecture
HPC DAY 2017 | HPE Storage and Data Management for Big Data
Presentation architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
Webinar: The Bifurcation of the Flash Market

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Modernizing your data center with Dell and AMD
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PPT
Teaching material agriculture food technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Electronic commerce courselecture one. Pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Modernizing your data center with Dell and AMD
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
Teaching material agriculture food technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
A Presentation on Artificial Intelligence
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectral efficient network and resource selection model in 5G networks
20250228 LYD VKU AI Blended-Learning.pptx
Network Security Unit 5.pdf for BCA BBA.
Encapsulation_ Review paper, used for researhc scholars
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Electronic commerce courselecture one. Pdf

Hi Speed Datawarehousing

  • 1. Hi-Speed DataWarehousing Jos van Dongen, Tholis Consulting
  • 2. Agenda Introduction Why Hi-Speed DWH? Where do we ‘Hi-speed’ the DWH? Part 1: Hi-Speed Strategies Upgrade Extend Migrate Part 2: New Hi-Speed DWH solutions What’s new? Which products? How fast are they? What does it cost? Database Systems 2008 THOLIS CONSULTING
  • 3. Hi-Speed Why? Growing data volume: Gartner group: 2007: 50% of DWH’s > 10 TB 2011: 50% of DWH’s > 50 TB < 5 TB is considered small (!) Increasing workload: Operational BI Pervasive BI Advanced Analytics & Mining Database Systems 2008 THOLIS CONSULTING
  • 4. Hi-Speed Where? Datawarehousing: Development ETL Query & Analysis Maintenance (Index, aggregate, backup, restore, authorization, etc.) Presentation focus: Query & Analysis Database Systems 2008 THOLIS CONSULTING
  • 5. Hi-Speed How? Database Systems 2008 THOLIS CONSULTING Upgrade 2-5* Extend 5-100* Migrate 10-400* Hardware Processing power Memory Disk Software 64bit New OS versions New RDBMS versions Add datamarts OLAP engines (Datamart) Appliances ‘ Buddy’ system Datastore replacement DWH Appliances HW/SW packages SW (roll your own) Outsource
  • 6. Upgrade Hardware: Cost ‘no issue’ Database Systems 2008 THOLIS CONSULTING <2000: Solve performance problems in software (tuning, optimization) 2008: hardware is cheaper than time!
  • 7. Upgrade Hardware: Memory Database Systems 2008 THOLIS CONSULTING Feb 2008: 2 GB €41,- (PC) 4 GB €160,- (Server) Entry level server: 32 GB à €1.280,-
  • 8. Upgrade Hardware: CPU Database Systems 2008 THOLIS CONSULTING 2004 2008
  • 9. Upgrade Hardware: disk Database Systems 2008 THOLIS CONSULTING 2008: SSD disks + Access time 0,1 ms (vs. 4-5 ms SAS disk) + I/O *2-3 + Power consumption 10-20% of HDD + Noise level 0 db - Still very expensive: €2.300,- for 128 GB EMC Symmetrix: SSD as ‘ultra high performance’ option
  • 10. Upgrade Software 64 bit OS is mandatory (32 bit max 4 GB) Oracle 11g: Cube Organized Materialized View (auto) Partitioning options Data Compression Information LifeCycle Management Hot Standby DB for real-time reporting SQL Server 2008: Partitioning Data Compression Win2008/SQL2008 doubles Win2003/SQL2005 performance! Database Systems 2008 THOLIS CONSULTING
  • 11. Extend: Add datamart(s) Default in Inmon (CIF) architecture. Dimensional: Often DM’s as views on DWH Add OLAP engine (or replace RDMBS) for datamarts Add Appliance for datamarts Netezza started here (but is scaling up to EDW level) TeraData scales ‘down’ to this level as well Competitive ‘sweet spot’ for Appliance Vendors Use alternative solution (see also ‘Roll your own’) Database Systems 2008 THOLIS CONSULTING
  • 12. Extend: ‘Buddy’ system ParAccel ‘Amigo’: Database Systems 2008 THOLIS CONSULTING Q-Router handles all requests: OLTP is executed on Database of Record Analytical query is executed on ParAccel MPP Grid
  • 13. Extend: Datastore replacement DatAupia ‘Satori’: Database Systems 2008 THOLIS CONSULTING
  • 14. Migrate: Appliances ‘ Traditional’ TeraData HP NeoView Kognitio DATAllegro GreenPlum Netezza Characteristics: Plug and play Combination of HW, SW, Support & Services Database Systems 2008 THOLIS CONSULTING
  • 15. Migrate: ‘Roll your own’ Mostly column based, MPP, Shared Nothing architectures 1 Established Vendor: Sybase IQ, since 1993 Wide choice of closed and open source products: Open Source: LucidDB, MonetDB Software only: Vertica*, ParAccel*, Brighthouse, ExaSol, Valentina, VectorStar, Tenbase, Sand, etc # . Soft/hardware: Dataupia ‘ Lab’ware: Calpont Database Systems 2008 THOLIS CONSULTING *Also available as DWH Appliance #Mostly special purpose solutions, e.g. BigTable
  • 16. Part 2: New Hi-Speed Solutions Since 2005, 4 new vendors on the market: Vertica (Michael Stonebraker, $25Mln funding) ParAccel (Barry Zane*, $20Mln funding) DatAupia (Foster Hinshaw*, $16Mln funding) InfoBright (Warsaw University, $8 Mln funding) Database Systems 2008 THOLIS CONSULTING * Netezza founders
  • 17. What’s different? Massive Parallel Processing (MPP) Throw lots of commodity hardware at it (see ‘Upgrade’) Column based data organization Limit I/O by ‘pruning’ (compare horizontal partitioning) 1 datatype per column allows for heavy compression Data compression CPU is not the bottleneck, I/O is Read optimization In memory operation Database Systems 2008 THOLIS CONSULTING
  • 18. SMP vs MPP Database Systems 2008 THOLIS CONSULTING Different storage approaches: Shared Disk (clustering) Shared Nothing All DWH appliance & new software vendors use Shared Nothing architecture
  • 19. Rows vs Columns Nothing new about column storage: Taxir, 1969 Conceptual view: Database Systems 2008 THOLIS CONSULTING Rows Columns
  • 20. Products: Vertica (1) Database Systems 2008 THOLIS CONSULTING Architecture: WOS & ROS Architecture: Columns & projections MPP Shared Nothing Column Storage Compression Read Optimized
  • 21. Products: Vertica (2) Database Systems 2008 THOLIS CONSULTING
  • 22. Products: ParAccel (1) Database Systems 2008 THOLIS CONSULTING Two implementation modes: Amigo* & Maverick Two versions: in memory & disk based (no hybrid solution yet) MPP Shared Nothing Compression Parallel loader *SQL Server only; Oracle version in Beta
  • 23. Products: ParAccel (2) Database Systems 2008 THOLIS CONSULTING High availability built in: Shattered TPC-H benchmark: Appliance partnership with Sun: Phoenix all in memory DWH appliance Sedona disk based VLDB
  • 24. Products: ExaSol Database Systems 2008 THOLIS CONSULTING MPP Column based Auto tuning In-Memory based In-Memory Compression ExaCluster OS
  • 25. Products: BrightHouse (1) Uses MySQL as DBMS Not columns but 64K Data Packs Knowledge Grid and DP nodes replace traditional indexes Heavy Compression (10:1) Database Systems 2008 THOLIS CONSULTING
  • 26. Products: BrightHouse (2) Database Systems 2008 THOLIS CONSULTING
  • 27. Products: DatAupia Database Systems 2008 THOLIS CONSULTING Database Appliance Adds MPP capability to DB/2, Oracle & SQL ‘ Invisible’ appliance Lowest cost solution on the market Plug and play:
  • 28. How Fast: TPC/H Benchmark Database Systems 2008 THOLIS CONSULTING Typical BI queries, e.g. Top 10 of non-shipped orders on date x Annual growth of marketshare Profit per producttype, year and country Profit share local suppliers etc.
  • 29. Query Example: TPC-H Q9 -- $ID$ -- TPC-H/TPC-R Product Type Profit Measure Query (Q9) -- Functional Query Definition -- Approved February 1998 Select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderke y and s_nationkey = n_nationkey and p_name like '%green%' ) as profit group by nation, o_year order by nation, o_year desc Database Systems 2008 THOLIS CONSULTING
  • 30. Remember last year? Database Systems 2008 THOLIS CONSULTING Qry 1-10 on SF2 (2 GB data) Single CPU, Single disk, 2GB Ram, Windows 2003
  • 31. How Fast: TPC-H 100 & 300GB Database Systems 2008 THOLIS CONSULTING
  • 32. How Fast: TPC-H 1 TB Database Systems 2008 THOLIS CONSULTING But: So: Always verify your own workload against your own data!
  • 33. Hi-Speed, what now? Upgrading hardware might be the most time- and cost effective short term solution to performance problems OK, this software is fast, but what about ETL/ELT ? (physical) Design? Maintenance ? Support ? When you hit the limits of your traditional DWH: Evaluate & Proof of Value When will Oracle, Microsoft and IBM enter this arena? Database Systems 2008 THOLIS CONSULTING
  • 34. Database Systems 2008 THOLIS CONSULTING ?