SlideShare a Scribd company logo
Big Data Insurance
Mike Johnson
Mike.Johnson@progress.com
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.2
Big Data is Here to Stay – Forbes Sept. 2015
The data volumes are exploding,
more data has been created in the past two years than in the entire
previous history of the human race.
By 2020, our accumulated digital universe of data will grow from 4.4 zettabytes
today to around 44 zettabytes, or 44 trillion gigabytes.
Within five years there will be over
50 billion smart connected devices in the world, all developed to
collect, analyze and share data.
The
Hadoop … market is forecast to grow at a compound annual growth rate 58%
surpassing $1 billion by 2020.
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.3
The Big Data Ecosystem Today
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.4
Hadoop Ecosystems Continues to Grow instead of Shrink
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.5
The Number of Versions of all the Hadoop Components is Staggering!
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.6
Big Data Release Cadences Continue to Cause ISVs Difficulty
Quarterly:
Monthly or More:
Yearly:
Multiple Times a Year:
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.7
To Make things More Complicated…
§  There is real, valuable, important functionality in many of these releases
§  Examples Include:
•  New DataTypes in Hive (Varchar, Decimal, Timestamp, Binary, etc…)
•  Additional Ability to push down Queries in Mongo
•  Metadata Enhancements in newer Versions of Hive
•  Cassandra is adding enhancements every other month
•  Etc..
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.8
This Amount of Change puts ISVs in a Difficult Position
Testing Nightmares
Inconsistencies of feature support
Keeping Up with the Industry
What do ISVs require today?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.10
What ISVs need is a Vendor that takes care of all this for you!
§  Progress|DataDirect has been writing Connectivity for over 25 Years!
§  We have been working with Big Data sources since ????
§  Significant Investment in Testing Infrastructure
•  Over 150 Hadoop Servers
•  More than 30 Spark Servers
•  Over 250 Big Data Servers!
§  Day 1 Support Policy for New Versions
§  Dedicated Team of people dealing with configuring new systems and doing certifications
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.11
Progress|DataDirect - Smoothing out the Rough Edges
§  Data Types reported and function based on Version
•  Timestamp added in 0.8
•  Decimal added in 0.11
•  Date and varchar added in 0.12
•  Char added in 0.13
§  Syntax differences (HiveQL)
•  INSERT statements
•  Parameter arrays
§  Catalog Metadata functionality
•  Earlier versions of Hive didn‘t have Metadata functions at all
•  Newer Versions don‘t necessariy report Metadata correctly
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.12
The DataDirect Support Matrix
Component Supported Versions
Amazon Elastic MapReduce (Amazon EMR) 2.1.4 and Higher
Apache Hadoop Hive 0.71 and Higher
Cloudera's Distribution Including Apache Hadoop (CDH) CDH3 Update 4 and Higher
Hortonworks Distrbution for Apache Hadoop 1.3 and Higher
IBM BigInsights 3.0 and Higher
MapR Distribution for Apache Hadoop 1.2 and Higher
Pivotal HD Enterprise (PHD) 2.0.1 and Higher
Cloudera Impala 1.0 and Higher
Spark
Pivotal HAWQ 1.1 and Higher
MongoDB 2.2 and Higher
Cassandra 1.2 and Higher
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.13
The DataDirect Certification Process
§  Relational DBs
•  We run all tests on each supported version before announcing certification
•  Add full test suite runs on all platforms to regular patch runs
•  Generally support 4-6 major versions of a Relational DB
•  The number of tests that we run for a Relational DB increase slowly over time
•  Occasionally phase out really old versions
§  Big Data
•  Cloudera versions generally release before Apache
•  Always certify Apache
•  Ensure that other Distros Hive Versions have already been certified
•  Certify a given distro with a given Hive version
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.14
It’s not ALL about Connection into Big Data Systems
§  Most of these Systems want to be the core system in your environment
§  There is usually a great need to help get data into the systems through tools such as:
•  SQOOP
•  Spark
•  Flume
§  The rest of the DataDirect portfolio of drivers plug into these tools to broaden your reach
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.15
The DataDirect Support Matrix
Component Supported Versions
Amazon Elastic MapReduce (Amazon EMR) 2.1.4 and Higher
Apache Hadoop Hive 0.71 and Higher
Cloudera's Distribution Including Apache Hadoop (CDH) CDH3 Update 4 and Higher
Hortonworks Distribution for Apache Hadoop 1.3 and Higher
IBM BigInsights 3.0 and Higher
MapR Distribution for Apache Hadoop 1.2 and Higher
Pivotal HD Enterprise (PHD) 2.0.1 and Higher
Cloudera Impala 1.0 and Higher
Spark 1.2 and Higher
Pivotal HAWQ 1.1 and Higher
MongoDB 2.2 and Higher
Cassandra 1.2 and Higher
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.16
Progress|DataDirect - Smoothing out the Rough Edges
§  Data Types reported and function based on Version
•  Timestamp added in 0.8
•  Decimal added in 0.11
•  Date and varchar added in 0.12
•  Char added in 0.13
§  Syntax differences (HiveQL)
•  INSERT statements
•  Parameter arrays
§  Catalog Metadata functionality
•  Earlier versions of Hive didn‘t have Metadata functions at all
•  Newer Versions don‘t necessariy report Metadata correctly
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.17
The DataDirect Certification Process
§  Relational DBs
•  We run all tests on each supported version before announcing certification
•  Add full test suite runs on all platforms to regular patch runs
•  Generally support 4-6 major versions of a Relational DB
•  The number of tests that we run for a Relational DB increase slowly over time
•  Occasionally phase out really old versions
§  Big Data
•  Always certify Apache
•  Cloudera versions generally release before Apache and don’t strictly follow Apache
•  Ensure that other Distros Hive Versions have already been certified
•  Certify a given distro with a given Hive version
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.18
It’s not ALL about Connecting into Big Data Systems
§  Most of these Systems want to be the core
system in your environment
§  A great need to quickly get data into these
systems through tools such as:
•  SQOOP
•  Spark
•  Flume
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.19
Big Data / NoSQL
 Relational
 SaaS / Cloud
 EDI / XML / Text
Ø  Apache Hadoop Hive
Ø  Cloudera
Ø  Hortonworks
Ø  MapR
Ø  Amazon EMR
Ø  Cloudera Impala
Ø  Pivotal Hawq
Ø  MongoDB
Ø  IBM BigInsights
Ø  Oracle BDA
Ø  Cassandra
Ø  SAP HANA (Preview)
Ø  Microsoft SQL Server
Ø  Oracle DB
Ø  IBM DB2
Ø  Progress OpenEdge
Ø  SAP Sybase
Ø  MySQL
Ø  PostgreSQL
Ø  Pervasive SQL (Btrieve)
Ø  IBM Informix
Ø  Clipper
Ø  Dbase
Ø  FoxPro
Ø  Paradox
Ø  Text Files
Ø  Excel
Ø  Salesforce.com
Ø  Database.com
Ø  FinancialForce
Ø  Veeva CRM
Ø  ServiceMax
Ø  Any Force.com App
Ø  Microsoft Dynamics CRM *
Ø  Microsoft SQL Azure
Ø  Oracle Eloqua *
Ø  Oracle Service Cloud
Ø  Marketo *
Ø  Google Analytics *
Ø  SugarCRM
Ø  Hubspot (Preview) *
Ø  Progress Rollbase *
Ø  EDIFACT
Ø  X12
Ø  IATA
Ø  HealthcaseEDI:X12 (HIPPA), ICD-10, HL7
Ø  Flat Files: CSV, TXV, dBase
Ø  Text files
Ø  EDIG@S
Ø  EANCOM
Currently Supported Data Sources
Data Warehouses
Ø  TeraData
Ø  Amazon Redshift
Ø  Pivotal GreenPlum
Ø  SAP Sybase IQAny Data Source
Ø  SDK
Ø  SequeLink Socket Server
Ø  Custom Engineering
* Available exclusively for DataDirect Cloud
Big Data Insurance

More Related Content

PDF
SQL Access to NoSQL
PDF
Navigating Your Product's Growth with Embedded Analytics
PPTX
How to Prepare Your Toolbox for the Future of SharePoint Development
PPTX
Journey to SAS Analytics Grid with SAS, R, Python
PPTX
REST API debate: OData vs GraphQL vs ORDS
PPTX
OData Hackathon Challenge
PPTX
Hybrid Data Pipeline for SQL and REST
PPTX
Building a marketing data lake
SQL Access to NoSQL
Navigating Your Product's Growth with Embedded Analytics
How to Prepare Your Toolbox for the Future of SharePoint Development
Journey to SAS Analytics Grid with SAS, R, Python
REST API debate: OData vs GraphQL vs ORDS
OData Hackathon Challenge
Hybrid Data Pipeline for SQL and REST
Building a marketing data lake

What's hot (20)

PPTX
Firewall friendly pipeline for secure data access
PPTX
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
PPTX
OData External Data Integration Strategies for SaaS
PPTX
OData and the future of business objects universes
PPTX
Leveraging SUSE Linux to run SAP HANA on the Amazon Web Services Cloud
PDF
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
PDF
OOW16 - Planning Your Upgrade to Oracle E-Business Suite 12.2 [CON1423]
PPTX
Hortonworks Oracle Big Data Integration
PPTX
Bringing Trus and Visibility to Apache Hadoop
PDF
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
PDF
Virtualized Oracle Real Application Clusters (RAC) - Containers and VMs for RAC
PDF
(Oracle) DBA and Other Skills Needed in 2020
PDF
Oracle Real Application Clusters (RAC) 12c Rel. 2 - What's Next?
PDF
Oracle RAC in the Oracle Cloud
PDF
Make Your Application “Oracle RAC Ready” & Test For It
PDF
Pre-Con Ed: Explore What’s New in CA Performance Management 3.0
PDF
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
PDF
Why Use an Oracle Database?
PDF
Sql server 2012_parallel_data_warehouse_breakthrough_platform_white_paper
PDF
(Oracle) DBA Skills to Have, to Obtain and to Nurture
Firewall friendly pipeline for secure data access
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
OData External Data Integration Strategies for SaaS
OData and the future of business objects universes
Leveraging SUSE Linux to run SAP HANA on the Amazon Web Services Cloud
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
OOW16 - Planning Your Upgrade to Oracle E-Business Suite 12.2 [CON1423]
Hortonworks Oracle Big Data Integration
Bringing Trus and Visibility to Apache Hadoop
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
Virtualized Oracle Real Application Clusters (RAC) - Containers and VMs for RAC
(Oracle) DBA and Other Skills Needed in 2020
Oracle Real Application Clusters (RAC) 12c Rel. 2 - What's Next?
Oracle RAC in the Oracle Cloud
Make Your Application “Oracle RAC Ready” & Test For It
Pre-Con Ed: Explore What’s New in CA Performance Management 3.0
Oracle Real Application Clusters (RAC) 12c Rel. 2 - Operational Best Practices
Why Use an Oracle Database?
Sql server 2012_parallel_data_warehouse_breakthrough_platform_white_paper
(Oracle) DBA Skills to Have, to Obtain and to Nurture
Ad

Viewers also liked (20)

PDF
Fixing the Insurance Industry: How Big Data can Transform Customer Satisfaction
PPTX
Big Data in Insurance Industry
PDF
Innovation and Big Data in Insurance
PDF
SAS Customer Analytics for Insurance
PDF
Presentation at Big Data & Analytics for Insurance 2016
PPTX
Insurance Industry Trends in 2015: #1 Big Data and Analytics
PPT
Trabajos de fisica: Teoria corpuscular y ondulatoria de la luz
PPTX
Customer Lifecycle Engagement for Insurance Companies
PDF
Data, Analytics and the Insurance Industry
PPTX
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
PDF
Big data & analytics in the insurance industry: Westfield Insurance
PPT
Blogging
PPT
Opin upplýsingatækni
PPT
Chemrev4
PPT
From Static To Dynamic
PPT
PPT
Balance scorecard
PPS
FunHalo
PPT
milieu
PPS
Inglesfacil
Fixing the Insurance Industry: How Big Data can Transform Customer Satisfaction
Big Data in Insurance Industry
Innovation and Big Data in Insurance
SAS Customer Analytics for Insurance
Presentation at Big Data & Analytics for Insurance 2016
Insurance Industry Trends in 2015: #1 Big Data and Analytics
Trabajos de fisica: Teoria corpuscular y ondulatoria de la luz
Customer Lifecycle Engagement for Insurance Companies
Data, Analytics and the Insurance Industry
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Big data & analytics in the insurance industry: Westfield Insurance
Blogging
Opin upplýsingatækni
Chemrev4
From Static To Dynamic
Balance scorecard
FunHalo
milieu
Inglesfacil
Ad

Similar to Big Data Insurance (20)

PPTX
Hadoop and Hive in Enterprises
PPTX
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
PPTX
Big dataproposal
PPTX
Overview of big data & hadoop v1
PPTX
Overview of big data & hadoop version 1 - Tony Nguyen
PPTX
Overview of Big data, Hadoop and Microsoft BI - version1
PDF
Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing
PPTX
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
PPTX
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
PDF
Modern data warehouse
PDF
Modern data warehouse
PPTX
Apache Hive for modern DBAs
PDF
The Future of Analytics, Data Integration and BI on Big Data Platforms
PDF
SQL on Hadoop in Taiwan
PPT
Information Security Analytics
PPTX
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
PPTX
Hadoop and IoT Sinergija 2014
PPTX
Hadoop and IoT Sinergija 2014
PPTX
The modern analytics architecture
PPTX
Hybrid Data Warehouse Hadoop Implementations
Hadoop and Hive in Enterprises
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Big dataproposal
Overview of big data & hadoop v1
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of Big data, Hadoop and Microsoft BI - version1
Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Modern data warehouse
Modern data warehouse
Apache Hive for modern DBAs
The Future of Analytics, Data Integration and BI on Big Data Platforms
SQL on Hadoop in Taiwan
Information Security Analytics
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
The modern analytics architecture
Hybrid Data Warehouse Hadoop Implementations

More from Progress (20)

PDF
Ship Quickly, Ship Quality: The Developer’s Quest (Infographic)
PDF
Database Technology Trends 2016 – Survey Results
PDF
Geekier Analytics for SaaS data
PDF
Top 10 innovative IoT connected devices
PPTX
Top SaaS App Challenges: Which One Is Yours?
PPTX
SQL Connectivity in a MongoDB World
PPTX
Ignite Your Big Data With a Spark!
PPTX
Bridge the App Gap: Crossing the Chasm Between IT and Business
PDF
3 Simple Ways to Simplify Your Mobile Apps
PPTX
3 Ways to Simplify your Mobile Apps
PDF
Why Should You Join The Mobile Revolution?
PDF
B2B marketing analytics-report
PPT
PaaS for App Dev and Deployment
PPTX
How OData Opens Your Data To Enterprise Mobile Applications
PDF
Progress Rollbase: Building Powerful Applications One Block at a Time
PPTX
Creating Stunning Enterprise Apps for Both Web and Mobile
PDF
With Progress Pacific, The RAD Race Has Already Been Won!
PPTX
Build Powerful Apps Fast with Progress Rollbase
PPTX
Does PaaS Pay Off?
PDF
Does PaaS Pay Off?
Ship Quickly, Ship Quality: The Developer’s Quest (Infographic)
Database Technology Trends 2016 – Survey Results
Geekier Analytics for SaaS data
Top 10 innovative IoT connected devices
Top SaaS App Challenges: Which One Is Yours?
SQL Connectivity in a MongoDB World
Ignite Your Big Data With a Spark!
Bridge the App Gap: Crossing the Chasm Between IT and Business
3 Simple Ways to Simplify Your Mobile Apps
3 Ways to Simplify your Mobile Apps
Why Should You Join The Mobile Revolution?
B2B marketing analytics-report
PaaS for App Dev and Deployment
How OData Opens Your Data To Enterprise Mobile Applications
Progress Rollbase: Building Powerful Applications One Block at a Time
Creating Stunning Enterprise Apps for Both Web and Mobile
With Progress Pacific, The RAD Race Has Already Been Won!
Build Powerful Apps Fast with Progress Rollbase
Does PaaS Pay Off?
Does PaaS Pay Off?

Recently uploaded (20)

DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PDF
Laughter Yoga Basic Learning Workshop Manual
PPTX
5 Stages of group development guide.pptx
PDF
Traveri Digital Marketing Seminar 2025 by Corey and Jessica Perlman
PDF
A Brief Introduction About Julia Allison
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PPTX
HR Introduction Slide (1).pptx on hr intro
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PDF
Chapter 5_Foreign Exchange Market in .pdf
PPTX
Probability Distribution, binomial distribution, poisson distribution
PDF
Dr. Enrique Segura Ense Group - A Self-Made Entrepreneur And Executive
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PPTX
Business Ethics - An introduction and its overview.pptx
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
IFRS Notes in your pocket for study all the time
PDF
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
PDF
Deliverable file - Regulatory guideline analysis.pdf
DOCX
Business Management - unit 1 and 2
PDF
Nidhal Samdaie CV - International Business Consultant
Euro SEO Services 1st 3 General Updates.docx
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
Laughter Yoga Basic Learning Workshop Manual
5 Stages of group development guide.pptx
Traveri Digital Marketing Seminar 2025 by Corey and Jessica Perlman
A Brief Introduction About Julia Allison
DOC-20250806-WA0002._20250806_112011_0000.pdf
HR Introduction Slide (1).pptx on hr intro
unit 1 COST ACCOUNTING AND COST SHEET
Chapter 5_Foreign Exchange Market in .pdf
Probability Distribution, binomial distribution, poisson distribution
Dr. Enrique Segura Ense Group - A Self-Made Entrepreneur And Executive
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
Business Ethics - An introduction and its overview.pptx
MSPs in 10 Words - Created by US MSP Network
IFRS Notes in your pocket for study all the time
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
Deliverable file - Regulatory guideline analysis.pdf
Business Management - unit 1 and 2
Nidhal Samdaie CV - International Business Consultant

Big Data Insurance

  • 1. Big Data Insurance Mike Johnson Mike.Johnson@progress.com
  • 2. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.2 Big Data is Here to Stay – Forbes Sept. 2015 The data volumes are exploding, more data has been created in the past two years than in the entire previous history of the human race. By 2020, our accumulated digital universe of data will grow from 4.4 zettabytes today to around 44 zettabytes, or 44 trillion gigabytes. Within five years there will be over 50 billion smart connected devices in the world, all developed to collect, analyze and share data. The Hadoop … market is forecast to grow at a compound annual growth rate 58% surpassing $1 billion by 2020.
  • 3. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.3 The Big Data Ecosystem Today
  • 4. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.4 Hadoop Ecosystems Continues to Grow instead of Shrink
  • 5. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.5 The Number of Versions of all the Hadoop Components is Staggering!
  • 6. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.6 Big Data Release Cadences Continue to Cause ISVs Difficulty Quarterly: Monthly or More: Yearly: Multiple Times a Year:
  • 7. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.7 To Make things More Complicated… §  There is real, valuable, important functionality in many of these releases §  Examples Include: •  New DataTypes in Hive (Varchar, Decimal, Timestamp, Binary, etc…) •  Additional Ability to push down Queries in Mongo •  Metadata Enhancements in newer Versions of Hive •  Cassandra is adding enhancements every other month •  Etc..
  • 8. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.8 This Amount of Change puts ISVs in a Difficult Position Testing Nightmares Inconsistencies of feature support Keeping Up with the Industry
  • 9. What do ISVs require today?
  • 10. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.10 What ISVs need is a Vendor that takes care of all this for you! §  Progress|DataDirect has been writing Connectivity for over 25 Years! §  We have been working with Big Data sources since ???? §  Significant Investment in Testing Infrastructure •  Over 150 Hadoop Servers •  More than 30 Spark Servers •  Over 250 Big Data Servers! §  Day 1 Support Policy for New Versions §  Dedicated Team of people dealing with configuring new systems and doing certifications
  • 11. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.11 Progress|DataDirect - Smoothing out the Rough Edges §  Data Types reported and function based on Version •  Timestamp added in 0.8 •  Decimal added in 0.11 •  Date and varchar added in 0.12 •  Char added in 0.13 §  Syntax differences (HiveQL) •  INSERT statements •  Parameter arrays §  Catalog Metadata functionality •  Earlier versions of Hive didn‘t have Metadata functions at all •  Newer Versions don‘t necessariy report Metadata correctly
  • 12. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.12 The DataDirect Support Matrix Component Supported Versions Amazon Elastic MapReduce (Amazon EMR) 2.1.4 and Higher Apache Hadoop Hive 0.71 and Higher Cloudera's Distribution Including Apache Hadoop (CDH) CDH3 Update 4 and Higher Hortonworks Distrbution for Apache Hadoop 1.3 and Higher IBM BigInsights 3.0 and Higher MapR Distribution for Apache Hadoop 1.2 and Higher Pivotal HD Enterprise (PHD) 2.0.1 and Higher Cloudera Impala 1.0 and Higher Spark Pivotal HAWQ 1.1 and Higher MongoDB 2.2 and Higher Cassandra 1.2 and Higher
  • 13. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.13 The DataDirect Certification Process §  Relational DBs •  We run all tests on each supported version before announcing certification •  Add full test suite runs on all platforms to regular patch runs •  Generally support 4-6 major versions of a Relational DB •  The number of tests that we run for a Relational DB increase slowly over time •  Occasionally phase out really old versions §  Big Data •  Cloudera versions generally release before Apache •  Always certify Apache •  Ensure that other Distros Hive Versions have already been certified •  Certify a given distro with a given Hive version
  • 14. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.14 It’s not ALL about Connection into Big Data Systems §  Most of these Systems want to be the core system in your environment §  There is usually a great need to help get data into the systems through tools such as: •  SQOOP •  Spark •  Flume §  The rest of the DataDirect portfolio of drivers plug into these tools to broaden your reach
  • 15. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.15 The DataDirect Support Matrix Component Supported Versions Amazon Elastic MapReduce (Amazon EMR) 2.1.4 and Higher Apache Hadoop Hive 0.71 and Higher Cloudera's Distribution Including Apache Hadoop (CDH) CDH3 Update 4 and Higher Hortonworks Distribution for Apache Hadoop 1.3 and Higher IBM BigInsights 3.0 and Higher MapR Distribution for Apache Hadoop 1.2 and Higher Pivotal HD Enterprise (PHD) 2.0.1 and Higher Cloudera Impala 1.0 and Higher Spark 1.2 and Higher Pivotal HAWQ 1.1 and Higher MongoDB 2.2 and Higher Cassandra 1.2 and Higher
  • 16. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.16 Progress|DataDirect - Smoothing out the Rough Edges §  Data Types reported and function based on Version •  Timestamp added in 0.8 •  Decimal added in 0.11 •  Date and varchar added in 0.12 •  Char added in 0.13 §  Syntax differences (HiveQL) •  INSERT statements •  Parameter arrays §  Catalog Metadata functionality •  Earlier versions of Hive didn‘t have Metadata functions at all •  Newer Versions don‘t necessariy report Metadata correctly
  • 17. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.17 The DataDirect Certification Process §  Relational DBs •  We run all tests on each supported version before announcing certification •  Add full test suite runs on all platforms to regular patch runs •  Generally support 4-6 major versions of a Relational DB •  The number of tests that we run for a Relational DB increase slowly over time •  Occasionally phase out really old versions §  Big Data •  Always certify Apache •  Cloudera versions generally release before Apache and don’t strictly follow Apache •  Ensure that other Distros Hive Versions have already been certified •  Certify a given distro with a given Hive version
  • 18. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.18 It’s not ALL about Connecting into Big Data Systems §  Most of these Systems want to be the core system in your environment §  A great need to quickly get data into these systems through tools such as: •  SQOOP •  Spark •  Flume
  • 19. © 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved.19 Big Data / NoSQL Relational SaaS / Cloud EDI / XML / Text Ø  Apache Hadoop Hive Ø  Cloudera Ø  Hortonworks Ø  MapR Ø  Amazon EMR Ø  Cloudera Impala Ø  Pivotal Hawq Ø  MongoDB Ø  IBM BigInsights Ø  Oracle BDA Ø  Cassandra Ø  SAP HANA (Preview) Ø  Microsoft SQL Server Ø  Oracle DB Ø  IBM DB2 Ø  Progress OpenEdge Ø  SAP Sybase Ø  MySQL Ø  PostgreSQL Ø  Pervasive SQL (Btrieve) Ø  IBM Informix Ø  Clipper Ø  Dbase Ø  FoxPro Ø  Paradox Ø  Text Files Ø  Excel Ø  Salesforce.com Ø  Database.com Ø  FinancialForce Ø  Veeva CRM Ø  ServiceMax Ø  Any Force.com App Ø  Microsoft Dynamics CRM * Ø  Microsoft SQL Azure Ø  Oracle Eloqua * Ø  Oracle Service Cloud Ø  Marketo * Ø  Google Analytics * Ø  SugarCRM Ø  Hubspot (Preview) * Ø  Progress Rollbase * Ø  EDIFACT Ø  X12 Ø  IATA Ø  HealthcaseEDI:X12 (HIPPA), ICD-10, HL7 Ø  Flat Files: CSV, TXV, dBase Ø  Text files Ø  EDIG@S Ø  EANCOM Currently Supported Data Sources Data Warehouses Ø  TeraData Ø  Amazon Redshift Ø  Pivotal GreenPlum Ø  SAP Sybase IQAny Data Source Ø  SDK Ø  SequeLink Socket Server Ø  Custom Engineering * Available exclusively for DataDirect Cloud