SlideShare a Scribd company logo
Calpont InfiniDB®
Accelerating Data Insights

                      ®



Where the Rubber Meets the Road –
Analytic Platforms in the Real World

Featuring Matt Aslett, 451Research

July 18, 2012
Today’s Presenters

    Matt Aslett
    • Research Manager,
      Data Management and Analytics
    • With 451 Research since 2007
    • www.twitter.com/maslett

      Information Management            Commercial Adoption of Open Source
       Operational databases           (CAOS)
       Data warehousing                 Open source projects
       Data caching                     Adoption of open source software
       Event processing                 Vendor strategies


InfiniDB® Scalable. Fast. Simple.   2                        © 2012 Calpont. All Rights Reserved.
Today’s Presenters

    Bob Wilkinson
    • Calpont Vice President of Engineering
    • Formerly CTO for Tektronix
      Communications
    • 16 years of product development
    • Responsible for design, development,
      and support of InfiniDB                                             ®




InfiniDB® Scalable. Fast. Simple.   3         © 2012 Calpont. All Rights Reserved.
Today’s Discussion

 • Matt Aslett
         o Total Data and the Rise of the Analytic Platform
         o Analytic Platforms in the Big Data ecosystem
         o Defining the Analytic Platform
 • Bob Wilkinson
         o InfiniDB Analytic Platform
         o InfiniDB in Action
            • Telecommunications
            • Online Advertising
 • Summary and Q&A

InfiniDB® Scalable. Fast. Simple.       4               © 2012 Calpont. All Rights Reserved.
Overview


 The rise of the analytic platform
  What and why

 The analytic platform’s place in the ‘big data’ ecosystem
  Where and when

 The key characteristics of an analytic platform
  How and which



                                                                     5




                      © 2012 by The 451 Group. All rights reserved
The 451 Group




                                                               6




                © 2012 by The 451 Group. All rights reserved
Big Data – Implications for Data Management
 “Big data” - realization of greater business intelligence by
  storing, processing and analyzing data that was previously
  ignored due to the limitations of traditional data management
  technologies to handle its volume, velocity and/or variety.




       Volume                     Velocity                                   Variety
       The volume of data         The data is being                          The data lacks the
       is too large for           produced at a rate                         structure to make it
       traditional database       that is beyond the                         suitable for storage
       software tools to          performance limits                         and analysis in
       cope with                  of traditional                             traditional databases
                                  systems                                    and data warehouses




                              © 2012 by The 451 Group. All rights reserved
Total Data - Beyond ‘Big Data’
 The adoption of non-traditional data processing technologies is
   driven not just by the nature of the data, but also by the user’s
   particular data processing requirements.




Totality                Exploration                       Frequency                Dependency
The desire to process   The interest in                   The desire to            The reliance on
and analyze data in     exploratory analytic              increase the rate of     existing technologies
its entirety, rather    approaches, in which              analysis in order to     and skills, and the
than analyzing a        schema is defined in              generate more            need to balance
sample of data and      response to the                   accurate and timely      investment in those
extrapolating the       nature of the query.              business intelligence.   existing technologies
results.                                                                           and skills with the
                                                                                   adoption of new
                                                                                   techniques.



                                © 2012 by The 451 Group. All rights reserved
Beyond the limitations of traditional data warehousing
 The EDW is supposed to be a single source of the ‘truth’ and avoid
  data silos.

 One of the most significant inefficiencies of data warehousing is
  that users have traditionally had to design their data-warehouse
  models to match their planned queries.

 This approach is too rigid in a world of rapidly changing business
  requirements and real-time decision-making

 And its inflexibility serves to encourage the growth of data silos and
  the exact redundancy and duplication issues the EDW was
  apparently designed to avoid.

 A business analyst or executive unable to get the answers to queries
  they require from the EDW is likely to find their own ways to answer
  these queries.



                         © 2012 by The 451 Group. All rights reserved
The Rise of Specialist Platforms

    The alternative is to embrace dispersed data, adopting not silos but
     specialist data platforms, that complement the EDW.

    ‘Total Data’ describes an approach that treats the various data
     management components as an integrated whole.

    eBay is a prime example of this approach in action, with its
     Singularity analytic platform, as well as an EDW and Hadoop.




Structured SQL analysis    Semi-structured SQL                            Unstructured analysis

                           © 2012 by The 451 Group. All rights reserved
Defining “Analytic Platform”
 Enterprises have used specialist data marts/warehouses for many
    years for departmental/application-specific use-cases.

 Analytic platforms are designed to enable different analytic
    approaches, that complement traditional EDW workloads.

   Large data volumes
   Raw/close-to-raw data
   Multiple dimensions
   Complex variables
   Near real-time requirements
   Columnar storage
   SQL, user-defined functions
   MapReduce
   In-database analytics
   Flexible schema



                          © 2012 by The 451 Group. All rights reserved
Flexible schema
 Apply structural patterns as the data is analyzed, rather than when
  it is loaded into the database.

                         Query
Schema on write

    Application          Schema                             Data storage   Results




Schema on read                                                    Query


    Application        Data storage                             Schema     Results




                          © 2012 by The 451 Group. All rights reserved
“Exploratory Analytic Platform”
 The need for EAPs is not necessarily driven by the choice of storage
  platform (e.g., Hadoop or analytic database) or query language
  (e.g., SQL or MapReduce).

 Instead it is driven by the nature of the query or workload, or the
  skills and tools employed by the person interacting with the data.

 While data analysts are analyzing data to find answers to existing
  questions, data scientists are exploring patterns in data to prompt
  new questions.

 E.g. customer analysis, interactive marketing, targeted advertising,
  churn analysis, sentiment analysis, fraud analysis.

 An EAP should be flexible enough to enable the use of multiple
  techniques to support exploratory analysis.


                         © 2012 by The 451 Group. All rights reserved
EAP in larger Total Data landscape
                                                            EDW retains core role for
                                                               stable schema and
                                                               structured SQL analytics
                                                               on ERP, CRM apps etc.


                                                            Hadoop for storage and
                                                               processing of raw data,
                                                               analysis of unstructured,
                                                               schemaless data.

                                                            EAP for flexible,
                                                               exploratory analytics on
                                                               rapidly updated data with
                                                               evolving schema.


                     © 2012 by The 451 Group. All rights reserved
The Spectrum of Analytic Approaches

 Integration enables a ‘total data’ approach that treats the various
  platforms as points on a spectrum depending on the rigidity and
  importance of schema, rather than individual silos.




                         © 2012 by The 451 Group. All rights reserved
The Spectrum of Analytic Approaches

 Integration enables a ‘total data’ approach that treats the various
  platforms as points on a spectrum depending on the rigidity and
  importance of schema, rather than individual silos.




                         © 2012 by The 451 Group. All rights reserved
The Spectrum of Analytic Approaches

 Integration enables a ‘total data’ approach that treats the various
  platforms as points on a spectrum depending on the rigidity and
  importance of schema, rather than individual silos.




              Calpont InfiniDB
              • Columnar MPP
              • Vertical and horizontal range partitioning
              • Integrated MapReduce
              • Distributed user-defined functions

                         © 2012 by The 451 Group. All rights reserved
Considerations for Deploying an Analytic Platform
 Scalability – the ability to handle large volumes of data and expand
  as data volumes grow

 Performance – high performance processing is required to deliver
  rapid results

 Efficiency – in-database analytics approaches that take the query to
  the data

 Flexibility – no reliance on restrictive schema to deliver the desired
  performance

 Variability – support for multiple query approaches and advanced
  functions to enable exploratory analysis



                         © 2012 by The 451 Group. All rights reserved
Calpont Corporation
                                             Calpont Mission
                                              To provide a highly
                                                 scalable data
                                             platform that enables
                                               analytic business
                                              decisions as timely
• Software Company                             as customers and
                                                markets dictate.

• High Perf/ HA Analytic Data Platform

• Dallas HQ, Silicon Valley

• Partners in North America, Europe, Japan

• Online Media, Digital Networks, Telco
What is InfiniDB?



                     Simple, Powerful Platform for Big Data Analytics


               Columnar Performance Efficiency
                      Widely used MySQL Interface
      MPP, MapReduce style Query Execution

                                            20




InfiniDB® Scalable. Fast. Simple.                                © 2012 Calpont. All Rights Reserved.
Benefits of InfiniDB



                     Real-time, Consistent Query Performance

                     Linear Scale for Massive Data

                     Removes Limits to Dimensions and Granularity

                     Easy to Deploy and Maintain


InfiniDB® Scalable. Fast. Simple.        21             © 2012 Calpont. All Rights Reserved.
InfiniDB Analytic Platform – DW and Exploration
      Analytic Needs                  Analytic Platform     Data Integration           Big Data Sources



                                    Data Warehouse
                                                                 ETL
                                                                                     Transactional

   Dimensional
     Analytics                                                Hadoop

                                                                                        Operational

                                     Analytic Data             MDM
  Data Discovery
                                        Store


                                                                                          Legacy
                                                          Direct Load Model               RDBMS


       Predictive
        Analytics
InfiniDB® Scalable. Fast. Simple.                                         © 2012 Calpont. All Rights Reserved.
InfiniDB - Telecommunications
Telecommunications Market Challenges
                                          Global Mobile Voice and
                                     Data Revenues/ARPU – 2007-2013
                                                                                    Macro Drivers:
                                                                                    • Subscriber Growth declining
                                                                      Voice
                                                                      Revenue
            US $ Millions per Year




                                                                      Data
                                                                      Revenue       • ARPU declining
                                                                      Total ARPU
                                                                                    • Revenue Growth vs. Cost to
                                                                                      Carry

              Source: Informa Telecoms & Media
                                                                                    Do carriers?
                                                                                    • Attempt to control costs via
                                                                                      throttling, etc.
                                                                                    • Increase revenue through
                                                                                      monetization strategies


7/18/2012
  InfiniDB® Scalable. Fast. Simple.                                            24                     © 2012 Calpont. All Rights Reserved.
The Telco Gold Mine
                           Quality
                           •    Meets CSP expectations?
                           •    Meets Subscriber expectations?
                                                                            Data Sources
                                                                            • Element feeds
  Telco data is
                                                                            • Probe feeds
  rich – Can it be                                                          • Device agents
  fully leveraged?                                                          • Log files
                                                                            • Care data




   Usage                                                         Location
   •    What applications/services?                              •   Where are they?
   •    How much, how long, etc.                                 •   Movement patterns, etc.


InfiniDB® Scalable. Fast. Simple.                         25                         © 2012 Calpont. All Rights Reserved.
Challenge? or Opportunity?
  Multi-Dimensional Analysis


   Dimensions                       service              application




                                              Linkage?
                     network                                           customer




                                              kpi




InfiniDB® Scalable. Fast. Simple.                                             © 2012 Calpont. All Rights Reserved.
Telco Success
         Representative data from Customer Experience (CEM) analytics :
                                    Legacy       InfiniDB     Improvement
         # of DRs                   15 billion   15 billion   n/a
         Database size              4 TB         < 1TB        (75%)
         Load rates                 30k/sec      >120K/sec    400%
         Typical analytics          300 sec.     5 sec.       (98%)
         query


       Benefits
        Game-changer for storage of and access to non-aggregated data
        Near linear scale out performance




InfiniDB® Scalable. Fast. Simple.                                     © 2012 Calpont. All Rights Reserved.
InfiniDB - Online Advertising
Online Advertising – Market Challenges

  • Advertising Analytics (≠ Web Analytics)
          o Interactions and performance of ads on other sites
          o Attribution analysis - ad optimization, efficient targeting,
            and return on ad spend
  • Challenges
          o Massive daily data consumption – “Billions Served”
          o Ad targeting is not real-time with traditional data tech
          o Attribution analytics effectiveness




               Wide Dimensionality                    Granularity

InfiniDB® Scalable. Fast. Simple.                                          © 2012 Calpont. All Rights Reserved.
Mobile Advertising – Analytic Data Environment
        Info Sources
                                    Source Data
      Location Ads




   Free WiFi Ad Share
                                      ETL                                   Analytic Platform                     BI / Analytic Front End



  WiFi Captive Display
                                                               Special Needs
                                                                Latitudinal / Longitudinal
                                                               Geospatial Functions
                                                                Military Grid Ref System
   App Embedded Ads
                                                               (MGRS) Functions


                                            Non-Calpont product names are trademarks of their respective owners

InfiniDB® Scalable. Fast. Simple.                                      30                                         © 2012 Calpont. All Rights Reserved.
Online Advertising Success

         Location-based Mobile Advertiser Funnels Big Data Insights
                                    Legacy           InfiniDB      Improvement
         # of DRs                   300 Million      300 Million   n/a
         Database size              >6 TB            3 TB          (50%)
         Load rates                 100k/sec         1M+/sec       1000%
         Typical analytics          20-30 min with   15 sec.       (99.2%)
         query                      cubes

       Benefits
                                                                   Mobile Audience Insights Report


        Real-time analytics about niche segments
        Simple MySQL interface for easy use of Hadoop ETL
       extracts
       “Mobile Audience Insights” for segment affinity and
       engagement strategies

InfiniDB® Scalable. Fast. Simple.                                          © 2012 Calpont. All Rights Reserved.
Key Takeaways
  A spectrum of analytic platforms address structured and
   unstructured needs that complement the traditional EDW
  Proper choice of an analytics platform should depend on rigidity
   and importance of schema, as well as skills and tools of users
  InfiniDB is a scalable MPP columnar platform supporting
   exploratory analytics for structured data
  Calpont is helping partners create transformational solutions in
   Telco Customer Experience and Online Advertising




InfiniDB® Scalable. Fast. Simple.                      © 2012 Calpont. All Rights Reserved.
More Info on 451 Research and Calpont

   Matt Aslett                                 Bob Wilkinson
   451 Research                                Calpont Corporation
   www.451research.com                         www.calpont.com
   @maslett @451research                       @Calpont, @InfiniDB




    451 examines trends behind Big Data and   Calpont discusses why Big Data in online
      the Total Data management approach      marketing needs modern data technology

InfiniDB® Scalable. Fast. Simple.                33                                © 2012 Calpont. All Rights Reserved.
®

More Related Content

PPTX
Innovate Analytics with Oracle Data Mining & Oracle R
PPT
Metadata in general and Dublin Core in specific; some experiences
PDF
Big Data
PPTX
Ibm info sphere datastage and hadoop two best-of-breed solutions together-f...
PDF
Translational Research Intelligence - Beyond Traditional Bi
PPTX
Metadata Use Cases You Can Use
PPTX
Big Data Practice_Planning_steps_RK
PDF
Informatics technologies in an evolving r & d landscape
Innovate Analytics with Oracle Data Mining & Oracle R
Metadata in general and Dublin Core in specific; some experiences
Big Data
Ibm info sphere datastage and hadoop two best-of-breed solutions together-f...
Translational Research Intelligence - Beyond Traditional Bi
Metadata Use Cases You Can Use
Big Data Practice_Planning_steps_RK
Informatics technologies in an evolving r & d landscape

What's hot (20)

PPTX
Webinar - Security and Manageability: Key Criteria in Selecting Enterprise-Gr...
PDF
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
PPTX
Big Data SE vs. SE for Big Data
PDF
Big Data and Data Virtualization
PDF
Ten Pillars of World Class Data Virtualization
PPTX
2012 10 bigdata_overview
PPTX
Big data and apache hadoop adoption
PDF
Slides: Relational to NoSQL Migration
PDF
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
PPTX
The Convergence of Data & Digital: Mapping Out a Cohesive Strategy for Maximu...
PDF
Are You Killing the Benefits of Your Data Lake?
PDF
Data architecture for modern enterprise
PDF
Enabling Cloud Data Integration (EMEA)
PDF
Rajesh Angadi Brochure
PDF
Introduction to Data Warehousing
PDF
Modern data warehouse
PDF
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
PDF
The Emerging Data Lake IT Strategy
PPTX
Developing a Strategy for Data Lake Governance
PDF
Data lakes
Webinar - Security and Manageability: Key Criteria in Selecting Enterprise-Gr...
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
Big Data SE vs. SE for Big Data
Big Data and Data Virtualization
Ten Pillars of World Class Data Virtualization
2012 10 bigdata_overview
Big data and apache hadoop adoption
Slides: Relational to NoSQL Migration
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
The Convergence of Data & Digital: Mapping Out a Cohesive Strategy for Maximu...
Are You Killing the Benefits of Your Data Lake?
Data architecture for modern enterprise
Enabling Cloud Data Integration (EMEA)
Rajesh Angadi Brochure
Introduction to Data Warehousing
Modern data warehouse
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
The Emerging Data Lake IT Strategy
Developing a Strategy for Data Lake Governance
Data lakes
Ad

Similar to Analytic Platforms in the Real World with 451Research and Calpont_July 2012 (20)

PPT
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
PPTX
What is the Point of Hadoop
PDF
Analyze This! Best Practices For Big And Fast Data
 
PPTX
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
PPTX
Anexinet Big Data Solutions
PDF
Investigative Analytics- What's in a Data Scientists Toolbox
PPTX
Derfor skal du bruge en DataLake
PDF
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
PDF
All Together Now: A Recipe for Successful Data Governance
PDF
Data lake benefits
PDF
Modern Data Management for Federal Modernization
PDF
Advanced Analytics and Machine Learning with Data Virtualization (India)
PDF
Ibm big data ibm marriage of hadoop and data warehousing
PPTX
Data lake-itweekend-sharif university-vahid amiry
PPTX
Database-Management-Systems-An-Introduction (1).pptx
PDF
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
PDF
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
PDF
A beginners guide to Cloudera Hadoop
PDF
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
PPT
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
What is the Point of Hadoop
Analyze This! Best Practices For Big And Fast Data
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Anexinet Big Data Solutions
Investigative Analytics- What's in a Data Scientists Toolbox
Derfor skal du bruge en DataLake
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
All Together Now: A Recipe for Successful Data Governance
Data lake benefits
Modern Data Management for Federal Modernization
Advanced Analytics and Machine Learning with Data Virtualization (India)
Ibm big data ibm marriage of hadoop and data warehousing
Data lake-itweekend-sharif university-vahid amiry
Database-Management-Systems-An-Introduction (1).pptx
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
A beginners guide to Cloudera Hadoop
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ad

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Approach and Philosophy of On baking technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
KodekX | Application Modernization Development
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Encapsulation theory and applications.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
A Presentation on Artificial Intelligence
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Big Data Technologies - Introduction.pptx
Encapsulation_ Review paper, used for researhc scholars
Diabetes mellitus diagnosis method based random forest with bat algorithm
Approach and Philosophy of On baking technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KodekX | Application Modernization Development
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Encapsulation theory and applications.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Monthly Chronicles - July 2025
Electronic commerce courselecture one. Pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
A Presentation on Artificial Intelligence
MYSQL Presentation for SQL database connectivity
Machine learning based COVID-19 study performance prediction
Big Data Technologies - Introduction.pptx

Analytic Platforms in the Real World with 451Research and Calpont_July 2012

  • 1. Calpont InfiniDB® Accelerating Data Insights ® Where the Rubber Meets the Road – Analytic Platforms in the Real World Featuring Matt Aslett, 451Research July 18, 2012
  • 2. Today’s Presenters Matt Aslett • Research Manager, Data Management and Analytics • With 451 Research since 2007 • www.twitter.com/maslett Information Management Commercial Adoption of Open Source  Operational databases (CAOS)  Data warehousing  Open source projects  Data caching  Adoption of open source software  Event processing  Vendor strategies InfiniDB® Scalable. Fast. Simple. 2 © 2012 Calpont. All Rights Reserved.
  • 3. Today’s Presenters Bob Wilkinson • Calpont Vice President of Engineering • Formerly CTO for Tektronix Communications • 16 years of product development • Responsible for design, development, and support of InfiniDB ® InfiniDB® Scalable. Fast. Simple. 3 © 2012 Calpont. All Rights Reserved.
  • 4. Today’s Discussion • Matt Aslett o Total Data and the Rise of the Analytic Platform o Analytic Platforms in the Big Data ecosystem o Defining the Analytic Platform • Bob Wilkinson o InfiniDB Analytic Platform o InfiniDB in Action • Telecommunications • Online Advertising • Summary and Q&A InfiniDB® Scalable. Fast. Simple. 4 © 2012 Calpont. All Rights Reserved.
  • 5. Overview The rise of the analytic platform  What and why The analytic platform’s place in the ‘big data’ ecosystem  Where and when The key characteristics of an analytic platform  How and which 5 © 2012 by The 451 Group. All rights reserved
  • 6. The 451 Group 6 © 2012 by The 451 Group. All rights reserved
  • 7. Big Data – Implications for Data Management  “Big data” - realization of greater business intelligence by storing, processing and analyzing data that was previously ignored due to the limitations of traditional data management technologies to handle its volume, velocity and/or variety. Volume Velocity Variety The volume of data The data is being The data lacks the is too large for produced at a rate structure to make it traditional database that is beyond the suitable for storage software tools to performance limits and analysis in cope with of traditional traditional databases systems and data warehouses © 2012 by The 451 Group. All rights reserved
  • 8. Total Data - Beyond ‘Big Data’  The adoption of non-traditional data processing technologies is driven not just by the nature of the data, but also by the user’s particular data processing requirements. Totality Exploration Frequency Dependency The desire to process The interest in The desire to The reliance on and analyze data in exploratory analytic increase the rate of existing technologies its entirety, rather approaches, in which analysis in order to and skills, and the than analyzing a schema is defined in generate more need to balance sample of data and response to the accurate and timely investment in those extrapolating the nature of the query. business intelligence. existing technologies results. and skills with the adoption of new techniques. © 2012 by The 451 Group. All rights reserved
  • 9. Beyond the limitations of traditional data warehousing  The EDW is supposed to be a single source of the ‘truth’ and avoid data silos.  One of the most significant inefficiencies of data warehousing is that users have traditionally had to design their data-warehouse models to match their planned queries.  This approach is too rigid in a world of rapidly changing business requirements and real-time decision-making  And its inflexibility serves to encourage the growth of data silos and the exact redundancy and duplication issues the EDW was apparently designed to avoid.  A business analyst or executive unable to get the answers to queries they require from the EDW is likely to find their own ways to answer these queries. © 2012 by The 451 Group. All rights reserved
  • 10. The Rise of Specialist Platforms  The alternative is to embrace dispersed data, adopting not silos but specialist data platforms, that complement the EDW.  ‘Total Data’ describes an approach that treats the various data management components as an integrated whole.  eBay is a prime example of this approach in action, with its Singularity analytic platform, as well as an EDW and Hadoop. Structured SQL analysis Semi-structured SQL Unstructured analysis © 2012 by The 451 Group. All rights reserved
  • 11. Defining “Analytic Platform”  Enterprises have used specialist data marts/warehouses for many years for departmental/application-specific use-cases.  Analytic platforms are designed to enable different analytic approaches, that complement traditional EDW workloads.  Large data volumes  Raw/close-to-raw data  Multiple dimensions  Complex variables  Near real-time requirements  Columnar storage  SQL, user-defined functions  MapReduce  In-database analytics  Flexible schema © 2012 by The 451 Group. All rights reserved
  • 12. Flexible schema  Apply structural patterns as the data is analyzed, rather than when it is loaded into the database. Query Schema on write Application Schema Data storage Results Schema on read Query Application Data storage Schema Results © 2012 by The 451 Group. All rights reserved
  • 13. “Exploratory Analytic Platform”  The need for EAPs is not necessarily driven by the choice of storage platform (e.g., Hadoop or analytic database) or query language (e.g., SQL or MapReduce).  Instead it is driven by the nature of the query or workload, or the skills and tools employed by the person interacting with the data.  While data analysts are analyzing data to find answers to existing questions, data scientists are exploring patterns in data to prompt new questions.  E.g. customer analysis, interactive marketing, targeted advertising, churn analysis, sentiment analysis, fraud analysis.  An EAP should be flexible enough to enable the use of multiple techniques to support exploratory analysis. © 2012 by The 451 Group. All rights reserved
  • 14. EAP in larger Total Data landscape  EDW retains core role for stable schema and structured SQL analytics on ERP, CRM apps etc.  Hadoop for storage and processing of raw data, analysis of unstructured, schemaless data.  EAP for flexible, exploratory analytics on rapidly updated data with evolving schema. © 2012 by The 451 Group. All rights reserved
  • 15. The Spectrum of Analytic Approaches  Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos. © 2012 by The 451 Group. All rights reserved
  • 16. The Spectrum of Analytic Approaches  Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos. © 2012 by The 451 Group. All rights reserved
  • 17. The Spectrum of Analytic Approaches  Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos. Calpont InfiniDB • Columnar MPP • Vertical and horizontal range partitioning • Integrated MapReduce • Distributed user-defined functions © 2012 by The 451 Group. All rights reserved
  • 18. Considerations for Deploying an Analytic Platform  Scalability – the ability to handle large volumes of data and expand as data volumes grow  Performance – high performance processing is required to deliver rapid results  Efficiency – in-database analytics approaches that take the query to the data  Flexibility – no reliance on restrictive schema to deliver the desired performance  Variability – support for multiple query approaches and advanced functions to enable exploratory analysis © 2012 by The 451 Group. All rights reserved
  • 19. Calpont Corporation Calpont Mission To provide a highly scalable data platform that enables analytic business decisions as timely • Software Company as customers and markets dictate. • High Perf/ HA Analytic Data Platform • Dallas HQ, Silicon Valley • Partners in North America, Europe, Japan • Online Media, Digital Networks, Telco
  • 20. What is InfiniDB? Simple, Powerful Platform for Big Data Analytics Columnar Performance Efficiency Widely used MySQL Interface MPP, MapReduce style Query Execution 20 InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 21. Benefits of InfiniDB Real-time, Consistent Query Performance Linear Scale for Massive Data Removes Limits to Dimensions and Granularity Easy to Deploy and Maintain InfiniDB® Scalable. Fast. Simple. 21 © 2012 Calpont. All Rights Reserved.
  • 22. InfiniDB Analytic Platform – DW and Exploration Analytic Needs Analytic Platform Data Integration Big Data Sources Data Warehouse ETL Transactional Dimensional Analytics Hadoop Operational Analytic Data MDM Data Discovery Store Legacy Direct Load Model RDBMS Predictive Analytics InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 24. Telecommunications Market Challenges Global Mobile Voice and Data Revenues/ARPU – 2007-2013 Macro Drivers: • Subscriber Growth declining Voice Revenue US $ Millions per Year Data Revenue • ARPU declining Total ARPU • Revenue Growth vs. Cost to Carry Source: Informa Telecoms & Media Do carriers? • Attempt to control costs via throttling, etc. • Increase revenue through monetization strategies 7/18/2012 InfiniDB® Scalable. Fast. Simple. 24 © 2012 Calpont. All Rights Reserved.
  • 25. The Telco Gold Mine Quality • Meets CSP expectations? • Meets Subscriber expectations? Data Sources • Element feeds Telco data is • Probe feeds rich – Can it be • Device agents fully leveraged? • Log files • Care data Usage Location • What applications/services? • Where are they? • How much, how long, etc. • Movement patterns, etc. InfiniDB® Scalable. Fast. Simple. 25 © 2012 Calpont. All Rights Reserved.
  • 26. Challenge? or Opportunity? Multi-Dimensional Analysis Dimensions service application Linkage? network customer kpi InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 27. Telco Success Representative data from Customer Experience (CEM) analytics : Legacy InfiniDB Improvement # of DRs 15 billion 15 billion n/a Database size 4 TB < 1TB (75%) Load rates 30k/sec >120K/sec 400% Typical analytics 300 sec. 5 sec. (98%) query Benefits  Game-changer for storage of and access to non-aggregated data  Near linear scale out performance InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 28. InfiniDB - Online Advertising
  • 29. Online Advertising – Market Challenges • Advertising Analytics (≠ Web Analytics) o Interactions and performance of ads on other sites o Attribution analysis - ad optimization, efficient targeting, and return on ad spend • Challenges o Massive daily data consumption – “Billions Served” o Ad targeting is not real-time with traditional data tech o Attribution analytics effectiveness Wide Dimensionality Granularity InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 30. Mobile Advertising – Analytic Data Environment Info Sources Source Data Location Ads Free WiFi Ad Share ETL Analytic Platform BI / Analytic Front End WiFi Captive Display Special Needs  Latitudinal / Longitudinal Geospatial Functions  Military Grid Ref System App Embedded Ads (MGRS) Functions Non-Calpont product names are trademarks of their respective owners InfiniDB® Scalable. Fast. Simple. 30 © 2012 Calpont. All Rights Reserved.
  • 31. Online Advertising Success Location-based Mobile Advertiser Funnels Big Data Insights Legacy InfiniDB Improvement # of DRs 300 Million 300 Million n/a Database size >6 TB 3 TB (50%) Load rates 100k/sec 1M+/sec 1000% Typical analytics 20-30 min with 15 sec. (99.2%) query cubes Benefits Mobile Audience Insights Report  Real-time analytics about niche segments  Simple MySQL interface for easy use of Hadoop ETL extracts “Mobile Audience Insights” for segment affinity and engagement strategies InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 32. Key Takeaways A spectrum of analytic platforms address structured and unstructured needs that complement the traditional EDW Proper choice of an analytics platform should depend on rigidity and importance of schema, as well as skills and tools of users InfiniDB is a scalable MPP columnar platform supporting exploratory analytics for structured data Calpont is helping partners create transformational solutions in Telco Customer Experience and Online Advertising InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 33. More Info on 451 Research and Calpont Matt Aslett Bob Wilkinson 451 Research Calpont Corporation www.451research.com www.calpont.com @maslett @451research @Calpont, @InfiniDB 451 examines trends behind Big Data and Calpont discusses why Big Data in online the Total Data management approach marketing needs modern data technology InfiniDB® Scalable. Fast. Simple. 33 © 2012 Calpont. All Rights Reserved.
  • 34. ®