SlideShare a Scribd company logo
The Road for Enterprise Data
         From Traditional BI to Big Data


          @LynnLangit
  Practioner, Author, Instructor
BI = ‘Current State’ Questions



              • What did we sell?
  Collecting  • When did we sell it?
Transactional
    data      • Where did we sell it?
              • What did we sell with it?
BTW…Do you use Data Mining?
BI Data Landscape

           Storage
           Processing
             Query
          Presentation
Mix-in #1 -- the Cloud and…
• Host Data in the Cloud
• Process & Query Data in the Cloud
  – Click to query and (data) mine
  – Return the data locally
  – Use Self-service BI visualizers
• Mash-up Cloud data
  – Combine with local data
NoSQL and the Cloud
• The Elephant in the room…Hadoop
• Over 120+ types of noSQL databases
  – http://nosql-database.org/
Can’t We All Play Together?
Data in the Cloud - Microsoft
Windows Azure DataMarket
Amazon AWS
Google App Engine Data
New on Google – MySQL++
Comparing RDBMS and MapReduce
  Reference: Tom White’s Hadoop: The Definitive Guide


                           Traditional RDBMS            MapReduce

Data Size                  Gigabytes (Terabytes)        Petabytes (Hexabytes)

Access                     Interactive and Batch        Batch

Updates                    Read / Write many times      Write once, Read many times

Structure                  Static Schema                Dynamic Schema

Integrity                  High (ACID)                  Low

Scaling                    Nonlinear                    Linear

DBA Ratio                  1:40                         1:3000
BTW…NoSQL is 50x CHEAPER
BigData = ‘Next State’ Questions



             • What could happen?
Collecting   • Why didn’t this happen?
             • When will the next new thing
behavioral     happen?
  data       • What will the next new thing
               be?
Splunk
Mining Log Files
Presenting the results
Freebase
Mix-in #2 - Data Scientists




• Who asks the ‘right’ questions now?
• Who understands the languages?
• Who can understand the results?
Is Data Science your next Career?
Becoming a Data Scientist
• Conferences
  – Strata
  – Data Scientist
    Summit
  – CloudCamps
• Practice
  – here
Mix-in #3 - Presentation




• New Devices – iPad, Kindle Fire
• New User Experiences – touch, Kinect
• EVERYTHING on the phone
HortonWorks, Cloudera…
Karmasphere Studio
for Amazon Elastic MapReduce
More PowerPivot
Cloud-based Data Mining
Predixion
QlikView
QlikView on iPad
BI >BigData ‘To Do List
Store some (more) data on the cloud
• Relational and non-relational
• Transaction AND Behavioral
Process some data in the cloud
• Try data mining
• Learn about Data Science
Update your client tools
• New UI (touch, gestures)
• Click to Query
• New form factors (phone, tablet)
Hadoop Connector to Excel - Demo
www.TeachingKidsProgramming.org
• Do a Recipe  Teach a Kid (Ages 10 ++)
• Microsoft SmallBasic  Free Courseware (recipes)
Keep up with Big Data

             Follow me @LynnLangit




                  RSS my blog
               www.LynnLangit.com


         Hire me
         • To help build your BI/Big Data solution
         • To teach your team next gen BI

More Related Content

PDF
Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
PPTX
Dataiku Flow and dctc - Berlin Buzzwords
PDF
Critical Breakthroughs and Challenges in Big Data and Analytics
PPTX
Big Data Visualisation with Hadoop and PowerPivot
PDF
Big Data made easy in the era of the Cloud - Demi Ben-Ari
PDF
Make your data talk
PPTX
Hadoop and BigData - July 2016
PDF
Exploring BigData with Google BigQuery
Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
Dataiku Flow and dctc - Berlin Buzzwords
Critical Breakthroughs and Challenges in Big Data and Analytics
Big Data Visualisation with Hadoop and PowerPivot
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Make your data talk
Hadoop and BigData - July 2016
Exploring BigData with Google BigQuery

What's hot (20)

PPTX
Sql rally amsterdam Aanalysing data with Power BI and Hive
PPTX
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
PPTX
Exploring Big Data Analytics Tools
PDF
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
PDF
Free servers to build Big Data Systems on: Bing's Approach
PDF
The Rise of the DataOps - Dataiku - J On the Beach 2016
PDF
From hadoop to spark
PDF
Big and fast a quest for relevant and real-time analytics
PPT
Overview of Bigdata Analytics
PDF
Hadoop and SAP BI
PPTX
Aaum Analytics event - Big data in the cloud
PPTX
Introduction to Big Data
PDF
BigQuery for Beginners
PDF
How BigQuery broke my heart
PDF
Google BigQuery is the future of Analytics! (Google Developer Conference)
ODP
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
PPTX
Introduction to Big Data
PDF
democratization of data sql-konferenz
PDF
Better Insights from Your Master Data - Graph Database LA Meetup
PDF
What is support_engineer_in_treasuredata
Sql rally amsterdam Aanalysing data with Power BI and Hive
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Exploring Big Data Analytics Tools
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Free servers to build Big Data Systems on: Bing's Approach
The Rise of the DataOps - Dataiku - J On the Beach 2016
From hadoop to spark
Big and fast a quest for relevant and real-time analytics
Overview of Bigdata Analytics
Hadoop and SAP BI
Aaum Analytics event - Big data in the cloud
Introduction to Big Data
BigQuery for Beginners
How BigQuery broke my heart
Google BigQuery is the future of Analytics! (Google Developer Conference)
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Introduction to Big Data
democratization of data sql-konferenz
Better Insights from Your Master Data - Graph Database LA Meetup
What is support_engineer_in_treasuredata
Ad

Similar to Strata Online_road_to_enterprise_data_2011 (20)

PPTX
Architecting Your First Big Data Implementation
PDF
Big Data and NoSQL in Microsoft-Land
PDF
Introduction to Big Data
PPTX
Big data webinar may23 nrit by sunil
PPTX
NoSQL for the SQL Server Pro
PPTX
SQLCAT: Tier-1 BI in the World of Big Data
PPTX
Transform your DBMS to drive engagement innovation with Big Data
PPTX
Big Data & Hadoop Introduction
PDF
NoSQL – Back to the Future or Yet Another DB Feature?
PDF
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
PPTX
Big Data and NoSQL for Database and BI Pros
PPTX
Lunch & Learn Intro to Big Data
PPTX
Big Data in the Microsoft Platform
PDF
Neo4j in Depth
PPT
Big Data Paris : Hadoop and NoSQL
PDF
One Size Doesn't Fit All: The New Database Revolution
PPTX
Big data4businessusers
PPTX
Choosing technologies for a big data solution in the cloud
PPT
Final deck
PDF
Hadoop meets Agile! - An Agile Big Data Model
Architecting Your First Big Data Implementation
Big Data and NoSQL in Microsoft-Land
Introduction to Big Data
Big data webinar may23 nrit by sunil
NoSQL for the SQL Server Pro
SQLCAT: Tier-1 BI in the World of Big Data
Transform your DBMS to drive engagement innovation with Big Data
Big Data & Hadoop Introduction
NoSQL – Back to the Future or Yet Another DB Feature?
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data and NoSQL for Database and BI Pros
Lunch & Learn Intro to Big Data
Big Data in the Microsoft Platform
Neo4j in Depth
Big Data Paris : Hadoop and NoSQL
One Size Doesn't Fit All: The New Database Revolution
Big data4businessusers
Choosing technologies for a big data solution in the cloud
Final deck
Hadoop meets Agile! - An Agile Big Data Model
Ad

More from Lynn Langit (20)

PPTX
VariantSpark on AWS
PPTX
Serverless Architectures
PPTX
10+ Years of Teaching Kids Programming
PPTX
Blastn plus jupyter on Docker
PDF
Testing in Ballerina Language
PPTX
Teaching Kids to create Alexa Skills
PPTX
Practical cloud
PPTX
Understanding Jupyter notebooks using bioinformatics examples
PPTX
Genome-scale Big Data Pipelines
PPTX
Teaching Kids Programming
PPTX
Practical Cloud
PPTX
Serverless Reality
PPTX
Genomic Scale Big Data Pipelines
PPTX
VariantSpark - a Spark library for genomics
PPTX
Bioinformatics Data Pipelines built by CSIRO on AWS
PPTX
Serverless Reality
PDF
Beyond Relational
PPTX
New AWS Services for Bioinformatics
PPTX
Google Cloud and Data Pipeline Patterns
PPTX
Scaling Galaxy on Google Cloud Platform
VariantSpark on AWS
Serverless Architectures
10+ Years of Teaching Kids Programming
Blastn plus jupyter on Docker
Testing in Ballerina Language
Teaching Kids to create Alexa Skills
Practical cloud
Understanding Jupyter notebooks using bioinformatics examples
Genome-scale Big Data Pipelines
Teaching Kids Programming
Practical Cloud
Serverless Reality
Genomic Scale Big Data Pipelines
VariantSpark - a Spark library for genomics
Bioinformatics Data Pipelines built by CSIRO on AWS
Serverless Reality
Beyond Relational
New AWS Services for Bioinformatics
Google Cloud and Data Pipeline Patterns
Scaling Galaxy on Google Cloud Platform

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MIND Revenue Release Quarter 2 2025 Press Release
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Network Security Unit 5.pdf for BCA BBA.
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Encapsulation_ Review paper, used for researhc scholars
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
NewMind AI Weekly Chronicles - August'25 Week I
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Approach and Philosophy of On baking technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
Unlocking AI with Model Context Protocol (MCP)
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Strata Online_road_to_enterprise_data_2011

Editor's Notes

  • #2: How to Get There from Here: The Road to Enterprise DataLynn Langit (Teaching Kids Programming)Wednesday, 12/07/2011How will companies familiar with BI and SQL gradually embrace unstructured data and noSQL models? Will this be through a “layer” of SQL emulation? Through an Excel plug-in that generates Hadoop workloads? A rethinking on the part of database vendors? Or something else entirely?In this session, cloud and data expert Lynn Langit explores the roadmap to Big Data adoption by traditional enterprise IT and corporate software developers.
  • #7: http://hadoop.apache.org/ & http://www.mongodb.org/
  • #8: http://www.microsoft.com/download/en/details.aspx?id=27584http://www.oracle.com/us/corporate/features/feature-oracle-loader-for-hadoop-505115.html
  • #9: http://windows.azure.com
  • #10: https://datamarket.azure.com/
  • #11: http://aws.amazon.com/
  • #12: http://code.google.com/appengine/http://code.google.com/appengine/articles/datastore/overview.html
  • #13: http://code.google.com
  • #15: http://lynnlangit.wordpress.com/2011/11/09/relational-cloud-storage-is-50x-more-expensive-than-nosql/
  • #17: http://www.splunk.com/product
  • #20: http://www.freebase.com/
  • #22: http://www.romymisra.com/the-new-job-market-rulers-data-scientists/
  • #23: http://www.quora.com/Career-Advice/How-do-I-become-a-data-scientistSlide from- http://www.huffingtonpost.com/roger-ehrenberg/data-driven-startup_b_1088124.html?ref=tw
  • #24: http://www.web-designers-directory.org/articles/top-rated-android-applications-for-2011-20.html
  • #25: http://hortonworks.com/technology/hortonworksdataplatform/http://www.cloudera.com/
  • #26: http://www.youtube.com/watch?v=gjsMDAcI1Mo
  • #28: http://www.predixionsoftware.com/predixion/
  • #30: http://www.qlikview.com/us
  • #32: http://dennyglee.com/
  • #33: From the blog - http://www.thisisthegreenroom.com/2011/data-science-vs-business-intelligence/
  • #34: Lynn