SlideShare a Scribd company logo
Welcome to Hadoop World: NYC 2009
Hadoop is Everywhere
                          Presents:




Christophe Bisciglia
Founder christophe@cloudera.com
Hadoop World Details and Event Updates
Too Late to Print
▪   WiFi Details                    ▪   UI BOF
    ▪   SSID: HadoopWorld               ▪   Lead: Philip Zeyliger, Cloudera
    ▪   Password: hadoop09              ▪   Vanderbilt Suite, Afternoon Break
▪   Twitter: #hadoopworld           ▪   HBase BOF
                                        ▪   Lead: Michael Stack, Microsoft
▪   Break Out Sessions                  ▪   Terrace Ballroom, Afternoon Break
    ▪   Applications (This Room)
    ▪   Dev / Admin: Terrace Ballroom (Across Lobby)
    ▪   Extensions: Vanderbilt Suite (One Floor Up)
Hadoop World Sponsors
Thanks!
Why Hadoop World?
Time to Upgrade Your Data Management Strategy
▪   Hadoop isn’t just for Web Companies anymore
    ▪   Terabytes are common place
    ▪   Enables consumption of all enterprise data
    ▪   Wide adoption across verticals
▪   Hadoop is driven by the Community
    ▪   Most registrants are new to Hadoop
    ▪   Sharing experience is critical - and incredibly valuable
    ▪   Users and Developers exchanging needs and ideas
Growing Up with Hadoop
You’ve come a long way baby...
Growing Up with Hadoop
You’ve come a long way baby...

▪   Early Days
    ▪   2004: Google Publishes MapReduce/GFS
    ▪   2005: Hadoop Prototype
        ▪   Doug Cutting and Mike Cafarella
    ▪   2006: Hadoop Running on 20 nodes
        ▪   Internet Archive and UW



                                                  Doug Cutting
                                               Photo Credit: New York Times
Growing Up with Hadoop
You’ve come a long way baby...

▪   Formative Years
    ▪   2006: Yahoo! Begins Major Investment
    ▪   2007: Yahoo! Runs Hadoop on 2000 nodes
    ▪   2008: Yahoo! uses Hadoop to claim Terasort
        Benchmark
Growing Up with Hadoop
You’ve come a long way baby...



▪   5 Major Releases for Hadoop in last year
    ▪   More Reliable
    ▪   More Scalable
    ▪   More Manageable
Growing Up with Hadoop
You’ve come a long way baby...




▪   New Sub-Projects Embrace New Users
    ▪   Hive: SQL Data Warehouse for Hadoop
    ▪   Pig: Data Analysis Language
Growing Up with Hadoop
You’ve come a long way baby...




▪   Sqoop: Database import for Hadoop
    ▪   Developer by Aaron Kimball, Cloudera
    ▪   Works over JDBC
    ▪   Extensible for better pefromance
Growing Up with Hadoop
You’ve come a long way baby...




▪   RDBMS Vendors Embrace Hadoop
    ▪   MapReduce is great for Analytics
    ▪   Hadoop is the MapReduce Standard
    ▪           integrates directly with Hadoop
Growing Up with Hadoop
You’ve come a long way baby...




▪   Adoption Spanning Globe
    ▪   HUGs outside the US
    ▪   Over 10x Companies “PoweredBy”
    ▪   Not Just for Web Companies Anymore
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community




 Hadoop Community
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community


Latest Stable Hadoop Release

Stable Upcoming Features       Distribution for Hadoop
  (by customer request)




  Hadoop Community
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community

                                                             Source Code Powering Y!
Latest Stable Hadoop Release
                                                         Improvements for EC2 and S3
Stable Upcoming Features       Distribution for Hadoop
  (by customer request)
                                                          New Features from Cloudera




  Hadoop Community
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community

                                                             Source Code Powering Y!
Latest Stable Hadoop Release
                                                         Improvements for EC2 and S3
Stable Upcoming Features       Distribution for Hadoop
  (by customer request)
                                                          New Features from Cloudera


                               Cloudera Enhancements
                                      Bug Fixes

  Hadoop Community               Contributed to Apache
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community



                 Distribution for Hadoop

                  Cross-Platform Packaging,
                  Integration and Testing

                     Hive, Pig, Sqoop, ...

                          Support
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community



   Private Cloud
                              Distribution for Hadoop

                              Cross-Platform Packaging,
                               Integration and Testing

                                 Hive, Pig, Sqoop, ...

                                      Support


                   Pac
                      kag
                         es
Cloudera’s Distribution for Hadoop
Delivering Hadoop to a Larger Community



   Private Cloud                                                   Public Cloud
                              Distribution for Hadoop

                              Cross-Platform Packaging,
                               Integration and Testing

                                 Hive, Pig, Sqoop, ...

                                      Support


                   Pac
                      kag                                    ges
                         es                               Ima
Comparing Growth Rates since March 2009
Standard Packaging Drives Adoption

▪   Consistent Downloads                      Cloudera Downloads

    from Apache                               Apache Downloads
                                                                                                                           1,835%




    Cloudera Packages
                                                                                                            1,392%
▪

    Drive New Usage
                                                                                             1,026%




                                                                               762%

▪   Enables New Hadoop
    Applications                                                  384%


                                                     238%


                                       100%
                                                                                      133%
                                              100%          96%          95%                          93%            97%            95%


                                      March 2009       May 2009           July 09 Aug 09 Sept 09
Normalized by unique users accessing hadoop.apache.org/core/releases.html and Cloudera Package
Repositories in March 2009
Cloudera’s Business to Date
Support, Training and Professional Services
▪   Dozens of Support Customers
    ▪   Using Hadoop for real enterprise workloads

▪   Training and Certification
    ▪   100’s of engineers trained
    ▪   Sysadmin and Manager programs launched at Hadoop World

▪   Professional Services

More Related Content

PPTX
Hadoop: An Industry Perspective
PPTX
Big Data and Hadoop
PPTX
Big data concepts
PPTX
Apache Hadoop at 10
PPTX
Hadoop Tutorial For Beginners
PDF
What is hadoop
PPTX
Apache hadoop introduction and architecture
PDF
Big data Hadoop Analytic and Data warehouse comparison guide
Hadoop: An Industry Perspective
Big Data and Hadoop
Big data concepts
Apache Hadoop at 10
Hadoop Tutorial For Beginners
What is hadoop
Apache hadoop introduction and architecture
Big data Hadoop Analytic and Data warehouse comparison guide

What's hot (20)

PPTX
Hadoop: Distributed Data Processing
PPTX
Introduction to Apache Hadoop Eco-System
PPTX
Hadoop and Big Data
PPTX
Big Data Concepts
PPTX
Big data Analytics Hadoop
PPTX
PPT on Hadoop
PPTX
Overview of Big data, Hadoop and Microsoft BI - version1
PDF
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
PDF
Hadoop - Architectural road map for Hadoop Ecosystem
PPTX
Big data processing with apache spark part1
PPTX
Big Data & Hadoop Tutorial
PPTX
Apache Hadoop
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
PPTX
Hadoop and big data
PPTX
Big Data and Hadoop Introduction
PPTX
Big Data on the Microsoft Platform
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
PDF
Hadoop Architecture Options for Existing Enterprise DataWarehouse
PDF
Introduction to Bigdata and HADOOP
PPTX
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
Hadoop: Distributed Data Processing
Introduction to Apache Hadoop Eco-System
Hadoop and Big Data
Big Data Concepts
Big data Analytics Hadoop
PPT on Hadoop
Overview of Big data, Hadoop and Microsoft BI - version1
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
Hadoop - Architectural road map for Hadoop Ecosystem
Big data processing with apache spark part1
Big Data & Hadoop Tutorial
Apache Hadoop
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop and big data
Big Data and Hadoop Introduction
Big Data on the Microsoft Platform
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Hadoop Architecture Options for Existing Enterprise DataWarehouse
Introduction to Bigdata and HADOOP
The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.
Ad

Similar to Hw09 Welcome To Hadoop World (20)

PPTX
Amr Awadallah, unSEXY Presentation
PPTX
Cloudera Manager Webinar | Cloudera Enterprise 3.7
PDF
Webinar: The Future of Hadoop
PDF
Emerging trends in data analytics
PDF
Hadoop summit cloudera keynote_v5
PPTX
Spark in the Enterprise - 2 Years Later by Alan Saldich
PPTX
Apache Hadoop Now Next and Beyond
PPTX
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
PDF
Applications on Hadoop
PPTX
Big Data Analytics - Is Your Elephant Enterprise Ready?
PDF
One Hadoop, Multiple Clouds - NYC Big Data Meetup
PDF
One Hadoop, Multiple Clouds
PDF
Common and unique use cases for Apache Hadoop
PDF
Commonanduniqueusecases 110831113310-phpapp01
PDF
Improving the Drupal Developer Experience with DevCloud, Managed Cloud and th...
PDF
Hadoop Application Architectures Mark Grover Ted Malaska Jonathan Seidman Gwe...
PDF
Hadoop on Cloud: Why and How?
PDF
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
PDF
Karmasphere Studio for Hadoop
PDF
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Amr Awadallah, unSEXY Presentation
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Webinar: The Future of Hadoop
Emerging trends in data analytics
Hadoop summit cloudera keynote_v5
Spark in the Enterprise - 2 Years Later by Alan Saldich
Apache Hadoop Now Next and Beyond
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Applications on Hadoop
Big Data Analytics - Is Your Elephant Enterprise Ready?
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds
Common and unique use cases for Apache Hadoop
Commonanduniqueusecases 110831113310-phpapp01
Improving the Drupal Developer Experience with DevCloud, Managed Cloud and th...
Hadoop Application Architectures Mark Grover Ted Malaska Jonathan Seidman Gwe...
Hadoop on Cloud: Why and How?
Driving in the Desert - Running Your HDP Cluster with Helion, Openstack, and ...
Karmasphere Studio for Hadoop
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Leveraging the Cloud for Big Data Analytics 12.11.18
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18

Recently uploaded (20)

PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation theory and applications.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Modernizing your data center with Dell and AMD
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
KodekX | Application Modernization Development
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Mobile App Security Testing_ A Comprehensive Guide.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
MYSQL Presentation for SQL database connectivity
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Building Integrated photovoltaic BIPV_UPV.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
NewMind AI Weekly Chronicles - August'25 Week I
Modernizing your data center with Dell and AMD
Diabetes mellitus diagnosis method based random forest with bat algorithm
KodekX | Application Modernization Development

Hw09 Welcome To Hadoop World

  • 1. Welcome to Hadoop World: NYC 2009 Hadoop is Everywhere Presents: Christophe Bisciglia Founder christophe@cloudera.com
  • 2. Hadoop World Details and Event Updates Too Late to Print ▪ WiFi Details ▪ UI BOF ▪ SSID: HadoopWorld ▪ Lead: Philip Zeyliger, Cloudera ▪ Password: hadoop09 ▪ Vanderbilt Suite, Afternoon Break ▪ Twitter: #hadoopworld ▪ HBase BOF ▪ Lead: Michael Stack, Microsoft ▪ Break Out Sessions ▪ Terrace Ballroom, Afternoon Break ▪ Applications (This Room) ▪ Dev / Admin: Terrace Ballroom (Across Lobby) ▪ Extensions: Vanderbilt Suite (One Floor Up)
  • 4. Why Hadoop World? Time to Upgrade Your Data Management Strategy ▪ Hadoop isn’t just for Web Companies anymore ▪ Terabytes are common place ▪ Enables consumption of all enterprise data ▪ Wide adoption across verticals ▪ Hadoop is driven by the Community ▪ Most registrants are new to Hadoop ▪ Sharing experience is critical - and incredibly valuable ▪ Users and Developers exchanging needs and ideas
  • 5. Growing Up with Hadoop You’ve come a long way baby...
  • 6. Growing Up with Hadoop You’ve come a long way baby... ▪ Early Days ▪ 2004: Google Publishes MapReduce/GFS ▪ 2005: Hadoop Prototype ▪ Doug Cutting and Mike Cafarella ▪ 2006: Hadoop Running on 20 nodes ▪ Internet Archive and UW Doug Cutting Photo Credit: New York Times
  • 7. Growing Up with Hadoop You’ve come a long way baby... ▪ Formative Years ▪ 2006: Yahoo! Begins Major Investment ▪ 2007: Yahoo! Runs Hadoop on 2000 nodes ▪ 2008: Yahoo! uses Hadoop to claim Terasort Benchmark
  • 8. Growing Up with Hadoop You’ve come a long way baby... ▪ 5 Major Releases for Hadoop in last year ▪ More Reliable ▪ More Scalable ▪ More Manageable
  • 9. Growing Up with Hadoop You’ve come a long way baby... ▪ New Sub-Projects Embrace New Users ▪ Hive: SQL Data Warehouse for Hadoop ▪ Pig: Data Analysis Language
  • 10. Growing Up with Hadoop You’ve come a long way baby... ▪ Sqoop: Database import for Hadoop ▪ Developer by Aaron Kimball, Cloudera ▪ Works over JDBC ▪ Extensible for better pefromance
  • 11. Growing Up with Hadoop You’ve come a long way baby... ▪ RDBMS Vendors Embrace Hadoop ▪ MapReduce is great for Analytics ▪ Hadoop is the MapReduce Standard ▪ integrates directly with Hadoop
  • 12. Growing Up with Hadoop You’ve come a long way baby... ▪ Adoption Spanning Globe ▪ HUGs outside the US ▪ Over 10x Companies “PoweredBy” ▪ Not Just for Web Companies Anymore
  • 13. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community
  • 14. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Hadoop Community
  • 15. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Latest Stable Hadoop Release Stable Upcoming Features Distribution for Hadoop (by customer request) Hadoop Community
  • 16. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Source Code Powering Y! Latest Stable Hadoop Release Improvements for EC2 and S3 Stable Upcoming Features Distribution for Hadoop (by customer request) New Features from Cloudera Hadoop Community
  • 17. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Source Code Powering Y! Latest Stable Hadoop Release Improvements for EC2 and S3 Stable Upcoming Features Distribution for Hadoop (by customer request) New Features from Cloudera Cloudera Enhancements Bug Fixes Hadoop Community Contributed to Apache
  • 18. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Distribution for Hadoop Cross-Platform Packaging, Integration and Testing Hive, Pig, Sqoop, ... Support
  • 19. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Private Cloud Distribution for Hadoop Cross-Platform Packaging, Integration and Testing Hive, Pig, Sqoop, ... Support Pac kag es
  • 20. Cloudera’s Distribution for Hadoop Delivering Hadoop to a Larger Community Private Cloud Public Cloud Distribution for Hadoop Cross-Platform Packaging, Integration and Testing Hive, Pig, Sqoop, ... Support Pac kag ges es Ima
  • 21. Comparing Growth Rates since March 2009 Standard Packaging Drives Adoption ▪ Consistent Downloads Cloudera Downloads from Apache Apache Downloads 1,835% Cloudera Packages 1,392% ▪ Drive New Usage 1,026% 762% ▪ Enables New Hadoop Applications 384% 238% 100% 133% 100% 96% 95% 93% 97% 95% March 2009 May 2009 July 09 Aug 09 Sept 09 Normalized by unique users accessing hadoop.apache.org/core/releases.html and Cloudera Package Repositories in March 2009
  • 22. Cloudera’s Business to Date Support, Training and Professional Services ▪ Dozens of Support Customers ▪ Using Hadoop for real enterprise workloads ▪ Training and Certification ▪ 100’s of engineers trained ▪ Sysadmin and Manager programs launched at Hadoop World ▪ Professional Services