SlideShare a Scribd company logo
The Elephant in the Room




                                   Jim O’Neil
               Developer Evangelist, Microsoft
         jim.oneil@microsoft.com  @jimoneil
DIY
Roll your own Hadoop cluster….
      welcome to DevOps

                                     “Isotope”




                                 Pallet
Appliances
Oracle Big Data Appliance
  – 18 server / 12 core each / 40Gb Infiniband
  – Partnering with Cloudera on the distribution
Greenplum HD Data Computing Appliance
  – 18 nodes, 12 core each
  – Straight up Apache Hadoop
NetApp Open Solution for Hadoop
  – Storage arrays only (E2660 and FAS2040)
  – Partnership with Cloudera
The Elephant in the Cloud
Jim O’Neil
Developer Evangelist, Microsoft
jim.oneil@microsoft.com  @jimoneil
Cloud: a Notional Definition
                  Private Cloud

                                  Deployment Models
                   Hybrid Cloud


               Community Cloud


                   Public Cloud
                                           Essential Characteristics

  Infrastructure as a Service

 Platform as a Service                  Broad network        Rapid Elasticity
                                           access
Software as
                                                 Resource Pooling
   a Service
                                  On-demand self-service            Measured service
Hadoop in the Cloud
Google App Engine
     appengine-mapreduce API   (not really Hadoop)



Amazon Web Services
     66 Public AMIs (including Cloudera)
     Elastic Map Reduce
Windows Azure
     Hadoop on Azure
IBM SmartCloud
     Infosphere BigInsights
Google App Engine
         MapreducePipeline Class

Experimental!

Mapreduce is an experimental, innovative, and rapidly
changing new feature for App Engine.
Unfortunately, being on the bleeding edge means that
we may make backwards-incompatible changes to
Mapreduce. We will inform the community when this
feature is no longer experimental.
Amazon EMR
u
Windows Azure
      http://HadoopOnAzure.com

Currently in Customer Technology Preview
Partnership with Hortonworks
     Windows updates to Apache
     JavaScript framework
     Hive ODBC connector
IBM SmartCloud
InfoSphere BigInsights
     IBM distribution of Hadoop (0.20.2)
     Jaql query language
     BigSheets
     BigInsight Scheduler
     “Hadoop ecosystem”
           Hive, Avro, Hbase, Pig, Oozie, Flume
I meant what I said, and I said what I meant.
An elephant's faithful, one hundred percent.




                 Jim O’Neil
        Developer Evangelist, Microsoft
     jim.oneil@microsoft.com  @jimoneil

More Related Content

PPTX
Comparison of AWS, GCP & Azure web solutions
PPTX
Azure Databricks & Spark @ Techorama 2018
PDF
Services are the New Cloud Platform (Services-as-a-Platform)
PPTX
Google Cloud Platform: Prototype ->Production-> Planet scale
PDF
AWS or Azure or Google Cloud | Best Cloud Platform | Cloud Platform Comparison
PPTX
Dataminds - ML in Production
PDF
Cloud Computing: Making the right choice
PDF
Introduction to Google Cloud Platform
Comparison of AWS, GCP & Azure web solutions
Azure Databricks & Spark @ Techorama 2018
Services are the New Cloud Platform (Services-as-a-Platform)
Google Cloud Platform: Prototype ->Production-> Planet scale
AWS or Azure or Google Cloud | Best Cloud Platform | Cloud Platform Comparison
Dataminds - ML in Production
Cloud Computing: Making the right choice
Introduction to Google Cloud Platform

What's hot (19)

PPTX
Introduction to GCP presentation
PDF
EclipseCon21 - Alice's Adventures in Sirius Web Land!
PDF
SiriusCon 2020 - Sirius to the Web with Obeo Cloud Platform
ODP
Cloudify 10m
PPTX
Google Cloud Platform (GCP)
PDF
Big data on google cloud
DOCX
tell us which cloud you prefer
PPTX
How to build virtual assistant like Jarvis (in Ironman) with Google Assistant...
PDF
Event Report - AWS reInvent 2017 - It's big...
PDF
Cloud computing overview & Technical intro to Google Cloud
PPTX
eBay's private Cloud Journey
ODP
Open stack bigdata NY cloudcamp
PDF
Build with all of Google Cloud
PPTX
The Open PaaS Stack
PPTX
www.geocloud.work
PPTX
Google Cloud Platform Update - NEXT 2017
PPTX
Kubernetes on GCP
PDF
Barak Regev - Google Cloud Platform
PDF
Understanding cloud costs with analytics
Introduction to GCP presentation
EclipseCon21 - Alice's Adventures in Sirius Web Land!
SiriusCon 2020 - Sirius to the Web with Obeo Cloud Platform
Cloudify 10m
Google Cloud Platform (GCP)
Big data on google cloud
tell us which cloud you prefer
How to build virtual assistant like Jarvis (in Ironman) with Google Assistant...
Event Report - AWS reInvent 2017 - It's big...
Cloud computing overview & Technical intro to Google Cloud
eBay's private Cloud Journey
Open stack bigdata NY cloudcamp
Build with all of Google Cloud
The Open PaaS Stack
www.geocloud.work
Google Cloud Platform Update - NEXT 2017
Kubernetes on GCP
Barak Regev - Google Cloud Platform
Understanding cloud costs with analytics
Ad

Viewers also liked (17)

ODP
The power of hadoop in cloud computing
PDF
Introduction to Cloudera's Unique Architecture & Competitive Advantages
PPTX
Putting hadoop on any cloud big data spain
PDF
Cloud Customer Architecture for Big Data and Analytics
PPTX
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
PPTX
Hadoop & cloud storage object store integration in production (final)
PPTX
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
PDF
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
PPTX
Enterprise Architecture in the Era of Big Data and Quantum Computing
PDF
Data Lake for the Cloud: Extending your Hadoop Implementation
PPTX
Where to Deploy Hadoop: Bare Metal or Cloud?
PDF
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
PDF
Hadoop meets Cloud with Multi-Tenancy
PPTX
Is Cloud a right Companion for Hadoop
PPTX
Hadoop in the cloud – The what, why and how from the experts
PPTX
Hadoop in the Cloud: Common Architectural Patterns
PPTX
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
The power of hadoop in cloud computing
Introduction to Cloudera's Unique Architecture & Competitive Advantages
Putting hadoop on any cloud big data spain
Cloud Customer Architecture for Big Data and Analytics
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop & cloud storage object store integration in production (final)
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Enterprise Architecture in the Era of Big Data and Quantum Computing
Data Lake for the Cloud: Extending your Hadoop Implementation
Where to Deploy Hadoop: Bare Metal or Cloud?
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Hadoop meets Cloud with Multi-Tenancy
Is Cloud a right Companion for Hadoop
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the Cloud: Common Architectural Patterns
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Ad

Similar to Hadoop in the Cloud (20)

PDF
JIT Borawan Cloud computing part 2
PPT
Cloud computing and Hadoop introduction
PDF
Rethinking the cloud_-_limitations_and_oppotunities_-_2011_nexcom
PDF
Hadoop on Azure, Blue elephants
PDF
Hw09 Data Processing In The Enterprise
PPTX
Big Data in the Microsoft Platform
PDF
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
PDF
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
PDF
Apachecon Euro 2012: Elastic, Multi-tenant Hadoop on Demand
KEY
Cloud Review V2
PDF
Hadoop - Architectural road map for Hadoop Ecosystem
PPTX
Introduction to Apache Hadoop Ecosystem
PDF
WTIA Cloud Computing Series - Part I: The Fundamentals
PPT
Cyberinfrastructure and Applications Overview: Howard University June22
PPTX
Microsoft's Hadoop Story
PDF
Big Data/Hadoop Infrastructure Considerations
PPT
Cloud computingjun28
PPT
Cloud computingjun28
PPT
cloud computing
PDF
8 mattwoodaws-intro-pdf-110411093115-phpapp01
JIT Borawan Cloud computing part 2
Cloud computing and Hadoop introduction
Rethinking the cloud_-_limitations_and_oppotunities_-_2011_nexcom
Hadoop on Azure, Blue elephants
Hw09 Data Processing In The Enterprise
Big Data in the Microsoft Platform
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
Apachecon Euro 2012: Elastic, Multi-tenant Hadoop on Demand
Cloud Review V2
Hadoop - Architectural road map for Hadoop Ecosystem
Introduction to Apache Hadoop Ecosystem
WTIA Cloud Computing Series - Part I: The Fundamentals
Cyberinfrastructure and Applications Overview: Howard University June22
Microsoft's Hadoop Story
Big Data/Hadoop Infrastructure Considerations
Cloud computingjun28
Cloud computingjun28
cloud computing
8 mattwoodaws-intro-pdf-110411093115-phpapp01

More from Jim O'Neil (12)

PPTX
Azure and DevOps: ARM & ARM
PPTX
Weka Health Vaccine Smart Fridge
PPTX
Go Serverless with Azure Functions
PPTX
Windows 8.1 Themes
PPTX
Windows Azure Cloud Services
PPTX
Windows Azure Overview
PPTX
Windows 8 App and Game Development Landscape
PPTX
MongoDB and Windows Azure
PPTX
Azure overview
PPTX
The PaaS Landscape
PPTX
Sampling from the Cloud Smorgasbord
PPTX
Drupal and Microsoft
Azure and DevOps: ARM & ARM
Weka Health Vaccine Smart Fridge
Go Serverless with Azure Functions
Windows 8.1 Themes
Windows Azure Cloud Services
Windows Azure Overview
Windows 8 App and Game Development Landscape
MongoDB and Windows Azure
Azure overview
The PaaS Landscape
Sampling from the Cloud Smorgasbord
Drupal and Microsoft

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Machine learning based COVID-19 study performance prediction
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
A Presentation on Artificial Intelligence
Unlocking AI with Model Context Protocol (MCP)
sap open course for s4hana steps from ECC to s4
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Machine learning based COVID-19 study performance prediction
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Programs and apps: productivity, graphics, security and other tools
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology
Reach Out and Touch Someone: Haptics and Empathic Computing
gpt5_lecture_notes_comprehensive_20250812015547.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
The AUB Centre for AI in Media Proposal.docx
Dropbox Q2 2025 Financial Results & Investor Presentation
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
A comparative analysis of optical character recognition models for extracting...
Encapsulation_ Review paper, used for researhc scholars
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Assigned Numbers - 2025 - Bluetooth® Document
NewMind AI Weekly Chronicles - August'25-Week II
A Presentation on Artificial Intelligence

Hadoop in the Cloud

  • 1. The Elephant in the Room Jim O’Neil Developer Evangelist, Microsoft jim.oneil@microsoft.com  @jimoneil
  • 2. DIY Roll your own Hadoop cluster…. welcome to DevOps “Isotope” Pallet
  • 3. Appliances Oracle Big Data Appliance – 18 server / 12 core each / 40Gb Infiniband – Partnering with Cloudera on the distribution Greenplum HD Data Computing Appliance – 18 nodes, 12 core each – Straight up Apache Hadoop NetApp Open Solution for Hadoop – Storage arrays only (E2660 and FAS2040) – Partnership with Cloudera
  • 4. The Elephant in the Cloud Jim O’Neil Developer Evangelist, Microsoft jim.oneil@microsoft.com  @jimoneil
  • 5. Cloud: a Notional Definition Private Cloud Deployment Models Hybrid Cloud Community Cloud Public Cloud Essential Characteristics Infrastructure as a Service Platform as a Service Broad network Rapid Elasticity access Software as Resource Pooling a Service On-demand self-service Measured service
  • 6. Hadoop in the Cloud Google App Engine appengine-mapreduce API (not really Hadoop) Amazon Web Services 66 Public AMIs (including Cloudera) Elastic Map Reduce Windows Azure Hadoop on Azure IBM SmartCloud Infosphere BigInsights
  • 7. Google App Engine MapreducePipeline Class Experimental! Mapreduce is an experimental, innovative, and rapidly changing new feature for App Engine. Unfortunately, being on the bleeding edge means that we may make backwards-incompatible changes to Mapreduce. We will inform the community when this feature is no longer experimental.
  • 9. Windows Azure http://HadoopOnAzure.com Currently in Customer Technology Preview Partnership with Hortonworks Windows updates to Apache JavaScript framework Hive ODBC connector
  • 10. IBM SmartCloud InfoSphere BigInsights IBM distribution of Hadoop (0.20.2) Jaql query language BigSheets BigInsight Scheduler “Hadoop ecosystem” Hive, Avro, Hbase, Pig, Oozie, Flume
  • 11. I meant what I said, and I said what I meant. An elephant's faithful, one hundred percent. Jim O’Neil Developer Evangelist, Microsoft jim.oneil@microsoft.com  @jimoneil