SlideShare a Scribd company logo
1
Data & Analytics Convergence
Keith Manthey, CTO Analytics
Property of EMC. Not for further distribution
2
Source: EMC Digital Universe with Research and Analysis by IDC, The Digital Universe of
Opportunities: Rich Data and the Increasing Value of the Internet of Things, April 2014.
2020
4.4ZETTABYTES
44ZETTABYTES
10xMORE
DigitalUniverse 2014
2013
ZETTABYTE = 1,000,000,000,000,000,000,000 bytes
34.4 Billion 32GB Smartphones =1 ZETTABYTE
34.4 Billion Samsung S5’s end-to-end would circle the Earth 121.8 times
Property of EMC. Not for further distribution
3
4
© Copyright 2015 EMC Corporation. All rights reserved.
30B
DEVICES
7B
PEOPLE
1M+
NEW BUSINESSES
Source: Gartner Group, 2014
© Copyright 2015 EMC Corporation. All rights reserved.
2020: A NEW DIGITAL WORLD
Property of EMC. Not for further distribution
5
PRECISION
FARMING
DRESS THAT
DISPLAYS HOW
WE FEEL
CONTACT LENS
THAT CONTROLS
BLOOD SUGAR
THERMOSTAT
THAT KNOWS
YOU’RE AWAY
FITNESS BAND
THAT MEASURES
ACTIVITY LEVEL
GLASSES THAT
DIRECT US
WHERE TO GO
DRONES THAT
DELIVER OUR
GROCERIES
DIGITIZATION IS ALREADY BEGINNING
Property of EMC. Not for further distribution
6
© Copyright 2015 EMC Corporation. All rights reserved.
Many Industries Face Structural Change
Property of EMC. Not for further distribution
7
Analytics is about Data & Outcomes…
Property of EMC. Not for further distribution
8
Macro Market Trends
Courtesy of Wikibon
Courtesy of Infoworld
Property of EMC. Not for further distribution
9
Reasons for Change?
SKILLS
Operations
Growth of Data Type
Property of EMC. Not for further distribution
10
Philosophical - Database
Cache Logs
System Processes
(including Logical - Catalog + Physical Structures –
Reader/Writers)
Data
Storage
Instance
Traditional DB
Assumes:
• Query < 5% of Data
• Schema on Write
(Structured)
• All data confirms to
Schema (changes to
versioned data if
schema changes)
• Limited to compute
methods (SQL, UDF,
and R soon*)
Property of EMC. Not for further distribution
11
Philosophical - Hadoop
Spark MapReduce
HDFS
(including Logical – Name Node+ Physical Structures – Data
Node)
Data
Storage
YARN
Hadoop
Built for:
• Query 100% of Data
each time
• Schema on Read
(including multiple
versions over time)
• Unlimited in compute
methods (SQL,
Programmatic,
Tools(Spark, Storm,
R…))
Property of EMC. Not for further distribution
12
Comparison
Spark MapReduce
HDFS
(including Logical – Name Node+ Physical Structures – Data
Node)
Data
Storage
YARN
Cache Logs
System Processes
(including Logical - Catalog + Physical Structures –
Reader/Writers)
Data
Storage
Instance
SCALE UP – More CPUS/Memory
Vs
SCALE OUT – More Nodes
SCALE OUT – More Nodes
Property of EMC. Not for further distribution
13
Convergence
Single
Execution
Query Across
Structured
and
Unstructured
(“push down
processing”)
&
Hadoop
Query
Integration
Erasure
Coding
(HDFS-EC)
(enterprise
patterns)
&
Better
Operational
Support
&
Skills gaps
Property of EMC. Not for further distribution
14
What is the DB Convergence Play?
Per Microsoft, “PolyBase is
a T-SQL front end that
allows customers to query
data stored in HDFS”
Microsoft Polybase - Click here for original
IBM's Big SQL Product Overview
Property of EMC. Not for further distribution
15
But… Hadoop is about DAS
Property of EMC. Not for further distribution
16
Data Locality – Per Eric Brewer…
MSFT
Research Link
U. Cal
Berkeley
Original Link /
Paper
Property of EMC. Not for further distribution
17
Who is Eric Brewer?
• Eric Brewer is a UC
Berkeley Professor who
happens to be currently on
sabbatical working with
Google (VP of
Infrastructure).
• He proposed the CAP
Theorem in 1990
• Google records 40K hits on
“Brewer’s Theorem Proofs”
Property of EMC. Not for further distribution
18
It’s all Hadoop?
• Per Mike Olson at 2015 Strataconf, Hadoop is really
disappearing, with the real importance of discussion
on the applications on top of the platform
• It’s about Outcomes and use cases. As a result,
Machine Learning & Spark are gaining all the glory
– “How Old” Presentation from Strataconf
– IBM commits 3.5K associates to Apache Spark
– Microsoft buy Revolution Analytics to bring Machine
Learning to Databases
Property of EMC. Not for further distribution
19
What has transpired with Hadoop?
• Cloudera has cracked into the Operational Data Store
and Data Warehouse Gartner Quads. This has long
been held by traditional RDBMS entrants.
• Increased investment from Hadoop vendors around
items like Kudu and LLAP targeting OLTP workloads.
• Creation of a converged ACID Compliant RDBMS on
Hadoop
Property of EMC. Not for further distribution
20
Keith’s Predictions
• More Enterprise Patterns for Hadoop:
– Companies are running out of data center and network
space. The push for denser footprints are emerging
– Operations drives better reference architectures that match
their support model
– More focus on Interactive Queries and real time processing
– More converged pushes from other parties like Splice
– More use cases driving more adoption, but less about
Hadoop
• More Unstructured Data Support / Analytics for
Databases & ACID compliance upon Hadoop.
– To Quote Willie Sutton: “It’s where the money is…”
Property of EMC. Not for further distribution
21
Why does EMC Care?
• Enterprise Standard Storage Technology supporting
the World’s Databases
• Largest Enterprise Storage Vendor for Hadoop
Platforms (Isilon)
– Certified with Hortonworks and Cloudera, along with Pivotal
and IBM Big Insights
• Bring ease of use to difficult platform and ease of
convergence on products like Polybase.
Property of EMC. Not for further distribution
22
Appendix
Property of EMC. Not for further distribution
23
© Copyright 2015 EMC Corporation. All rights reserved.
Ethernet
Hadoop Architecture – DAS vs Isilon
NameNode
Ethernet
Compute Node Compute Node Compute Node
Compute NodeCompute Node Compute Node
name
node
name
node
name
node
datanode
Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node
Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node
Property of EMC. Not for further distribution
24
Traditional Hadoop POD
18 racks
Extended Time-to-Results
•Requires Additional “Data Staging” Storage
•Iterative Testing is Time Consuming
•Requires Copying of Data Several Times
Rigid Architecture
•Inefficient Floor Space
•Must Purchase Compute & Storage Together
•Storage Efficiency < 25%
Lacks Enterprise Features
•No Disaster Recovery, Snapshots
•Single Protocol (HDFS Only)
•Lacks Full Security Features
42U
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
=
~ 5PB Usable
Hadoop Storage
Isilon vHadoop
(no staging needed)
Hadoop POD:
Compute with Staging Storage
Isilon vHadoop
8 racks
Faster Time-to-Results
•Data Stays on the Isilon Cluster
•Allows for Rapid Iterative Testing Process
•Simplifies Hosting Workflow
Flexible Architecture
•Efficient Floor Space, Power & Cooling
•Leverage VMs for Flexible Deployments
•Storage Efficiency > 78%
Enterprise Capabilities
•Disaster Recovery, Snapshots
•SyncIQ-Data Replication Offsite
•Highly Secure Hosting Environment
42U
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
42U
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
42U
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
42U
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
SERIES
Property of EMC. Not for further distribution
25
© Copyright 2015 EMC Corporation. All rights reserved.
1TB Hadoop Job Cycle Comparison
Isilon Significantly Reduces Time To Results
Traditional Hadoop+DAS
17:32 30:18 20:5020:50
Isilon Enabled Hadoop
18:51
Terasort Test on 1TB
DAS Isilon Benefit
MB/s Per Node 55.00 85.00 55%
Compute Min 30.18 18.51 -39%
TTR Min 89.30 18.51 -79%
Isilon Advantages
• Eliminates All Data Movement
• Allows for Virtualized Compute
• Significantly Less Cost
• 79% Faster TTR!
TTR- 89.3
Minutes!
Property of EMC. Not for further distribution
EMC Isilon Database Converged deck

More Related Content

PDF
Building a Big Data platform with the Hadoop ecosystem
PDF
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
PDF
Common and unique use cases for Apache Hadoop
PPTX
Scaling Data Science on Big Data
PDF
Empowering you with Democratized Data Access, Data Science and Machine Learning
PPTX
DEVNET-1166 Open SDN Controller APIs
PDF
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
PDF
On Demand HDP Clusters using Cloudbreak and Ambari
Building a Big Data platform with the Hadoop ecosystem
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Common and unique use cases for Apache Hadoop
Scaling Data Science on Big Data
Empowering you with Democratized Data Access, Data Science and Machine Learning
DEVNET-1166 Open SDN Controller APIs
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
On Demand HDP Clusters using Cloudbreak and Ambari

What's hot (20)

PPTX
Real Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
PPTX
Interactive query in hadoop
PDF
Overview of stinger interactive query for hive
PPTX
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
PPTX
Big Data Warehousing: Pig vs. Hive Comparison
PDF
50 Shades of SQL
PPTX
Wrangling Customer Usage Data with Hadoop
PPTX
Hadoop-as-a-Service for Lifecycle Management Simplicity
PDF
The Car of the Future - Autonomous, Connected, and Data Centric
PPTX
The Fundamentals Guide to HDP and HDInsight
PDF
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
PPTX
Is Cloud a right Companion for Hadoop
PPTX
What it takes to run Hadoop at Scale: Yahoo! Perspectives
PPTX
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
PPTX
HDInsight Hadoop on Windows Azure
PDF
Apache Spark Workshop at Hadoop Summit
PPTX
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
PDF
Enabling big data & AI workloads on the object store at DBS
PPTX
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
PDF
Big Data Architecture and Deployment
Real Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
Interactive query in hadoop
Overview of stinger interactive query for hive
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Big Data Warehousing: Pig vs. Hive Comparison
50 Shades of SQL
Wrangling Customer Usage Data with Hadoop
Hadoop-as-a-Service for Lifecycle Management Simplicity
The Car of the Future - Autonomous, Connected, and Data Centric
The Fundamentals Guide to HDP and HDInsight
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Is Cloud a right Companion for Hadoop
What it takes to run Hadoop at Scale: Yahoo! Perspectives
Demystify Big Data Breakfast Briefing: Herb Cunitz, Hortonworks
HDInsight Hadoop on Windows Azure
Apache Spark Workshop at Hadoop Summit
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Enabling big data & AI workloads on the object store at DBS
How Big Data and Hadoop Integrated into BMC ControlM at CARFAX
Big Data Architecture and Deployment
Ad

Similar to EMC Isilon Database Converged deck (20)

PPTX
EMC config Hadoop
PPTX
PPTX
EMC HADOOP Storage Strategy
PPTX
EMC Big Data Solutions Overview
PPTX
In-Place analytics with Unified Data Access
PPTX
Disaggregated Hadoop Stacks
PDF
EMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
 
PPTX
Modern infrastructure for business data lake
 
PPTX
Cloud Infrastructure and Services (CIS) - Webinar
 
PDF
De wondere wereld van cloud en sddc 26 nov 2013 ht v1.1
PPTX
5 Things that Make Hadoop a Game Changer
PDF
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
 
PDF
Storage as a service v4 eng
PDF
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
PPTX
Back to The Future V
PPTX
Redefine Big Data
PDF
EMC Isilon Multitenancy for Hadoop Big Data Analytics
 
PDF
Quelle stratégie pour EMC en 2015 ? Repensons l'IT
 
PPTX
EMC EC Overview
PDF
Keynote Ouverture Plénière - Sébastien Verger
 
EMC config Hadoop
EMC HADOOP Storage Strategy
EMC Big Data Solutions Overview
In-Place analytics with Unified Data Access
Disaggregated Hadoop Stacks
EMC Isilon Scale-Out NAS for In-Place Hadoop Data Analytics
 
Modern infrastructure for business data lake
 
Cloud Infrastructure and Services (CIS) - Webinar
 
De wondere wereld van cloud en sddc 26 nov 2013 ht v1.1
5 Things that Make Hadoop a Game Changer
EMC Big Data | Hadoop Starter Kit | EMC Forum 2014
 
Storage as a service v4 eng
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Back to The Future V
Redefine Big Data
EMC Isilon Multitenancy for Hadoop Big Data Analytics
 
Quelle stratégie pour EMC en 2015 ? Repensons l'IT
 
EMC EC Overview
Keynote Ouverture Plénière - Sébastien Verger
 
Ad

Recently uploaded (20)

PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Global journeys: estimating international migration
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Foundation of Data Science unit number two notes
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Fluorescence-microscope_Botany_detailed content
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Global journeys: estimating international migration
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Moving the Public Sector (Government) to a Digital Adoption
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Foundation of Data Science unit number two notes
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
.pdf is not working space design for the following data for the following dat...
Clinical guidelines as a resource for EBP(1).pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
climate analysis of Dhaka ,Banglades.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Fluorescence-microscope_Botany_detailed content

EMC Isilon Database Converged deck

  • 1. 1 Data & Analytics Convergence Keith Manthey, CTO Analytics Property of EMC. Not for further distribution
  • 2. 2 Source: EMC Digital Universe with Research and Analysis by IDC, The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, April 2014. 2020 4.4ZETTABYTES 44ZETTABYTES 10xMORE DigitalUniverse 2014 2013 ZETTABYTE = 1,000,000,000,000,000,000,000 bytes 34.4 Billion 32GB Smartphones =1 ZETTABYTE 34.4 Billion Samsung S5’s end-to-end would circle the Earth 121.8 times Property of EMC. Not for further distribution
  • 3. 3
  • 4. 4 © Copyright 2015 EMC Corporation. All rights reserved. 30B DEVICES 7B PEOPLE 1M+ NEW BUSINESSES Source: Gartner Group, 2014 © Copyright 2015 EMC Corporation. All rights reserved. 2020: A NEW DIGITAL WORLD Property of EMC. Not for further distribution
  • 5. 5 PRECISION FARMING DRESS THAT DISPLAYS HOW WE FEEL CONTACT LENS THAT CONTROLS BLOOD SUGAR THERMOSTAT THAT KNOWS YOU’RE AWAY FITNESS BAND THAT MEASURES ACTIVITY LEVEL GLASSES THAT DIRECT US WHERE TO GO DRONES THAT DELIVER OUR GROCERIES DIGITIZATION IS ALREADY BEGINNING Property of EMC. Not for further distribution
  • 6. 6 © Copyright 2015 EMC Corporation. All rights reserved. Many Industries Face Structural Change Property of EMC. Not for further distribution
  • 7. 7 Analytics is about Data & Outcomes… Property of EMC. Not for further distribution
  • 8. 8 Macro Market Trends Courtesy of Wikibon Courtesy of Infoworld Property of EMC. Not for further distribution
  • 9. 9 Reasons for Change? SKILLS Operations Growth of Data Type Property of EMC. Not for further distribution
  • 10. 10 Philosophical - Database Cache Logs System Processes (including Logical - Catalog + Physical Structures – Reader/Writers) Data Storage Instance Traditional DB Assumes: • Query < 5% of Data • Schema on Write (Structured) • All data confirms to Schema (changes to versioned data if schema changes) • Limited to compute methods (SQL, UDF, and R soon*) Property of EMC. Not for further distribution
  • 11. 11 Philosophical - Hadoop Spark MapReduce HDFS (including Logical – Name Node+ Physical Structures – Data Node) Data Storage YARN Hadoop Built for: • Query 100% of Data each time • Schema on Read (including multiple versions over time) • Unlimited in compute methods (SQL, Programmatic, Tools(Spark, Storm, R…)) Property of EMC. Not for further distribution
  • 12. 12 Comparison Spark MapReduce HDFS (including Logical – Name Node+ Physical Structures – Data Node) Data Storage YARN Cache Logs System Processes (including Logical - Catalog + Physical Structures – Reader/Writers) Data Storage Instance SCALE UP – More CPUS/Memory Vs SCALE OUT – More Nodes SCALE OUT – More Nodes Property of EMC. Not for further distribution
  • 14. 14 What is the DB Convergence Play? Per Microsoft, “PolyBase is a T-SQL front end that allows customers to query data stored in HDFS” Microsoft Polybase - Click here for original IBM's Big SQL Product Overview Property of EMC. Not for further distribution
  • 15. 15 But… Hadoop is about DAS Property of EMC. Not for further distribution
  • 16. 16 Data Locality – Per Eric Brewer… MSFT Research Link U. Cal Berkeley Original Link / Paper Property of EMC. Not for further distribution
  • 17. 17 Who is Eric Brewer? • Eric Brewer is a UC Berkeley Professor who happens to be currently on sabbatical working with Google (VP of Infrastructure). • He proposed the CAP Theorem in 1990 • Google records 40K hits on “Brewer’s Theorem Proofs” Property of EMC. Not for further distribution
  • 18. 18 It’s all Hadoop? • Per Mike Olson at 2015 Strataconf, Hadoop is really disappearing, with the real importance of discussion on the applications on top of the platform • It’s about Outcomes and use cases. As a result, Machine Learning & Spark are gaining all the glory – “How Old” Presentation from Strataconf – IBM commits 3.5K associates to Apache Spark – Microsoft buy Revolution Analytics to bring Machine Learning to Databases Property of EMC. Not for further distribution
  • 19. 19 What has transpired with Hadoop? • Cloudera has cracked into the Operational Data Store and Data Warehouse Gartner Quads. This has long been held by traditional RDBMS entrants. • Increased investment from Hadoop vendors around items like Kudu and LLAP targeting OLTP workloads. • Creation of a converged ACID Compliant RDBMS on Hadoop Property of EMC. Not for further distribution
  • 20. 20 Keith’s Predictions • More Enterprise Patterns for Hadoop: – Companies are running out of data center and network space. The push for denser footprints are emerging – Operations drives better reference architectures that match their support model – More focus on Interactive Queries and real time processing – More converged pushes from other parties like Splice – More use cases driving more adoption, but less about Hadoop • More Unstructured Data Support / Analytics for Databases & ACID compliance upon Hadoop. – To Quote Willie Sutton: “It’s where the money is…” Property of EMC. Not for further distribution
  • 21. 21 Why does EMC Care? • Enterprise Standard Storage Technology supporting the World’s Databases • Largest Enterprise Storage Vendor for Hadoop Platforms (Isilon) – Certified with Hortonworks and Cloudera, along with Pivotal and IBM Big Insights • Bring ease of use to difficult platform and ease of convergence on products like Polybase. Property of EMC. Not for further distribution
  • 22. 22 Appendix Property of EMC. Not for further distribution
  • 23. 23 © Copyright 2015 EMC Corporation. All rights reserved. Ethernet Hadoop Architecture – DAS vs Isilon NameNode Ethernet Compute Node Compute Node Compute Node Compute NodeCompute Node Compute Node name node name node name node datanode Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Property of EMC. Not for further distribution
  • 24. 24 Traditional Hadoop POD 18 racks Extended Time-to-Results •Requires Additional “Data Staging” Storage •Iterative Testing is Time Consuming •Requires Copying of Data Several Times Rigid Architecture •Inefficient Floor Space •Must Purchase Compute & Storage Together •Storage Efficiency < 25% Lacks Enterprise Features •No Disaster Recovery, Snapshots •Single Protocol (HDFS Only) •Lacks Full Security Features 42U SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES = ~ 5PB Usable Hadoop Storage Isilon vHadoop (no staging needed) Hadoop POD: Compute with Staging Storage Isilon vHadoop 8 racks Faster Time-to-Results •Data Stays on the Isilon Cluster •Allows for Rapid Iterative Testing Process •Simplifies Hosting Workflow Flexible Architecture •Efficient Floor Space, Power & Cooling •Leverage VMs for Flexible Deployments •Storage Efficiency > 78% Enterprise Capabilities •Disaster Recovery, Snapshots •SyncIQ-Data Replication Offsite •Highly Secure Hosting Environment 42U SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES 42U SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES 42U SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES 42U SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES SERIES Property of EMC. Not for further distribution
  • 25. 25 © Copyright 2015 EMC Corporation. All rights reserved. 1TB Hadoop Job Cycle Comparison Isilon Significantly Reduces Time To Results Traditional Hadoop+DAS 17:32 30:18 20:5020:50 Isilon Enabled Hadoop 18:51 Terasort Test on 1TB DAS Isilon Benefit MB/s Per Node 55.00 85.00 55% Compute Min 30.18 18.51 -39% TTR Min 89.30 18.51 -79% Isilon Advantages • Eliminates All Data Movement • Allows for Virtualized Compute • Significantly Less Cost • 79% Faster TTR! TTR- 89.3 Minutes! Property of EMC. Not for further distribution

Editor's Notes

  • #3: Just to put the exponential growth of the digital universe into context… Like the physical universe, the digital universe is large – doubling in size every two years, and by 2020 the digital universe – the data we create and copy annually – will reach 44 zettabytes, or 44 trillion gigabytes – containing nearly as many digital bits as there are stars in the universe. If the Digital Universe were represented by the memory in a stack of tablets, in 2013 it would have stretched two-thirds of the way to the Moon. By 2020, there would be 6.6 stacks from the Earth to the Moon. With this much data floating around, we need structure to sort it, make sense of it, and tell the story. That’s where data visualization comes in.
  • #5: We live in an amazing time, but looking forward to 2020… Estimated 30B – 200B devices 7 billion people 1 million new businesses from where we are today These people using these devices within these businesses are constantly connected This gives rise to new ways of doing business: new disruptive technology, new disruptive business models <CLICK>
  • #6: We’re already starting to see this today Looking at the likes of Nest: a thermostat that knows when you are in and out of your house, and can regulate the temperature in your home much more efficiently than ever done in the past Wearables such as Fitbits, Jawbones and the like There’s sports clothing companies that come to us that say they think in 10 years they will be more of a software company, with clothing that contains embedded telemetric devices, that communicate not just who they are, and where they are, but what time they get up, when they eat, when they sweat. Sports companies will know almost everything about you, whereas in the past they’ve known almost nothing about you. Another example: contact lenses that regulate blood sugar And another: intelligent machines. Let’s drill in on that for a bit… <CLICK>
  • #7: Many industries are facing massive change. The thing that is driving the change is software and new applications – mobile and web applications – that create new possibilities. These are just a few examples: Nest is a software-defined thermostat. Thermostat’s entire job is measure temperature in a range and send a current to turn on and off the furnace / ac when the temperature is out of range. But, Nest built a thermostat with a web application to control it from anywhere and intelligence in the thermostat to recognize patterns and even know when you are home, so it can automatically adjust the temperature for you. That innovation is why Google bought Nest for $3.2B in Case in 2014. Tesla is a software-defined car. A mobile app allows you to control the car from anywhere, turning on the AC/heat before you arrive, opening and closing doors. They can also improve the car’s capabilities and efficiency by upgrading the car’s software instead of forcing you to get a new car. Uber allows you to call a towncar from any location. You call the car, the car shows up, you get in, tell them where you are going and get out. Your credit card is automatically hit, then you rate the driver and they rate you. This has turned the taxi industry on it’s ear. The entertainment industry is another big change. In the 80s, we all went to Blockbuster and hoped our new release was available. In the 2000s, they started redbox, which really hurt Blockbuster. Now you simply sit at home and everyone can watch the new release on the same day it comes out, streaming in to the home. Blockbuster is gone. Redboxes are fading. It’s all online streaming to your TV, your phone, your tablet…
  • #8: What all products have figured out is its about Outcomes. A client won’t install Hadoop just to buy some servers. They are hoping (realistically or not) that they can improve something of their business
  • #9: Why would Hadoop want to move towards Enterprise Storage Reference Architectures or Why would Databases w/ Enterprise Storage Reference Architectures move towards Hadoop Vendors Adoption for Hadoop are based upon available skills and operational support
  • #10: For SQL Databases, all of the data growth is mainly unstructured For Hadoop, its hard for companies to get started due to operations and lack of skills with their incumbent talent pool.