SlideShare a Scribd company logo
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Comprehensive Approach to Building
Your Big Data Solution
We do Hadoop.
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Speakers
	
    Hortonworks
◦  Ali Bajwa, Senior Partner Solution Engineer
	
    Red Hat
◦  Irshad Raihan, Senior Principal, Product Marketing
	
    Cisco
◦  Ron Graham, Big Data Analytics Engineer
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Partnership
100%	
  open	
  source	
  Hadoop	
  Distribu5on,	
  	
  
Support	
  and	
  Training	
  
	
  
Middleware,	
  Storage,	
  PaaS,	
  IaaS	
  
UCS	
  Integrated	
  Infrastructure	
  
For	
  Big	
  Data	
  
CISCO,	
  HORTONWORKS	
  AND	
  RED	
  HAT	
  ARE	
  PARTNERING	
  TO	
  HELP	
  YOU	
  
BUILD	
  YOUR	
  BIG	
  DATA	
  SOLUTION	
  AND	
  REACH	
  MASSIVE	
  SCALABILITY,	
  
SUPERIOR	
  EFFICIENCY	
  AND	
  DRAMATICALLY	
  LOWER	
  TOTAL	
  COST	
  OF	
  
OWNERSHIP	
  THANKS	
  TO	
  A	
  VALIDATED	
  JOINT	
  ARCHITECTURE.
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Traditional systems under pressure
Challenges
•  Constrains data to app
•  Can’t manage new data
•  Costly to Scale
Business Value
Clickstream
Geolocation
Web Data
Internet of Things
Docs, emails
Server logs
2012
2.8 Zettabytes
2020
40 Zettabytes
LAGGARDS
INDUSTRY
LEADERS
1
2 New Data
ERP CRM SCM
New
Traditional
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Modern Data Architecture emerges to unify data & processing
Modern Data Architecture
•  Enable applications to have access to
all your enterprise data through an
efficient centralized platform
•  Supported with a centralized
approach governance, security and
operations
•  Versatile to handle any applications
and datasets no matter the size or
type
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
SOURCES
Existing Systems
ERP	
   CRM	
   SCM	
  
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
ANALYTICS
Applications
Business
Analytics
Visualization
& Dashboards
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS
(Hadoop Distributed File System)
YARN: Data Operating System
Interactive Real-TimeBatch Partner ISVBatch BatchMP
P	
  
EDW	
  
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop Driver: Cost optimization
Archive Data off EDW
Move rarely used data to Hadoop as active
archive, store more data longer
Offload costly ETL process
Free your EDW to perform high-value functions
like analytics & operations, not ETL
Enrich the value of your EDW
Use Hadoop to refine new data sources, such as
web and machine data for new analytical context
ANALYTICS
Data
Marts
Business
Analytics
Visualization
& Dashboards
HDP helps you reduce costs and optimize the value associated with your EDW
ANALYTICSDATASYSTEMS
Data
Marts
Business
Analytics
Visualization
& Dashboards
HDP 2.2
ELT
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
N
Cold Data,
Deeper Archive
& New Sources
Enterprise
Data
Warehouse
Hot
MPP
In-Memory
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
Existing Systems
ERP	
   CRM	
   SCM	
  
SOURCES
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop Driver: Enabling the data lakeSCALE
SCOPE
Data Lake Definition
•  Centralized Architecture
Multiple applications on a shared data set
with consistent levels of service
•  Any App, Any Data
Multiple applications accessing all data
affording new insights and opportunities.
•  Unlocks ‘Systems of Insight’
Advanced algorithms and applications
used to derive new value and optimize
existing value.
Drivers:
1.  Cost Optimization
2.  Advanced Analytic Apps
Goal:
•  Centralized Architecture
•  Data-driven Business
DATA
LAKE
Journey to the Data Lake with Hadoop
Systems of Insight
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Only HDP delivers a Centralized Architecture
HDP is uniquely built around YARN serving as a data operating system that provides multi-tenant Resource
Management, consistent Governance & Security and efficient Operations services across Hadoop applications.
Hortonworks Data Platform
YARN
Data Operating System
•  A centralized architecture of
consistent enterprise
services for resource
management, security,
operations, and
governance.
•  The versatility to support
multiple applications and
diverse workloads from
batch to interactive to real-
time, open source and
commercial.
Key Benefits
•  Multiple applications on a
shared data set with consistent
levels of service: a multitenant
data platform.
•  Provides a shared platform to
enable new analytic
applications.
•  Delivers maximum cost
efficiency for cluster resource
management. Fewer servers
fewer nodes.
Storage
YARN: Data Operating System
Governance Security
Operations
Resource Management
Existing
Applications
New
Analytics
Partner
Applications
Data Access: Batch, Interactive & Real-time
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP delivers a completely open data platform
Hortonworks Data Platform 2.2
Hortonworks Data Platform provides Hadoop for the Enterprise: a centralized architecture
of core enterprise services, for any application and any data.
Completely Open
•  HDP incorporates every element
required of an enterprise data
platform: data storage, data access,
governance, security, operations
•  All components are developed in
open source and then rigorously
tested, certified, and delivered as
an integrated open source platform
that’s easy to consume and use by
the enterprise and ecosystem.
YARN: Data Operating System
(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
ApachePig
° °
° °
° ° °
° ° °
HDFS
(Hadoop Distributed File System)
GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
Apache Falcon
ApacheHive
Cascading
ApacheHBase
ApacheAccumulo
ApacheSolr
ApacheSpark
ApacheStorm
Apache Sqoop
Apache Flume
Apache Kafka
SECURITY
Apache Ranger
Apache Knox
Apache Falcon
OPERATIONS
Apache Ambari
Apache
Zookeeper
Apache Oozie
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP: Any Data, Any Application, Anywhere
Any Application
•  Deep integration with ecosystem
partners to extend existing
investments and skills
•  Broadest set of applications through
the stable of YARN-Ready applications
Any Data
Deploy applications fueled by clickstream, sensor,
social, mobile, geo-location, server log, and other
new paradigm datasets with existing legacy
datasets.
Anywhere
Implement HDP naturally across the
complete range of deployment options
Clickstream	
   Web	
  	
  
&	
  Social	
  
Geoloca3on	
   Internet	
  of	
  
Things	
  
Server	
  	
  
Logs	
  
Files,	
  emails	
  ERP	
   CRM	
   SCM	
  
hybrid
commodity appliance cloud
Over 70 Hortonworks Certified YARN Apps
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Open Source IS the standard for platform technology
Modern platform standards are defined by open communities
For Hadoop, the ASF provides guidelines and
a governance framework and the open
community defines the standards for Hadoop.
Roadmap matches user
requirements not vendor
monetization requirements
Hortonworks Open Source Development Model yields unmatched
efficiency
•  Infinite number of developers under governance of ASF applied to problem
•  End users motivated to contribute to Apache Hadoop as they are consumers
•  IT vendors motivated to align with Apache Hadoop to capture adjacent opportunities
Hortonworks Open Source Business Model de-risks investments
•  Buying behavior changed: enterprise wants support subscription license
•  Vendor needs to earn your business, every year is an election year
•  Equitable balance of power between vendor and consumer
•  IT vendors want platform technologies to be open source to avoid lock-in
TITLE SLIDE: HEADLINE
Presenter name
Title, Red Hat
Date
Red	
  Hat	
  Big	
  Data	
  
Open	
  the	
  possibili5es	
  of	
  your	
  data	
  
13
Big	
  Data	
  innova3on	
  cannot	
  happen	
  in	
  a	
  bubble	
  
Strong	
  partnerships	
  with	
  industry	
  leaders	
  and	
  open	
  source	
  communi5es	
  
14
Business	
  User	
  Architect	
  Data	
  Center	
  Operator	
   App	
  Developer	
  
Mul5ple	
  Silos.	
  Mul5ple	
  Views.	
  Mul5ple	
  Goals.	
  
The	
  Old	
  Data	
  Lifecycle	
  
Manage	
  	
   Build	
  	
   Code	
   Query	
  
15
Business	
  User	
  
Architect	
  
Data	
  Center	
  Operator	
  
App	
  Developer	
  
One	
  Language.	
  One	
  View.	
  One	
  Goal.	
  
The	
  New	
  Data	
  Lifecycle	
  
Ingest	
   Integrate	
  
Act	
   Discover	
  
16
Lack	
  of	
  agile,	
  open,	
  and	
  cost	
  effec5ve	
  enterprise-­‐grade	
  solu5ons	
  
Barriers	
  to	
  Big	
  Data	
  Success	
  
I	
  want	
  more	
  than	
  
canned	
  BI	
  queries	
  
I	
  am	
  locked	
  into	
  a	
  
vendor	
  stack	
  
I	
  want	
  to	
  use	
  my	
  favorite	
  
dev	
  framework	
  
I	
  need	
  to	
  integrate	
  
data	
  across	
  silos	
  
Business	
  User	
  
Architect	
  
Data	
  Center	
  Operator	
  
App	
  Developer	
  
17
Business	
  User	
  
Architect	
  
Data	
  Center	
  Operator	
  
App	
  Developer	
  
Ingest	
  
Integrate	
  
Act	
  
Discover	
  
Big	
  Data	
  Solu3ons	
  from	
  Red	
  Hat	
  
Integrated	
  Big	
  Data	
  PlaOorm	
  
	
  
Cisco UCS Integrated Infrastructure for Big Data
Hadoop
Compatible
File System
Red Hat
Storage
Hadoop Data Processing
Map/Reduce YARN
Analytics
Operating System
Red Hat Enterprise Linux
Cloud
Red Hat Enterprise Linux
OpenStack Platform
Operating EnvironmentData Integration & Application Development
Application Platform-
as-a-Service
OpenShift by Red Hat
Data Integration and Data
Services
Red Hat JBoss Data
Virtualization
Data Caching
Red Hat JBoss
Data Grid
Business Rules Mgmt
Red Hat JBoss BRMS
Development
Red Hat JBoss
Developer
Studio Hadoop
Distributed
File
System
Management
HortonworksCisco Red Hat
Data Integration
and Data
Services
Composite
Cloud
Cisco OpenStack
Pig Spark Storm
HBase Tez Hive
Cisco Security Suite
CiscoUCSDirectoryExpress
CiscoUnifiedManagement
Ambari
Virtualization
Red Hat Enterprise
Virtualization
Software and Solutions Innovation
Empowering What’s Next
Ron Graham
Big Data Analytics Engineer
Hardware Architecture
Cisco UCS with Big Data
20© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
Why Cisco UCS for Big Data?
•  Manageability
•  Save time with UCS Manager
•  Enables consistent and rapid
deployments using UCS Service profiles
•  Offers operational simplification
•  Delivers a modular solution
•  Scalability
•  Performance
SIM Card
Identity for a phone
Service Profile
Identity for a server
21© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
•  End to end provisioning, installation, and
monitoring tool for Hadoop Clusters
•  Better business outcomes with faster time to
value from Big Data
•  Provides appliance like experience with out
inflexibilities
•  Centralized visibility across Hadoop and
physical infrastructure
•  Powerful interface for further integration into
third party tools and services
UCS Director Express for Big Data
End to end solution for Hadoop
22© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
Powering Big Data and Analytics
UCS	
  B200	
  
Scale-­‐out	
  Analy5cs	
  
Big	
  Data	
  
with	
  
EMC	
  
Isilon	
  
and	
  VCE	
  
Invicta	
  
(Fast	
  
Data)	
  
UCS	
  C240	
  
(Hadoop,	
  NoSQL	
  
MPP)	
  
UCS	
  Manager,	
  Director,	
  Express,	
  Central,	
  Redhat	
  	
  
ACl	
  
C/B460	
  (In-­‐
memory	
  
Analy5cs)	
  
UCS	
  C3160,	
  
C3260	
  
(Hadoop)	
  
UCS	
  C220	
  
(real-­‐5me,	
  streaming)	
  
FlexPod	
  
Select	
  
with	
  
NetApp	
  
E-­‐Series	
  UCS	
  Mini	
  (All-­‐in-­‐one	
  
at	
  Edge)	
  
UCS	
  M-­‐Series	
  (Massive	
  
scale-­‐out)	
  
Ac5an,	
  DataStax,	
  Hortonworks,	
  MongoDB,	
  Pivotal,SAP,	
  SAS,	
  Splunk	
  	
  
Cisco,	
  Elas5c	
  Search,	
  IBM,	
  Informa5ca,	
  MicrosoZ,	
  MicroStrategy	
  ,	
  Oracle,	
  SAP,	
  
SAS	
  	
  and	
  others	
  
Complete	
  
and	
  Industry	
  
leading	
  
Por[olio	
  
Ecosystem	
  
Partners	
  
ISV	
  Partners	
  
Infrastructure	
  
Management	
  
Data	
  Management	
  
Applica5ons	
  
23© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
DESIGNS
Big Data
Cisco Validated Designs
for leading big data
platforms can be found
at:
www.cisco.com/go/bigdata
Cisco Validated Designs
Accelerate Deployment
24© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
Server 8x UCS C220 M4
CPU 2 x Intel Xeon
E5-2620 v3 (15M
Cache, 2.40 GHz)
Memory 256GB
Storage 8 1.2-TB 10K SAS
SFF HDD
Starter High Performance
Server 8x UCS C220 M4
CPU 2 x Intel Xeon
E5-2680 v3 (30M
Cache, 2.50 GHz)
Memory 384GB
Storage 2 1.2-TB 10K SAS
SFF HDD, 6 400-
GB SAS SSD
Performance Optimized Capacity Optimized Extreme Capacity
Server 16x UCS C240 M4
CPU 2 x Intel Xeon E5-2680
v3 (30M Cache, 2.50
GHz)
Memory 256GB
Storage 2 120-GB SATA SSD,
24 1.2-TB 10K SAS
SFF HDD
Server 16x UCS C240 M4
CPU 2 x Intel Xeon
E5-2620 v3 (15M
Cache, 2.40 GHz)
Memory 128GB
Storage 2 120-GB SATA
SSD. 12 4-TB 7.2K
SAS SFF HDD
Server 2x UCS C3160
CPU 2 x Intel Xeon
E5-2695 v2 (30M
Cache, 2.40 GHz)
Memory 256GB
Storage 2 120-GB SATA
SSD, 60 4-TB 7.2K
SAS SFF HDD
Cisco UCS CPA for Big Data v3
Reference Architecture and Bundles
25© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
2x UCS 6296 Series
Fabric Interconnect
UCS Manager
•  UCS Domain (68 Servers)
•  Manage by UCS Manager
•  2.8 PB of storage
•  HDP 2.2
•  Tiered Storage
•  Tez
•  RHEL 6.5
•  Dual 10G Network
•  17 Servers Per Rack
UCS C240 M4
2x E5-2680 v3
256GB Memory
Cisco 12Gb/s SAS Raid Controller
2x 120GB STAT SSD
24x 1.2TB 10k SAS
2x Cisco UCS VIC 1227
UCS C3160
2x E5-2695 v2
256GB Memory
Cisco 12Gb/s SAS Raid Controller
2x 120GB SATA SSD
60x 4TB 7.2k SAS SFF
2x Cisco UCS VIC 1227
/ 17 10Gb Ethernet
/ 17 10Gb Ethernet
64 Node Cluster Configuration
26© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
UCSD Express
UCS 6200 Series
Fabric Interconnect
UCS Manager
UCS C240 M4 Series
Rack Server
UCS C3160 Rack
Server
Apache Ambari
Unified Management
Programmability, Scalability and Automation
27© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
UCS 6200 Series
Fabric Interconnect
UCS C240 M4 Series
Rack Server
UCS C3160 Rack
Server
Data
Data
Data
Cold
n replicas on
Archive
Warm
1 replicas on Disk,
n-1 on Archive
Hot
All (n) replicas on
Disk
Cold
Hot
Policy
Hot - for both storage and compute. The data that
is popular and still being used for processing will
stay in this policy. When a block is hot, all replicas
are stored in DISK.
Warm - partially hot and partially cold. When a
block is warm, some of its replicas are stored in
DISK and the remaining replicas are stored in
ARCHIVE.
Cold - only for storage with limited compute. The
data that is no longer being used, or data that
needs to be archived is moved from hot storage to
cold storage. When a block is cold, all replicas are
stored in ARCHIVE.
Multi-tiered Storage Architecture
Multi-temperature Policy
28© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
Software and Solutions Innovation
Empowering What’s Next
UCS 6200 Series
Fabric Interconnect
UCS C240 M4 Series
Rack Server
UCS C3160 Rack
Server
Data
Data
Data
Cold
n replicas on
Archive
Warm
1 replicas on Disk,
n-1 on Archive
Hot
All (n) replicas on
Disk
Cold
Hot
Mover – A new data migration tool
It periodically scans the files in HDFS to
check if the block placement satisfies the
storage policy. For the blocks violating the
storage policy, it moves the replicas to a
different storage type in order to fulfill the
storage policy requirement.
A
C
D
A
C
D
E
A
C
D
E
N
N
N
N
E
Multi-tiered Storage Architecture
Multi-temperature Policy
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Next Steps…
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop
Learn more with our partnerships
http://hortonworks.com/partner/cisco/
http://hortonworks.com/partner/redhat/
Joint CVD bit.ly/Cisco-CVD
30© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential
•  Cisco Live! in San Diego – June 7 - 11
•  Hadoop Summit in San Jose – June 9 – 11
•  Red Hat Summit in Boston - June 23-26
More information about Red Hat’s Big Data solutions please visit:
•  redhat.com/bigdata
•  redhatstorage.redhat.com/category/big-data
•  redhat.com/en/insights/big-data
More information about Cisco’s Big Data and Analytics Offers please visit:
•  www.cisco.com/go/bigdata and www.cisco.com/go/bigdata_design
•  http://blogs.cisco.com/author/raghunathnambiar
•  bit.ly/Cisco-CVD
30
Meet us in person!

More Related Content

PDF
Supporting Financial Services with a More Flexible Approach to Big Data
PDF
Splunk-hortonworks-risk-management-oct-2014
PDF
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
PDF
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
PDF
Hp Converged Systems and Hortonworks - Webinar Slides
PDF
Discover HDP 2.1: Apache Solr for Hadoop Search
PDF
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
PDF
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Supporting Financial Services with a More Flexible Approach to Big Data
Splunk-hortonworks-risk-management-oct-2014
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Hp Converged Systems and Hortonworks - Webinar Slides
Discover HDP 2.1: Apache Solr for Hadoop Search
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance

What's hot (20)

PDF
Hortonworks - What's Possible with a Modern Data Architecture?
PDF
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
PPTX
Falcon Meetup
PDF
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
PPTX
Log Analytics Optimization
PDF
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
PDF
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
PDF
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
PPTX
Hortonworks Data In Motion Series Part 4
PDF
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
PDF
Discover.hdp2.2.storm and kafka.final
PPTX
Hortonworks Data In Motion Webinar Series Pt. 2
PPTX
Enabling the Real Time Analytical Enterprise
PDF
Discover.hdp2.2.h base.final[2]
PDF
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
PDF
Implementing a Data Lake with Enterprise Grade Data Governance
PPTX
YARN Ready: Integrating to YARN with Tez
PPTX
Don't Let Security Be The 'Elephant in the Room'
PPTX
Spark Summit EMEA - Arun Murthy's Keynote
PDF
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Hortonworks - What's Possible with a Modern Data Architecture?
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Falcon Meetup
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Log Analytics Optimization
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks Data In Motion Series Part 4
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover.hdp2.2.storm and kafka.final
Hortonworks Data In Motion Webinar Series Pt. 2
Enabling the Real Time Analytical Enterprise
Discover.hdp2.2.h base.final[2]
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Implementing a Data Lake with Enterprise Grade Data Governance
YARN Ready: Integrating to YARN with Tez
Don't Let Security Be The 'Elephant in the Room'
Spark Summit EMEA - Arun Murthy's Keynote
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Ad

Viewers also liked (20)

PDF
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
PDF
Hortonworks and Platfora in Financial Services - Webinar
PPTX
Digital Transformation and Data Protection in Automotive Industry
PDF
Smarter commerce partner presentation final
PDF
Smarter commerce overview
PDF
Bringing Big Data Analytics to Network Monitoring
PDF
Hadoop Summit 2013 : Continuous Integration on top of hadoop
PDF
Case study - Automotive DMS Connection to Salesforce.com
PDF
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
PDF
Qrious about Insights -- Big Data in the Real World
PPTX
Leveraging SAP, Hadoop, and Big Data to Redefine Business
PDF
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
PDF
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
PDF
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
PDF
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
PDF
Hortonworks and Voltage Security webinar
PDF
Hortonworks, Novetta and Noble Energy Webinar
PDF
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
PDF
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
PDF
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Hortonworks and Platfora in Financial Services - Webinar
Digital Transformation and Data Protection in Automotive Industry
Smarter commerce partner presentation final
Smarter commerce overview
Bringing Big Data Analytics to Network Monitoring
Hadoop Summit 2013 : Continuous Integration on top of hadoop
Case study - Automotive DMS Connection to Salesforce.com
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Qrious about Insights -- Big Data in the Real World
Leveraging SAP, Hadoop, and Big Data to Redefine Business
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
Hortonworks and Voltage Security webinar
Hortonworks, Novetta and Noble Energy Webinar
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Ad

Similar to A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks and Red Hat (20)

PPTX
Supporting Financial Services with a More Flexible Approach to Big Data
PDF
Hortonworks & Bilot Data Driven Transformations with Hadoop
PDF
Meetup oslo hortonworks HDP
PDF
Hortonworks Hadoop @ Oslo Hadoop User Group
PDF
Introduction to Hadoop
PDF
Storm Demo Talk - Colorado Springs May 2015
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PDF
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
PDF
Discover hdp 2.2 hdfs - final
PPTX
Yahoo! Hack Europe
PDF
Webinar turbo charging_data_science_hawq_on_hdp_final
PDF
Webinar turbo charging_data_science_hawq_on_hdp_final
PDF
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
PDF
IoT Crash Course Hadoop Summit SJ
PDF
Solving Big Data Problems using Hortonworks
PPTX
Mrinal devadas, Hortonworks Making Sense Of Big Data
PDF
Azure Cafe Marketplace with Hortonworks March 31 2016
PDF
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
PPTX
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks & Bilot Data Driven Transformations with Hadoop
Meetup oslo hortonworks HDP
Hortonworks Hadoop @ Oslo Hadoop User Group
Introduction to Hadoop
Storm Demo Talk - Colorado Springs May 2015
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2 hdfs - final
Yahoo! Hack Europe
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
IoT Crash Course Hadoop Summit SJ
Solving Big Data Problems using Hortonworks
Mrinal devadas, Hortonworks Making Sense Of Big Data
Azure Cafe Marketplace with Hortonworks March 31 2016
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG

More from Hortonworks (20)

PDF
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
PDF
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
PDF
Getting the Most Out of Your Data in the Cloud with Cloudbreak
PDF
Johns Hopkins - Using Hadoop to Secure Access Log Events
PDF
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
PDF
HDF 3.2 - What's New
PPTX
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
PDF
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
PDF
IBM+Hortonworks = Transformation of the Big Data Landscape
PDF
Premier Inside-Out: Apache Druid
PDF
Accelerating Data Science and Real Time Analytics at Scale
PDF
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
PDF
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
PDF
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
PDF
Making Enterprise Big Data Small with Ease
PDF
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
PDF
Driving Digital Transformation Through Global Data Management
PPTX
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
PDF
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
PDF
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Johns Hopkins - Using Hadoop to Secure Access Log Events
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
HDF 3.2 - What's New
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
IBM+Hortonworks = Transformation of the Big Data Landscape
Premier Inside-Out: Apache Druid
Accelerating Data Science and Real Time Analytics at Scale
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Making Enterprise Big Data Small with Ease
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Driving Digital Transformation Through Global Data Management
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Unlock Value from Big Data with Apache NiFi and Streaming CDC

Recently uploaded (20)

PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
Transform Your Business with a Software ERP System
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
PTS Company Brochure 2025 (1).pdf.......
PPT
Introduction Database Management System for Course Database
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
AI in Product Development-omnex systems
PDF
System and Network Administraation Chapter 3
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
L1 - Introduction to python Backend.pptx
Design an Analysis of Algorithms I-SECS-1021-03
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Online Work Permit System for Fast Permit Processing
Transform Your Business with a Software ERP System
Wondershare Filmora 15 Crack With Activation Key [2025
PTS Company Brochure 2025 (1).pdf.......
Introduction Database Management System for Course Database
2025 Textile ERP Trends: SAP, Odoo & Oracle
VVF-Customer-Presentation2025-Ver1.9.pptx
Which alternative to Crystal Reports is best for small or large businesses.pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
How Creative Agencies Leverage Project Management Software.pdf
ISO 45001 Occupational Health and Safety Management System
AI in Product Development-omnex systems
System and Network Administraation Chapter 3
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Odoo Companies in India – Driving Business Transformation.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
L1 - Introduction to python Backend.pptx

A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks and Red Hat

  • 1. Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Comprehensive Approach to Building Your Big Data Solution We do Hadoop.
  • 2. Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Speakers    Hortonworks ◦  Ali Bajwa, Senior Partner Solution Engineer    Red Hat ◦  Irshad Raihan, Senior Principal, Product Marketing    Cisco ◦  Ron Graham, Big Data Analytics Engineer
  • 3. Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Partnership 100%  open  source  Hadoop  Distribu5on,     Support  and  Training     Middleware,  Storage,  PaaS,  IaaS   UCS  Integrated  Infrastructure   For  Big  Data   CISCO,  HORTONWORKS  AND  RED  HAT  ARE  PARTNERING  TO  HELP  YOU   BUILD  YOUR  BIG  DATA  SOLUTION  AND  REACH  MASSIVE  SCALABILITY,   SUPERIOR  EFFICIENCY  AND  DRAMATICALLY  LOWER  TOTAL  COST  OF   OWNERSHIP  THANKS  TO  A  VALIDATED  JOINT  ARCHITECTURE.
  • 4. Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Traditional systems under pressure Challenges •  Constrains data to app •  Can’t manage new data •  Costly to Scale Business Value Clickstream Geolocation Web Data Internet of Things Docs, emails Server logs 2012 2.8 Zettabytes 2020 40 Zettabytes LAGGARDS INDUSTRY LEADERS 1 2 New Data ERP CRM SCM New Traditional
  • 5. Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Modern Data Architecture emerges to unify data & processing Modern Data Architecture •  Enable applications to have access to all your enterprise data through an efficient centralized platform •  Supported with a centralized approach governance, security and operations •  Versatile to handle any applications and datasets no matter the size or type Clickstream   Web     &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   SOURCES Existing Systems ERP   CRM   SCM   ANALYTICS Data Marts Business Analytics Visualization & Dashboards ANALYTICS Applications Business Analytics Visualization & Dashboards ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) YARN: Data Operating System Interactive Real-TimeBatch Partner ISVBatch BatchMP P   EDW  
  • 6. Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Driver: Cost optimization Archive Data off EDW Move rarely used data to Hadoop as active archive, store more data longer Offload costly ETL process Free your EDW to perform high-value functions like analytics & operations, not ETL Enrich the value of your EDW Use Hadoop to refine new data sources, such as web and machine data for new analytical context ANALYTICS Data Marts Business Analytics Visualization & Dashboards HDP helps you reduce costs and optimize the value associated with your EDW ANALYTICSDATASYSTEMS Data Marts Business Analytics Visualization & Dashboards HDP 2.2 ELT ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Cold Data, Deeper Archive & New Sources Enterprise Data Warehouse Hot MPP In-Memory Clickstream   Web     &  Social   Geoloca3on   Sensor     &  Machine   Server     Logs   Unstructured   Existing Systems ERP   CRM   SCM   SOURCES
  • 7. Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Driver: Enabling the data lakeSCALE SCOPE Data Lake Definition •  Centralized Architecture Multiple applications on a shared data set with consistent levels of service •  Any App, Any Data Multiple applications accessing all data affording new insights and opportunities. •  Unlocks ‘Systems of Insight’ Advanced algorithms and applications used to derive new value and optimize existing value. Drivers: 1.  Cost Optimization 2.  Advanced Analytic Apps Goal: •  Centralized Architecture •  Data-driven Business DATA LAKE Journey to the Data Lake with Hadoop Systems of Insight
  • 8. Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Only HDP delivers a Centralized Architecture HDP is uniquely built around YARN serving as a data operating system that provides multi-tenant Resource Management, consistent Governance & Security and efficient Operations services across Hadoop applications. Hortonworks Data Platform YARN Data Operating System •  A centralized architecture of consistent enterprise services for resource management, security, operations, and governance. •  The versatility to support multiple applications and diverse workloads from batch to interactive to real- time, open source and commercial. Key Benefits •  Multiple applications on a shared data set with consistent levels of service: a multitenant data platform. •  Provides a shared platform to enable new analytic applications. •  Delivers maximum cost efficiency for cluster resource management. Fewer servers fewer nodes. Storage YARN: Data Operating System Governance Security Operations Resource Management Existing Applications New Analytics Partner Applications Data Access: Batch, Interactive & Real-time
  • 9. Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP delivers a completely open data platform Hortonworks Data Platform 2.2 Hortonworks Data Platform provides Hadoop for the Enterprise: a centralized architecture of core enterprise services, for any application and any data. Completely Open •  HDP incorporates every element required of an enterprise data platform: data storage, data access, governance, security, operations •  All components are developed in open source and then rigorously tested, certified, and delivered as an integrated open source platform that’s easy to consume and use by the enterprise and ecosystem. YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ApachePig ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS Apache Falcon ApacheHive Cascading ApacheHBase ApacheAccumulo ApacheSolr ApacheSpark ApacheStorm Apache Sqoop Apache Flume Apache Kafka SECURITY Apache Ranger Apache Knox Apache Falcon OPERATIONS Apache Ambari Apache Zookeeper Apache Oozie
  • 10. Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP: Any Data, Any Application, Anywhere Any Application •  Deep integration with ecosystem partners to extend existing investments and skills •  Broadest set of applications through the stable of YARN-Ready applications Any Data Deploy applications fueled by clickstream, sensor, social, mobile, geo-location, server log, and other new paradigm datasets with existing legacy datasets. Anywhere Implement HDP naturally across the complete range of deployment options Clickstream   Web     &  Social   Geoloca3on   Internet  of   Things   Server     Logs   Files,  emails  ERP   CRM   SCM   hybrid commodity appliance cloud Over 70 Hortonworks Certified YARN Apps
  • 11. Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Open Source IS the standard for platform technology Modern platform standards are defined by open communities For Hadoop, the ASF provides guidelines and a governance framework and the open community defines the standards for Hadoop. Roadmap matches user requirements not vendor monetization requirements Hortonworks Open Source Development Model yields unmatched efficiency •  Infinite number of developers under governance of ASF applied to problem •  End users motivated to contribute to Apache Hadoop as they are consumers •  IT vendors motivated to align with Apache Hadoop to capture adjacent opportunities Hortonworks Open Source Business Model de-risks investments •  Buying behavior changed: enterprise wants support subscription license •  Vendor needs to earn your business, every year is an election year •  Equitable balance of power between vendor and consumer •  IT vendors want platform technologies to be open source to avoid lock-in
  • 12. TITLE SLIDE: HEADLINE Presenter name Title, Red Hat Date Red  Hat  Big  Data   Open  the  possibili5es  of  your  data  
  • 13. 13 Big  Data  innova3on  cannot  happen  in  a  bubble   Strong  partnerships  with  industry  leaders  and  open  source  communi5es  
  • 14. 14 Business  User  Architect  Data  Center  Operator   App  Developer   Mul5ple  Silos.  Mul5ple  Views.  Mul5ple  Goals.   The  Old  Data  Lifecycle   Manage     Build     Code   Query  
  • 15. 15 Business  User   Architect   Data  Center  Operator   App  Developer   One  Language.  One  View.  One  Goal.   The  New  Data  Lifecycle   Ingest   Integrate   Act   Discover  
  • 16. 16 Lack  of  agile,  open,  and  cost  effec5ve  enterprise-­‐grade  solu5ons   Barriers  to  Big  Data  Success   I  want  more  than   canned  BI  queries   I  am  locked  into  a   vendor  stack   I  want  to  use  my  favorite   dev  framework   I  need  to  integrate   data  across  silos   Business  User   Architect   Data  Center  Operator   App  Developer  
  • 17. 17 Business  User   Architect   Data  Center  Operator   App  Developer   Ingest   Integrate   Act   Discover   Big  Data  Solu3ons  from  Red  Hat  
  • 18. Integrated  Big  Data  PlaOorm     Cisco UCS Integrated Infrastructure for Big Data Hadoop Compatible File System Red Hat Storage Hadoop Data Processing Map/Reduce YARN Analytics Operating System Red Hat Enterprise Linux Cloud Red Hat Enterprise Linux OpenStack Platform Operating EnvironmentData Integration & Application Development Application Platform- as-a-Service OpenShift by Red Hat Data Integration and Data Services Red Hat JBoss Data Virtualization Data Caching Red Hat JBoss Data Grid Business Rules Mgmt Red Hat JBoss BRMS Development Red Hat JBoss Developer Studio Hadoop Distributed File System Management HortonworksCisco Red Hat Data Integration and Data Services Composite Cloud Cisco OpenStack Pig Spark Storm HBase Tez Hive Cisco Security Suite CiscoUCSDirectoryExpress CiscoUnifiedManagement Ambari Virtualization Red Hat Enterprise Virtualization
  • 19. Software and Solutions Innovation Empowering What’s Next Ron Graham Big Data Analytics Engineer Hardware Architecture Cisco UCS with Big Data
  • 20. 20© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next Why Cisco UCS for Big Data? •  Manageability •  Save time with UCS Manager •  Enables consistent and rapid deployments using UCS Service profiles •  Offers operational simplification •  Delivers a modular solution •  Scalability •  Performance SIM Card Identity for a phone Service Profile Identity for a server
  • 21. 21© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next •  End to end provisioning, installation, and monitoring tool for Hadoop Clusters •  Better business outcomes with faster time to value from Big Data •  Provides appliance like experience with out inflexibilities •  Centralized visibility across Hadoop and physical infrastructure •  Powerful interface for further integration into third party tools and services UCS Director Express for Big Data End to end solution for Hadoop
  • 22. 22© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next Powering Big Data and Analytics UCS  B200   Scale-­‐out  Analy5cs   Big  Data   with   EMC   Isilon   and  VCE   Invicta   (Fast   Data)   UCS  C240   (Hadoop,  NoSQL   MPP)   UCS  Manager,  Director,  Express,  Central,  Redhat     ACl   C/B460  (In-­‐ memory   Analy5cs)   UCS  C3160,   C3260   (Hadoop)   UCS  C220   (real-­‐5me,  streaming)   FlexPod   Select   with   NetApp   E-­‐Series  UCS  Mini  (All-­‐in-­‐one   at  Edge)   UCS  M-­‐Series  (Massive   scale-­‐out)   Ac5an,  DataStax,  Hortonworks,  MongoDB,  Pivotal,SAP,  SAS,  Splunk     Cisco,  Elas5c  Search,  IBM,  Informa5ca,  MicrosoZ,  MicroStrategy  ,  Oracle,  SAP,   SAS    and  others   Complete   and  Industry   leading   Por[olio   Ecosystem   Partners   ISV  Partners   Infrastructure   Management   Data  Management   Applica5ons  
  • 23. 23© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next DESIGNS Big Data Cisco Validated Designs for leading big data platforms can be found at: www.cisco.com/go/bigdata Cisco Validated Designs Accelerate Deployment
  • 24. 24© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next Server 8x UCS C220 M4 CPU 2 x Intel Xeon E5-2620 v3 (15M Cache, 2.40 GHz) Memory 256GB Storage 8 1.2-TB 10K SAS SFF HDD Starter High Performance Server 8x UCS C220 M4 CPU 2 x Intel Xeon E5-2680 v3 (30M Cache, 2.50 GHz) Memory 384GB Storage 2 1.2-TB 10K SAS SFF HDD, 6 400- GB SAS SSD Performance Optimized Capacity Optimized Extreme Capacity Server 16x UCS C240 M4 CPU 2 x Intel Xeon E5-2680 v3 (30M Cache, 2.50 GHz) Memory 256GB Storage 2 120-GB SATA SSD, 24 1.2-TB 10K SAS SFF HDD Server 16x UCS C240 M4 CPU 2 x Intel Xeon E5-2620 v3 (15M Cache, 2.40 GHz) Memory 128GB Storage 2 120-GB SATA SSD. 12 4-TB 7.2K SAS SFF HDD Server 2x UCS C3160 CPU 2 x Intel Xeon E5-2695 v2 (30M Cache, 2.40 GHz) Memory 256GB Storage 2 120-GB SATA SSD, 60 4-TB 7.2K SAS SFF HDD Cisco UCS CPA for Big Data v3 Reference Architecture and Bundles
  • 25. 25© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next 2x UCS 6296 Series Fabric Interconnect UCS Manager •  UCS Domain (68 Servers) •  Manage by UCS Manager •  2.8 PB of storage •  HDP 2.2 •  Tiered Storage •  Tez •  RHEL 6.5 •  Dual 10G Network •  17 Servers Per Rack UCS C240 M4 2x E5-2680 v3 256GB Memory Cisco 12Gb/s SAS Raid Controller 2x 120GB STAT SSD 24x 1.2TB 10k SAS 2x Cisco UCS VIC 1227 UCS C3160 2x E5-2695 v2 256GB Memory Cisco 12Gb/s SAS Raid Controller 2x 120GB SATA SSD 60x 4TB 7.2k SAS SFF 2x Cisco UCS VIC 1227 / 17 10Gb Ethernet / 17 10Gb Ethernet 64 Node Cluster Configuration
  • 26. 26© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next UCSD Express UCS 6200 Series Fabric Interconnect UCS Manager UCS C240 M4 Series Rack Server UCS C3160 Rack Server Apache Ambari Unified Management Programmability, Scalability and Automation
  • 27. 27© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next UCS 6200 Series Fabric Interconnect UCS C240 M4 Series Rack Server UCS C3160 Rack Server Data Data Data Cold n replicas on Archive Warm 1 replicas on Disk, n-1 on Archive Hot All (n) replicas on Disk Cold Hot Policy Hot - for both storage and compute. The data that is popular and still being used for processing will stay in this policy. When a block is hot, all replicas are stored in DISK. Warm - partially hot and partially cold. When a block is warm, some of its replicas are stored in DISK and the remaining replicas are stored in ARCHIVE. Cold - only for storage with limited compute. The data that is no longer being used, or data that needs to be archived is moved from hot storage to cold storage. When a block is cold, all replicas are stored in ARCHIVE. Multi-tiered Storage Architecture Multi-temperature Policy
  • 28. 28© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential Software and Solutions Innovation Empowering What’s Next UCS 6200 Series Fabric Interconnect UCS C240 M4 Series Rack Server UCS C3160 Rack Server Data Data Data Cold n replicas on Archive Warm 1 replicas on Disk, n-1 on Archive Hot All (n) replicas on Disk Cold Hot Mover – A new data migration tool It periodically scans the files in HDFS to check if the block placement satisfies the storage policy. For the blocks violating the storage policy, it moves the replicas to a different storage type in order to fulfill the storage policy requirement. A C D A C D E A C D E N N N N E Multi-tiered Storage Architecture Multi-temperature Policy
  • 29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Next Steps… Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop Learn more with our partnerships http://hortonworks.com/partner/cisco/ http://hortonworks.com/partner/redhat/ Joint CVD bit.ly/Cisco-CVD
  • 30. 30© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential •  Cisco Live! in San Diego – June 7 - 11 •  Hadoop Summit in San Jose – June 9 – 11 •  Red Hat Summit in Boston - June 23-26 More information about Red Hat’s Big Data solutions please visit: •  redhat.com/bigdata •  redhatstorage.redhat.com/category/big-data •  redhat.com/en/insights/big-data More information about Cisco’s Big Data and Analytics Offers please visit: •  www.cisco.com/go/bigdata and www.cisco.com/go/bigdata_design •  http://blogs.cisco.com/author/raghunathnambiar •  bit.ly/Cisco-CVD 30 Meet us in person!