SlideShare a Scribd company logo
Hadoop-2 @ebay
Mayank Bansal
ebay
Hadoop – 2 @ ebay
Mayank Bansal
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Who I am
• Principal Engineer @ ebay
• Apache Hadoop Committer
• Apache Oozie PMC and Committer
• Current
• Leading Hadoop Core Development for
YARN and MapReduce @ ebay
• Past
• Working on Scheduler / Resource
Managers
• Working on Distributed Systems
• Data Pipeline frameworks
Mayank Bansal
Who we are
• ebay Hadoop Team
• We are around 40 people developing
and supporting Hadoop
• Thousands of Hadoop Users @ ebay
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Hadoop Evolution @ ebay
2007
1-10 nodes
2010
100+ nodes
1000s + cores
1 PB
2011
1000+ node
10,000+ cores
10+ PB
2012
3000+ node
30,000+ cores
50+ PB
2013/2014
10,000 nodes
150,000+ cores
150 PB
2009
50+ nodes
Hadoop - 1 Architecture
Hadoop-1 Limitations
• Scalability
• Maximum Cluster Size 4-5K nodes
• Maximum concurrent tasks ~40K
• Job Tracker scalability
• Availability
• Failure kills all the jobs
• Hard partition on Maps and Reduce
• Less Cluster utilization
• Lack support for alternate Paradigms
Hadoop-2
Single Use System
Batch Apps
Multi Purpose Platform
Batch, Interactive, streaming
YARN
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Application Master
• Runs on Normal Node Manager machines
• Out Of Memory Errors
• Slow Machines
• Flaky Network
Application Master
Nodes Goes Down
• Map Reduce
• Can Build state from Job History Files
• Generic Applications
• Application Time Line/History Server
• YARN-321
• YARN-1530
Application Master
• Slow Machines
• Automation/Monitoring
• Flaky Network
• Split Brain problem
• Fixed for Map Reduce
• All the AppMasters have to fix this
Application Master
Out Of Memory
• Physical Memory Errors
• yarn.app.mapreduce.am.resource.mb
• yarn.app.mapreduce.am.command-opts
• Virtual Memory Errors
• Default Ratio 2.1, needs to be tweaked
• yarn.nodemanager.vmem-check-enabled
• yarn.nodemanager.vmem-pmem-ratio
Binary Compatibility
• Works well
• mapred apis are binary compatible
• mapreduce apis are source compatible
• BUT …
• Only works for 70% Applications
• Why?
• Reflections
• Uber Jars in class path
• MAPREDUCE-5108
Binary Compatibility
LZO Compression
• LZO is not compiled with Hadoop-2
Avro
• http://repo1.maven.org/maven2
• Version => 1.7.4-hadoop2
Log Aggregation
• Loads lot of data in HDFS
• Per Day 5-7 TB of Data
• Default is 30 days we made that to 4 days
• yarn.log-aggregation.retain-seconds
• Lot of load on Namenode
User Engagement
• Engage all users for verifying jobs
• Test with Production like data
• Verify all jobs just not the sample jobs
Agenda
• Who we are?
• Background of Hadoop and Hadoop at ebay
• What are the challenges
• What we achieved using Hadoop-2
Benchmarks
Benchmark Hadoop-1 Hadoop-2 Improvement
Sort 500 seconds 365 seconds ~20%
Tera Sort 182 seconds 180 seconds About the same
Shuffle 993 seconds 530 seconds ~2X
Scalability 1020 seconds 275 seconds ~4X
YARN-938
Hadoop-2 Numbers
0
100000
200000
300000
400000
500000
600000
700000
Tasks Starting per Hour
Hadoop-2 Hadoop-1
0
100000
200000
300000
400000
500000
600000
700000
Tasks Finishing Per Hour
Hadoop-2 Hadoop-1
~59%
more tasks
~52%
more tasks
Hadoop-2 Numbers
0
100
200
300
400
500
600
Apps Submitted per hour
Hadoop-2 Hadoop-1
0
100
200
300
400
500
600
Apps Finishing Per Hour
Hadoop-2 Hadoop-1
~51%
more tasks
~50%
more tasks
Hadoop-2 Numbers
0
0.2
0.4
0.6
0.8
1
1.2
0:00
0:35
1:10
1:45
2:20
2:55
3:30
4:05
4:40
5:15
5:50
6:25
7:00
7:35
8:10
8:45
9:20
9:55
10:30
11:05
11:40
12:15
12:50
13:25
14:00
14:35
15:10
15:45
16:20
16:55
17:30
18:05
18:40
19:15
19:50
20:25
21:00
21:35
22:10
22:45
23:20
23:55
Hadoop-2 Cluster Utilization
Utilization
Overall improvements
• Over All Job throughput
• increased ~2X
• Over All Run time of jobs
• Increased ~1.5X to 2X
Apps Beyond MapReduce
• Tez
• Storm
• Shark and Spark
• …
Availability
• Namenode HA
• RM Restart
• RM HA
• Rolling upgrades (Coming soon)
Conclusion
• There are some pain points.
• Need to plan User Testing
• Worth The Effort
Questions
30
Mayank Bansal
mabansal@ebay.com
mayank@apache.org
Hadoop-2 @ eBay

More Related Content

PPT
Hadoop at Ebay
PPTX
Hadoop @ eBay: Past, Present, and Future
PPTX
Hadoop and HBase @eBay
PPT
2 hadoop@e bay-hug-2010-07-21
PPTX
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
PPTX
Qubole @ AWS Meetup Bangalore - July 2015
PPTX
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
PPTX
Qubole - Big data in cloud
Hadoop at Ebay
Hadoop @ eBay: Past, Present, and Future
Hadoop and HBase @eBay
2 hadoop@e bay-hug-2010-07-21
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Qubole @ AWS Meetup Bangalore - July 2015
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
Qubole - Big data in cloud

What's hot (20)

PPTX
Tailored for Spark
PPTX
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
PPTX
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia
PDF
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
PPTX
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
PPTX
Drilling into Data with Apache Drill
PDF
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
PPTX
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
PPTX
Big Data at Pinterest - Presented by Qubole
PPTX
Optimizing Big Data to run in the Public Cloud
PPTX
Atlanta MLConf
PPTX
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
PPTX
Real time fraud detection at 1+M scale on hadoop stack
PPT
Hadoop at Yahoo! -- Hadoop World NY 2009
PDF
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
PDF
Dataflow in 104corp - AWS UserGroup TW 2018
PDF
Big Data MDX with Mondrian and Apache Kylin
PDF
HBaseCon2017 Community-Driven Graphs with JanusGraph
PPTX
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
PDF
What's new in SQL on Hadoop and Beyond
Tailored for Spark
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Drilling into Data with Apache Drill
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
Big Data at Pinterest - Presented by Qubole
Optimizing Big Data to run in the Public Cloud
Atlanta MLConf
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
Real time fraud detection at 1+M scale on hadoop stack
Hadoop at Yahoo! -- Hadoop World NY 2009
Presto & differences between popular SQL engines (Spark, Redshift, and Hive)
Dataflow in 104corp - AWS UserGroup TW 2018
Big Data MDX with Mondrian and Apache Kylin
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
What's new in SQL on Hadoop and Beyond
Ad

Similar to Hadoop-2 @ eBay (20)

PPTX
Big Data and Hadoop
PPTX
Compression talk
PDF
OC Big Data Monthly Meetup #5 - Session 1 - Altiscale
PPTX
Hadoop ppt1
PPTX
Hadoop.pptx
PPTX
Hadoop.pptx
PPTX
List of Engineering Colleges in Uttarakhand
PPTX
HyperDB, MySQL Performance, & Flavors of MySQL
PPTX
Apache yarn
PDF
Workflow Engines for Hadoop
PPTX
The Meta of Hadoop - COMAD 2012
PPTX
Apache Hadoop YARN: Present and Future
PPTX
Hadoop and Big data in Big data and cloud.pptx
PDF
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
PPTX
Hadoop Summit Europe 2015 - YARN Present and Future
PPTX
Apache Hadoop YARN 2015: Present and Future
PPTX
Hadoop And Their Ecosystem ppt
PPTX
Hadoop And Their Ecosystem
PDF
The Evolution of Big Data at Spotify
Big Data and Hadoop
Compression talk
OC Big Data Monthly Meetup #5 - Session 1 - Altiscale
Hadoop ppt1
Hadoop.pptx
Hadoop.pptx
List of Engineering Colleges in Uttarakhand
HyperDB, MySQL Performance, & Flavors of MySQL
Apache yarn
Workflow Engines for Hadoop
The Meta of Hadoop - COMAD 2012
Apache Hadoop YARN: Present and Future
Hadoop and Big data in Big data and cloud.pptx
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
Hadoop Summit Europe 2015 - YARN Present and Future
Apache Hadoop YARN 2015: Present and Future
Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem
The Evolution of Big Data at Spotify
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Empathic Computing: Creating Shared Understanding
PDF
Modernizing your data center with Dell and AMD
PDF
Encapsulation theory and applications.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Big Data Technologies - Introduction.pptx
PPTX
A Presentation on Artificial Intelligence
PPTX
Cloud computing and distributed systems.
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
NewMind AI Weekly Chronicles - August'25 Week I
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Empathic Computing: Creating Shared Understanding
Modernizing your data center with Dell and AMD
Encapsulation theory and applications.pdf
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
A Presentation on Artificial Intelligence
Cloud computing and distributed systems.
Per capita expenditure prediction using model stacking based on satellite ima...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...

Hadoop-2 @ eBay