SlideShare a Scribd company logo
Big Data Cloud Meetup Big Data & Cloud Computing  - Help, Educate & Demystify. September 8 th  2011
Fail-Proofing Hadoop Clusters with Automated Service Failover Michael Dalton, CTO Zettaset Sept 8th 2011 Meetup
Problem Hadoop environments have many SPOFs NameNode , JobTracker, Oozie Kerberos Sept 8th 2011 Meetup
Ideal Solution Automated failover No data loss Handle all failover aspects (IP failover, etc) Failover all services No JobTracker = No MR No Kerberos = no new Kerberos authentication Sept 8th 2011 Meetup
Existing Solutions AvatarNode (NameNode, patch from FB) Replicate writes to a backup service BackupNameNode (NN, not committed) 'Hot' copy of NameNode, replicated All failover manual  Sept 8th 2011 Meetup
Why is Failover Hard? Sept 8th 2011 Meetup M1 M2 C1 C2
Data Loss Split-Brain issues lose data Multiple masters = data corruption Clients confused about who is up Problem for traditional HA environments Linux-HA, etc Heartbeat failure != Death Sept 8th 2011 Meetup
Theoretical Limits Can we solve this reliably? Fischer-Lynch-Paterson (FLP) Theorem Consensus impossible in asynchronous distributed system when even a single process can fail No free lunch Sept 8th 2011 Meetup
Revisiting Our Assumptions Drop fully asynchronous requirement What about leases? Masters obtain, renew a lease Shutdown if lease expires (not asynchronous) Assumes only bounded relative clock skew Everyone should agree on how fast time elapses Sept 8th 2011 Meetup
Master Failover Requires highly available lock / lease system Master obtains a lease to be master Replicates writes to a backup master If master loses lease, hold a new election Old master will shut down when lease expires If clock skew bounded, no split-brain! Sept 8th 2011 Meetup
Failover: Locks/Consensus Apache ZooKeeper – Hadoop subproject  Highly-available distributed filesystem for distributed consensus problems  Create election, membership, etc. using special-purpose FS semantics 'Ephemeral' files disappear when session lease expires 'Sequential' files have auto-incremented suffix Sept 8th 2011 Meetup
ZooKeeper Internals ZooKeeper consists of a quorum of nodes (typically 3-9) Majority vote elects a leader (via leases) Leader proposes all FS modifications Majority must approve a modification for it to be committed Sept 8th 2011 Meetup
Example: HBase Apache HBase has full automated multi-master failover Prospective masters register in ZooKeeper ZooKeeper ephemeral/sequential files used for election Clients lookup current address of master in ZooKeeper Failover fully automated All files stored on HDFS, so no replication issues Sept 8th 2011 Meetup
Failover: Replication HBase approach avoids replication issues with HDFS Kerberos, NN, Oozie, etc can't use HDFS Legacy compatibility (and for NN, circular deps) How can we add synchronous write replication? Can't break compatibility or change apps Sept 8th 2011 Meetup
Failover: Networking HBase avoids networking failover by storing master address in ZK Legacy services use IP or hostnames, not ZK, to connect to master Out-of-trunk patches to make ZK a DNS server But Java doesn't respect DNS TTLs anyway, complicating max time for failover Sept 8th 2011 Meetup
Failover: Networking HBase avoids networking failover by storing master address in ZK Legacy services use IP or hostnames, not ZK, to connect to master Out-of-trunk patches to make ZK a DNS server But Java doesn't respect DNS TTLs anyway, complicating max time for failover DNS introduces its own issues anyway... Sept 8th 2011 Meetup
IP Failover Instead, you can failover IP addresses Virtual IPs – if supported by router Otherwise, dynamically update routes as part of your failover New leader updates routing tables.  For local area networks, ensure ARP tables updated Gratuitous ARP or store ARP information in ZK Sept 8th 2011 Meetup
Putting it all together Consensus/Election Use ZooKeeper, 3-9 node quorum State Replication Small data in ZK, Large data in HDFS If neither possible, DRBD  Network Failover Store master address in ZK Or, perform IP failover Dynamically update routing tables, update ARPcache Sept 8th 2011 Meetup
Conclusion Fully automated failover is possible Design for synchronous replication Prevent split-brain  Manage legacy compatibility Coming to Hadoop ZettaSet provides fully HA Hadoop  Sept 8th 2011 Meetup

More Related Content

PDF
RackN Physical Layer Automation Innovation
PDF
Persistent storage in Docker
PPT
Running Apache Spark & Apache Zeppelin in Production
PDF
Apache Zeppelin Helium and Beyond
PDF
Accumulo Summit 2015: Alternatives to Apache Accumulo's Java API [API]
PDF
Past, Present and Future of Health apps
PPTX
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
PPTX
Recommendation Engines - An Architectural Guide
RackN Physical Layer Automation Innovation
Persistent storage in Docker
Running Apache Spark & Apache Zeppelin in Production
Apache Zeppelin Helium and Beyond
Accumulo Summit 2015: Alternatives to Apache Accumulo's Java API [API]
Past, Present and Future of Health apps
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
Recommendation Engines - An Architectural Guide

Similar to BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automatic Service Failover by Mike Dalton of Zettaset (20)

PPTX
Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
PPTX
Hadoop Summit 2012 | HDFS High Availability
PDF
Intro to big data choco devday - 23-01-2014
PPT
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
PPTX
HDFS Namenode High Availability
PPTX
Nn ha hadoop world.final
PDF
Hdfs high availability
PDF
Hdfs high availability
PDF
Hadoop availability
PPTX
High Availability in YARN
PDF
getFamiliarWithHadoop
PPTX
Hadoop Fundamentals
PPTX
Hadoop fundamentals
PDF
SVCC-2014
PPTX
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
PDF
Apache hbase for the enterprise (Strata+Hadoop World 2012)
PDF
Taskerman - a distributed cluster task manager
PDF
Apache Hadoop & Friends at Utah Java User's Group
ODP
Apache Hadoop HDFS
PDF
Hadoop Distributed File System Reliability and Durability at Facebook
Hadoop World 2011: HDFS Name Node High Availablity - Aaron Myers, Cloudera & ...
Hadoop Summit 2012 | HDFS High Availability
Intro to big data choco devday - 23-01-2014
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
HDFS Namenode High Availability
Nn ha hadoop world.final
Hdfs high availability
Hdfs high availability
Hadoop availability
High Availability in YARN
getFamiliarWithHadoop
Hadoop Fundamentals
Hadoop fundamentals
SVCC-2014
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Apache hbase for the enterprise (Strata+Hadoop World 2012)
Taskerman - a distributed cluster task manager
Apache Hadoop & Friends at Utah Java User's Group
Apache Hadoop HDFS
Hadoop Distributed File System Reliability and Durability at Facebook
Ad

More from BigDataCloud (20)

PDF
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
PDF
Crime Analysis & Prediction System
PDF
REAL-TIME RECOMMENDATION SYSTEMS
PDF
Cloud Computing Services
PDF
Google Enterprise Cloud Platform - Resources & $2000 credit!
PDF
Big Data in the Cloud - Solutions & Apps
PDF
Big Data Analytics in Motorola on the Google Cloud Platform
PDF
Streak + Google Cloud Platform
PDF
Using Advanced Analyics to bring Business Value
PDF
Creating Business Value from Big Data, Analytics & Technology.
PDF
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
PPTX
Why Hadoop is the New Infrastructure for the CMO?
PDF
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
PPTX
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
PPTX
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
PDF
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
PDF
What Does Big Data Mean and Who Will Win
PDF
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
PDF
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
PPT
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
Webinar - Comparative Analysis of Cloud based Machine Learning Platforms
Crime Analysis & Prediction System
REAL-TIME RECOMMENDATION SYSTEMS
Cloud Computing Services
Google Enterprise Cloud Platform - Resources & $2000 credit!
Big Data in the Cloud - Solutions & Apps
Big Data Analytics in Motorola on the Google Cloud Platform
Streak + Google Cloud Platform
Using Advanced Analyics to bring Business Value
Creating Business Value from Big Data, Analytics & Technology.
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Why Hadoop is the New Infrastructure for the CMO?
Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal
Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB
Big Data Cloud Meetup - Jan 24 2013 - Zettaset
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
What Does Big Data Mean and Who Will Win
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...
Ad

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Approach and Philosophy of On baking technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
A Presentation on Artificial Intelligence
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Electronic commerce courselecture one. Pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Spectroscopy.pptx food analysis technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Mobile App Security Testing_ A Comprehensive Guide.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
“AI and Expert System Decision Support & Business Intelligence Systems”
MIND Revenue Release Quarter 2 2025 Press Release
Approach and Philosophy of On baking technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
A Presentation on Artificial Intelligence
MYSQL Presentation for SQL database connectivity
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Electronic commerce courselecture one. Pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Spectroscopy.pptx food analysis technology

BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automatic Service Failover by Mike Dalton of Zettaset

  • 1. Big Data Cloud Meetup Big Data & Cloud Computing - Help, Educate & Demystify. September 8 th 2011
  • 2. Fail-Proofing Hadoop Clusters with Automated Service Failover Michael Dalton, CTO Zettaset Sept 8th 2011 Meetup
  • 3. Problem Hadoop environments have many SPOFs NameNode , JobTracker, Oozie Kerberos Sept 8th 2011 Meetup
  • 4. Ideal Solution Automated failover No data loss Handle all failover aspects (IP failover, etc) Failover all services No JobTracker = No MR No Kerberos = no new Kerberos authentication Sept 8th 2011 Meetup
  • 5. Existing Solutions AvatarNode (NameNode, patch from FB) Replicate writes to a backup service BackupNameNode (NN, not committed) 'Hot' copy of NameNode, replicated All failover manual Sept 8th 2011 Meetup
  • 6. Why is Failover Hard? Sept 8th 2011 Meetup M1 M2 C1 C2
  • 7. Data Loss Split-Brain issues lose data Multiple masters = data corruption Clients confused about who is up Problem for traditional HA environments Linux-HA, etc Heartbeat failure != Death Sept 8th 2011 Meetup
  • 8. Theoretical Limits Can we solve this reliably? Fischer-Lynch-Paterson (FLP) Theorem Consensus impossible in asynchronous distributed system when even a single process can fail No free lunch Sept 8th 2011 Meetup
  • 9. Revisiting Our Assumptions Drop fully asynchronous requirement What about leases? Masters obtain, renew a lease Shutdown if lease expires (not asynchronous) Assumes only bounded relative clock skew Everyone should agree on how fast time elapses Sept 8th 2011 Meetup
  • 10. Master Failover Requires highly available lock / lease system Master obtains a lease to be master Replicates writes to a backup master If master loses lease, hold a new election Old master will shut down when lease expires If clock skew bounded, no split-brain! Sept 8th 2011 Meetup
  • 11. Failover: Locks/Consensus Apache ZooKeeper – Hadoop subproject Highly-available distributed filesystem for distributed consensus problems Create election, membership, etc. using special-purpose FS semantics 'Ephemeral' files disappear when session lease expires 'Sequential' files have auto-incremented suffix Sept 8th 2011 Meetup
  • 12. ZooKeeper Internals ZooKeeper consists of a quorum of nodes (typically 3-9) Majority vote elects a leader (via leases) Leader proposes all FS modifications Majority must approve a modification for it to be committed Sept 8th 2011 Meetup
  • 13. Example: HBase Apache HBase has full automated multi-master failover Prospective masters register in ZooKeeper ZooKeeper ephemeral/sequential files used for election Clients lookup current address of master in ZooKeeper Failover fully automated All files stored on HDFS, so no replication issues Sept 8th 2011 Meetup
  • 14. Failover: Replication HBase approach avoids replication issues with HDFS Kerberos, NN, Oozie, etc can't use HDFS Legacy compatibility (and for NN, circular deps) How can we add synchronous write replication? Can't break compatibility or change apps Sept 8th 2011 Meetup
  • 15. Failover: Networking HBase avoids networking failover by storing master address in ZK Legacy services use IP or hostnames, not ZK, to connect to master Out-of-trunk patches to make ZK a DNS server But Java doesn't respect DNS TTLs anyway, complicating max time for failover Sept 8th 2011 Meetup
  • 16. Failover: Networking HBase avoids networking failover by storing master address in ZK Legacy services use IP or hostnames, not ZK, to connect to master Out-of-trunk patches to make ZK a DNS server But Java doesn't respect DNS TTLs anyway, complicating max time for failover DNS introduces its own issues anyway... Sept 8th 2011 Meetup
  • 17. IP Failover Instead, you can failover IP addresses Virtual IPs – if supported by router Otherwise, dynamically update routes as part of your failover New leader updates routing tables. For local area networks, ensure ARP tables updated Gratuitous ARP or store ARP information in ZK Sept 8th 2011 Meetup
  • 18. Putting it all together Consensus/Election Use ZooKeeper, 3-9 node quorum State Replication Small data in ZK, Large data in HDFS If neither possible, DRBD Network Failover Store master address in ZK Or, perform IP failover Dynamically update routing tables, update ARPcache Sept 8th 2011 Meetup
  • 19. Conclusion Fully automated failover is possible Design for synchronous replication Prevent split-brain Manage legacy compatibility Coming to Hadoop ZettaSet provides fully HA Hadoop Sept 8th 2011 Meetup