SlideShare a Scribd company logo
You’ve got HBase
How AOL Mail handles Big Data


Presented at HBaseCon
May 22, 2012
The AOL Mail System
Over 15 years old
Constantly evolving
10,000+ hosts
70+ Million mailboxes
50+ Billion emails
A technology stack that runs the gamut




                                           Presented at
                                         HBaseCon 2012
                                                Page 2
What that means…
Lots of data
Lots of moving parts
Tight SLAs
Mature system + Young software = Tough marriage
 We don’t buy “commodity” hardware
 Engrained Dev/QA/Prod product lifecycle
 Somewhat “version locked” to tried-and-true platforms
 Expect service outages to be quickly mitigated by our NOC w/out waiting for an on-call




                                                                                  Presented at
                                                                                HBaseCon 2012
                                                                                       Page 3
So where does HBase fit?
It’s a component, not the foundation
Currently used in two places
Being evaluated for more
 It will remain a tool in our diverse Big Data arsenal




                                                           Presented at
                                                         HBaseCon 2012
                                                                Page 4
An Activity Profiler
An “Activity Profiler”
Watches for particular behaviors
Designed and built in 6/2010
Originally “vanilla” Hadoop 0.20.2 + HBase 0.90.2
Currently CDH3
1.4+ Million Events/min
60x 24TB (raw) DataNodes w/ local RegionServers
15x application hosts
Is an internal-only tool
 Used by automated anti-abuse systems
 Leveraged by data analysts for adhoc queries/MapRed

                                                         Presented at
                                                       HBaseCon 2012
                                                              Page 6
An “Activity Profiler”




                           Presented at
                         HBaseCon 2012
                                Page 7
Why the “Event Catcher” layer?
Has to “speak the language” of our existing systems
 Easy to plug an HBase translator in to existing data feeds
 Hard to modify the infrastructure to speak HBase

Flume was too young at the time




                                                                Presented at
                                                              HBaseCon 2012
                                                                     Page 8
Why batch load via MapRed?
Real time is not currently a requirement
Allows filtering at different points
Allows us to “trigger” events
 Designed before coprocessors

Early data integrity issues necessitated “replaying”
 Missing append support early on
 Holes in the Meta table
 Long splits and GC pauses caused client timeouts

Can sample data into a “sandbox” for job development
Makes pig, hive, and other MapRed easy and stable
 We keep the raw data around as well

                                                      Presented at
                                                    HBaseCon 2012
                                                           Page 9
HBase and MapRed can live in harmony
Bigger than “average” hardware
 36+GB RAM
 8+ cores

Proper system tuning is essential
 Good information on tuning Hadoop is prolific, but…
   XFS > EXT
   JBOD > RAID
 As far as HBase is concerned…
   Just go buy Lars’ book

Careful job development, optimization is key!



                                                         Presented at
                                                       HBaseCon 2012
                                                             Page 10
Contact History API
Contact History API
Services a member-facing API
Designed and built in 10/2010
Modeled after the previous application
 Built by a different Engineering team
 Used to solve a very different problem

250K+ Inserts/min
3+ Million Inserts/min during MapRed
20x 24TB (raw) DataNodes w/ local RegionServers
14x application hosts
Leverages Memcached to reduce query load on HBase
                                              Presented at
                                            HBaseCon 2012
                                                  Page 12
Contact History API




                        Presented at
                      HBaseCon 2012
                            Page 13
Where we go from here
Amusing mistakes to learn from
Exploding regions
 Batch inserts via MapRed result in fast, symmetrical key space growth
 Attempting to split every region at the same time is a bad idea
 Turning off region splitting and using a custom “rolling region splitter” is a good idea
 Take time and load into consideration when selecting regions to split

Backups, backups, backups!
 You can never have to many

Large, non-splitable regions tell you things
 Our key space maps to accounts
 Excessively large keys equal excessively “active” accounts




                                                                                      Presented at
                                                                                    HBaseCon 2012
                                                                                          Page 15
Next-generation model




                          Presented at
                        HBaseCon 2012
                              Page 16
Thanks!




            Presented at
          HBaseCon 2012
                Page 17

More Related Content

PPTX
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
PPTX
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
PPTX
HBase at Bloomberg: High Availability Needs for the Financial Industry
PPTX
Content Identification using HBase
PPTX
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
PPTX
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
PPTX
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
PDF
Data Evolution in HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
HBase at Bloomberg: High Availability Needs for the Financial Industry
Content Identification using HBase
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
Data Evolution in HBase

What's hot (20)

PDF
HBaseCon 2013: Integration of Apache Hive and HBase
PDF
HBase Read High Availability Using Timeline-Consistent Region Replicas
PPTX
A Survey of HBase Application Archetypes
PPTX
HBaseCon 2015: HBase and Spark
PPTX
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
PPTX
Keynote: The Future of Apache HBase
PPTX
HBaseCon 2015: State of HBase Docs and How to Contribute
PPTX
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
PPTX
HBase Backups
PDF
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
PPTX
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
PDF
Facebook - Jonthan Gray - Hadoop World 2010
PDF
Large-scale Web Apps @ Pinterest
PDF
HBaseCon 2015- HBase @ Flipboard
PPTX
Dancing with the elephant h base1_final
PPTX
Meet hbase 2.0
PDF
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
PPTX
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
PPT
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
PPTX
HBase in Practice
HBaseCon 2013: Integration of Apache Hive and HBase
HBase Read High Availability Using Timeline-Consistent Region Replicas
A Survey of HBase Application Archetypes
HBaseCon 2015: HBase and Spark
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
Keynote: The Future of Apache HBase
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBase Backups
HBaseCon 2012 | HBase Security for the Enterprise - Andrew Purtell, Trend Micro
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
Facebook - Jonthan Gray - Hadoop World 2010
Large-scale Web Apps @ Pinterest
HBaseCon 2015- HBase @ Flipboard
Dancing with the elephant h base1_final
Meet hbase 2.0
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Real-Time Video Analytics Using Hadoop and HBase (HBaseCon 2013)
HBase in Practice
Ad

Viewers also liked (20)

PDF
HBaseCon 2015: Meet HBase 1.0
PDF
HBaseCon 2013: Apache HBase Operations at Pinterest
PPTX
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
PPTX
HBaseCon 2012 | Scaling GIS In Three Acts
PPTX
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
PPTX
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
PPTX
Cross-Site BigTable using HBase
PPT
HBaseCon 2012 | Building Mobile Infrastructure with HBase
PPTX
HBaseCon 2013: 1500 JIRAs in 20 Minutes
PDF
Tales from the Cloudera Field
PPTX
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
PPTX
HBaseCon 2013: Being Smarter Than the Smart Meter
PPTX
HBaseCon 2013: Apache HBase on Flash
PPTX
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
PPT
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
PPTX
HBaseCon 2013: Rebuilding for Scale on Apache HBase
PDF
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
PPTX
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
PPTX
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
PDF
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
Cross-Site BigTable using HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2013: 1500 JIRAs in 20 Minutes
Tales from the Cloudera Field
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
Ad

Similar to HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data (20)

PPTX
Hadoop: today and tomorrow
PPTX
Big data and tools
PPT
Hadoop a Natural Choice for Data Intensive Log Processing
PDF
Hbase mhug 2015
PPTX
Overview of Big data, Hadoop and Microsoft BI - version1
PPTX
Overview of big data & hadoop version 1 - Tony Nguyen
ODP
Hadoop demo ppt
PPTX
The Future of Hbase
PPT
Hadoop presentation
PDF
Hw09 Data Processing In The Enterprise
PPTX
Chicago Data Summit: Geo-based Content Processing Using HBase
PDF
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
PPT
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
PPTX
PPTX
Talend Big Data Capabilities Overview
PPT
Eric Baldeschwieler Keynote from Storage Developers Conference
PPT
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
PDF
Modern data warehouse
PDF
Modern data warehouse
DOCX
Big Data - Hadoop Ecosystem
Hadoop: today and tomorrow
Big data and tools
Hadoop a Natural Choice for Data Intensive Log Processing
Hbase mhug 2015
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of big data & hadoop version 1 - Tony Nguyen
Hadoop demo ppt
The Future of Hbase
Hadoop presentation
Hw09 Data Processing In The Enterprise
Chicago Data Summit: Geo-based Content Processing Using HBase
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Talend Big Data Capabilities Overview
Eric Baldeschwieler Keynote from Storage Developers Conference
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Modern data warehouse
Modern data warehouse
Big Data - Hadoop Ecosystem

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Leveraging the Cloud for Big Data Analytics 12.11.18
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
Teaching material agriculture food technology
PDF
Approach and Philosophy of On baking technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
KodekX | Application Modernization Development
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
cuic standard and advanced reporting.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Electronic commerce courselecture one. Pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
MYSQL Presentation for SQL database connectivity
Building Integrated photovoltaic BIPV_UPV.pdf
Teaching material agriculture food technology
Approach and Philosophy of On baking technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Agricultural_Statistics_at_a_Glance_2022_0.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Machine learning based COVID-19 study performance prediction
CIFDAQ's Market Insight: SEC Turns Pro Crypto
KodekX | Application Modernization Development
NewMind AI Monthly Chronicles - July 2025
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
cuic standard and advanced reporting.pdf
A Presentation on Artificial Intelligence
Electronic commerce courselecture one. Pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
MYSQL Presentation for SQL database connectivity

HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data

  • 1. You’ve got HBase How AOL Mail handles Big Data Presented at HBaseCon May 22, 2012
  • 2. The AOL Mail System Over 15 years old Constantly evolving 10,000+ hosts 70+ Million mailboxes 50+ Billion emails A technology stack that runs the gamut Presented at HBaseCon 2012 Page 2
  • 3. What that means… Lots of data Lots of moving parts Tight SLAs Mature system + Young software = Tough marriage We don’t buy “commodity” hardware Engrained Dev/QA/Prod product lifecycle Somewhat “version locked” to tried-and-true platforms Expect service outages to be quickly mitigated by our NOC w/out waiting for an on-call Presented at HBaseCon 2012 Page 3
  • 4. So where does HBase fit? It’s a component, not the foundation Currently used in two places Being evaluated for more It will remain a tool in our diverse Big Data arsenal Presented at HBaseCon 2012 Page 4
  • 6. An “Activity Profiler” Watches for particular behaviors Designed and built in 6/2010 Originally “vanilla” Hadoop 0.20.2 + HBase 0.90.2 Currently CDH3 1.4+ Million Events/min 60x 24TB (raw) DataNodes w/ local RegionServers 15x application hosts Is an internal-only tool Used by automated anti-abuse systems Leveraged by data analysts for adhoc queries/MapRed Presented at HBaseCon 2012 Page 6
  • 7. An “Activity Profiler” Presented at HBaseCon 2012 Page 7
  • 8. Why the “Event Catcher” layer? Has to “speak the language” of our existing systems Easy to plug an HBase translator in to existing data feeds Hard to modify the infrastructure to speak HBase Flume was too young at the time Presented at HBaseCon 2012 Page 8
  • 9. Why batch load via MapRed? Real time is not currently a requirement Allows filtering at different points Allows us to “trigger” events Designed before coprocessors Early data integrity issues necessitated “replaying” Missing append support early on Holes in the Meta table Long splits and GC pauses caused client timeouts Can sample data into a “sandbox” for job development Makes pig, hive, and other MapRed easy and stable We keep the raw data around as well Presented at HBaseCon 2012 Page 9
  • 10. HBase and MapRed can live in harmony Bigger than “average” hardware 36+GB RAM 8+ cores Proper system tuning is essential Good information on tuning Hadoop is prolific, but… XFS > EXT JBOD > RAID As far as HBase is concerned… Just go buy Lars’ book Careful job development, optimization is key! Presented at HBaseCon 2012 Page 10
  • 12. Contact History API Services a member-facing API Designed and built in 10/2010 Modeled after the previous application Built by a different Engineering team Used to solve a very different problem 250K+ Inserts/min 3+ Million Inserts/min during MapRed 20x 24TB (raw) DataNodes w/ local RegionServers 14x application hosts Leverages Memcached to reduce query load on HBase Presented at HBaseCon 2012 Page 12
  • 13. Contact History API Presented at HBaseCon 2012 Page 13
  • 14. Where we go from here
  • 15. Amusing mistakes to learn from Exploding regions Batch inserts via MapRed result in fast, symmetrical key space growth Attempting to split every region at the same time is a bad idea Turning off region splitting and using a custom “rolling region splitter” is a good idea Take time and load into consideration when selecting regions to split Backups, backups, backups! You can never have to many Large, non-splitable regions tell you things Our key space maps to accounts Excessively large keys equal excessively “active” accounts Presented at HBaseCon 2012 Page 15
  • 16. Next-generation model Presented at HBaseCon 2012 Page 16
  • 17. Thanks! Presented at HBaseCon 2012 Page 17

Editor's Notes

  • #2: Introduce myself: I am Chris Niemira, a Systems Administrator with AOL. I run a number of Hadoop and HBase clusters, along with numerous other components of the AOL Mail system. I spend my days doing work that ranges from system patches, code installs and troubleshooting, to capacity planning, performance and bottleneck analysis, and kernel tuning. I do a little engineering, a little design work, an on-call rotation, and every once in a while I get to play with Hadoop/HBase.
  • #3: The AOL Mail System has been around for a long time, and went through a major re-architecture between 2010 – 2011. It’s not a 15 year old code base, and we evolve it constantly. We service over 70 million mailboxes in the AOL Mail environment today. That includes supporting our paying members, in addition to free accounts. Of course, member experience is our #1 priority. We have all kinds of tools in our proverbial utility belt, as we believe in trying to use the right thing for the right job.
  • #4: It means we’re reasonably large. But we’ve also been operating “at scale” for a long time now. While we have been doing “Big Data” for a lot of years now, we got to our current size by operating a certain way: Rigid quality and change controls, lots of documentation, emphasis on uptime. As we have shifted toward being more agile, we have had to be careful with unproven technologies. HBase, for all the buzz, is still pretty young and error-prone. Some of the realities for dealing with a production Hadoop/HBase system would seemingly require a departure from our traditional mentality. Like everyone, we require stability and robustness of our production applications, but our way of getting there has had to change. Above all, however, we must still take care of our customers, so it’s a balancing act for us.
  • #5: So HBase is one of the tools we’ve added to kit in the last few years that’s still proving itself. We’ve got two applications running and we’ve identified a few other places where it’s a good candidate to utilize. This isn’t to say that we are not using it for important things, but it’s not at the core of our system. We’ve managed to build a relatively stable platform over time. There’s a lot of scripted recovery, and a lot of proactive monitoring in our environment, and for the most part when there are problems, they are mitigated or resolved without even the involvement of an admin.
  • #7: AOL Mail first stared looking into Hadoop and HBase back in mid 2010. Other business units in our company had been working with Hadoop for a while before then, and a little of intra-company consulting convinced us to give HBase a try. This system is one component our our anti-abuse strategy. I can’t reveal exactly what it does, but I can tell you a bit about how the HBase stuff happens. In addition to the 60 node cluster and the application servers there’s the ancillary junk which includes NameNodes (2x), HMasters (2x), Zookeepers (3x). The app hosts and Zookeepers, which are currently physicals, are being switched to virtual devices in our internal cloud.
  • #8: This is what the application looks like. The “Service Layer” comprises various components within the AOL Mail system. They speak their own protocols and send messages to an “Event Catcher” which decodes the stream, and writes a log to local disk. That log is imported in Hadoop (and can optionally be sampled to a development sandbox at the same time) and then further cooked via MapRed which ultimately outputs rows into HBase, and can send triggers to external applications. One thing we can do at this point (not illustrated) is populate a memcache which may be used by client apps to reduce load on some HBase queries.
  • #10: The real answer is that when we first started, we couldn’t make streaming a million and a half rows a minute work out with the Hbase we had two years ago. At the time, it was easier for us to build the batch loader, which has proven to have a few interesting advantages. Our next-generation model will rely on HBase itself being more stable, and will heavily leverage coprocessors to do a lot of what we’re doing now with MapReduce.
  • #11: A big obstacle for us is getting MapReduce and HBase to play nicely together. From what I’ve seen, bigger hardware is starting to become more popular for running HBase, and we believe it’s essential. We’ve floated between an 8 – 16 GB heap for the RegionServer. For this application, I believe we’re currently using 16. Getting GC tuning and the IPC timeouts in HBase/Zookeeper correct are critically important. System tuning is also very important. Depending on which flavor of Linux you’re running, the stock configuration may be completely inappropriate for the needs of an HBase/Hadoop complex. In particular, look at the kenel’s IO scheduler, and VM settings.
  • #13: This application was built a short while after we started our trial-by-fire with HBase on the previous application. It was a different development team with input from the engineers working on the previously discussed application. This application has the same “event catcher” layer for the same reasons, but it has always written directly to HBase. We import data into a “raw” table and then process that table with MapReduce writing the output into a “cooked” table. There’s a much lower number of events here, but it spikes up significantly during the MapReduce phase. It’s exactly the same class of hardware with the same ancillary junk as the previous app. Most of the query load is actually farmed out of memcache.
  • #14: Yes, this is a relatively straight-forward design.
  • #16: Exploding tables might be a better name for this, since it’s an across-the-board sort of thing. Backups, of course, are obvious. We’ve run into three catastrophic data loss events, actually once each with three different clusters. The first was during a burn-in phase for the Contact History application I described earlier. At that time the data it had accumulated over the week or so that it had been running wasn’t considered important essential so we were able to truncate and move along. Another time, for a separate plain Hadoop cluster, an unintentionally malicious user actually managed to delete my backups and corrupt the namenode’s event log. Luckily that data was restorable from another source. The last time was with the Activity Profiler application. Basically, having data backups saved the day.
  • #17: This is our working model for a next-generation HBase system It is currently being prototyped with the cooperation of our Engineering and Operations teams The key design concept is to allow for a great deal of flexibility and re-use, and it centers around this idea of installing a fairly dynamic rules-engine at both the event collection and event storage layers. Hopefully will be presenting it soon