SlideShare a Scribd company logo
YFROG! Reliable, beautiful and fun to use messaging platform.
10,000 Concurrent requests per second
Super fast!
Super Huge datastore – 2bl rows.
Backend is scalable
Does not lose data
Why? – HBASE is used for 99% of the backend
HBASE Best Practices or Taming the Beast Hbase at Imageshack: Started using Hbase 8 months ago
ImageShack: 25 ml monthly uniques

More Related Content

DOC
Typoi oikotopon tsin_92-43
PDF
Facebook Messages & HBase
PPTX
Adding Search to the Hadoop Ecosystem
PDF
Apache Hive 0.13 Performance Benchmarks
PPTX
Hadoop World 2011 Keynote: Ebay - Hugh Williams
PPTX
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
PDF
HDFS Analysis for Small Files
PPTX
Hive + Tez: A Performance Deep Dive
Typoi oikotopon tsin_92-43
Facebook Messages & HBase
Adding Search to the Hadoop Ecosystem
Apache Hive 0.13 Performance Benchmarks
Hadoop World 2011 Keynote: Ebay - Hugh Williams
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
HDFS Analysis for Small Files
Hive + Tez: A Performance Deep Dive

Similar to Hug Hbase Presentation. (20)

PDF
The Smug Mug Tale
PPT
Mysql talk
PPTX
HBase: Extreme makeover
PDF
Hbase: an introduction
PDF
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
PDF
Cassandra TK 2014 - Large Nodes
PDF
HBase: Extreme Makeover
PPTX
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
PDF
Introduction to Galera Cluster
ODP
Shootout at the PAAS Corral
PPTX
Open Source Data Deduplication
PDF
Cassandra Anti-Patterns
PDF
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
PPTX
Jvm & Garbage collection tuning for low latencies application
PPT
1. Scaling PHP/MySQL...Presentation from Flickr
PPTX
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
PPTX
HBase at Flurry
PPTX
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
PPTX
Hadoop Architecture_Cluster_Cap_Plan
PDF
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
The Smug Mug Tale
Mysql talk
HBase: Extreme makeover
Hbase: an introduction
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
Cassandra TK 2014 - Large Nodes
HBase: Extreme Makeover
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Introduction to Galera Cluster
Shootout at the PAAS Corral
Open Source Data Deduplication
Cassandra Anti-Patterns
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
Jvm & Garbage collection tuning for low latencies application
1. Scaling PHP/MySQL...Presentation from Flickr
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
HBase at Flurry
Big Data and Hadoop in Cloud - Leveraging Amazon EMR
Hadoop Architecture_Cluster_Cap_Plan
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
Ad

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
Teaching material agriculture food technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
cuic standard and advanced reporting.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Advanced IT Governance
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Approach and Philosophy of On baking technology
Spectral efficient network and resource selection model in 5G networks
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Teaching material agriculture food technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
The AUB Centre for AI in Media Proposal.docx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
GamePlan Trading System Review: Professional Trader's Honest Take
Unlocking AI with Model Context Protocol (MCP)
The Rise and Fall of 3GPP – Time for a Sabbatical?
cuic standard and advanced reporting.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Chapter 3 Spatial Domain Image Processing.pdf
Advanced IT Governance
Mobile App Security Testing_ A Comprehensive Guide.pdf
MYSQL Presentation for SQL database connectivity
Review of recent advances in non-invasive hemoglobin estimation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Network Security Unit 5.pdf for BCA BBA.
Approach and Philosophy of On baking technology
Ad

Hug Hbase Presentation.

  • 1. YFROG! Reliable, beautiful and fun to use messaging platform.
  • 4. Super Huge datastore – 2bl rows.
  • 7. Why? – HBASE is used for 99% of the backend
  • 8. HBASE Best Practices or Taming the Beast Hbase at Imageshack: Started using Hbase 8 months ago
  • 9. ImageShack: 25 ml monthly uniques
  • 10. Yfrog: 33 ml monthly uniques
  • 11. 4 Hbase Clusters of various sizes (50TB to 1 PT)
  • 12. Storing and serving 250ml photos (500kb average per file), 60 servers
  • 13. Yfrog is powered by smaller 50 TB cluster, with 2 billion rows, 20 servers
  • 14. Using 0.89x and 0.90x versions
  • 15. What about Hardware? Having more smaller nodes is better than having less faster bigger nodes.
  • 16. Lots of RAM is good but only to a point, just avoid swap.
  • 17. We use sub $1k desktop grade servers, they work great!
  • 18. Check your network hardware for packet drops (we had outifDiscards interrupting zookeeper messages, Region servers would suicide during packet loss), just use ping -f to test for packet loss between core nodes.
  • 19. JVM GC does take lots of CPU when misconfigured – e.g. Small NewSize
  • 20. Single Namenode? No problem, just build two clusters have your APP tier do log query replication and replays when needed.
  • 21. Inexpensive 2TB hitachi disks (~$100) work great, get more units for your money.
  • 22. Critical configuration elements Starting with Hbase is easy, but we need to pay attention to: 1. Do not start without Graphs – trends over time are critical.
  • 23. 2. Setup HDFS to work flawlessly (pay attention to ulimits, thread limits, hardware stats, graphs, iowait, etc)
  • 24. 3. Adjust JVM GC NewSize to be at least 100MB (if YG GC is too slow for 100MB, you need faster CPUs).
  • 25. 4. For metadata rows (small rows) adjust your Hbase block size to be 4 or 8kb, you will see less IO and more blocks will fit into RAM.
  • 26. 5. Setup write cache (memstore) and read cache (block cache) depending on your load Write cache must have lower and upper limits close to each other otherwise you will have very large cache flushes, not good for IO or GC.
  • 27. What to monitor? (via Ganglia) Runnable threads graph should be flat, if its not, you like have some contention somewhere (IO, HDFS, etc)
  • 28. Memstore size graph should be fairly flat with even flushes over time.
  • 29. Iowait graphs should not go over 70-80% during major compaction, and 20% during minor compactions. Otherwise just add more disks and/or nodes.
  • 30. Monitor and graph Thrift threads (via ps -eLf | grep PID), if your threads end up over 25,000, you may run out of RAM. We have dedicated thrift boxes so that we don't accidently kill RS nodes.
  • 31. We use Nagios to monitor and alert for DN, RS, ZK, NN, etc on their web tcp ports – very helpful.
  • 32. Run hbck to check for consistency of meta structures.
  • 33. Issues we had. App Tier bugs would abuse Hbase, generate millions of queries – logging all RPC calls to HBASE on the App Tier is critical. Took us long time to figure out that Hbase was not at fault, because we did not know what to expect.
  • 34. Various RAM brands – boxes crash for no reason.
  • 35. Glibc in FC13 had race condition bug, would lock up nodes, crash JVM processes under high load. Solution: yum -y update glibc (invalid binfree)
  • 36. When running in mixed hardware environment, some boxes were slow enough to affect HDFS for the whole cluster – looking at “runnable threads” and “fsreadlatency” in Ganglia always pointed which boxes were 'slow'
  • 37. Running cloudera HDFS under user 'hadoop', that was restricted to 1024 threads by default would crash datanodes, but only during compactions. Setting hadoop soft(and hard) nproc 32,000 in limits.conf resolved it.
  • 38. GC sometimes autotunes NewSize of 20MB, caused GC run to 20 or 30 per second, causing CPU to flatline at 100% and kill the RS. Manually setting to 128MB resolved this issue.
  • 39. Finally everything is tuned. And Hbase Runs great!
  • 42. Fast – 0.5 ms puts, 2-3ms reads, 10ms disk reads.
  • 43. Recovers quickly when nodes are taken down
  • 44. Oncall team can finally relax
  • 45. Final setup tips Use the same hardware for your nodes
  • 46. Load test HBASE with YCSB – just leave it running for a week, if nothing crashes, you are good. Best not to test with live user traffic :)
  • 47. Do not worry about Namenode redundancy, just backup /name dir frequently. Setup secondary Hbase cluster with the money you save on not buying 'Server' grade nodes.
  • 48. Burn in your disks, even if they are new
  • 49. Put Memcached between your App. Tier and Hbase, App. Bugs will hit memcached first, keeping hbase safe from the assault, which could drive your utilization.
  • 50. Many thanks to: Ryan Rawson
  • 53. And everyone else on the hbase user list who helped us out during the rough times.