SlideShare a Scribd company logo
Building               Stats

          Richard Crowley
      richard@opendns.com
@400000004a381ba80c294ddc   q1   69.64.43.245 normal 558867 alt2.gmail-smtp-in.l.google.com. 1 0
Then: 8 billion DNS queries per day
@400000004a381ba80dd39e94
@400000004a381ba80dd3a664
                            q1
                            q1
                                 163.192.13.30 normal 894966 dns.hitachi-koki.co.jp. 1 0
                                 63.84.243.25 normal 0 photos-d.ak.fbcdn.net. 1 0
@400000004a381ba80dd3ae34   q1   24.155.125.240 normal 1045953 my-iqquiz.com. 1 0
@400000004a381ba80dd3b604   q1   64.253.103.18 normal 788290 6.164.133.166.in-addr.arpa. 12 2
@400000004a381ba80dd3bdd4   q1   70.246.80.10 normal 0 googleads.g.doubleclick.net. 1 0
@400000004a381ba80dd3c5a4   q1   98.108.66.45 normal 0 _ldap._tcp.nj-bloomfield._sites.dc._msdcs.mrii.c
@400000004a381ba80dd41b94   q1   98.144.16.195 normal 0 js.casalemedia.com. 1 0
@400000004a381ba80dd42364   q1   68.165.29.60 normal 0 img-cdn.mediaplex.com. 1 0
@400000004a381ba80dd42b34   q1   12.233.75.219 normal 0 zsmseno.clnet.cz. 1 0
@400000004a381ba80dd43304   q1   174.37.58.88 normal 0 70.96.118.85.bl.spamcop.net. 16 0
@400000004a381ba80dd43ad4   q1   208.76.86.13 normal 519070 252.76.75.208.bl.spamcop.net. 1 3
@400000004a381ba80dd442a4   q1   201.138.19.196 normal 0 isatap.domain.local. 1 3
@400000004a381ba80dd465cc   q1   24.192.98.53 normal 0 208.85.224.82.in-addr.arpa. 12 0
@400000004a381ba80dd46d9c   q1   64.91.71.57 normal 0 liveupdate.symantecliveupdate.com. 1 0
@400000004a381ba80dd4756c   q1   69.64.43.245 normal 558867 alt4.gmail-smtp-in.l.google.com. 1 0
@400000004a381ba80dd47d3c   q1   69.64.43.245 normal 558867 alt4.gmail-smtp-in.l.google.com. 1 0
@400000004a381ba80dd4850c   q1   72.10.191.11 normal 812477 iprep1.t.ctmail.com. 1 0
@400000004a381ba80dd49c7c   q1   12.233.75.219 normal 0 zsmseno.clnet.cz. 1 0
@400000004a381ba80dd4a44c   q1   69.157.60.79 normal 0 img-cdn.mediaplex.com. 1 0
@400000004a381ba80dd4ac1c   q1   208.43.52.205 nxdomain 0 haghway.com.br. 1 0
@400000004a381ba80dd4b3ec   q1   204.145.0.242 normal 488877 105.12.90.201.asetnhap5duax9a26l24rda5g3gv
@400000004a381ba80dd4bbbc   q1   206.246.157.1 normal 0 penninegas.co.uk. 15 2
@400000004a381ba80dd4c38c   q1   69.21.243.131 normal 0 svn.atomicobject.com. 28 0
@400000004a381ba80dd4dafc   q1   163.192.13.65 normal 894966 dns.hitachi-koki.co.jp. 1 0
@400000004a381ba80dd4e2cc   q1   76.65.199.42 nxdomain 0 cs16.msg.dcn.yahoo.com. 1 0
@400000004a381ba80dd4ea9c   q1   189.169.97.227 normal 0 impaktosoo.gateway.2wire.net. 1 3
@400000004a381ba80dd4f26c   q1   69.64.43.245 normal 558867 gmail.com. 15 0
@400000004a381ba80dd4f654   q1   189.168.174.182 normal 0 wpad.2wire.net. 1 3
@400000004a381ba80dd4fe24   q1   69.64.43.245 normal 558867 alt3.gmail-smtp-in.l.google.com. 1 0
@400000004a381ba80dd51594   q1   189.133.170.67 normal 0 v13.lscache5.googlevideo.com. 1 0
@400000004a381ba80dd538bc   q1   12.186.60.189 nxdomain 0 carolyn5.ktemca.com. 1 0
@400000004a381ba80dd5408c   q1   72.249.148.132 normal 384918 mailin-04.mx.aol.com. 1 0
@400000004a381ba80dd5485c   q1   76.65.199.42 nxdomain 0 csa.yahoo.com. 1 0
@400000004a381ba80dd5502c   q1   208.73.228.5 normal 119716 3.0.0.172.in-addr.arpa. 12 3
@400000004a381ba80dd55414
@400000004a381ba80dd55be4   Now: 14 billion DNS queries per day
                            q1
                            q1
                                 72.249.26.8 normal 0 schnurr.de. 1 0
                                 96.61.141.172 servfail 0 bc2.gamingsquared.com. 1 0
Logs are silly, let’s make graphs
High level design from my OpenDNS interview


    map/reduce/ish


    Stage 1 buckets data by network
    Stage 2 aggregates and stores


    Prefers to duplicate data rather than omit data


    Give each network a separate table (keeps each table
    small(er) and keeps the primary key small(er))
False starts
False start #1: storing domains

    auto_increment is bad (table lock)


    Use the SHA1 of the domain as primary key


    Currently we have 2 machines storing domains
    About 48 GB in each domains.ibd
    28 GB memcached across 8 machines
    effectively makes this database write-only
False start #2: std::bad_alloc

  Stage 2 aggregated too much data and ran out of memory


  Bad idea: improve the heuristic used to guess
  memory usage and prevent std::bad_alloc


  Good idea: catch std::bad_alloc, clean up and restart
  Pre-allocating buffers that will be reused makes this easy


  Protip: Run two programs (memcached and Stage 2, for
  example) compiled 32-bit on a 64-bit CPU with 8 GB RAM
False start #3: open tables

  80+ %iowait from opening and closing tables


  strace showed lots of calls to open() and close()
  strace crashed MySQL


  Altered mysqld_safe to set ulimit -n 600000
False start #4: MyISAM


    Didn’t mind table locks, so I used MyISAM


    12 MB/sec total across 4 nodes


    Migration to InnoDB is in progress
    Expect a 2x improvement from InnoDB
    innodb_flush_log_at_trx_commit=2
Architecture
Bird’s eye view                                        Resolvers
                          Domains DB     User DB       (worldwide)


              Proxy




Web servers
(Palo Alto)




                                         Stage 1
              Stats DBs     Stage 2


                                       San Francisco
Stage 1 (“map”)

   rsync log files from our DNS servers to
   3 servers in San Francisco


   Looking up a network in memcached (or $GLOBALS)
   gives the preferred Stage 2


   Write log lines back to local disk,
   one bucket for each Stage 2 machine


   Future work: automated rebalancing and failover
Stage 2 data structures
{
    “db1”: {                             Stats aggregation (pseudocode)
      “123456”: {
        “2009-06-17”: {
          “last_updated”: 1234567890,
          “file_ptrs”: [0xDEADBEEF, 0xDECAFBAD],
          “topdomains”: {
             “xkcd.com”: [12,3,5,47,0,0,6,10,1,9,2,3,0,4,2,0,5,12,19,35,32,2,4,0],
          },
          “requesttypes”: { “A”: [ /* 24 hours */ ], “MX”: [ /* 24 hours */ ] },
          “uniqueips”: { “1.2.3.4”: [ /* 24 hours */ ] }
        }
      }
    }
}


__gnu_cxx::hash_map<
  char *, // Filename                        File reference counting (C++)
  std::pair<
    unsigned int, // Reference count
    pthread_t // Owning thread or NULL
  >,
  hash_ptr // Hashes a pointer as if it were an integer
>
Stage 2 (“reduce”)
 rsync intermediate files from all Stage 1 servers

 8 aggregator threads read intermediate files into memory

 8 pruning threads write SQL statements to disk
 They decide what to prune based on the last_updated time
 They prefer to prune data that allows many files to be deleted


 Files are reference counted and only deleted
 when all of their rows are on disk as SQL
Stats Databases (“satan”)
  MySQL 5.0.77-percona
  12 disks
  16 GB RAM


  table_cache=300000


  innodb_dict_size_limit=2G
  innodb_flush_log_at_trx_commit=2
Website

  opendns.com is in Palo Alto
  DNS Stats are in San Francisco


  (Private) JSON API proxies small chunks
  of stats data to the website as needed


  Queries are done with no LIMIT clause
  Results are paginated in memcached (TTL = 1 hour)
Questions?

   http://opendns.com/dashboard/stats


   http://rcrowley.org/talks/opendns_stats.pdf


   richard@opendns.com

   Photo credits: http://flic.kr/p/4Szofb, http://flic.kr/p/4aH3YK,
   http://flic.kr/p/RUfEt, http://flic.kr/p/4Zng8Y, http://flic.kr/p/2MRnuq,
   http://flic.kr/p/9T4HX, http://flic.kr/p/41eEvH, http://flic.kr/p/5Rhxbq,
   http://flic.kr/p/68RgCp, http://flic.kr/p/oEVp, http://flic.kr/p/tfpXk,
   http://flic.kr/p/4Twpd4

More Related Content

PPTX
MongoDB.local Austin 2018: Workload Isolation: Are You Doing it Wrong?
PDF
Workload Isolation - Asya Kamsky
PDF
SBA Security Meetup: I want to break free - The attacker inside a Container
PDF
Redo internals ppt
PDF
Operation outbreak
PDF
PDX Tech Meetup - The changing landscape of passwords
PDF
Unified Data Platform, by Pauline Yeung of Cisco Systems
PPTX
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
MongoDB.local Austin 2018: Workload Isolation: Are You Doing it Wrong?
Workload Isolation - Asya Kamsky
SBA Security Meetup: I want to break free - The attacker inside a Container
Redo internals ppt
Operation outbreak
PDX Tech Meetup - The changing landscape of passwords
Unified Data Platform, by Pauline Yeung of Cisco Systems
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...

What's hot (18)

PPTX
Percona Live UK 2014 Part III
PDF
Cassandra SF 2013 - In Case Of Emergency Break Glass
PDF
Redis - for duplicate detection on real time stream
PDF
MongoDB Drivers And High Availability: Deep Dive
ODP
Analysis of Compromised Linux Server
PDF
Improving Authenticated Dynamic Dictionaries, with Applications to Cryptocurr...
PPTX
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
PPTX
MongoDB - External Authentication
PPT
Shapira oda perf_webinar_v2
PDF
Cassandra introduction @ ParisJUG
PPTX
CryptoWall: How It Works
PDF
Riyaj real world performance issues rac focus
PPTX
Become a Java GC Hero - All Day Devops
PPTX
PostgreSQL 9.4 JSON Types and Operators
PDF
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
PPTX
MongoDB - Sharded Cluster Tutorial
PDF
High Availability With DRBD & Heartbeat
PDF
Advanced rac troubleshooting
Percona Live UK 2014 Part III
Cassandra SF 2013 - In Case Of Emergency Break Glass
Redis - for duplicate detection on real time stream
MongoDB Drivers And High Availability: Deep Dive
Analysis of Compromised Linux Server
Improving Authenticated Dynamic Dictionaries, with Applications to Cryptocurr...
SCALE 15x Minimizing PostgreSQL Major Version Upgrade Downtime
MongoDB - External Authentication
Shapira oda perf_webinar_v2
Cassandra introduction @ ParisJUG
CryptoWall: How It Works
Riyaj real world performance issues rac focus
Become a Java GC Hero - All Day Devops
PostgreSQL 9.4 JSON Types and Operators
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
MongoDB - Sharded Cluster Tutorial
High Availability With DRBD & Heartbeat
Advanced rac troubleshooting
Ad

Similar to Building OpenDNS Stats (20)

PDF
Accumulo Summit 2016: You Won't Believe These 3 Tricks for Maximizing Accumul...
PDF
Facebook architecture
PDF
Qcon 090408233824-phpapp01
PDF
Facebook的架构
PDF
Facebook architecture
PDF
Couch Db
PDF
Your backend architecture is what matters slideshare
PDF
A @textfiles approach to gathering the world's DNS
PDF
Building and deploying large scale real time news system with my sql and dist...
PDF
PDF
Alternative Dns Servers Choice And Deployment And Optional Sql Ldap Backends ...
PDF
CNIT 124: Ch 5: Information Gathering
PDF
Securerank ping-opendns
PPT
08Mapping.ppt
PDF
The Hadoop Ecosystem
PDF
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
PDF
12 Years in DNS Security As a Defender
PDF
The Cassandra Distributed Database
PDF
Reliability & Scale in AWS while letting you sleep through the night
PPTX
2_Chapter 2_DNS.pptx
Accumulo Summit 2016: You Won't Believe These 3 Tricks for Maximizing Accumul...
Facebook architecture
Qcon 090408233824-phpapp01
Facebook的架构
Facebook architecture
Couch Db
Your backend architecture is what matters slideshare
A @textfiles approach to gathering the world's DNS
Building and deploying large scale real time news system with my sql and dist...
Alternative Dns Servers Choice And Deployment And Optional Sql Ldap Backends ...
CNIT 124: Ch 5: Information Gathering
Securerank ping-opendns
08Mapping.ppt
The Hadoop Ecosystem
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
12 Years in DNS Security As a Defender
The Cassandra Distributed Database
Reliability & Scale in AWS while letting you sleep through the night
2_Chapter 2_DNS.pptx
Ad

More from George Ang (20)

PDF
Wrapper induction construct wrappers automatically to extract information f...
PDF
Opinion mining and summarization
PPT
Huffman coding
PPT
Do not crawl in the dust 
different ur ls similar text
PPT
大规模数据处理的那些事儿
PPT
腾讯大讲堂02 休闲游戏发展的文化趋势
PPT
腾讯大讲堂03 qq邮箱成长历程
PPT
腾讯大讲堂04 im qq
PPT
腾讯大讲堂05 面向对象应对之道
PPT
腾讯大讲堂06 qq邮箱性能优化
PPT
腾讯大讲堂07 qq空间
PPT
腾讯大讲堂08 可扩展web架构探讨
PPT
腾讯大讲堂09 如何建设高性能网站
PPT
腾讯大讲堂01 移动qq产品发展历程
PPT
腾讯大讲堂10 customer engagement
PPT
腾讯大讲堂11 拍拍ce工作经验分享
PPT
腾讯大讲堂14 qq直播(qq live) 介绍
PPT
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
PPTX
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
PPT
腾讯大讲堂16 产品经理工作心得分享
Wrapper induction construct wrappers automatically to extract information f...
Opinion mining and summarization
Huffman coding
Do not crawl in the dust 
different ur ls similar text
大规模数据处理的那些事儿
腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂04 im qq
腾讯大讲堂05 面向对象应对之道
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂07 qq空间
腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂10 customer engagement
腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂16 产品经理工作心得分享

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
KodekX | Application Modernization Development
PPT
Teaching material agriculture food technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Electronic commerce courselecture one. Pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
Teaching material agriculture food technology
Per capita expenditure prediction using model stacking based on satellite ima...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Empathic Computing: Creating Shared Understanding
Understanding_Digital_Forensics_Presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation
Chapter 3 Spatial Domain Image Processing.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Spectral efficient network and resource selection model in 5G networks
Reach Out and Touch Someone: Haptics and Empathic Computing
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Electronic commerce courselecture one. Pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf

Building OpenDNS Stats

  • 1. Building Stats Richard Crowley richard@opendns.com
  • 2. @400000004a381ba80c294ddc q1 69.64.43.245 normal 558867 alt2.gmail-smtp-in.l.google.com. 1 0 Then: 8 billion DNS queries per day @400000004a381ba80dd39e94 @400000004a381ba80dd3a664 q1 q1 163.192.13.30 normal 894966 dns.hitachi-koki.co.jp. 1 0 63.84.243.25 normal 0 photos-d.ak.fbcdn.net. 1 0 @400000004a381ba80dd3ae34 q1 24.155.125.240 normal 1045953 my-iqquiz.com. 1 0 @400000004a381ba80dd3b604 q1 64.253.103.18 normal 788290 6.164.133.166.in-addr.arpa. 12 2 @400000004a381ba80dd3bdd4 q1 70.246.80.10 normal 0 googleads.g.doubleclick.net. 1 0 @400000004a381ba80dd3c5a4 q1 98.108.66.45 normal 0 _ldap._tcp.nj-bloomfield._sites.dc._msdcs.mrii.c @400000004a381ba80dd41b94 q1 98.144.16.195 normal 0 js.casalemedia.com. 1 0 @400000004a381ba80dd42364 q1 68.165.29.60 normal 0 img-cdn.mediaplex.com. 1 0 @400000004a381ba80dd42b34 q1 12.233.75.219 normal 0 zsmseno.clnet.cz. 1 0 @400000004a381ba80dd43304 q1 174.37.58.88 normal 0 70.96.118.85.bl.spamcop.net. 16 0 @400000004a381ba80dd43ad4 q1 208.76.86.13 normal 519070 252.76.75.208.bl.spamcop.net. 1 3 @400000004a381ba80dd442a4 q1 201.138.19.196 normal 0 isatap.domain.local. 1 3 @400000004a381ba80dd465cc q1 24.192.98.53 normal 0 208.85.224.82.in-addr.arpa. 12 0 @400000004a381ba80dd46d9c q1 64.91.71.57 normal 0 liveupdate.symantecliveupdate.com. 1 0 @400000004a381ba80dd4756c q1 69.64.43.245 normal 558867 alt4.gmail-smtp-in.l.google.com. 1 0 @400000004a381ba80dd47d3c q1 69.64.43.245 normal 558867 alt4.gmail-smtp-in.l.google.com. 1 0 @400000004a381ba80dd4850c q1 72.10.191.11 normal 812477 iprep1.t.ctmail.com. 1 0 @400000004a381ba80dd49c7c q1 12.233.75.219 normal 0 zsmseno.clnet.cz. 1 0 @400000004a381ba80dd4a44c q1 69.157.60.79 normal 0 img-cdn.mediaplex.com. 1 0 @400000004a381ba80dd4ac1c q1 208.43.52.205 nxdomain 0 haghway.com.br. 1 0 @400000004a381ba80dd4b3ec q1 204.145.0.242 normal 488877 105.12.90.201.asetnhap5duax9a26l24rda5g3gv @400000004a381ba80dd4bbbc q1 206.246.157.1 normal 0 penninegas.co.uk. 15 2 @400000004a381ba80dd4c38c q1 69.21.243.131 normal 0 svn.atomicobject.com. 28 0 @400000004a381ba80dd4dafc q1 163.192.13.65 normal 894966 dns.hitachi-koki.co.jp. 1 0 @400000004a381ba80dd4e2cc q1 76.65.199.42 nxdomain 0 cs16.msg.dcn.yahoo.com. 1 0 @400000004a381ba80dd4ea9c q1 189.169.97.227 normal 0 impaktosoo.gateway.2wire.net. 1 3 @400000004a381ba80dd4f26c q1 69.64.43.245 normal 558867 gmail.com. 15 0 @400000004a381ba80dd4f654 q1 189.168.174.182 normal 0 wpad.2wire.net. 1 3 @400000004a381ba80dd4fe24 q1 69.64.43.245 normal 558867 alt3.gmail-smtp-in.l.google.com. 1 0 @400000004a381ba80dd51594 q1 189.133.170.67 normal 0 v13.lscache5.googlevideo.com. 1 0 @400000004a381ba80dd538bc q1 12.186.60.189 nxdomain 0 carolyn5.ktemca.com. 1 0 @400000004a381ba80dd5408c q1 72.249.148.132 normal 384918 mailin-04.mx.aol.com. 1 0 @400000004a381ba80dd5485c q1 76.65.199.42 nxdomain 0 csa.yahoo.com. 1 0 @400000004a381ba80dd5502c q1 208.73.228.5 normal 119716 3.0.0.172.in-addr.arpa. 12 3 @400000004a381ba80dd55414 @400000004a381ba80dd55be4 Now: 14 billion DNS queries per day q1 q1 72.249.26.8 normal 0 schnurr.de. 1 0 96.61.141.172 servfail 0 bc2.gamingsquared.com. 1 0
  • 3. Logs are silly, let’s make graphs
  • 4. High level design from my OpenDNS interview map/reduce/ish Stage 1 buckets data by network Stage 2 aggregates and stores Prefers to duplicate data rather than omit data Give each network a separate table (keeps each table small(er) and keeps the primary key small(er))
  • 6. False start #1: storing domains auto_increment is bad (table lock) Use the SHA1 of the domain as primary key Currently we have 2 machines storing domains About 48 GB in each domains.ibd 28 GB memcached across 8 machines effectively makes this database write-only
  • 7. False start #2: std::bad_alloc Stage 2 aggregated too much data and ran out of memory Bad idea: improve the heuristic used to guess memory usage and prevent std::bad_alloc Good idea: catch std::bad_alloc, clean up and restart Pre-allocating buffers that will be reused makes this easy Protip: Run two programs (memcached and Stage 2, for example) compiled 32-bit on a 64-bit CPU with 8 GB RAM
  • 8. False start #3: open tables 80+ %iowait from opening and closing tables strace showed lots of calls to open() and close() strace crashed MySQL Altered mysqld_safe to set ulimit -n 600000
  • 9. False start #4: MyISAM Didn’t mind table locks, so I used MyISAM 12 MB/sec total across 4 nodes Migration to InnoDB is in progress Expect a 2x improvement from InnoDB innodb_flush_log_at_trx_commit=2
  • 11. Bird’s eye view Resolvers Domains DB User DB (worldwide) Proxy Web servers (Palo Alto) Stage 1 Stats DBs Stage 2 San Francisco
  • 12. Stage 1 (“map”) rsync log files from our DNS servers to 3 servers in San Francisco Looking up a network in memcached (or $GLOBALS) gives the preferred Stage 2 Write log lines back to local disk, one bucket for each Stage 2 machine Future work: automated rebalancing and failover
  • 13. Stage 2 data structures { “db1”: { Stats aggregation (pseudocode) “123456”: { “2009-06-17”: { “last_updated”: 1234567890, “file_ptrs”: [0xDEADBEEF, 0xDECAFBAD], “topdomains”: { “xkcd.com”: [12,3,5,47,0,0,6,10,1,9,2,3,0,4,2,0,5,12,19,35,32,2,4,0], }, “requesttypes”: { “A”: [ /* 24 hours */ ], “MX”: [ /* 24 hours */ ] }, “uniqueips”: { “1.2.3.4”: [ /* 24 hours */ ] } } } } } __gnu_cxx::hash_map< char *, // Filename File reference counting (C++) std::pair< unsigned int, // Reference count pthread_t // Owning thread or NULL >, hash_ptr // Hashes a pointer as if it were an integer >
  • 14. Stage 2 (“reduce”) rsync intermediate files from all Stage 1 servers 8 aggregator threads read intermediate files into memory 8 pruning threads write SQL statements to disk They decide what to prune based on the last_updated time They prefer to prune data that allows many files to be deleted Files are reference counted and only deleted when all of their rows are on disk as SQL
  • 15. Stats Databases (“satan”) MySQL 5.0.77-percona 12 disks 16 GB RAM table_cache=300000 innodb_dict_size_limit=2G innodb_flush_log_at_trx_commit=2
  • 16. Website opendns.com is in Palo Alto DNS Stats are in San Francisco (Private) JSON API proxies small chunks of stats data to the website as needed Queries are done with no LIMIT clause Results are paginated in memcached (TTL = 1 hour)
  • 17. Questions? http://opendns.com/dashboard/stats http://rcrowley.org/talks/opendns_stats.pdf richard@opendns.com Photo credits: http://flic.kr/p/4Szofb, http://flic.kr/p/4aH3YK, http://flic.kr/p/RUfEt, http://flic.kr/p/4Zng8Y, http://flic.kr/p/2MRnuq, http://flic.kr/p/9T4HX, http://flic.kr/p/41eEvH, http://flic.kr/p/5Rhxbq, http://flic.kr/p/68RgCp, http://flic.kr/p/oEVp, http://flic.kr/p/tfpXk, http://flic.kr/p/4Twpd4