SlideShare a Scribd company logo
Scaling with Postgres
    Robert Treat

    Highload++ 2010


Monday, October 25, 2010
Who Am I?

    ✤    Robert Treat

    ✤    OmniTI

               ✤    Design, Development,
                    Database, Ops




Monday, October 25, 2010
Who Am I?

    ✤    Robert Treat

    ✤    OmniTI

               ✤    Design, Development,
                    DATABASE, Ops




Monday, October 25, 2010
Who Am I?

    ✤    Robert Treat

    ✤    OmniTI

               ✤    Design, Development,
                    DATABASE, Ops

                     ✤     Etsy, Allisports,
                           National Geographic,
                           Gilt, etc...



Monday, October 25, 2010
Who Am I?

    ✤    Robert Treat

    ✤    Postgres

               ✤    Web, Advocacy,
                    phpPgAdmin

                     ✤     Major Contributor




Monday, October 25, 2010
Who Am I?


    ✤    Postgres 6.5 -> 9.1alpha1

    ✤    Terabytes of data

    ✤    Millions of transactions per day

    ✤    OLTP, ODS, DW

    ✤    Perl, PHP, Java, Ruby, C#



Monday, October 25, 2010
Who Am I?


                           OBSERVATION == LEARNING
                                   (hopefully)




Monday, October 25, 2010
Scalability


                              It is the ability of a computer
                           application or product (hardware
                                or software) to continue to
                               function well when it (or its
                             context) is changed in size or
                            volume in order to meet a user
                                            need.



Monday, October 25, 2010
Scalability

                           Given ever increasing load




Monday, October 25, 2010
Scalability

                              Given ever increasing load



                             NEVER GO DOWN
                           ALWAYS PERFORM WELL




Monday, October 25, 2010
Scalability

                               Given ever increasing load



                             NEVER GO DOWN
                           ALWAYS PERFORM WELL


                              impossible goal, but we’ll try



Monday, October 25, 2010
Scalability

                                  Given ever increasing load



                             NEVER GO DOWN
                           ALWAYS PERFORM WELL


                 NOTE! data loss is not a goal, but ideally we won’t lose it :-)



Monday, October 25, 2010
It starts with culture...




Monday, October 25, 2010
✤    Get over schema purity
               ✤    add column default not null




Monday, October 25, 2010
✤    Get over schema purity
               ✤    add column default not null




                           Good performance comes from good schema
                              design, HOWEVER, perfect relational
                                  modeling is NOT THE GOAL




Monday, October 25, 2010
✤    Devs must own schema and queries
               ✤    they design, you refine




Monday, October 25, 2010
✤    Devs must own schema and queries
               ✤    they design, you refine




                              Performance and scalability cannot be
                            managed solely within the database; both
                             require application level knowledge. To
                           achieve this, application developers need to
                           have visibility of the resources they work on


Monday, October 25, 2010
Gain Visibility




Monday, October 25, 2010
Gain Visibility


    ✤    Monitoring

          ✤   Alerts

          ✤   Trending

          ✤   Capacity Planning

          ✤   Performance Tuning



Monday, October 25, 2010
Gain Visibility

    ✤    Alerts

          ✤   server: out of disk space, high load, etc...

          ✤   database: connections, sequences, etc...

          ✤   business: registrations, revenue, etc...

          ✤   etc...



                                      check_postgres.pl
Monday, October 25, 2010
Gain Visibility

    ✤    Trending

          ✤   server: disk usage, load, etc...

          ✤   database: connections, sequences, etc...

          ✤   business: registrations, revenue, etc...

          ✤   etc...



                                     cacti, mrtg, circonus
Monday, October 25, 2010
Gain Visibility


    ✤    Capacity Planning

          ✤   disks, cpu, memory

          ✤   connections, vacuum, bloat




                           simple projections, done regularly, are good enough
Monday, October 25, 2010
Gain Visibility


    ✤    Performance tuning

          ✤   how long do queries take?

          ✤   how often do they run?




                                          pgfouine
Monday, October 25, 2010
Gain Visibility
                           COMMITS/PUSHES




Monday, October 25, 2010
Gain Visibility


                           ALL alerts, graphs, query reports, etc...
                           MUST be available to EVERYONE on
                                 the team AT ALL TIMES




Monday, October 25, 2010
Hands on

                           You can’t succeed without first putting
                                 the right culture in place.

                           Once you are on the right path, make
                            sure you have the right technology




Monday, October 25, 2010
Postgres Versions

    ✤    MINIMUM: 8.3

          ✤   removes xid for read only queries, significant reduction in vacuum
              activity




Monday, October 25, 2010
Postgres Versions

    ✤    MINIMUM: 8.3

          ✤   removes xid for read only queries, significant reduction in vacuum
              activity




                                       seriously!



Monday, October 25, 2010
Postgres Versions

    ✤    MINIMUM: 8.3

          ✤   removes xid for read only queries, significant reduction in vacuum
              activity

    ✤    BETTER: 8.4

          ✤   revised free space map management leads to more efficient
              vacuuming




Monday, October 25, 2010
Postgres Versions

    ✤    MINIMUM: 8.3

          ✤   removes xid for read only queries, significant reduction in vacuum
              activity

    ✤    BETTER: 8.4

          ✤   revised free space map management leads to more efficient
              vacuuming

    ✤    WHY NOT? 9.0

          ✤   Hot standby / streaming replication couldn’t hurt

Monday, October 25, 2010
Speaking of replication


          ✤   Common practice for scaling websites

          ✤   Good for READ based loads

          ✤   We have used many:

               ✤    slony, rubyrep, bucardo, 9.0 built-in, mammoth, wrote-our-own




Monday, October 25, 2010
Speaking of replication




Monday, October 25, 2010
Speaking of replication

          ✤   No favorite system for this, evaluate based on:

               ✤    avoid solutions that duplicate writes at sql level (imho)

               ✤    how comfortable am I debugging the system?

               ✤    do you need automated schema changes?

               ✤    how much redundancy / complexity do you need?

               ✤    how does the system handle node failure for N nodes?



Monday, October 25, 2010
So what would you use? (tm)

               ✤    2 Nodes, master + standby: Postgres 9.0

               ✤    Master + multiple slaves: Slony

               ✤    Master-Master: Bucardo




                                 All choices subject to change!!



Monday, October 25, 2010
A word about “Sharding”

    ✤    Distributed computing is hard(er)

          ✤   we think of things in a singular global state

          ✤   the more we can work in that model, the better

          ✤   RDBM offer poor solutions for multiple masters

               ✤    you must manage that complexity on your own




Monday, October 25, 2010
A word about “Sharding”


    ✤    Splitting systems by service:

          ✤     separate db for login, forums, sales, etc...

          ✤   allows for growth

          ✤   provides simple interface




Monday, October 25, 2010
Pooling

          ✤   Postgres connections are expensive!

               ✤    fork new process per connection

               ✤    keep 1 process open per connection

          ✤   1000+ processes you will notice trouble




Monday, October 25, 2010
Pooling

          ✤   Postgres connections are expensive!

               ✤    fork new process per connection

               ✤    keep 1 process open per connection

          ✤   1000+ processes you will notice trouble

          ✤   POOLING

               ✤    JDBC, mod-perl

               ✤    pgbouncer ftw!
Monday, October 25, 2010
Summary

          ✤   Schema / Queries should be shared between dev, dba teams!

          ✤   Monitoring + Visibility!

          ✤   >= 8.3 Required!

          ✤   Replication, jump in it!

          ✤     Use connection pooling!




Monday, October 25, 2010
Thanks!

                               Oleg & Crew
                               Highload++
                                  OmniTI
                           Postgres Community!
                                   You!




                                more:
                              @robtreat2
                             www.xzilla.net
Monday, October 25, 2010

More Related Content

PDF
Berlin.JS Meetup
PDF
HH.JS - State of the Automation
PPTX
Система управления качеством (Денис Бугров, Денис Самосеев)
ODP
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
PDF
The Magic of Hot Streaming Replication (Bruce Momjian)
PDF
SkySQL Reference Architecture (Kaj Arno)
PDF
Rapid upgrades with pg upgrade (Bruce Momjian)
POTX
Анатомия баннерной системы (Артем Вольфтруб)
Berlin.JS Meetup
HH.JS - State of the Automation
Система управления качеством (Денис Бугров, Денис Самосеев)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
The Magic of Hot Streaming Replication (Bruce Momjian)
SkySQL Reference Architecture (Kaj Arno)
Rapid upgrades with pg upgrade (Bruce Momjian)
Анатомия баннерной системы (Артем Вольфтруб)

Similar to Scaling with Postgres (Robert Treat) (20)

PDF
Scaling with Postgres (Highload++ 2010)
PDF
Mysql features for the enterprise
PDF
The Trajectory of Change
PDF
Android casting-wide-net-android-devices
PDF
For every site a make file
PDF
Paul Querna - libcloud
PDF
ActiveRecord 2.3
PDF
Sharpen your axe drupal concph 2010
PDF
DSR Microservices (Day 1, Part 1)
PDF
DSR microservices
PDF
Drupal 8 for site builders
PDF
Uberconf 10
PDF
Couchdbkit & Dango
PDF
Couchdbkit djangocong-20100425
PDF
Agile Enterprise Devops and Cloud - Interop 2010 NYC
PPTX
Testing In Production (TiP) Advances with Big Data and the Cloud
PDF
MuraCon 2012 - Creating a Mura CMS plugin with FW/1
PDF
Continuous Delivery for the Web Platform
PDF
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
PPT
IUT presentation - English
Scaling with Postgres (Highload++ 2010)
Mysql features for the enterprise
The Trajectory of Change
Android casting-wide-net-android-devices
For every site a make file
Paul Querna - libcloud
ActiveRecord 2.3
Sharpen your axe drupal concph 2010
DSR Microservices (Day 1, Part 1)
DSR microservices
Drupal 8 for site builders
Uberconf 10
Couchdbkit & Dango
Couchdbkit djangocong-20100425
Agile Enterprise Devops and Cloud - Interop 2010 NYC
Testing In Production (TiP) Advances with Big Data and the Cloud
MuraCon 2012 - Creating a Mura CMS plugin with FW/1
Continuous Delivery for the Web Platform
KoprowskiT_SQLRelay2014#8_Birmingham_FromPlanToBackupToCloud
IUT presentation - English
Ad

More from Ontico (20)

PDF
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
PDF
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
PPTX
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
PDF
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
PDF
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
PDF
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PDF
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
PDF
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
PPTX
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
PPTX
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
PDF
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
PPTX
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
PPTX
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
PDF
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
PPT
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
PPTX
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
PPTX
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
PPTX
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
PPTX
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
PDF
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Ad

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation theory and applications.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Empathic Computing: Creating Shared Understanding
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced methodologies resolving dimensionality complications for autism neur...
MYSQL Presentation for SQL database connectivity
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
Spectroscopy.pptx food analysis technology
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
Agricultural_Statistics_at_a_Glance_2022_0.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Mobile App Security Testing_ A Comprehensive Guide.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
Machine learning based COVID-19 study performance prediction
Network Security Unit 5.pdf for BCA BBA.
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Digital-Transformation-Roadmap-for-Companies.pptx
Empathic Computing: Creating Shared Understanding

Scaling with Postgres (Robert Treat)

  • 1. Scaling with Postgres Robert Treat Highload++ 2010 Monday, October 25, 2010
  • 2. Who Am I? ✤ Robert Treat ✤ OmniTI ✤ Design, Development, Database, Ops Monday, October 25, 2010
  • 3. Who Am I? ✤ Robert Treat ✤ OmniTI ✤ Design, Development, DATABASE, Ops Monday, October 25, 2010
  • 4. Who Am I? ✤ Robert Treat ✤ OmniTI ✤ Design, Development, DATABASE, Ops ✤ Etsy, Allisports, National Geographic, Gilt, etc... Monday, October 25, 2010
  • 5. Who Am I? ✤ Robert Treat ✤ Postgres ✤ Web, Advocacy, phpPgAdmin ✤ Major Contributor Monday, October 25, 2010
  • 6. Who Am I? ✤ Postgres 6.5 -> 9.1alpha1 ✤ Terabytes of data ✤ Millions of transactions per day ✤ OLTP, ODS, DW ✤ Perl, PHP, Java, Ruby, C# Monday, October 25, 2010
  • 7. Who Am I? OBSERVATION == LEARNING (hopefully) Monday, October 25, 2010
  • 8. Scalability It is the ability of a computer application or product (hardware or software) to continue to function well when it (or its context) is changed in size or volume in order to meet a user need. Monday, October 25, 2010
  • 9. Scalability Given ever increasing load Monday, October 25, 2010
  • 10. Scalability Given ever increasing load NEVER GO DOWN ALWAYS PERFORM WELL Monday, October 25, 2010
  • 11. Scalability Given ever increasing load NEVER GO DOWN ALWAYS PERFORM WELL impossible goal, but we’ll try Monday, October 25, 2010
  • 12. Scalability Given ever increasing load NEVER GO DOWN ALWAYS PERFORM WELL NOTE! data loss is not a goal, but ideally we won’t lose it :-) Monday, October 25, 2010
  • 13. It starts with culture... Monday, October 25, 2010
  • 14. Get over schema purity ✤ add column default not null Monday, October 25, 2010
  • 15. Get over schema purity ✤ add column default not null Good performance comes from good schema design, HOWEVER, perfect relational modeling is NOT THE GOAL Monday, October 25, 2010
  • 16. Devs must own schema and queries ✤ they design, you refine Monday, October 25, 2010
  • 17. Devs must own schema and queries ✤ they design, you refine Performance and scalability cannot be managed solely within the database; both require application level knowledge. To achieve this, application developers need to have visibility of the resources they work on Monday, October 25, 2010
  • 19. Gain Visibility ✤ Monitoring ✤ Alerts ✤ Trending ✤ Capacity Planning ✤ Performance Tuning Monday, October 25, 2010
  • 20. Gain Visibility ✤ Alerts ✤ server: out of disk space, high load, etc... ✤ database: connections, sequences, etc... ✤ business: registrations, revenue, etc... ✤ etc... check_postgres.pl Monday, October 25, 2010
  • 21. Gain Visibility ✤ Trending ✤ server: disk usage, load, etc... ✤ database: connections, sequences, etc... ✤ business: registrations, revenue, etc... ✤ etc... cacti, mrtg, circonus Monday, October 25, 2010
  • 22. Gain Visibility ✤ Capacity Planning ✤ disks, cpu, memory ✤ connections, vacuum, bloat simple projections, done regularly, are good enough Monday, October 25, 2010
  • 23. Gain Visibility ✤ Performance tuning ✤ how long do queries take? ✤ how often do they run? pgfouine Monday, October 25, 2010
  • 24. Gain Visibility COMMITS/PUSHES Monday, October 25, 2010
  • 25. Gain Visibility ALL alerts, graphs, query reports, etc... MUST be available to EVERYONE on the team AT ALL TIMES Monday, October 25, 2010
  • 26. Hands on You can’t succeed without first putting the right culture in place. Once you are on the right path, make sure you have the right technology Monday, October 25, 2010
  • 27. Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity Monday, October 25, 2010
  • 28. Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity seriously! Monday, October 25, 2010
  • 29. Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity ✤ BETTER: 8.4 ✤ revised free space map management leads to more efficient vacuuming Monday, October 25, 2010
  • 30. Postgres Versions ✤ MINIMUM: 8.3 ✤ removes xid for read only queries, significant reduction in vacuum activity ✤ BETTER: 8.4 ✤ revised free space map management leads to more efficient vacuuming ✤ WHY NOT? 9.0 ✤ Hot standby / streaming replication couldn’t hurt Monday, October 25, 2010
  • 31. Speaking of replication ✤ Common practice for scaling websites ✤ Good for READ based loads ✤ We have used many: ✤ slony, rubyrep, bucardo, 9.0 built-in, mammoth, wrote-our-own Monday, October 25, 2010
  • 33. Speaking of replication ✤ No favorite system for this, evaluate based on: ✤ avoid solutions that duplicate writes at sql level (imho) ✤ how comfortable am I debugging the system? ✤ do you need automated schema changes? ✤ how much redundancy / complexity do you need? ✤ how does the system handle node failure for N nodes? Monday, October 25, 2010
  • 34. So what would you use? (tm) ✤ 2 Nodes, master + standby: Postgres 9.0 ✤ Master + multiple slaves: Slony ✤ Master-Master: Bucardo All choices subject to change!! Monday, October 25, 2010
  • 35. A word about “Sharding” ✤ Distributed computing is hard(er) ✤ we think of things in a singular global state ✤ the more we can work in that model, the better ✤ RDBM offer poor solutions for multiple masters ✤ you must manage that complexity on your own Monday, October 25, 2010
  • 36. A word about “Sharding” ✤ Splitting systems by service: ✤ separate db for login, forums, sales, etc... ✤ allows for growth ✤ provides simple interface Monday, October 25, 2010
  • 37. Pooling ✤ Postgres connections are expensive! ✤ fork new process per connection ✤ keep 1 process open per connection ✤ 1000+ processes you will notice trouble Monday, October 25, 2010
  • 38. Pooling ✤ Postgres connections are expensive! ✤ fork new process per connection ✤ keep 1 process open per connection ✤ 1000+ processes you will notice trouble ✤ POOLING ✤ JDBC, mod-perl ✤ pgbouncer ftw! Monday, October 25, 2010
  • 39. Summary ✤ Schema / Queries should be shared between dev, dba teams! ✤ Monitoring + Visibility! ✤ >= 8.3 Required! ✤ Replication, jump in it! ✤ Use connection pooling! Monday, October 25, 2010
  • 40. Thanks! Oleg & Crew Highload++ OmniTI Postgres Community! You! more: @robtreat2 www.xzilla.net Monday, October 25, 2010