SlideShare a Scribd company logo
MySQL HA
with PaceMaker
   Kris Buytaert
Kris Buytaert
●   CTO and Open Source Consultant @inuits.eu
●   „Infrastructure Architect“
●   I don't remember when I started using MySQL
●   Specializing in Automated , Large Scale
    Deployments , Highly Available infrastructures,
    since 2008 also known as “the Cloud”
                    th
●   Surviving the 10 floor test
●   Cofounded devopsdays.org
In this presentation
●   High Availability ?
●   MySQL HA Solutions
●   MySQL Replication
●   Linux HA / Pacemaker
What is HA Clustering ?

●   One service goes down
=> others take over its work
●   IP address takeover, service takeover,
●   Not designed for high-performance
●   Not designed for high troughput (load
    balancing)
Does it Matter ?

●   Downtime is expensive
●   You mis out on $$$
●   Your boss complains
●   New users don't return
Lies, Damn Lies, and
Statistics
         Counting nines
            (slide by Alan R)




 99.9999%                        30 sec
 99.999%                          5 min
 99.99%                          52 min
 99.9%                            9 hr
 99%                            3.5 day
The Rules of HA

●   Keep it Simple
●   Keep it Simple
●   Prepare for Failure
●   Complexity is the enemy of reliability
●   Test your HA setup
You care about ?
●   Your data ?
•Consistent
•Realitime
•Eventual Consistent
●   Your Connection
•Always
•Most of the time
Eliminating the SPOF
●   Find out what Will Fail
•Disks
•Fans
•Power (Supplies)
●   Find out what Can Fail
•Network
•Going Out Of Memory
Split Brain
●   Communications failures can lead to separated
    partitions of the cluster
●   If those partitions each try and take control of
    the cluster, then it's called a split-brain
    condition
●   If this happens, then bad things will happen
•http://linux-ha.org/BadThingsWillHappen
Historical MySQL HA
●   Replication
•1 read write node
•Multiple read only nodes
•Application needed to be modified
Solutions Today
●   BYO
●   DRBD
●   MySQL Cluster NDBD
●   Multi Master Replication
●   MySQL Proxy
●   MMM / Flipper
●   Galera
●   Percona XtraDB Cluster
Data vs Connection
●   DATA :
•Replication
•DRBD
●   Connection
•LVS
•Proxy
•Heartbeat / Pacemaker
Shared Storage
●   1 MySQL instance
●   Monitor MySQL node
●   Stonith
●   $$$              1+1 <> 2
●   Storage = SPOF
●   Split Brain :(
DRBD
●   Distributed Replicated Block Device
●   In the Linux Kernel (as of very recent)
●   Usually only 1 mount
•Multi mount as of 8.X
•Requires GFS / OCFS2
●   Regular FS ext3 ...
●   Only 1 MySQL instance Active accessing data
●   Upon Failover MySQL needs to be started on
    other node
DRBD(2)
●   What happens when you pull the plug of a
    Physical machine ?
•Minimal Timeout
•Why did the crash happen ?
•Is my data still correct ?
•Innodb Consistency Checks ?
•Lengthy ?
•Check your BinLog size
MySQL Cluster NDBD
●   Shared-nothing architecture
●   Automatic partitioning
●   Synchronous replication
●   Fast automatic fail-over of data nodes
●   In-memory indexes
●   Not suitable for all query patterns (multi-table
    JOINs, range scans)
Title
– Data
MySQL Cluster NDBD
●   All indexed data needs to be in memory
●   Good and bad experiences
•Better experiences when using the API
•Bad when using the MySQL Server
●   Test before you deploy
●   Does not fit for all apps
How replication works
● Master server keeps track of all updates in the
  Binary Log
•Slave requests to read the binary update log
•Master acts in a passive role, not keeping track
of what slave has read what data

●  Upon connecting the slaves do the following:
•The slave informs the master of where it left off
•It catches up on the updates
•It waits for the master to notify it of new
updates
Buytaert kris my_sql-pacemaker
Two Slave Threads
● How does it work?
•The I/O thread connects to the master and asks for
the updates in the master’s binary log
•The I/O thread copies the statements to the relay
log
•The SQL thread implements the statements in the
relay log
Advantages
•Long running SQL statements don’t block log
downloading
•Allows the slave to keep up with the master better
•In case of master crash the slave is more likely to
have all statements
Replication commands
Slave commands
● START|STOP SLAVE

● RESET SLAVE

● SHOW SLAVE STATUS

● CHANGE MASTER TO…

● LOAD DATA FROM MASTER

● LOAD TABLE tblname FROM MASTER



Master commands
● SHOW MASTER STATUS

● PURGE MASTER LOGS…
Show slave statusG
  Slave_IO_State: Waiting for master to send event
             Master_Host: 172.16.0.1
             Master_User: repli
             Master_Port: 3306
           Connect_Retry: 60
         Master_Log_File: XMS-1-bin.000014
      Read_Master_Log_Pos: 106
          Relay_Log_File: XMS-2-relay.000033
           Relay_Log_Pos: 251
     Relay_Master_Log_File: XMS-1-bin.000014
         Slave_IO_Running: Yes
        Slave_SQL_Running: Yes
         Replicate_Do_DB: xpol
      Replicate_Ignore_DB:
       Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
 Replicate_Wild_Ignore_Table:
             Last_Errno: 0
             Last_Error:
            Skip_Counter: 0
      Exec_Master_Log_Pos: 106
         Relay_Log_Space: 547
         Until_Condition: None
          Until_Log_File:
           Until_Log_Pos: 0
       Master_SSL_Allowed: No
       Master_SSL_CA_File:
       Master_SSL_CA_Path:
         Master_SSL_Cert:
        Master_SSL_Cipher:
          Master_SSL_Key:
     Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
           Last_IO_Errno: 0
           Last_IO_Error:
          Last_SQL_Errno: 0
          Last_SQL_Error:
1 row in set (0.00 sec)
Row vs Statement
●   Pro                              ●  Pro
                                     •All changes can be replicated
•Proven (around since MySQL 3.23)
                                     •Similar technology used by other
•Smaller log files                   RDBMSes
                                     •Fewer locks required for some
•Auditing of actual SQL statements   INSERT, UPDATE or DELETE
                                     statements
•No primary key requirement for      ● Con
replicated tables                    •More data to be logged
●   Con                              •Log file size increases
                                     (backup/restore implications)
•Non-deterministic functions and     •Replicated tables require explicit
UDFs                                 primary keys
                                     •Possible different result sets on
                                     bulk INSERTs
Multi Master Replication
●   Replicating the same table data both ways can
    lead to race conditions
•Auto_increment, unique keys, etc.. could cause
problems If you write them 2x
●   Both nodes are master
●   Both nodes are slave
●   Write in 1 get updates on the other

                                    M|S       M|S
MySQL Proxy
●   Man in the middle
●   Decides where to connect to
•LUA
●   Write rules to
•Redirect traffic
•
Master Slave & Proxy
●   Split Read and Write Actions
●   No Application change required
●   Sends specific queries to a specific node
●   Based on
•Customer
•User
•Table
•Availability
MySQL Proxy
●   Your new SPOF
●   Make your Proxy HA too !
•Heartbeat OCF Resource
Breaking Replication
●   If the master and slave gets out of sync
●   Updates on slave with identical index id
•Check error log for disconnections and issues
with replication
Monitor your Setup
●   Not just connectivity
●   Also functional
•Query data
•Check resultset is correct
●   Check replication
•MaatKit
•OpenARK
Pulling Traffic
●   Eg. for Cluster, MultiMaster setups
•DNS
•Advanced Routing
•LVS


•Flipper / MMM
MMM
●   Multi-Master Replication Manager
    for MySQL

•Perl scripts to perform
monitoring/failover and
management of MySQL master-
master replication configurations
●   Balance master / slave configs
    based on replication state

•Map Virtual IP to the Best Node
●   http://mysql-mmm.org/
Flipper
●   Flipper is a Perl tool for
    managing read and write
    access pairs of MySQL servers
●   master-master MySQL Servers
●   Clients machines do not
    connect "directly" to either
    node instead,
●   One IP for read,
●   One IP for write.
●   Flipper allows you to move
    these IP addresses between
    the nodes in a safe and
    controlled manner.
●   http://provenscaling.com/softw
    are/flipper/
Linux-HA PaceMaker
●   Plays well with others
●   Manages more than MySQL
●

●   ...v3 .. don't even think about the rest anymore
●

●   http://clusterlabs.org/
Heartbeat
●   Heartbeat v1
•Max 2 nodes
•No finegrained resources
•Monitoring using “mon”
●   Heartbeat v2
•XML usage was a consulting opportunity
•Stability issues
•Forking ?
Pacemaker Architecture
            ●   Stonithd : The Heartbeat fencing subsystem.
            ●   Lrmd : Local Resource Management Daemon.
                Interacts directly with resource agents (scripts).
            ●   pengine Policy Engine. Computes the next state of the
                cluster based on the current state and the configuration.
            ●   cib Cluster Information Base. Contains definitions of all
                cluster options, nodes, resources, their relationships to
                one another and current status. Synchronizes updates to
                all cluster nodes.
            ●   crmd Cluster Resource Management Daemon. Largely
                a message broker for the PEngine and LRM, it also
                elects a leader to co-ordinate the activities of the cluster.
            ●   openais messaging and membership layer.
            ●   heartbeat messaging layer, an alternative to OpenAIS.
            ●   ccm Short for Consensus Cluster Membership. The
                Heartbeat membership layer.
Pacemaker ?
●   Not a fork
●   Only CRM Code taken out of Heartbeat
●   As of Heartbeat 2.1.3
•Support for both OpenAIS / HeartBeat
•Different Release Cycles as Heartbeat
Heartbeat, OpenAis ?
●   Both Messaging Layers
●   Initially only Heartbeat
●   OpenAIS
●   Heartbeat got unmaintained
●   OpenAIS has heisenbugs :(
●   Heartbeat maintenance taken over by LinBit
●   CRM Detects which layer
Pacemaker




Heartbeat       or         OpenAIS




            Cluster Glue
Configuring Heartbeat
●   /etc/ha.d/ha.cf
Use crm = yes


●   /etc/ha.d/authkeys
Configuring Heartbeat
heartbeat::hacf {"clustername":

         hosts => ["host-a","host-b"],

         hb_nic => ["bond0"],

         hostip1 => ["10.0.128.11"],

         hostip2 => ["10.0.128.12"],

         ping => ["10.0.128.4"],

    }

heartbeat::authkeys {"ClusterName":

         password => “ClusterName ",

    }

http://github.com/jtimberman/puppet/tree/master/heartbeat/
Heartbeat Resources
●   LSB
●   Heartbeat resource (+status)
●   OCF (Open Cluster FrameWork) (+monitor)
●   Clones (don't use in HAv2)
●   Multi State Resources
A MySQL Resource
●   OCF
•Clone
•Where do you hook up the IP ?
•Multi State
•But we have Master Master replication
•Meta Resource
•Dummy resource that can monitor
•Connection
•Replication state
CRM
                          configure
●   Cluster Resource      property $id="cib-bootstrap-
                          options" 
    Manager                   stonith-enabled="FALSE" 
                              no-quorum-policy=ignore 
●   Keeps Nodes in Sync       start-failure-is-fatal="FALSE" 
                          rsc_defaults $id="rsc_defaults-
                          options" 
●   XML Based                 migration-threshold="1" 
                              failure-timeout="1"
●   cibadm                primitive d_mysql ocf:local:mysql 
                              op monitor interval="30s" 
●   Cli manageable            params test_user="sure"
                          test_passwd="illtell"
                          test_table="test.table"
●   Crm                   primitive ip_db
                          ocf:heartbeat:IPaddr2 
                              params ip="172.17.4.202"
                          nic="bond0" 
                              op monitor interval="10s"
                          group svc_db d_mysql ip_db
                          commit
Adding MySQL to the
stack

                     Replication
  Service IP MySQL

  “MySQLd”                          “MySQLd”   Resource MySQL

                                                Cluster Stack
                      Pacemaker

                      HeartBeat
         Node A                    Node B      Hardware
Pitfalls & Solutions
●   Monitor,
•Replication state
•Replication Lag


●   MaatKit
●   OpenARK
Conclusion
●   Plenty of Alternatives
●   Think about your Data
●   Think about getting Queries to that Data
●   Complexity is the enemy of reliability
●   Keep it Simple
●   Monitor inside the DB
Contact
Kris Buytaert
Kris.Buytaert@inuits.be

Further Reading
@krisbuytaert
http://www.krisbuytaert.be/blog/
http://www.inuits.be/
                      •Or the upcoming slides


                             Inuits
                             't Hemeltje
                             Duboistraat 50
                             2060 Antwerpen
                             Belgium
                             891.514.231

                             +32 475 961221

More Related Content

PDF
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
PDF
MySQL with DRBD/Pacemaker/Corosync on Linux
PPTX
Overview of some popular distributed databases
DOCX
Master master vs master-slave database
PDF
Scaling with sync_replication using Galera and EC2
PDF
Galera Cluster 3.0 Features
PDF
合并到 XtraDB 存储引擎集群
PDF
Best practices for MySQL High Availability
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
MySQL with DRBD/Pacemaker/Corosync on Linux
Overview of some popular distributed databases
Master master vs master-slave database
Scaling with sync_replication using Galera and EC2
Galera Cluster 3.0 Features
合并到 XtraDB 存储引擎集群
Best practices for MySQL High Availability

What's hot (20)

PDF
How to understand Galera Cluster - 2013
PPTX
MySQL Replication Alternative: Pros and Cons
PDF
MariaDB Galera Cluster - Simple, Transparent, Highly Available
PPTX
Maria DB Galera Cluster for High Availability
PDF
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
PDF
DIY: A distributed database cluster, or: MySQL Cluster
PDF
Galera explained 3
PDF
Highly Available MySQL/PHP Applications with mysqlnd
PPT
Taking Full Advantage of Galera Multi Master Cluster
PPT
Pacemaker+DRBD
PDF
Using and Benchmarking Galera in different architectures (PLUK 2012)
PDF
Galera Cluster - Node Recovery - Webinar slides
PDF
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
PPT
Using galera replication to create geo distributed clusters on the wan
PPT
Galera cluster - SkySQL Paris Meetup 17.12.2013
PDF
MySQL 5.6 Performance
PDF
Performance Tuning Best Practices
PDF
Introduction to Galera
PPT
Codership's galera cluster installation and quickstart webinar march 2016
How to understand Galera Cluster - 2013
MySQL Replication Alternative: Pros and Cons
MariaDB Galera Cluster - Simple, Transparent, Highly Available
Maria DB Galera Cluster for High Availability
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
DIY: A distributed database cluster, or: MySQL Cluster
Galera explained 3
Highly Available MySQL/PHP Applications with mysqlnd
Taking Full Advantage of Galera Multi Master Cluster
Pacemaker+DRBD
Using and Benchmarking Galera in different architectures (PLUK 2012)
Galera Cluster - Node Recovery - Webinar slides
9 DevOps Tips for Going in Production with Galera Cluster for MySQL - Slides
Using galera replication to create geo distributed clusters on the wan
Galera cluster - SkySQL Paris Meetup 17.12.2013
MySQL 5.6 Performance
Performance Tuning Best Practices
Introduction to Galera
Codership's galera cluster installation and quickstart webinar march 2016
Ad

Viewers also liked (9)

PPTX
максим бабич
PPTX
сергей спиридонов
PPTX
чашкин иван
PPTX
дыдыкин егор
PPTX
сумин андрей
PDF
Balashov
PDF
Jouravski kickstart1
PPT
ярослав рабоволюк
PDF
Kharkov
максим бабич
сергей спиридонов
чашкин иван
дыдыкин егор
сумин андрей
Balashov
Jouravski kickstart1
ярослав рабоволюк
Kharkov
Ad

Similar to Buytaert kris my_sql-pacemaker (20)

ODP
MySQL HA with PaceMaker
PDF
MySQL High Availability Solutions
PDF
Mysqlhacodebits20091203 1260184765-phpapp02
PDF
MySQL High Availability Solutions
PDF
Drupal Con My Sql Ha 2008 08 29
ODP
MySQL HA Alternatives 2010
PDF
Scaling MySQL -- Swanseacon.co.uk
PDF
High Availability with MySQL
PDF
MySQL highav Availability
PDF
Has MySQL grown up?
ODP
MySQL HA
PDF
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
PDF
Confoo 202 - MySQL Group Replication and ReplicaSet
PDF
MySQL NDB Cluster 8.0
PPTX
MySQL High Availability Solutions - Feb 2015 webinar
PDF
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
PDF
MySQL HA with Pacemaker
PDF
Proper Care and Feeding of a MySQL Database for Busy Linux Administrators
PPTX
MySQL High Availibility Solutions
PPT
MySQL Alta Disponibilidade com Replicação
MySQL HA with PaceMaker
MySQL High Availability Solutions
Mysqlhacodebits20091203 1260184765-phpapp02
MySQL High Availability Solutions
Drupal Con My Sql Ha 2008 08 29
MySQL HA Alternatives 2010
Scaling MySQL -- Swanseacon.co.uk
High Availability with MySQL
MySQL highav Availability
Has MySQL grown up?
MySQL HA
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
Confoo 202 - MySQL Group Replication and ReplicaSet
MySQL NDB Cluster 8.0
MySQL High Availability Solutions - Feb 2015 webinar
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
MySQL HA with Pacemaker
Proper Care and Feeding of a MySQL Database for Busy Linux Administrators
MySQL High Availibility Solutions
MySQL Alta Disponibilidade com Replicação

More from kuchinskaya (20)

PDF
Zamyakin
PDF
Panfilov
PDF
Platov
PDF
Rabovoluk
PDF
Smirnov dependency-injection-techforum(1)
PDF
Smirnov reverse-engineering-techforum
PDF
Zacepin
PDF
Zagursky
PDF
Haritonov
PDF
Chudov
PDF
Bubnov
PDF
A.pleshkov
PDF
Zenovich
PDF
Romanenko
PDF
Perepelitsa
PDF
Osipov
PDF
Kubasov
PDF
Kalugin balashov
PPTX
владимир габриелян
PPTX
митасов роман
Zamyakin
Panfilov
Platov
Rabovoluk
Smirnov dependency-injection-techforum(1)
Smirnov reverse-engineering-techforum
Zacepin
Zagursky
Haritonov
Chudov
Bubnov
A.pleshkov
Zenovich
Romanenko
Perepelitsa
Osipov
Kubasov
Kalugin balashov
владимир габриелян
митасов роман

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Cloud computing and distributed systems.
PDF
Machine learning based COVID-19 study performance prediction
PPT
Teaching material agriculture food technology
PDF
Approach and Philosophy of On baking technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
KodekX | Application Modernization Development
PDF
Encapsulation theory and applications.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Electronic commerce courselecture one. Pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Big Data Technologies - Introduction.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Cloud computing and distributed systems.
Machine learning based COVID-19 study performance prediction
Teaching material agriculture food technology
Approach and Philosophy of On baking technology
Digital-Transformation-Roadmap-for-Companies.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Network Security Unit 5.pdf for BCA BBA.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Chapter 3 Spatial Domain Image Processing.pdf
KodekX | Application Modernization Development
Encapsulation theory and applications.pdf
Review of recent advances in non-invasive hemoglobin estimation
Electronic commerce courselecture one. Pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Understanding_Digital_Forensics_Presentation.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Big Data Technologies - Introduction.pptx
Spectral efficient network and resource selection model in 5G networks
MYSQL Presentation for SQL database connectivity

Buytaert kris my_sql-pacemaker

  • 1. MySQL HA with PaceMaker Kris Buytaert
  • 2. Kris Buytaert ● CTO and Open Source Consultant @inuits.eu ● „Infrastructure Architect“ ● I don't remember when I started using MySQL ● Specializing in Automated , Large Scale Deployments , Highly Available infrastructures, since 2008 also known as “the Cloud” th ● Surviving the 10 floor test ● Cofounded devopsdays.org
  • 3. In this presentation ● High Availability ? ● MySQL HA Solutions ● MySQL Replication ● Linux HA / Pacemaker
  • 4. What is HA Clustering ? ● One service goes down => others take over its work ● IP address takeover, service takeover, ● Not designed for high-performance ● Not designed for high troughput (load balancing)
  • 5. Does it Matter ? ● Downtime is expensive ● You mis out on $$$ ● Your boss complains ● New users don't return
  • 6. Lies, Damn Lies, and Statistics Counting nines (slide by Alan R) 99.9999% 30 sec 99.999% 5 min 99.99% 52 min 99.9% 9 hr 99% 3.5 day
  • 7. The Rules of HA ● Keep it Simple ● Keep it Simple ● Prepare for Failure ● Complexity is the enemy of reliability ● Test your HA setup
  • 8. You care about ? ● Your data ? •Consistent •Realitime •Eventual Consistent ● Your Connection •Always •Most of the time
  • 9. Eliminating the SPOF ● Find out what Will Fail •Disks •Fans •Power (Supplies) ● Find out what Can Fail •Network •Going Out Of Memory
  • 10. Split Brain ● Communications failures can lead to separated partitions of the cluster ● If those partitions each try and take control of the cluster, then it's called a split-brain condition ● If this happens, then bad things will happen •http://linux-ha.org/BadThingsWillHappen
  • 11. Historical MySQL HA ● Replication •1 read write node •Multiple read only nodes •Application needed to be modified
  • 12. Solutions Today ● BYO ● DRBD ● MySQL Cluster NDBD ● Multi Master Replication ● MySQL Proxy ● MMM / Flipper ● Galera ● Percona XtraDB Cluster
  • 13. Data vs Connection ● DATA : •Replication •DRBD ● Connection •LVS •Proxy •Heartbeat / Pacemaker
  • 14. Shared Storage ● 1 MySQL instance ● Monitor MySQL node ● Stonith ● $$$ 1+1 <> 2 ● Storage = SPOF ● Split Brain :(
  • 15. DRBD ● Distributed Replicated Block Device ● In the Linux Kernel (as of very recent) ● Usually only 1 mount •Multi mount as of 8.X •Requires GFS / OCFS2 ● Regular FS ext3 ... ● Only 1 MySQL instance Active accessing data ● Upon Failover MySQL needs to be started on other node
  • 16. DRBD(2) ● What happens when you pull the plug of a Physical machine ? •Minimal Timeout •Why did the crash happen ? •Is my data still correct ? •Innodb Consistency Checks ? •Lengthy ? •Check your BinLog size
  • 17. MySQL Cluster NDBD ● Shared-nothing architecture ● Automatic partitioning ● Synchronous replication ● Fast automatic fail-over of data nodes ● In-memory indexes ● Not suitable for all query patterns (multi-table JOINs, range scans)
  • 19. MySQL Cluster NDBD ● All indexed data needs to be in memory ● Good and bad experiences •Better experiences when using the API •Bad when using the MySQL Server ● Test before you deploy ● Does not fit for all apps
  • 20. How replication works ● Master server keeps track of all updates in the Binary Log •Slave requests to read the binary update log •Master acts in a passive role, not keeping track of what slave has read what data ● Upon connecting the slaves do the following: •The slave informs the master of where it left off •It catches up on the updates •It waits for the master to notify it of new updates
  • 22. Two Slave Threads ● How does it work? •The I/O thread connects to the master and asks for the updates in the master’s binary log •The I/O thread copies the statements to the relay log •The SQL thread implements the statements in the relay log Advantages •Long running SQL statements don’t block log downloading •Allows the slave to keep up with the master better •In case of master crash the slave is more likely to have all statements
  • 23. Replication commands Slave commands ● START|STOP SLAVE ● RESET SLAVE ● SHOW SLAVE STATUS ● CHANGE MASTER TO… ● LOAD DATA FROM MASTER ● LOAD TABLE tblname FROM MASTER Master commands ● SHOW MASTER STATUS ● PURGE MASTER LOGS…
  • 24. Show slave statusG Slave_IO_State: Waiting for master to send event Master_Host: 172.16.0.1 Master_User: repli Master_Port: 3306 Connect_Retry: 60 Master_Log_File: XMS-1-bin.000014 Read_Master_Log_Pos: 106 Relay_Log_File: XMS-2-relay.000033 Relay_Log_Pos: 251 Relay_Master_Log_File: XMS-1-bin.000014 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: xpol Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 106 Relay_Log_Space: 547 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: 1 row in set (0.00 sec)
  • 25. Row vs Statement ● Pro ● Pro •All changes can be replicated •Proven (around since MySQL 3.23) •Similar technology used by other •Smaller log files RDBMSes •Fewer locks required for some •Auditing of actual SQL statements INSERT, UPDATE or DELETE statements •No primary key requirement for ● Con replicated tables •More data to be logged ● Con •Log file size increases (backup/restore implications) •Non-deterministic functions and •Replicated tables require explicit UDFs primary keys •Possible different result sets on bulk INSERTs
  • 26. Multi Master Replication ● Replicating the same table data both ways can lead to race conditions •Auto_increment, unique keys, etc.. could cause problems If you write them 2x ● Both nodes are master ● Both nodes are slave ● Write in 1 get updates on the other M|S M|S
  • 27. MySQL Proxy ● Man in the middle ● Decides where to connect to •LUA ● Write rules to •Redirect traffic •
  • 28. Master Slave & Proxy ● Split Read and Write Actions ● No Application change required ● Sends specific queries to a specific node ● Based on •Customer •User •Table •Availability
  • 29. MySQL Proxy ● Your new SPOF ● Make your Proxy HA too ! •Heartbeat OCF Resource
  • 30. Breaking Replication ● If the master and slave gets out of sync ● Updates on slave with identical index id •Check error log for disconnections and issues with replication
  • 31. Monitor your Setup ● Not just connectivity ● Also functional •Query data •Check resultset is correct ● Check replication •MaatKit •OpenARK
  • 32. Pulling Traffic ● Eg. for Cluster, MultiMaster setups •DNS •Advanced Routing •LVS •Flipper / MMM
  • 33. MMM ● Multi-Master Replication Manager for MySQL •Perl scripts to perform monitoring/failover and management of MySQL master- master replication configurations ● Balance master / slave configs based on replication state •Map Virtual IP to the Best Node ● http://mysql-mmm.org/
  • 34. Flipper ● Flipper is a Perl tool for managing read and write access pairs of MySQL servers ● master-master MySQL Servers ● Clients machines do not connect "directly" to either node instead, ● One IP for read, ● One IP for write. ● Flipper allows you to move these IP addresses between the nodes in a safe and controlled manner. ● http://provenscaling.com/softw are/flipper/
  • 35. Linux-HA PaceMaker ● Plays well with others ● Manages more than MySQL ● ● ...v3 .. don't even think about the rest anymore ● ● http://clusterlabs.org/
  • 36. Heartbeat ● Heartbeat v1 •Max 2 nodes •No finegrained resources •Monitoring using “mon” ● Heartbeat v2 •XML usage was a consulting opportunity •Stability issues •Forking ?
  • 37. Pacemaker Architecture ● Stonithd : The Heartbeat fencing subsystem. ● Lrmd : Local Resource Management Daemon. Interacts directly with resource agents (scripts). ● pengine Policy Engine. Computes the next state of the cluster based on the current state and the configuration. ● cib Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes. ● crmd Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities of the cluster. ● openais messaging and membership layer. ● heartbeat messaging layer, an alternative to OpenAIS. ● ccm Short for Consensus Cluster Membership. The Heartbeat membership layer.
  • 38. Pacemaker ? ● Not a fork ● Only CRM Code taken out of Heartbeat ● As of Heartbeat 2.1.3 •Support for both OpenAIS / HeartBeat •Different Release Cycles as Heartbeat
  • 39. Heartbeat, OpenAis ? ● Both Messaging Layers ● Initially only Heartbeat ● OpenAIS ● Heartbeat got unmaintained ● OpenAIS has heisenbugs :( ● Heartbeat maintenance taken over by LinBit ● CRM Detects which layer
  • 40. Pacemaker Heartbeat or OpenAIS Cluster Glue
  • 41. Configuring Heartbeat ● /etc/ha.d/ha.cf Use crm = yes ● /etc/ha.d/authkeys
  • 42. Configuring Heartbeat heartbeat::hacf {"clustername": hosts => ["host-a","host-b"], hb_nic => ["bond0"], hostip1 => ["10.0.128.11"], hostip2 => ["10.0.128.12"], ping => ["10.0.128.4"], } heartbeat::authkeys {"ClusterName": password => “ClusterName ", } http://github.com/jtimberman/puppet/tree/master/heartbeat/
  • 43. Heartbeat Resources ● LSB ● Heartbeat resource (+status) ● OCF (Open Cluster FrameWork) (+monitor) ● Clones (don't use in HAv2) ● Multi State Resources
  • 44. A MySQL Resource ● OCF •Clone •Where do you hook up the IP ? •Multi State •But we have Master Master replication •Meta Resource •Dummy resource that can monitor •Connection •Replication state
  • 45. CRM configure ● Cluster Resource property $id="cib-bootstrap- options" Manager stonith-enabled="FALSE" no-quorum-policy=ignore ● Keeps Nodes in Sync start-failure-is-fatal="FALSE" rsc_defaults $id="rsc_defaults- options" ● XML Based migration-threshold="1" failure-timeout="1" ● cibadm primitive d_mysql ocf:local:mysql op monitor interval="30s" ● Cli manageable params test_user="sure" test_passwd="illtell" test_table="test.table" ● Crm primitive ip_db ocf:heartbeat:IPaddr2 params ip="172.17.4.202" nic="bond0" op monitor interval="10s" group svc_db d_mysql ip_db commit
  • 46. Adding MySQL to the stack Replication Service IP MySQL “MySQLd” “MySQLd” Resource MySQL Cluster Stack Pacemaker HeartBeat Node A Node B Hardware
  • 47. Pitfalls & Solutions ● Monitor, •Replication state •Replication Lag ● MaatKit ● OpenARK
  • 48. Conclusion ● Plenty of Alternatives ● Think about your Data ● Think about getting Queries to that Data ● Complexity is the enemy of reliability ● Keep it Simple ● Monitor inside the DB
  • 49. Contact Kris Buytaert Kris.Buytaert@inuits.be Further Reading @krisbuytaert http://www.krisbuytaert.be/blog/ http://www.inuits.be/ •Or the upcoming slides Inuits 't Hemeltje Duboistraat 50 2060 Antwerpen Belgium 891.514.231 +32 475 961221