SlideShare a Scribd company logo
MySQL HA Using different solutions Robert Krzykawski DB Team Coordinator,  bwin  games. Anders Karlsson Principal Sales Engineer, MySQL
Agenda Who are we? HA Basics – Anders How we did it; Success or failure – Robert Summary Questions?
Anders Karlsson Sales Engineer with Sun / MySQL for 5+ years I have been in the RDBMS business for 20+ years I have worked for many of the major vendors and with most of the vendor products I’ve been in roles as Sales Engineer Consultant Porting engineer Support engineer Etc. Outside MySQL I build websites (www.papablues.com), develop Open Source software (MyQuery, ndbtop etc), am a keen photographer and drives sub-standard cars, among other things. Also:  www.makezfsgpl.com  ! Right now!
Robert Krzykawski DB Team Coordinator @  bwin  Games AB Have been working with MySQL in every way from system admin, DBA, DBD and now taking a more system architectural role. Been involved in building both small and big web based solutions since 1998 using MySQL. My roles throughout my professional life have varied. System administrator, Technical Sales support, DBA, DBD, Programmer, Application architect and System architect. Off work I am trying to automate things with scripts and programs to off load myself when “on work”.   I am also trying to find time to snowboard, play some paintball and a recently introduced hobby is our Maine Coon kittens.  
Why do you need HA Something can break. It usually will, eventually You will need to maintain your database eventually, without shutting the whole system down Adding HA to an existing running system is difficult, Much more so than to provide HA from the start You want a good nights sleep! You want failover to be automatic!
HA Concepts Fault tolerant architectures These are hardware architectures with supporting software that prevents against even individual component failures Single Point of Failure (SPOF) In any fault tolerant setup, you want to avoid a SPOF, as a link is not better than it’s weakest link Fail over and Fail back Fail over is the process of switching from a failed component to another component, dormant or also active. Fail back is the process of failing back from the backup component to the original one.
Some HA Components Heartbeat Heartbeat is an HA component that checks that the services that are being failed over, are alive. Heartbeat can check individual servers, software services, networking etc. HA Monitor The HA Monitor has different names in different frameworks. This is the component that allows configuration of the services, ensures proper shutdown and startup and allows manual control Replication Replication is a common component that ensures that the data content of managed data rich components are in sync
What should I require? Don’t aim too high, aim for what is reasonable for your needs Aim to ensure that no important data is lost What is “important data”? You decide! Different data means different “needs”! Aim to ensure that the solution can be automated. You will want this eventually anyway Aim to ensure a solution that can easily be tested and administered Aim to ensure that the solution is performant and scalable
MySQL Replication Easy to use and set up. Low performance impact Asynchronous only. Failback can be difficult. Need additional components MySQL with DRBD / ZFS / AVS Easy to use. Low cost software only. Synchronous. Good HA software integration. Certain performance impact. Limited data size and transaction rates. HA with MySQL – In short
MySQL with Shared storage Good performance. Eases hardware management. Good integration with HA software. Costly. SAN itself is a SPOF. MySQL Cluster Very good performance. Self contained. Very short fail-over times. Software only solution. Needs several physical servers. Not optimized for all MySQL applications. HA with MySQL – In short
bwin  games ab
Our goal at  bwin We were faced with a requirement; establish a highly available database platform. We had some rules to follow from management. interruptions due to hardware failure should not require hands-on work. Downtime should be minimized during interruptions. Performance of DB platform should not decrease when operating as usual Performance can decrease if a failure has occurred but should not deem the service unusable. Implementation should be done by the operations department. Developers should not be involved.
What solutions did we consider? Master/Master Linux HA HP Service Guard Sun Cluster Combination of the above MySQL Cluster Will walk through all of the above
Master/Master Master/Master with two active nodes would give us a seamless switch if we have a good load balancer. Will give us the ability to do schema changes “on line” Not only higher availability when both nodes are up, but better performance. Can eliminate the use of production slaves.  One entry point for application when using “LB”
Linux HA/ServiceGuard/SunCluster Service IP switch will cause a glitch in service. Since we are running 4.0 we can’t really do a master/master setup with service IP switching. Slave integrity is important and we are running 4.0; One master data. Can’t switch to slave and hope that everything was replicated. We are using SAN – Shared storage possible. One instance, two machines – One active, one standby. Innodb log size will be a problem. Timeout during recovery can cause problems during switch.
MySQL Cluster High availability built in if implemented correct Requires more hardware. More complex solution Requires application to support NDB Not full feature set.
Obstacles We are using MySQL 4.0 in our biggest database Master/Master scenario on 4.0 requires higher level of application awareness. LinuxHA/ServiceGuard/Sun Cluster will cause small glitch when we move resources. MySQL Cluster will require even more application changes in our case.
Our Choice LinuxHA because it is GPL/LGPL. Free and not owned by an organization. Fastest way to implement, did not require any support from dev. Department. All other ways required changes in application.
Layout Two versions
We do.. Use Linux HA 2.0. Needed for setup of “cluster” Use SAN. Shared storage is easier and faster, but Expensive.  DRBD can be used but saves the same data twice Also comes with a performance decrease.  Heartbeat on two bonds. Primary database interconnect network, secondary on database service network We have LUNs presented to multiple hosts Services have rules to be run on specific hosts only. We fence using RiLOE Have plans to fence on port level in FC switches.
What’s good and what’s bad.. Easy and fast implementation Our config does not increase/decrease performance. Innodb log size causes long recovery times. Testing to decrease it has caused performance penalties. Our solution is not fool proof because of long recovery times. It causes interruption of service. We can say it’s HA, but true HA solution would give us 100% uptime. 2nd Setup is complicated. We should aim for having simple setups. More common
What can we do better. Fine tune config for faster recovery/startup Add better fencing Monitor failover in case recovery takes long Master/Master or Multi master. If application can reconnect or if we have a smart load balancer we have no outages. Upgrades or schema changes can be made “online” No separation between writes and reads. Less complicated for developers. One entry point.
Summary Concepts Components Requirements Technologies Your goal Considerations Obstacles How we did it @  bwin  games AB HA recommendations
Questions The question is not, ‘What is the answer?’ The question is, ‘What is the question?’ Henri Poincaré
Thank you for your time! And thank you for listening so kindly. We can be found on: Robert Krzykawski –  http://krzykawski.com Anders Karlsson –  http://papablues.com http://karlssonondatabases.blogspot.com /

More Related Content

PPTX
Integrating Hybrid Cloud Database-as-a-Service with Cloud Foundry’s Service​ ...
PPTX
Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?
PPTX
Unbreakable SharePoint 2016 with SQL Server 2016 Always On Availability groups
PPTX
Micro service architecture
PPTX
SAP TechEd 2013 session Tec118 managing your-environment
PPTX
Unbreakable SharePoint 2013 with SQL Server Always On Availability Groups (HA...
PDF
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
PPTX
Best Practices for Managing MongoDB with Ops Manager
Integrating Hybrid Cloud Database-as-a-Service with Cloud Foundry’s Service​ ...
Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?
Unbreakable SharePoint 2016 with SQL Server 2016 Always On Availability groups
Micro service architecture
SAP TechEd 2013 session Tec118 managing your-environment
Unbreakable SharePoint 2013 with SQL Server Always On Availability Groups (HA...
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Best Practices for Managing MongoDB with Ops Manager

What's hot (20)

PPTX
Salesforce enabling real time scenarios at scale using kafka
PDF
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
PPTX
Deploy Office Web Apps Server 2013 in Azure
PPTX
Webinar slides - ClusterControl 1.2.11: with support for MariaDB’s MaxScale a...
PPTX
Sapuki sig 2013
PPTX
Pascal benois performance_troubleshooting-spsbe18
PDF
Top 15 Exchange Questions that Senior Admin ask - Jaap Wesselius
PDF
Percona, software libre y bases de datos
PPTX
Security of Oracle EBS - How I can Protect my System (UKOUG APPS 18 edition)
PDF
Nippon It Solutions Data services offering 2015
ODP
Zero Downtime JEE Architectures
PPTX
Always On - Zero Downtime releases
PDF
Database failover from client perspective
PPTX
Effective SharePoint Architecture - SharePoint Saturday Stockholm 2016
PPTX
Running Oracle EBS in the cloud (OAUG Collaborate 18 edition)
PPTX
Running Oracle EBS in the cloud (DOAG TECH17 edition)
TXT
services order
PDF
Become a MySQL DBA: performing live database upgrades - webinar slides
PDF
Principal Propagation with SAP Cloud Platform
PDF
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Salesforce enabling real time scenarios at scale using kafka
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Deploy Office Web Apps Server 2013 in Azure
Webinar slides - ClusterControl 1.2.11: with support for MariaDB’s MaxScale a...
Sapuki sig 2013
Pascal benois performance_troubleshooting-spsbe18
Top 15 Exchange Questions that Senior Admin ask - Jaap Wesselius
Percona, software libre y bases de datos
Security of Oracle EBS - How I can Protect my System (UKOUG APPS 18 edition)
Nippon It Solutions Data services offering 2015
Zero Downtime JEE Architectures
Always On - Zero Downtime releases
Database failover from client perspective
Effective SharePoint Architecture - SharePoint Saturday Stockholm 2016
Running Oracle EBS in the cloud (OAUG Collaborate 18 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)
services order
Become a MySQL DBA: performing live database upgrades - webinar slides
Principal Propagation with SAP Cloud Platform
Single Sign-On for APEX applications based on Kerberos (Important: latest ver...
Ad

Viewers also liked (7)

PDF
Dancing cluster
PDF
Wellness In The Community & Work Place
PPTX
Spa party presentation
PPT
Lesson 5 - Spray Tanning - The treatment area
PPT
Gurgaonmoms pink pamper party
PPTX
PPTX
Presentación empanada lunch - 10 de abrl de 2012
Dancing cluster
Wellness In The Community & Work Place
Spa party presentation
Lesson 5 - Spray Tanning - The treatment area
Gurgaonmoms pink pamper party
Presentación empanada lunch - 10 de abrl de 2012
Ad

Similar to MySQL HA Presentation (20)

PDF
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
PPTX
Building the perfect share point farm
PPT
IBM Innovate 2013 Session: DevOps 101
PPT
Was l iberty for java batch and jsr352
PDF
EXPERIENCE WITH MYSQL HA SOLUTION AND GROUP REPLICATION
PPT
Web Speed And Scalability
PPTX
A Deep Dive into SharePoint 2016 architecture and deployment
PPTX
DevOps @ Scania - Trust and some code - NFI Testforum 2015
DOCX
High availability solution database mirroring
PDF
Easy oracle & weblogic provisioning and deployment
PDF
DATABASE AUTOMATION with Thousands of database, monitoring and backup
PPTX
Handling Data in Mega Scale Systems
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPTX
OVH Lab - Enterprise Cloud Databases
PPTX
Management and Automation of MongoDB Clusters - Slides
PPT
Just do it!
PPTX
SAP ARCHITECTURE (I).pptx
PPT
Scaling Your Web Application
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
Building the perfect share point farm
IBM Innovate 2013 Session: DevOps 101
Was l iberty for java batch and jsr352
EXPERIENCE WITH MYSQL HA SOLUTION AND GROUP REPLICATION
Web Speed And Scalability
A Deep Dive into SharePoint 2016 architecture and deployment
DevOps @ Scania - Trust and some code - NFI Testforum 2015
High availability solution database mirroring
Easy oracle & weblogic provisioning and deployment
DATABASE AUTOMATION with Thousands of database, monitoring and backup
Handling Data in Mega Scale Systems
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
OVH Lab - Enterprise Cloud Databases
Management and Automation of MongoDB Clusters - Slides
Just do it!
SAP ARCHITECTURE (I).pptx
Scaling Your Web Application

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
KodekX | Application Modernization Development
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
cuic standard and advanced reporting.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Approach and Philosophy of On baking technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Big Data Technologies - Introduction.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Empathic Computing: Creating Shared Understanding
KodekX | Application Modernization Development
20250228 LYD VKU AI Blended-Learning.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
sap open course for s4hana steps from ECC to s4
cuic standard and advanced reporting.pdf
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MIND Revenue Release Quarter 2 2025 Press Release
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The AUB Centre for AI in Media Proposal.docx
Approach and Philosophy of On baking technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Diabetes mellitus diagnosis method based random forest with bat algorithm
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

MySQL HA Presentation

  • 1. MySQL HA Using different solutions Robert Krzykawski DB Team Coordinator, bwin games. Anders Karlsson Principal Sales Engineer, MySQL
  • 2. Agenda Who are we? HA Basics – Anders How we did it; Success or failure – Robert Summary Questions?
  • 3. Anders Karlsson Sales Engineer with Sun / MySQL for 5+ years I have been in the RDBMS business for 20+ years I have worked for many of the major vendors and with most of the vendor products I’ve been in roles as Sales Engineer Consultant Porting engineer Support engineer Etc. Outside MySQL I build websites (www.papablues.com), develop Open Source software (MyQuery, ndbtop etc), am a keen photographer and drives sub-standard cars, among other things. Also: www.makezfsgpl.com ! Right now!
  • 4. Robert Krzykawski DB Team Coordinator @ bwin Games AB Have been working with MySQL in every way from system admin, DBA, DBD and now taking a more system architectural role. Been involved in building both small and big web based solutions since 1998 using MySQL. My roles throughout my professional life have varied. System administrator, Technical Sales support, DBA, DBD, Programmer, Application architect and System architect. Off work I am trying to automate things with scripts and programs to off load myself when “on work”.  I am also trying to find time to snowboard, play some paintball and a recently introduced hobby is our Maine Coon kittens. 
  • 5. Why do you need HA Something can break. It usually will, eventually You will need to maintain your database eventually, without shutting the whole system down Adding HA to an existing running system is difficult, Much more so than to provide HA from the start You want a good nights sleep! You want failover to be automatic!
  • 6. HA Concepts Fault tolerant architectures These are hardware architectures with supporting software that prevents against even individual component failures Single Point of Failure (SPOF) In any fault tolerant setup, you want to avoid a SPOF, as a link is not better than it’s weakest link Fail over and Fail back Fail over is the process of switching from a failed component to another component, dormant or also active. Fail back is the process of failing back from the backup component to the original one.
  • 7. Some HA Components Heartbeat Heartbeat is an HA component that checks that the services that are being failed over, are alive. Heartbeat can check individual servers, software services, networking etc. HA Monitor The HA Monitor has different names in different frameworks. This is the component that allows configuration of the services, ensures proper shutdown and startup and allows manual control Replication Replication is a common component that ensures that the data content of managed data rich components are in sync
  • 8. What should I require? Don’t aim too high, aim for what is reasonable for your needs Aim to ensure that no important data is lost What is “important data”? You decide! Different data means different “needs”! Aim to ensure that the solution can be automated. You will want this eventually anyway Aim to ensure a solution that can easily be tested and administered Aim to ensure that the solution is performant and scalable
  • 9. MySQL Replication Easy to use and set up. Low performance impact Asynchronous only. Failback can be difficult. Need additional components MySQL with DRBD / ZFS / AVS Easy to use. Low cost software only. Synchronous. Good HA software integration. Certain performance impact. Limited data size and transaction rates. HA with MySQL – In short
  • 10. MySQL with Shared storage Good performance. Eases hardware management. Good integration with HA software. Costly. SAN itself is a SPOF. MySQL Cluster Very good performance. Self contained. Very short fail-over times. Software only solution. Needs several physical servers. Not optimized for all MySQL applications. HA with MySQL – In short
  • 12. Our goal at bwin We were faced with a requirement; establish a highly available database platform. We had some rules to follow from management. interruptions due to hardware failure should not require hands-on work. Downtime should be minimized during interruptions. Performance of DB platform should not decrease when operating as usual Performance can decrease if a failure has occurred but should not deem the service unusable. Implementation should be done by the operations department. Developers should not be involved.
  • 13. What solutions did we consider? Master/Master Linux HA HP Service Guard Sun Cluster Combination of the above MySQL Cluster Will walk through all of the above
  • 14. Master/Master Master/Master with two active nodes would give us a seamless switch if we have a good load balancer. Will give us the ability to do schema changes “on line” Not only higher availability when both nodes are up, but better performance. Can eliminate the use of production slaves. One entry point for application when using “LB”
  • 15. Linux HA/ServiceGuard/SunCluster Service IP switch will cause a glitch in service. Since we are running 4.0 we can’t really do a master/master setup with service IP switching. Slave integrity is important and we are running 4.0; One master data. Can’t switch to slave and hope that everything was replicated. We are using SAN – Shared storage possible. One instance, two machines – One active, one standby. Innodb log size will be a problem. Timeout during recovery can cause problems during switch.
  • 16. MySQL Cluster High availability built in if implemented correct Requires more hardware. More complex solution Requires application to support NDB Not full feature set.
  • 17. Obstacles We are using MySQL 4.0 in our biggest database Master/Master scenario on 4.0 requires higher level of application awareness. LinuxHA/ServiceGuard/Sun Cluster will cause small glitch when we move resources. MySQL Cluster will require even more application changes in our case.
  • 18. Our Choice LinuxHA because it is GPL/LGPL. Free and not owned by an organization. Fastest way to implement, did not require any support from dev. Department. All other ways required changes in application.
  • 20. We do.. Use Linux HA 2.0. Needed for setup of “cluster” Use SAN. Shared storage is easier and faster, but Expensive. DRBD can be used but saves the same data twice Also comes with a performance decrease. Heartbeat on two bonds. Primary database interconnect network, secondary on database service network We have LUNs presented to multiple hosts Services have rules to be run on specific hosts only. We fence using RiLOE Have plans to fence on port level in FC switches.
  • 21. What’s good and what’s bad.. Easy and fast implementation Our config does not increase/decrease performance. Innodb log size causes long recovery times. Testing to decrease it has caused performance penalties. Our solution is not fool proof because of long recovery times. It causes interruption of service. We can say it’s HA, but true HA solution would give us 100% uptime. 2nd Setup is complicated. We should aim for having simple setups. More common
  • 22. What can we do better. Fine tune config for faster recovery/startup Add better fencing Monitor failover in case recovery takes long Master/Master or Multi master. If application can reconnect or if we have a smart load balancer we have no outages. Upgrades or schema changes can be made “online” No separation between writes and reads. Less complicated for developers. One entry point.
  • 23. Summary Concepts Components Requirements Technologies Your goal Considerations Obstacles How we did it @ bwin games AB HA recommendations
  • 24. Questions The question is not, ‘What is the answer?’ The question is, ‘What is the question?’ Henri Poincaré
  • 25. Thank you for your time! And thank you for listening so kindly. We can be found on: Robert Krzykawski – http://krzykawski.com Anders Karlsson – http://papablues.com http://karlssonondatabases.blogspot.com /