SlideShare a Scribd company logo
© 2017 Pythian. Confidential 1
2© The Pythian Group Inc., 2018
February 26, 2019 - New York City, NY, USA
Matthias Crauwels
Deploying MariaDB for
High Availability on
Google Cloud Platform
© The Pythian Group Inc., 2018 3
Who am I?
© The Pythian Group Inc., 2018 44© The Pythian Group Inc., 2017
Matthias Crauwels
● Living in Ghent, Belgium
● Bachelor Computer Science
● ~20 years Linux user / admin
● ~10 years PHP developer
● ~8 years MySQL DBA
● 3rd year at Pythian
● Currently Lead Database Consultant
● GCP Certified Professional Architect
● AWS Solutions Architect Associate
● Father of Leander
© The Pythian Group Inc., 2018 5© The Pythian Group Inc., 2019 5
PYTHIAN
A global IT company that helps businesses leverage disruptive technologies to better compete.
Our services and software solutions unleash the power of cloud, data and analytics to drive better
business outcomes for our clients.
Our 20 years in data, commitment to hiring the best talent, and our deep technical and business expertise
allow us to meet our promise of using technology to deliver the best outcomes faster.
© The Pythian Group Inc., 2019
6© The Pythian Group Inc., 2019
AI / ML / BLOCKCHAIN
Intelligent analytics
and decision making
Software autonomy
Disruptive data technologies
CLOUD MIGRATION
& OPERATIONS
Plan, Migrate, Manage,
Optimize, Innovate
Multi-cloud, Hybrid-Cloud,
Cloud Native
ANALYTIC DATA SYSTEMS
Kick AaaS cloud-native, pre-packaged
analytics platform
Custom analytics platform design, implementation
and support services–for on-premises and cloud
Data science consulting and implementation services
OPERATIONAL DATA SYSTEMS
Database services–architecture
to ongoing management
On prem and in the cloud
Oracle, MS SQL, MySQL, Cassandra, MongoDB,
Hadoop, AWS/Azure/Google DBaaS
7© The Pythian Group Inc., 2018
AGENDA
7© The Pythian Group Inc., 2019
● Google Cloud Platform (GCP)
● 2 possible implementation
● More complex theoretical example
8© The Pythian Group Inc., 2018
Let's get started!
9© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 9
● Public cloud by Google
● Started as "AppEngine" (GA since Nov 2011)
● New services grew quickly, Compute Engine, CloudSQL, ...
Google Cloud Platform
10© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 10
Service comparison to AWS
Google Cloud Platform (GCP) Amazon Web Services (AWS)
Google Compute Engine (GCE) Elastic Compute Cloud (EC2)
Google Cloud Storage (GCS) Simple Storage Service (S3)
Google Kubernetes Engine (GKE) Elastic Container Service for Kubernetes
Google CloudSQL Relational Database Service (RDS)
Google BigQuery Redshift
... ...
11© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 11
Google Cloud Storage options
Source: Coursera
12© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 12
2 relational database stores?
Google CloudSQL Google Spanner
13© The Pythian Group Inc., 2018 13© The Pythian Group Inc., 2019
Google CloudSQL Google Spanner
● Managed service for MySQL
and PostgreSQL
● Scales up to 10 TB storage
● Regional availability
● Fully managed service
● Read replicas in multiple
zones
● Relational database store
● Horizontally scalable
● Heavily sharded
● Global availability
● Highly Available
● Fully managed service
14© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 14
● Stability
CloudSQL is still a pretty young product, it's not really mature (yet).
● Flexibility
There are still limits to the SQL you can run, not a simple
lift-and-shift
● Cost
Spanner is more expensive than Compute Engine
Why do something else?
15© The Pythian Group Inc., 2018
Solution 1
16© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 16
● MariaDB + Galera cluster
● Minimum 3 nodes (always odd number of nodes)
● Preferably within the same region (latency)
● Writing to any node is supported, but not recommended
● Suggested to use a proxy-layer
● MaxScale (for MariaDB subscribers)
● ProxySQL
MariaDB Cluster
17© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 17
MariaDB Cluster - schematics
18© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 18
On premise → GCP
1. Take a backup on premise
2. Use backup to seed first node in GCE
3. Bootstrap first node
4. Start node 2 and 3 → they will perform SST
5. Set up async replication from on prem to first node
6. Setup proxies and load-balancers
7. Cut over application to new load-balancer
MariaDB Cluster - migration path
19© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 19
MariaDB Cluster - conclusion
Pro Con
Lift and shift from on-prem to the cloud Will not keep scaling
You can put nodes in different zones within the
same region
Will not scale to multi-region
Proxy will perform read-write split All the perks from using Galera
All the benefits from using Galera
20© The Pythian Group Inc., 2018
Solution 2
21© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 21
● Default replication method for years in MySQL world
● Very stable and reliable
● Minimum 2 nodes (1 master + 1 replica/slave)
● Can be multi-region
● What about master high availability?
Regular (asynchronous) replication
22© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 22
Basic database architecture
23© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 23
● Master is single-point-of-failure (spof)
● How to detect master failure?
● What to do when failure is detected?
● How to announce changes in topology to the application
● ...
● Slaves scale out pretty well but
● There is no global load balancer for port 3306
● ...
Problems
24© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 24
Orchestrator is a High Availability and replication management tool.
● Works with ALL flavours of MySQL / MariaDB
● Can be use for multiple purposes
● discovery
● visualisation
● refactoring
● recovery
Master high availability using Orchestrator
25© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 25
Orchestrator - discovery
Orchestrator can (and will) discover your entire replication technology as
soon as you connect it to a single server in the topology.
It will use SHOW SLAVE HOSTS, SHOW PROCESSLIST, SHOW
SLAVE STATUS to try and connect to the other servers in the topology.
Requirement: the orchestrator_topology_userneeds to be created
on every server in the cluster so it can connect.
26© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 26
Orchestrator - visualisation
Orchestrator comes with a web interface that visualizes the servers in the
topology.
27© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 27
Orchestrator - refactoring
Orchestrator can be used to refactor the topology.
This can be done from the command line tool, via the API or even via the
web interface by dragging and dropping.
You can do things like
● Repoint a slave to a new master
● Promote a server to a (co-)master
● Start / Stop slave
● ...
28© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 28
Orchestrator - recovery
All of these features are nice, but they still require a human to execute
them. This doesn’t help you much when your master goes down at 3AM
and you get paged to resolve this.
Orchestrator can be configured to automatically recover your topology
from an outage.
29© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 29
Orchestrator - how recovery works?
To be able to perform a recovery, Orchestrator first needs to detect a
failure.
As indicated before Orchestrator connects to every server in the topology
and gathers information from each of the instances.
Orchestrator uses this information to make decisions on the best action to
take. They call this the holistic approach.
30© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 30
Orchestrator - failure detection example
31© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 31
Orchestrator HA
Orchestrator was written with High Availability as a basic concept.
You can easily run multiple Orchestrator instances with a shared MySQL
backend. All instances will collect all information but they will allow only
one instance to be the “active node” and to make changes to the
topology.
To eliminate a single-point-of-failure in the database backend you can
use either master-master replication (2 nodes) or Galera synchronous
replication (3 nodes).
32© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 32
Orchestrator HA (2)
Since version 3.x of Orchestrator there is “Orchestrator-on-Raft”.
Orchestrator now implements the ‘raft consensus protocol’. This will
● Ensure that a leader node is elected from the available nodes
● Ensure that the leader node has a quorum (majority) at all times
● Allow to run Orchestrator without a shared database backend
● Allow to run without a MySQL backend but use a sqlite backend
33© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 33
● Managed Instance Group requiring 3 nodes
● Orchestrator nodes using
● Raft for leader elections
● SQLite backend for local state
● Orchestrator database is auto-healing so no data replication is
required
Orchestrator in GCP
34© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 34
Orchestrator in GCP
35© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 35
Putting it together
36© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 36
● We introduce a proxy layer
● to help with read write splitting
● to prevent application connections to break on failover
● to have an easy to manage endpoint
● Proxy options
● ProxySQL
● MaxScale
● In this example I will use ProxySQL
Read write splitting
37© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 37
● Layer 7 proxy, understands MySQL protocol
● Uses "hostgroups" to group hosts together
(example one hostgroup for master and one for slaves)
● Query rules to redirect traffic to hostgroups
example:
by default all queries go to master hostgroup
add query rule to have regex ^SELECT to go to slaves
● ProxySQL clustering
● Proxy's will share configuration
● On GCP, start with 2 instances in Managed Instance Group with
autoscaling, add load balancer for traffic
ProxySQL
38© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 38
ProxySQL
39© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 39
Putting it together
40© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 40
● Up until this slide ProxySQL and MaxScale are interchangeable
● Hostgroups and Query rules can be replaced by the
read-write-split-service.
● We picked ProxySQL because of what comes next:
How do we connect Orchestrator and the proxy in a safe way?
Why ProxySQL over MaxScale
41© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 41
● Consul
● Distributed key-value store
● Orchestrator has native support for Consul
● Works across DC's
● Consul-template
● Connects to a consul service
● Updates a templated file everytime consul is update
● Runs arbitrary command on every update
Consul and Consul-template
42© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 42
● Orchestrator detects failure
● Orchestrator takes action, promoting new master
● Orchestrator updates Consul KV store
● Consul distributes values and notifies watchers
● Consul-template receives a notification
● Consul-template recreates query file and launches configured
command
Consul workflow
43© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 43
● Orchestrator has built-in support to update master entries in the Consul KV store
● But if we want the new master to be taken out of the reader pool we need to
implement some "hooks" to update Consul
● Example
# cat /usr/local/orchestrator/postfailoverhook.sh
#!/bin/bash
CONSUL_EXEC="/usr/local/bin/consul"
CONSUL_SLAVE_PREFIX="mysql/slave"
echo "Failover occurred. Running custom PostMasterFailoverProcesses Hook"
# orchestrator needs to make call to consul so the following is accomplished:
# 1. remove new master from slave group
# 2. put old master in slave group
echo "Removing new master from slave pool:
${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_SUCCESSOR_HOST}"
${CONSUL_EXEC} kv delete ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_SUCCESSOR_HOST}
echo "Adding old master to slave pool: ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_FAILED_HOST}"
${CONSUL_EXEC} kv put ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_FAILED_HOST}
Orchestrator hooks
44© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 44
● Example of "query" file
{{ if keyExists "mysql/master/testcluster/hostname" }}
DELETE FROM mysql_servers where hostgroup_id=10;
REPLACE into mysql_servers (hostgroup_id, hostname) values ( 10, " {{ key "mysql/master/testcluster/hostname" }}" );
{{ end }}
DELETE FROM mysql_servers where hostgroup_id=11;
{{ range tree "mysql/slave/testcluster" }}
REPLACE INTO mysql_servers (hostgroup_id, hostname) values ( 11, " {{ .Key }}{{ .Value }}" );
{{ end }}
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
● Example command
command = "/bin/bash -c 'mysql --defaults-file=/etc/proxysql-admin.my.cnf <
/opt/consul-template/templates/proxysql.sql'"
command_timeout = "60s"
Consul-template examples
45© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 45
Single region solution
46© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 46
● WePay
https://wecode.wepay.com/posts/highly-available-mysql-clusters-at-
wepay
● Implementation done by our sibling-team
● Chosen for HAProxy over ProxySQL because they needed
end-to-end SSL encryption
● ProxySQL 1.4.x does not support client-side SSL
● ProxySQL 2.x does support full end-to-end SSL encryption but was not
GA yet at time of implementation.
Publicly shared example
47© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 47
● We would want to create a multi-region setup
● Will be able to work on read-heavy environment that can tolerate
some "stale" reads.
● Master will still be a single instance that runs in one region.
● 1 or more slave(s) in each region
● ProxySQL cluster with internal load balancer in each region,
applications connect to local proxy cluster
● Orchestrator and Consul server can be scale to 1 instance per
region (recommended max 3 instances so not every region might
have an instance)
Where to go from here?
48© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 48
Theoretical Multi-region example
Questions?
Contact
Matthias Crauwels
crauwels@pythian.com
+1 (613) 565-8696 ext. 1215
© 2017 Pythian. Confidential 51
© 2017 Pythian. Confidential 52

More Related Content

PDF
PostgreSQLの範囲型と排他制約
PPTX
re:Invent 2022 DAT326 Deep dive into Amazon Aurora and its innovations
PDF
Dbtechshowcasesapporo mysql-turing-for-cloud-0.9.3
PPTX
モノリスからマイクロサービスへの移行 ~ストラングラーパターンの検証~(Spring Fest 2020講演資料)
PDF
[D31] PostgreSQLでスケールアウト構成を構築しよう by Yugo Nagata
PDF
Modern Authentication -- FIDO2 Web Authentication (WebAuthn) を学ぶ --
PDF
[AWSマイスターシリーズ]Amazon Elastic Load Balancing (ELB)
PDF
クラウド上のデータ活用デザインパターン
PostgreSQLの範囲型と排他制約
re:Invent 2022 DAT326 Deep dive into Amazon Aurora and its innovations
Dbtechshowcasesapporo mysql-turing-for-cloud-0.9.3
モノリスからマイクロサービスへの移行 ~ストラングラーパターンの検証~(Spring Fest 2020講演資料)
[D31] PostgreSQLでスケールアウト構成を構築しよう by Yugo Nagata
Modern Authentication -- FIDO2 Web Authentication (WebAuthn) を学ぶ --
[AWSマイスターシリーズ]Amazon Elastic Load Balancing (ELB)
クラウド上のデータ活用デザインパターン

What's hot (20)

PDF
Implementing role based access control on Web Application (sample case)
PDF
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
PPTX
Azure AD とアプリケーションを SAML 連携する際に陥る事例と対処方法について
PDF
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
PDF
The Full MySQL and MariaDB Parallel Replication Tutorial
PPTX
Scala 3 Is Coming: Martin Odersky Shares What To Know
PDF
Amebaにおけるログ解析基盤Patriotの活用事例
PDF
[Cloud OnAir] Google Cloud へのデータ移行 2019年1月24日 放送
PDF
ストリーミングのげんざい
PDF
今、改めて考えるPostgreSQLプラットフォーム - マルチクラウドとポータビリティ -(PostgreSQL Conference Japan 20...
PDF
オンプレミスRDBMSをAWSへ移行する手法
PDF
Dbts2013 特濃jpoug log_file_sync
PDF
MySQL SYSスキーマのご紹介
PPTX
PL22 - Backup and Restore Performance.pptx
PDF
Quarkus Technical Deep Dive - Japanese
PPTX
MSOfficeファイル暗号化のマスター鍵を利用したバックドアとその対策 by 光成滋生&竹迫良範
PDF
Azure Arc 概要
PDF
EC2でkeepalived+LVS(DSR)
PPTX
What every data programmer needs to know about disks
PDF
AWS Black Belt Tech シリーズ 2015 - Amazon Redshift
Implementing role based access control on Web Application (sample case)
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
Azure AD とアプリケーションを SAML 連携する際に陥る事例と対処方法について
マネージドPostgreSQLの実現に向けたPostgreSQL機能向上(PostgreSQL Conference Japan 2023 発表資料)
The Full MySQL and MariaDB Parallel Replication Tutorial
Scala 3 Is Coming: Martin Odersky Shares What To Know
Amebaにおけるログ解析基盤Patriotの活用事例
[Cloud OnAir] Google Cloud へのデータ移行 2019年1月24日 放送
ストリーミングのげんざい
今、改めて考えるPostgreSQLプラットフォーム - マルチクラウドとポータビリティ -(PostgreSQL Conference Japan 20...
オンプレミスRDBMSをAWSへ移行する手法
Dbts2013 特濃jpoug log_file_sync
MySQL SYSスキーマのご紹介
PL22 - Backup and Restore Performance.pptx
Quarkus Technical Deep Dive - Japanese
MSOfficeファイル暗号化のマスター鍵を利用したバックドアとその対策 by 光成滋生&竹迫良範
Azure Arc 概要
EC2でkeepalived+LVS(DSR)
What every data programmer needs to know about disks
AWS Black Belt Tech シリーズ 2015 - Amazon Redshift
Ad

Similar to Deploying MariaDB for HA on Google Cloud Platform (20)

PDF
Implementing MySQL Database-as-a-Service using open source tools
PDF
High-level architecture of a complete MariaDB deployment
PDF
MySQL HA Orchestrator Proxysql Consul.pdf
PDF
Welcome to databases in the Cloud
PPTX
A Year in Google - Percona Live Europe 2018
PDF
The MySQL High Availability Landscape and where Galera Cluster fits in
PDF
Getting Started with MariaDB with Docker
PPTX
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
PDF
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
PDF
Running Oracle EBS in the cloud (UKOUG APPS16 edition)
PDF
Lessons learned when managing MySQL in the Cloud
PDF
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
PDF
OSDC 2017 | Lessons from database failures by Colin Charles
PDF
Managing and Visualizing your Replication Topologies with Orchestrator
PDF
Database engineering
PPTX
The Rise of Microservices - Containers and Orchestration
PDF
PostgreSQL High Availability in a Containerized World
PDF
PaaS Emerging Technologies - October 2015
PDF
Scaling an invoicing SaaS from zero to over 350k customers
PPTX
User Camp High Availability Presentation
Implementing MySQL Database-as-a-Service using open source tools
High-level architecture of a complete MariaDB deployment
MySQL HA Orchestrator Proxysql Consul.pdf
Welcome to databases in the Cloud
A Year in Google - Percona Live Europe 2018
The MySQL High Availability Landscape and where Galera Cluster fits in
Getting Started with MariaDB with Docker
Oracle EBS Journey to the Cloud - What is New in 2022 (UKOUG Breakthrough 22 ...
NOSQL Meets Relational - The MySQL Ecosystem Gains More Flexibility
Running Oracle EBS in the cloud (UKOUG APPS16 edition)
Lessons learned when managing MySQL in the Cloud
Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service
OSDC 2017 | Lessons from database failures by Colin Charles
Managing and Visualizing your Replication Topologies with Orchestrator
Database engineering
The Rise of Microservices - Containers and Orchestration
PostgreSQL High Availability in a Containerized World
PaaS Emerging Technologies - October 2015
Scaling an invoicing SaaS from zero to over 350k customers
User Camp High Availability Presentation
Ad

More from MariaDB plc (20)

PDF
MariaDB Berlin Roadshow Slides - 8 April 2025
PDF
MariaDB München Roadshow - 24 September, 2024
PDF
MariaDB Paris Roadshow - 19 September 2024
PDF
MariaDB Amsterdam Roadshow: 19 September, 2024
PDF
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
PDF
MariaDB Paris Workshop 2023 - Newpharma
PDF
MariaDB Paris Workshop 2023 - Cloud
PDF
MariaDB Paris Workshop 2023 - MariaDB Enterprise
PDF
MariaDB Paris Workshop 2023 - Performance Optimization
PDF
MariaDB Paris Workshop 2023 - MaxScale
PDF
MariaDB Paris Workshop 2023 - novadys presentation
PDF
MariaDB Paris Workshop 2023 - DARVA presentation
PDF
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
PDF
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
PDF
Einführung : MariaDB Tech und Business Update Hamburg 2023
PDF
Hochverfügbarkeitslösungen mit MariaDB
PDF
Die Neuheiten in MariaDB Enterprise Server
PDF
Global Data Replication with Galera for Ansell Guardian®
PDF
Introducing workload analysis
PDF
Under the hood: SkySQL monitoring
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB München Roadshow - 24 September, 2024
MariaDB Paris Roadshow - 19 September 2024
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
Einführung : MariaDB Tech und Business Update Hamburg 2023
Hochverfügbarkeitslösungen mit MariaDB
Die Neuheiten in MariaDB Enterprise Server
Global Data Replication with Galera for Ansell Guardian®
Introducing workload analysis
Under the hood: SkySQL monitoring

Recently uploaded (20)

PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Introduction to Artificial Intelligence
PDF
AI in Product Development-omnex systems
PPTX
history of c programming in notes for students .pptx
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PTS Company Brochure 2025 (1).pdf.......
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Introduction to Artificial Intelligence
AI in Product Development-omnex systems
history of c programming in notes for students .pptx
wealthsignaloriginal-com-DS-text-... (1).pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Design an Analysis of Algorithms II-SECS-1021-03
2025 Textile ERP Trends: SAP, Odoo & Oracle
Odoo Companies in India – Driving Business Transformation.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Softaken Excel to vCard Converter Software.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
L1 - Introduction to python Backend.pptx
Operating system designcfffgfgggggggvggggggggg
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Design an Analysis of Algorithms I-SECS-1021-03
How to Choose the Right IT Partner for Your Business in Malaysia

Deploying MariaDB for HA on Google Cloud Platform

  • 1. © 2017 Pythian. Confidential 1
  • 2. 2© The Pythian Group Inc., 2018 February 26, 2019 - New York City, NY, USA Matthias Crauwels Deploying MariaDB for High Availability on Google Cloud Platform
  • 3. © The Pythian Group Inc., 2018 3 Who am I?
  • 4. © The Pythian Group Inc., 2018 44© The Pythian Group Inc., 2017 Matthias Crauwels ● Living in Ghent, Belgium ● Bachelor Computer Science ● ~20 years Linux user / admin ● ~10 years PHP developer ● ~8 years MySQL DBA ● 3rd year at Pythian ● Currently Lead Database Consultant ● GCP Certified Professional Architect ● AWS Solutions Architect Associate ● Father of Leander
  • 5. © The Pythian Group Inc., 2018 5© The Pythian Group Inc., 2019 5 PYTHIAN A global IT company that helps businesses leverage disruptive technologies to better compete. Our services and software solutions unleash the power of cloud, data and analytics to drive better business outcomes for our clients. Our 20 years in data, commitment to hiring the best talent, and our deep technical and business expertise allow us to meet our promise of using technology to deliver the best outcomes faster. © The Pythian Group Inc., 2019
  • 6. 6© The Pythian Group Inc., 2019 AI / ML / BLOCKCHAIN Intelligent analytics and decision making Software autonomy Disruptive data technologies CLOUD MIGRATION & OPERATIONS Plan, Migrate, Manage, Optimize, Innovate Multi-cloud, Hybrid-Cloud, Cloud Native ANALYTIC DATA SYSTEMS Kick AaaS cloud-native, pre-packaged analytics platform Custom analytics platform design, implementation and support services–for on-premises and cloud Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture to ongoing management On prem and in the cloud Oracle, MS SQL, MySQL, Cassandra, MongoDB, Hadoop, AWS/Azure/Google DBaaS
  • 7. 7© The Pythian Group Inc., 2018 AGENDA 7© The Pythian Group Inc., 2019 ● Google Cloud Platform (GCP) ● 2 possible implementation ● More complex theoretical example
  • 8. 8© The Pythian Group Inc., 2018 Let's get started!
  • 9. 9© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 9 ● Public cloud by Google ● Started as "AppEngine" (GA since Nov 2011) ● New services grew quickly, Compute Engine, CloudSQL, ... Google Cloud Platform
  • 10. 10© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 10 Service comparison to AWS Google Cloud Platform (GCP) Amazon Web Services (AWS) Google Compute Engine (GCE) Elastic Compute Cloud (EC2) Google Cloud Storage (GCS) Simple Storage Service (S3) Google Kubernetes Engine (GKE) Elastic Container Service for Kubernetes Google CloudSQL Relational Database Service (RDS) Google BigQuery Redshift ... ...
  • 11. 11© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 11 Google Cloud Storage options Source: Coursera
  • 12. 12© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 12 2 relational database stores? Google CloudSQL Google Spanner
  • 13. 13© The Pythian Group Inc., 2018 13© The Pythian Group Inc., 2019 Google CloudSQL Google Spanner ● Managed service for MySQL and PostgreSQL ● Scales up to 10 TB storage ● Regional availability ● Fully managed service ● Read replicas in multiple zones ● Relational database store ● Horizontally scalable ● Heavily sharded ● Global availability ● Highly Available ● Fully managed service
  • 14. 14© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 14 ● Stability CloudSQL is still a pretty young product, it's not really mature (yet). ● Flexibility There are still limits to the SQL you can run, not a simple lift-and-shift ● Cost Spanner is more expensive than Compute Engine Why do something else?
  • 15. 15© The Pythian Group Inc., 2018 Solution 1
  • 16. 16© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 16 ● MariaDB + Galera cluster ● Minimum 3 nodes (always odd number of nodes) ● Preferably within the same region (latency) ● Writing to any node is supported, but not recommended ● Suggested to use a proxy-layer ● MaxScale (for MariaDB subscribers) ● ProxySQL MariaDB Cluster
  • 17. 17© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 17 MariaDB Cluster - schematics
  • 18. 18© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 18 On premise → GCP 1. Take a backup on premise 2. Use backup to seed first node in GCE 3. Bootstrap first node 4. Start node 2 and 3 → they will perform SST 5. Set up async replication from on prem to first node 6. Setup proxies and load-balancers 7. Cut over application to new load-balancer MariaDB Cluster - migration path
  • 19. 19© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 19 MariaDB Cluster - conclusion Pro Con Lift and shift from on-prem to the cloud Will not keep scaling You can put nodes in different zones within the same region Will not scale to multi-region Proxy will perform read-write split All the perks from using Galera All the benefits from using Galera
  • 20. 20© The Pythian Group Inc., 2018 Solution 2
  • 21. 21© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 21 ● Default replication method for years in MySQL world ● Very stable and reliable ● Minimum 2 nodes (1 master + 1 replica/slave) ● Can be multi-region ● What about master high availability? Regular (asynchronous) replication
  • 22. 22© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 22 Basic database architecture
  • 23. 23© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 23 ● Master is single-point-of-failure (spof) ● How to detect master failure? ● What to do when failure is detected? ● How to announce changes in topology to the application ● ... ● Slaves scale out pretty well but ● There is no global load balancer for port 3306 ● ... Problems
  • 24. 24© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 24 Orchestrator is a High Availability and replication management tool. ● Works with ALL flavours of MySQL / MariaDB ● Can be use for multiple purposes ● discovery ● visualisation ● refactoring ● recovery Master high availability using Orchestrator
  • 25. 25© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 25 Orchestrator - discovery Orchestrator can (and will) discover your entire replication technology as soon as you connect it to a single server in the topology. It will use SHOW SLAVE HOSTS, SHOW PROCESSLIST, SHOW SLAVE STATUS to try and connect to the other servers in the topology. Requirement: the orchestrator_topology_userneeds to be created on every server in the cluster so it can connect.
  • 26. 26© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 26 Orchestrator - visualisation Orchestrator comes with a web interface that visualizes the servers in the topology.
  • 27. 27© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 27 Orchestrator - refactoring Orchestrator can be used to refactor the topology. This can be done from the command line tool, via the API or even via the web interface by dragging and dropping. You can do things like ● Repoint a slave to a new master ● Promote a server to a (co-)master ● Start / Stop slave ● ...
  • 28. 28© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 28 Orchestrator - recovery All of these features are nice, but they still require a human to execute them. This doesn’t help you much when your master goes down at 3AM and you get paged to resolve this. Orchestrator can be configured to automatically recover your topology from an outage.
  • 29. 29© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 29 Orchestrator - how recovery works? To be able to perform a recovery, Orchestrator first needs to detect a failure. As indicated before Orchestrator connects to every server in the topology and gathers information from each of the instances. Orchestrator uses this information to make decisions on the best action to take. They call this the holistic approach.
  • 30. 30© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 30 Orchestrator - failure detection example
  • 31. 31© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 31 Orchestrator HA Orchestrator was written with High Availability as a basic concept. You can easily run multiple Orchestrator instances with a shared MySQL backend. All instances will collect all information but they will allow only one instance to be the “active node” and to make changes to the topology. To eliminate a single-point-of-failure in the database backend you can use either master-master replication (2 nodes) or Galera synchronous replication (3 nodes).
  • 32. 32© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 32 Orchestrator HA (2) Since version 3.x of Orchestrator there is “Orchestrator-on-Raft”. Orchestrator now implements the ‘raft consensus protocol’. This will ● Ensure that a leader node is elected from the available nodes ● Ensure that the leader node has a quorum (majority) at all times ● Allow to run Orchestrator without a shared database backend ● Allow to run without a MySQL backend but use a sqlite backend
  • 33. 33© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 33 ● Managed Instance Group requiring 3 nodes ● Orchestrator nodes using ● Raft for leader elections ● SQLite backend for local state ● Orchestrator database is auto-healing so no data replication is required Orchestrator in GCP
  • 34. 34© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 34 Orchestrator in GCP
  • 35. 35© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 35 Putting it together
  • 36. 36© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 36 ● We introduce a proxy layer ● to help with read write splitting ● to prevent application connections to break on failover ● to have an easy to manage endpoint ● Proxy options ● ProxySQL ● MaxScale ● In this example I will use ProxySQL Read write splitting
  • 37. 37© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 37 ● Layer 7 proxy, understands MySQL protocol ● Uses "hostgroups" to group hosts together (example one hostgroup for master and one for slaves) ● Query rules to redirect traffic to hostgroups example: by default all queries go to master hostgroup add query rule to have regex ^SELECT to go to slaves ● ProxySQL clustering ● Proxy's will share configuration ● On GCP, start with 2 instances in Managed Instance Group with autoscaling, add load balancer for traffic ProxySQL
  • 38. 38© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 38 ProxySQL
  • 39. 39© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 39 Putting it together
  • 40. 40© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 40 ● Up until this slide ProxySQL and MaxScale are interchangeable ● Hostgroups and Query rules can be replaced by the read-write-split-service. ● We picked ProxySQL because of what comes next: How do we connect Orchestrator and the proxy in a safe way? Why ProxySQL over MaxScale
  • 41. 41© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 41 ● Consul ● Distributed key-value store ● Orchestrator has native support for Consul ● Works across DC's ● Consul-template ● Connects to a consul service ● Updates a templated file everytime consul is update ● Runs arbitrary command on every update Consul and Consul-template
  • 42. 42© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 42 ● Orchestrator detects failure ● Orchestrator takes action, promoting new master ● Orchestrator updates Consul KV store ● Consul distributes values and notifies watchers ● Consul-template receives a notification ● Consul-template recreates query file and launches configured command Consul workflow
  • 43. 43© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 43 ● Orchestrator has built-in support to update master entries in the Consul KV store ● But if we want the new master to be taken out of the reader pool we need to implement some "hooks" to update Consul ● Example # cat /usr/local/orchestrator/postfailoverhook.sh #!/bin/bash CONSUL_EXEC="/usr/local/bin/consul" CONSUL_SLAVE_PREFIX="mysql/slave" echo "Failover occurred. Running custom PostMasterFailoverProcesses Hook" # orchestrator needs to make call to consul so the following is accomplished: # 1. remove new master from slave group # 2. put old master in slave group echo "Removing new master from slave pool: ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_SUCCESSOR_HOST}" ${CONSUL_EXEC} kv delete ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_SUCCESSOR_HOST} echo "Adding old master to slave pool: ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_FAILED_HOST}" ${CONSUL_EXEC} kv put ${CONSUL_SLAVE_PREFIX}/${ORC_FAILURE_CLUSTER_ALIAS}/${ORC_FAILED_HOST} Orchestrator hooks
  • 44. 44© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 44 ● Example of "query" file {{ if keyExists "mysql/master/testcluster/hostname" }} DELETE FROM mysql_servers where hostgroup_id=10; REPLACE into mysql_servers (hostgroup_id, hostname) values ( 10, " {{ key "mysql/master/testcluster/hostname" }}" ); {{ end }} DELETE FROM mysql_servers where hostgroup_id=11; {{ range tree "mysql/slave/testcluster" }} REPLACE INTO mysql_servers (hostgroup_id, hostname) values ( 11, " {{ .Key }}{{ .Value }}" ); {{ end }} LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; ● Example command command = "/bin/bash -c 'mysql --defaults-file=/etc/proxysql-admin.my.cnf < /opt/consul-template/templates/proxysql.sql'" command_timeout = "60s" Consul-template examples
  • 45. 45© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 45 Single region solution
  • 46. 46© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 46 ● WePay https://wecode.wepay.com/posts/highly-available-mysql-clusters-at- wepay ● Implementation done by our sibling-team ● Chosen for HAProxy over ProxySQL because they needed end-to-end SSL encryption ● ProxySQL 1.4.x does not support client-side SSL ● ProxySQL 2.x does support full end-to-end SSL encryption but was not GA yet at time of implementation. Publicly shared example
  • 47. 47© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 47 ● We would want to create a multi-region setup ● Will be able to work on read-heavy environment that can tolerate some "stale" reads. ● Master will still be a single instance that runs in one region. ● 1 or more slave(s) in each region ● ProxySQL cluster with internal load balancer in each region, applications connect to local proxy cluster ● Orchestrator and Consul server can be scale to 1 instance per region (recommended max 3 instances so not every region might have an instance) Where to go from here?
  • 48. 48© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 48 Theoretical Multi-region example
  • 51. © 2017 Pythian. Confidential 51
  • 52. © 2017 Pythian. Confidential 52