SlideShare a Scribd company logo
2© The Pythian Group Inc., 2018
AllThingsOpen, Raleigh, NC, USA
October 15, 2019
Matthias Crauwels
Implementing MySQL
Database-as-a-Service
using Open Source tools
3© The Pythian Group Inc., 2018
Who am I?
4© The Pythian Group Inc., 2018 4© The Pythian Group Inc., 2019
Matthias Crauwels
● Living in Ghent, Belgium
● Bachelor Computer Science
● ~20 years Linux user / admin
● ~10 years PHP developer
● ~8 years MySQL DBA
● 3rd year at Pythian
● Currently Lead Database Consultant
● Father of Leander
5© The Pythian Group Inc., 2018 5© The Pythian Group Inc., 2018
Helping businesses
use data to compete
and win
6© The Pythian Group Inc., 2018
AGENDA
6© The Pythian Group Inc., 2019
Introduction and history
DBaaS: frontend
DBaaS: backend
Communication
7© The Pythian Group Inc., 2018
Let's get started!
8© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 8
You start a new application, in many cases on a LAMP stack
● Linux
● Apache
● MySQL
● PHP
Everything on a single server!
History
9© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 9
Your application grows… What do you do?
You buy a bigger server!
History
10© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 10
You application grows even more. Yay!
You buy more servers and split your infrastructure.
History
Database
Web server File server
11© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 11
Your application grows even more!
● You scale up the components
● Web servers are easy, just add more and load balance
● File servers are easy, get more/bigger disks, implement RAID solutions, ...
● What about the database??
■ More servers?
● Ok but what about the data?
■ I want all my web servers to see the same data.
● Writing it on all the servers? Overhead!
History
12© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 12
MySQL replication
● Writing to master
● Reading from replica’s (slaves)
History
13© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 13
Million dollar questions
● How do we know what server is the master?
● How do we know which servers are the replica’s?
● How do we manage this replication topology?
● What if the master goes down?
● What about maintenance?
● …
History
14© The Pythian Group Inc., 2018
Database-as-a-Service
Frontend Solution
15© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 15
ProxySQL
16© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 16
ProxySQL is a high performance layer 7 proxy application for MySQL.
● It provides ‘intelligent’ load balancing of application requests onto
multiple databases
● It understands the MySQL traffic that passes through it, and can split
reads from writes.
● It understands the underlying database topology, whether the
instances are up or down
● It shields applications from the complexity of the underlying
database topology, as well as any changes to it
● ...
ProxySQL: What?
17© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 17
● Hostgroup
All backend MySQL servers are grouped into hostgroups. These “hostgroups” will be used
for query routing.
● Query rules
Query rules are used for routing, mirroring, rewriting or blocking queries. They are at the
heart of ProxySQL’s functionalities
● MySQL users and servers
These are configuration items which the proxy uses to operate
ProxySQL: terminology
18© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 18
ProxySQL: Basic design (1)
19© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 19
ProxySQL: Basic design (2)
20© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 20
ProxySQL: Internals
21© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 21
ProxySQL will be configured to share configuration values with its peers.
Currently, all instances are equal and can be used to reconfigure, there is
no “master” or “leader”. This is a feature on the roadmap
(https://github.com/sysown/proxysql/wiki/ProxySQL-Cluster#roadmap).
Helps to:
● Avoid your ProxySQL instance to be the single point of failure
● Avoid having to reconfigure every ProxySQL instance on the
application server
● Helps to (auto-)scale the ProxySQL infrastructure
ProxySQL: Clustering
22© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 22
ProxySQL exists between the application and the database.
● It hides the complexity of the database topology to the application
● It knows which server is the master and which are the slaves
● It will not make changes to the topology so topology management
is not solved with this product.
● It has support for gracefully taking a server out of service
● It is easy to configure
● It can be clustered for not being a single-point-of-failure
ProxySQL: Conclusions
23© The Pythian Group Inc., 2018
Question about
ProxySQL?
24© The Pythian Group Inc., 2018
Database-as-a-Service
Backend management
25© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 25
Orchestrator
26© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 26
Orchestrator is a High Availability and replication management tool.
It can be used for:
● Discovery of a topology
● Visualisation of a topology
● Refactoring of a topology
● Recovery of a topology
Orchestrator: What?
27© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 27
Orchestrator can (and will) discover your entire replication technology as
soon as you connect it to a single server in the topology.
It will use SHOW SLAVE HOSTS, SHOW PROCESSLIST, SHOW
SLAVE STATUS to try and connect to the other servers in the topology.
Requirement: the orchestrator_topology_userneeds to be created
on every server in the cluster so it can connect.
Orchestrator: Discovery
28© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 28
Orchestrator comes with a web interface that visualizes the servers in the
topology.
Orchestrator: Visualization
29© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 29
Orchestrator can be used to refactor the topology.
This can be done from the command line tool, via the API or even via the
web interface by dragging and dropping.
You can do things like
● Repoint a slave to a new master
● Promote a server to a (co-)master
● Start / Stop slave
● ...
Orchestrator: Refactoring
30© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 30
All of these features are nice, but they still require a human to execute
them. This doesn’t help you much when your master goes down at 3AM
and you get paged to resolve this.
Orchestrator can be configured to automatically recover your topology
from an outage.
Orchestrator: Recovery
31© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 31
To be able to perform a recovery, Orchestrator first needs to detect a
failure.
As indicated before Orchestrator connects to every server in the topology
and gathers information from each of the instances.
Orchestrator uses this information to make decisions on the best action to
take. They call this the holistic approach.
Orchestrator: How recovery works?
32© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 32
Orchestrator: Failure detection example
33© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 33
Orchestrator was written with High Availability as a basic concept.
You can easily run multiple Orchestrator instances with a shared MySQL
backend. All instances will collect all information but they will allow only
one instance to be the “active node” and to make changes to the
topology.
To eliminate a single-point-of-failure in the database backend you can
use either master-master replication (2 nodes) or Galera synchronous
replication (3 nodes).
Orchestrator High Availability
34© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 34
Since version 3.x of Orchestrator there is “Orchestrator-on-Raft”.
Orchestrator now implements the ‘raft consensus protocol’. This will
● Ensure that a leader node is elected from the available nodes
● Ensure that the leader node has a quorum (majority) at all times
● Allow to run Orchestrator without a shared database backend
● Allow to run without a MySQL backend but use a sqlite backend
Orchestrator High Availability
35© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 35
A common example of a High Availability setup
● 3 Orchestrator nodes in different DC’s
● Often one primary DC, one backup DC and one “arbitrator” node in a
cloud DC.
● Orchestrator developers have made changes to raft protocol to allow
■ leader to step down
■ other nodes to yield to a certain node to become the leader
● Shlomi Noach from GitHub will definitely go into more detail on how
they implemented this at GitHub.
Orchestrator High Availability
36© The Pythian Group Inc., 2018
Questions about
Orchestrator?
37© The Pythian Group Inc., 2018
Overview
38© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 38
Architecture overview
APP(S)
Leader
39© The Pythian Group Inc., 2018
Communication
Default behaviour
40© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 40
● Using the read only flag monitoring in ProxySQL by adding a hostgroup-pair to
mysql_replication_hostgroupstable
Admin> SHOW CREATE TABLE mysql_replication_hostgroupsG
*************************** 1. row ***************************
table: mysql_replication_hostgroups
Create Table: CREATE TABLE mysql_replication_hostgroups (
writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY,
reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>0),
comment VARCHAR,
UNIQUE (reader_hostgroup)
)
1 row in set (0.00 sec)
● Requires monitoring user to be configured correctly
ProxySQL read only flag monitoring
41© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 41
● Orchestrator will flip the read-only flag on master failover
● Setting ApplyMySQLPromotionAfterMasterFailover
● default value was false (Orchestrator version < 3.0.12)
● since 3.0.12 default is true
● Recommendation has always been to enable this.
● Configure MySQL to be read-only by default (best practise)
Orchestrator ApplyMySQLPromotionAfterMasterFailover
42© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 42
● What happens on network partitions?
● Orchestrator sees master being unavailable and promotes a new
● Old master still is writeable (Orchestrator can not reach it to toggle the
flag)
● ProxySQL will move the new master (writable) to the writer hostgroup
● ProxySQL will place old master as SHUNNED.
● When network partition gets resolved it will still be writable so it will
return to ONLINE.
● this will lead to split brain
Default behaviour: Caveats
43© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 43
● Solutions to prevent this split brain scenario
● STONITH (shoot the other node in the head)
● Run script in ProxySQL scheduler that deletes any SHUNNED writers
from the configuration (both from the writer and reader hostgroups)
Default behaviour: Caveats / workarounds
44© The Pythian Group Inc., 2018
Communication
Orchestrator hooks
45© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 45
● Orchestrator implements hooks on various stages of the recovery
process
● These "hooks" are like events that will be called and you can
configure your own scripts to run
● This makes Orchestrator highly customisable and scriptable
● Default (naive) configuration will echo text to /tmp/recovery.log
● Use the hooks! If not for scripting then for alerting / notifying you
that something happened
Orchestrator hooks: What?
46© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 46
● Instead of relying on ProxySQL's monitoring of the read-only flag we
can now actively push changes to ProxySQL using the hooks.
● Whenever a planned or unplanned master change takes place we
will update the ProxySQL.
● Pre-failover:
■ Remove {failedHost} from the writer hostgroup
● Post-failover:
■ If the recovery was successful: Insert {successorHost} in the writer
hostgroup
● WARNING: test test test test test test !!!!!
(before enabling automated failovers in production)
Orchestrator hooks: Why?
47© The Pythian Group Inc., 2018
Communication
Decouple communication (between ProxySQL and Orchestrator)
48© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 48
● Orchestrator hooks are great but...
● ... what happens if there is no communication possible between
Orchestrator and ProxySQL?
● Hooks are only fired once
● What if ProxySQL is not reachable? Stop failover?
● You need ProxySQL admin credentials available on Orchestrator
The problem
49© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 49
● Decouple Orchestrator and ProxySQL
● Use Consul as key-value store in between both
● Orchestrator has built-in support to update master coordinates in the
K/V store (both for Zookeeper and Consul)
● Configuration settings
● "KVClusterMasterPrefix": "mysql/master",
● "ConsulAddress": "127.0.0.1:8500",
● "ZkAddress": "srv-a,srv-b:12181,srv-c",
The solution
50© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 50
● KVClusterMasterPrefixis the prefix to use for master discovery
entries. As example, your cluster alias is mycluster and the master
host is some.host-17.comthen you will expect an entry where:
● The Key is mysql/master/mycluster
● The Value is some.host-17.com:3306
● Additionally following key/values will be available automatically
● mysql/master/mycluster/hostname , value is some.host-17.com
● mysql/master/mycluster/port , value is 3306
● mysql/master/mycluster/ipv4 , value is 192.168.0.1
● mysql/master/mycluster/ipv6 , value is <whatever>
Which keys and values?
51© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 51
● Recommended setup for Orchestrator is to run 3 nodes with their
own local datastore (MySQL or SQLite)
● Communication between nodes happens using the RAFT protocol.
● This is also the preferred setup for the Consul K/V store
● We install Consul "server" on each Orchestrator nodes
● Consul "server" comes also with an "agent"
● We let the Orchestrator leader send it's updates to the local Consul
agent.
● Consul agent updates the Consul leader node and the leader
distributes the data to all 3 nodes using the RAFT protocol.
Avoiding single-point-of-failures (1)
52© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 52
● We now have our HA for Orchestrator and Consul.
● We have avoided network partitioning
● Majority vote is required to be the leader on both applications
● If our local Consul agent is unable to reach the Consul leader node, then
Orchestrator will not be able to reach its peers and thus not be the
Leader node.
● Optional: Orchestrator extends RAFT to implement a yield option to
yield to a specific leader. We could implement a cronjob for
Orchestrator to always yield Orchestrator leadership to the Consul
leader for faster updates but this not a requirement.
Avoiding single-point-of-failures (2)
53© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 53
● Orchestrator really doesn't care all that much for slaves
● Masters are important for HA
● the native support for the K/V store ends with updating the masters to it
("KVClusterMasterPrefix": "mysql/master" )
● API to the rescue!
● We can create a fairly simple script that runs in a cron
● pull ALL the servers from the API (get JSON response)
● compare the slave entries with values in Consul (for example keys
starting with mysql/slaves)
● update Consul if needed
What about the slaves?
54© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 54
● Now Orchestrator is updating Consul K/V (master via native support,
slaves via our script)
● Let's install a Consul "agent" on every ProxySQL machine.
● We can now query Consul data via this local agent
root@proxysql-1:~ $ consul members
Node Address Status Type Build Protocol DC Segment
orchestrator-1 10.0.1.2:8301 alive server 1.4.3 2 default <all>
orchestrator-2 10.0.2.2:8301 alive server 1.4.3 2 default <all>
orchestrator-3 10.0.3.2:8301 alive server 1.4.3 2 default <all>
proxysql-1 10.0.1.3:8301 alive client 1.4.3 2 default <default>
proxysql-2 10.0.2.3:8301 alive client 1.4.3 2 default <default>
How to configure ProxySQL?
55© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 55
● First option is to use the scripted approach
● Run a script in a cronjob or in the ProxySQL scheduler
● Crawl the Consul K/V store
● Update ProxySQL config
How to configure ProxySQL?
Pro Con
Fairly easy A lot of wasted CPU cycles
Fairly quick (ProxySQL scheduler works on
a millisecond base)
56© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 56
● Use consul-template
● Registers as listener to the Consul values
● Every time a value is changed it will re-generate a file from a template
● Example:
{{ if keyExists "mysql/master/testcluster/hostname" }}
DELETE FROM mysql_servers where hostgroup_id = 0;
REPLACE into mysql_servers (hostgroup_id, hostname) values ( 0, "{{ key
"mysql/master/testcluster/hostname" }}" );
{{ end }}
{{ range tree "mysql/slave/testcluster" }}
REPLACE into mysql_servers (hostgroup_id, hostname) values ( 1, "{{ .Key }}{{ .Value }}" );
{{ end }}
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
How to configure ProxySQL?
57© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 57
Architecture
58© The Pythian Group Inc., 2018
Questions?
59© The Pythian Group Inc., 2018
Contact
Matthias Crauwels
crauwels@pythian.com
+1 (613) 565-8696 ext. 1215
Twitter @mcrauwel
We're hiring!!
https://pythian.com/careers

More Related Content

PDF
17 Things Developers Should Know About Databases
PDF
Deploying Anything as a Service (XaaS) Using Operators on Kubernetes
PDF
The Future of Data Pipelines
PDF
Serverless Functions: Accelerating DevOps Adoption
PDF
GitOps is the best modern practice for CD with Kubernetes
PDF
OSDC 2018 - Distributed monitoring
PDF
FluentD vs. Logstash
PDF
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
17 Things Developers Should Know About Databases
Deploying Anything as a Service (XaaS) Using Operators on Kubernetes
The Future of Data Pipelines
Serverless Functions: Accelerating DevOps Adoption
GitOps is the best modern practice for CD with Kubernetes
OSDC 2018 - Distributed monitoring
FluentD vs. Logstash
How a distributed graph analytics platform uses Apache Kafka for data ingesti...

What's hot (20)

PDF
Kubernetes: The evolution of distributed systems | DevNation Tech Talk
PDF
01. lab instructions starting project
PPTX
Getting Started with Kafka on k8s
PPTX
Migrating from oracle soa suite to microservices on kubernetes
PDF
Flux is incubating + the road ahead
PDF
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
PDF
Kubernetes-Native DevOps: For Apache Kafka® with Confluent
PDF
44CON 2014 - Binary Protocol Analysis with CANAPE, James Forshaw
PDF
The service mesh management plane
PPTX
Cloud Economics - Crayon Optimization Services
PDF
The what, why and how of knative
PDF
dA Platform Overview
PDF
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
PDF
Toward Hybrid Cloud Serverless Transparency with Lithops Framework
PDF
Cloud-Native Modernization or Death? A false dichotomy. | DevNation Tech Talk
PDF
Operator development made easy with helm
PDF
Designing a complete ci cd pipeline using argo events, workflow and cd products
PDF
Why you should have a Schema Registry | David Hettler, Celonis SE
PDF
GitOps for Helm Users by Scott Rigby
PPTX
Moving existing apps to the cloud
Kubernetes: The evolution of distributed systems | DevNation Tech Talk
01. lab instructions starting project
Getting Started with Kafka on k8s
Migrating from oracle soa suite to microservices on kubernetes
Flux is incubating + the road ahead
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Kubernetes-Native DevOps: For Apache Kafka® with Confluent
44CON 2014 - Binary Protocol Analysis with CANAPE, James Forshaw
The service mesh management plane
Cloud Economics - Crayon Optimization Services
The what, why and how of knative
dA Platform Overview
Flink Forward San Francisco 2018: Dave Torok & Sameer Wadkar - "Embedding Fl...
Toward Hybrid Cloud Serverless Transparency with Lithops Framework
Cloud-Native Modernization or Death? A false dichotomy. | DevNation Tech Talk
Operator development made easy with helm
Designing a complete ci cd pipeline using argo events, workflow and cd products
Why you should have a Schema Registry | David Hettler, Celonis SE
GitOps for Helm Users by Scott Rigby
Moving existing apps to the cloud
Ad

Similar to Implementing MySQL Database-as-a-Service using open source tools (20)

PDF
Deploying MariaDB for HA on Google Cloud Platform
PDF
Replication Whats New in Mysql 8
PDF
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
PPTX
Cloud Native with Kyma
PPTX
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
PDF
Using The Mysql Binary Log As A Change Stream
PPTX
How YugaByte DB Implements Distributed PostgreSQL
PDF
Oracle Open World 2018 / Code One : MySQL 8.0 High Availability with MySQL I...
PPTX
Dataworks | 2018-06-20 | Gimel data platform
PPTX
Gimel at Dataworks Summit San Jose 2018
PDF
MySQL 8 High Availability with InnoDB Clusters
PDF
Huge pages why-what-how
PPTX
YugaByte + PKS CloudFoundry Meetup 10/15/2018
ODP
Doc store
PDF
QCon 2018 | Gimel | PayPal's Analytic Platform
PPTX
In-Memory Stream Processing with Hazelcast Jet @JEEConf
PDF
Why You Need Manageability Now More than Ever and How to Get It
PDF
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
PDF
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
PDF
MySQL Innovation Day Chicago - MySQL HA So Easy : That's insane !!
Deploying MariaDB for HA on Google Cloud Platform
Replication Whats New in Mysql 8
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
Cloud Native with Kyma
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
Using The Mysql Binary Log As A Change Stream
How YugaByte DB Implements Distributed PostgreSQL
Oracle Open World 2018 / Code One : MySQL 8.0 High Availability with MySQL I...
Dataworks | 2018-06-20 | Gimel data platform
Gimel at Dataworks Summit San Jose 2018
MySQL 8 High Availability with InnoDB Clusters
Huge pages why-what-how
YugaByte + PKS CloudFoundry Meetup 10/15/2018
Doc store
QCon 2018 | Gimel | PayPal's Analytic Platform
In-Memory Stream Processing with Hazelcast Jet @JEEConf
Why You Need Manageability Now More than Ever and How to Get It
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
MySQL Innovation Day Chicago - MySQL HA So Easy : That's insane !!
Ad

More from All Things Open (20)

PDF
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
PPTX
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
PDF
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
PDF
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
PDF
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
PDF
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
PDF
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
PPTX
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
PDF
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
PDF
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
PPTX
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
PDF
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
PPTX
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
PDF
The Death of the Browser - Rachel-Lee Nabors, AgentQL
PDF
Making Operating System updates fast, easy, and safe
PDF
Reshaping the landscape of belonging to transform community
PDF
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
PDF
Integrating Diversity, Equity, and Inclusion into Product Design
PDF
The Open Source Ecosystem for eBPF in Kubernetes
PDF
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
The Death of the Browser - Rachel-Lee Nabors, AgentQL
Making Operating System updates fast, easy, and safe
Reshaping the landscape of belonging to transform community
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
Integrating Diversity, Equity, and Inclusion into Product Design
The Open Source Ecosystem for eBPF in Kubernetes
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Empathic Computing: Creating Shared Understanding
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPT
Teaching material agriculture food technology
PPTX
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Network Security Unit 5.pdf for BCA BBA.
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Empathic Computing: Creating Shared Understanding
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
sap open course for s4hana steps from ECC to s4
Understanding_Digital_Forensics_Presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation_ Review paper, used for researhc scholars
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
Teaching material agriculture food technology
Cloud computing and distributed systems.

Implementing MySQL Database-as-a-Service using open source tools

  • 1. 2© The Pythian Group Inc., 2018 AllThingsOpen, Raleigh, NC, USA October 15, 2019 Matthias Crauwels Implementing MySQL Database-as-a-Service using Open Source tools
  • 2. 3© The Pythian Group Inc., 2018 Who am I?
  • 3. 4© The Pythian Group Inc., 2018 4© The Pythian Group Inc., 2019 Matthias Crauwels ● Living in Ghent, Belgium ● Bachelor Computer Science ● ~20 years Linux user / admin ● ~10 years PHP developer ● ~8 years MySQL DBA ● 3rd year at Pythian ● Currently Lead Database Consultant ● Father of Leander
  • 4. 5© The Pythian Group Inc., 2018 5© The Pythian Group Inc., 2018 Helping businesses use data to compete and win
  • 5. 6© The Pythian Group Inc., 2018 AGENDA 6© The Pythian Group Inc., 2019 Introduction and history DBaaS: frontend DBaaS: backend Communication
  • 6. 7© The Pythian Group Inc., 2018 Let's get started!
  • 7. 8© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 8 You start a new application, in many cases on a LAMP stack ● Linux ● Apache ● MySQL ● PHP Everything on a single server! History
  • 8. 9© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 9 Your application grows… What do you do? You buy a bigger server! History
  • 9. 10© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 10 You application grows even more. Yay! You buy more servers and split your infrastructure. History Database Web server File server
  • 10. 11© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 11 Your application grows even more! ● You scale up the components ● Web servers are easy, just add more and load balance ● File servers are easy, get more/bigger disks, implement RAID solutions, ... ● What about the database?? ■ More servers? ● Ok but what about the data? ■ I want all my web servers to see the same data. ● Writing it on all the servers? Overhead! History
  • 11. 12© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 12 MySQL replication ● Writing to master ● Reading from replica’s (slaves) History
  • 12. 13© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 13 Million dollar questions ● How do we know what server is the master? ● How do we know which servers are the replica’s? ● How do we manage this replication topology? ● What if the master goes down? ● What about maintenance? ● … History
  • 13. 14© The Pythian Group Inc., 2018 Database-as-a-Service Frontend Solution
  • 14. 15© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 15 ProxySQL
  • 15. 16© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 16 ProxySQL is a high performance layer 7 proxy application for MySQL. ● It provides ‘intelligent’ load balancing of application requests onto multiple databases ● It understands the MySQL traffic that passes through it, and can split reads from writes. ● It understands the underlying database topology, whether the instances are up or down ● It shields applications from the complexity of the underlying database topology, as well as any changes to it ● ... ProxySQL: What?
  • 16. 17© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 17 ● Hostgroup All backend MySQL servers are grouped into hostgroups. These “hostgroups” will be used for query routing. ● Query rules Query rules are used for routing, mirroring, rewriting or blocking queries. They are at the heart of ProxySQL’s functionalities ● MySQL users and servers These are configuration items which the proxy uses to operate ProxySQL: terminology
  • 17. 18© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 18 ProxySQL: Basic design (1)
  • 18. 19© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 19 ProxySQL: Basic design (2)
  • 19. 20© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 20 ProxySQL: Internals
  • 20. 21© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 21 ProxySQL will be configured to share configuration values with its peers. Currently, all instances are equal and can be used to reconfigure, there is no “master” or “leader”. This is a feature on the roadmap (https://github.com/sysown/proxysql/wiki/ProxySQL-Cluster#roadmap). Helps to: ● Avoid your ProxySQL instance to be the single point of failure ● Avoid having to reconfigure every ProxySQL instance on the application server ● Helps to (auto-)scale the ProxySQL infrastructure ProxySQL: Clustering
  • 21. 22© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 22 ProxySQL exists between the application and the database. ● It hides the complexity of the database topology to the application ● It knows which server is the master and which are the slaves ● It will not make changes to the topology so topology management is not solved with this product. ● It has support for gracefully taking a server out of service ● It is easy to configure ● It can be clustered for not being a single-point-of-failure ProxySQL: Conclusions
  • 22. 23© The Pythian Group Inc., 2018 Question about ProxySQL?
  • 23. 24© The Pythian Group Inc., 2018 Database-as-a-Service Backend management
  • 24. 25© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 25 Orchestrator
  • 25. 26© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 26 Orchestrator is a High Availability and replication management tool. It can be used for: ● Discovery of a topology ● Visualisation of a topology ● Refactoring of a topology ● Recovery of a topology Orchestrator: What?
  • 26. 27© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 27 Orchestrator can (and will) discover your entire replication technology as soon as you connect it to a single server in the topology. It will use SHOW SLAVE HOSTS, SHOW PROCESSLIST, SHOW SLAVE STATUS to try and connect to the other servers in the topology. Requirement: the orchestrator_topology_userneeds to be created on every server in the cluster so it can connect. Orchestrator: Discovery
  • 27. 28© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 28 Orchestrator comes with a web interface that visualizes the servers in the topology. Orchestrator: Visualization
  • 28. 29© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 29 Orchestrator can be used to refactor the topology. This can be done from the command line tool, via the API or even via the web interface by dragging and dropping. You can do things like ● Repoint a slave to a new master ● Promote a server to a (co-)master ● Start / Stop slave ● ... Orchestrator: Refactoring
  • 29. 30© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 30 All of these features are nice, but they still require a human to execute them. This doesn’t help you much when your master goes down at 3AM and you get paged to resolve this. Orchestrator can be configured to automatically recover your topology from an outage. Orchestrator: Recovery
  • 30. 31© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 31 To be able to perform a recovery, Orchestrator first needs to detect a failure. As indicated before Orchestrator connects to every server in the topology and gathers information from each of the instances. Orchestrator uses this information to make decisions on the best action to take. They call this the holistic approach. Orchestrator: How recovery works?
  • 31. 32© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 32 Orchestrator: Failure detection example
  • 32. 33© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 33 Orchestrator was written with High Availability as a basic concept. You can easily run multiple Orchestrator instances with a shared MySQL backend. All instances will collect all information but they will allow only one instance to be the “active node” and to make changes to the topology. To eliminate a single-point-of-failure in the database backend you can use either master-master replication (2 nodes) or Galera synchronous replication (3 nodes). Orchestrator High Availability
  • 33. 34© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 34 Since version 3.x of Orchestrator there is “Orchestrator-on-Raft”. Orchestrator now implements the ‘raft consensus protocol’. This will ● Ensure that a leader node is elected from the available nodes ● Ensure that the leader node has a quorum (majority) at all times ● Allow to run Orchestrator without a shared database backend ● Allow to run without a MySQL backend but use a sqlite backend Orchestrator High Availability
  • 34. 35© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 35 A common example of a High Availability setup ● 3 Orchestrator nodes in different DC’s ● Often one primary DC, one backup DC and one “arbitrator” node in a cloud DC. ● Orchestrator developers have made changes to raft protocol to allow ■ leader to step down ■ other nodes to yield to a certain node to become the leader ● Shlomi Noach from GitHub will definitely go into more detail on how they implemented this at GitHub. Orchestrator High Availability
  • 35. 36© The Pythian Group Inc., 2018 Questions about Orchestrator?
  • 36. 37© The Pythian Group Inc., 2018 Overview
  • 37. 38© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 38 Architecture overview APP(S) Leader
  • 38. 39© The Pythian Group Inc., 2018 Communication Default behaviour
  • 39. 40© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 40 ● Using the read only flag monitoring in ProxySQL by adding a hostgroup-pair to mysql_replication_hostgroupstable Admin> SHOW CREATE TABLE mysql_replication_hostgroupsG *************************** 1. row *************************** table: mysql_replication_hostgroups Create Table: CREATE TABLE mysql_replication_hostgroups ( writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY, reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>0), comment VARCHAR, UNIQUE (reader_hostgroup) ) 1 row in set (0.00 sec) ● Requires monitoring user to be configured correctly ProxySQL read only flag monitoring
  • 40. 41© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 41 ● Orchestrator will flip the read-only flag on master failover ● Setting ApplyMySQLPromotionAfterMasterFailover ● default value was false (Orchestrator version < 3.0.12) ● since 3.0.12 default is true ● Recommendation has always been to enable this. ● Configure MySQL to be read-only by default (best practise) Orchestrator ApplyMySQLPromotionAfterMasterFailover
  • 41. 42© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 42 ● What happens on network partitions? ● Orchestrator sees master being unavailable and promotes a new ● Old master still is writeable (Orchestrator can not reach it to toggle the flag) ● ProxySQL will move the new master (writable) to the writer hostgroup ● ProxySQL will place old master as SHUNNED. ● When network partition gets resolved it will still be writable so it will return to ONLINE. ● this will lead to split brain Default behaviour: Caveats
  • 42. 43© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 43 ● Solutions to prevent this split brain scenario ● STONITH (shoot the other node in the head) ● Run script in ProxySQL scheduler that deletes any SHUNNED writers from the configuration (both from the writer and reader hostgroups) Default behaviour: Caveats / workarounds
  • 43. 44© The Pythian Group Inc., 2018 Communication Orchestrator hooks
  • 44. 45© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 45 ● Orchestrator implements hooks on various stages of the recovery process ● These "hooks" are like events that will be called and you can configure your own scripts to run ● This makes Orchestrator highly customisable and scriptable ● Default (naive) configuration will echo text to /tmp/recovery.log ● Use the hooks! If not for scripting then for alerting / notifying you that something happened Orchestrator hooks: What?
  • 45. 46© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 46 ● Instead of relying on ProxySQL's monitoring of the read-only flag we can now actively push changes to ProxySQL using the hooks. ● Whenever a planned or unplanned master change takes place we will update the ProxySQL. ● Pre-failover: ■ Remove {failedHost} from the writer hostgroup ● Post-failover: ■ If the recovery was successful: Insert {successorHost} in the writer hostgroup ● WARNING: test test test test test test !!!!! (before enabling automated failovers in production) Orchestrator hooks: Why?
  • 46. 47© The Pythian Group Inc., 2018 Communication Decouple communication (between ProxySQL and Orchestrator)
  • 47. 48© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 48 ● Orchestrator hooks are great but... ● ... what happens if there is no communication possible between Orchestrator and ProxySQL? ● Hooks are only fired once ● What if ProxySQL is not reachable? Stop failover? ● You need ProxySQL admin credentials available on Orchestrator The problem
  • 48. 49© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 49 ● Decouple Orchestrator and ProxySQL ● Use Consul as key-value store in between both ● Orchestrator has built-in support to update master coordinates in the K/V store (both for Zookeeper and Consul) ● Configuration settings ● "KVClusterMasterPrefix": "mysql/master", ● "ConsulAddress": "127.0.0.1:8500", ● "ZkAddress": "srv-a,srv-b:12181,srv-c", The solution
  • 49. 50© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 50 ● KVClusterMasterPrefixis the prefix to use for master discovery entries. As example, your cluster alias is mycluster and the master host is some.host-17.comthen you will expect an entry where: ● The Key is mysql/master/mycluster ● The Value is some.host-17.com:3306 ● Additionally following key/values will be available automatically ● mysql/master/mycluster/hostname , value is some.host-17.com ● mysql/master/mycluster/port , value is 3306 ● mysql/master/mycluster/ipv4 , value is 192.168.0.1 ● mysql/master/mycluster/ipv6 , value is <whatever> Which keys and values?
  • 50. 51© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 51 ● Recommended setup for Orchestrator is to run 3 nodes with their own local datastore (MySQL or SQLite) ● Communication between nodes happens using the RAFT protocol. ● This is also the preferred setup for the Consul K/V store ● We install Consul "server" on each Orchestrator nodes ● Consul "server" comes also with an "agent" ● We let the Orchestrator leader send it's updates to the local Consul agent. ● Consul agent updates the Consul leader node and the leader distributes the data to all 3 nodes using the RAFT protocol. Avoiding single-point-of-failures (1)
  • 51. 52© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 52 ● We now have our HA for Orchestrator and Consul. ● We have avoided network partitioning ● Majority vote is required to be the leader on both applications ● If our local Consul agent is unable to reach the Consul leader node, then Orchestrator will not be able to reach its peers and thus not be the Leader node. ● Optional: Orchestrator extends RAFT to implement a yield option to yield to a specific leader. We could implement a cronjob for Orchestrator to always yield Orchestrator leadership to the Consul leader for faster updates but this not a requirement. Avoiding single-point-of-failures (2)
  • 52. 53© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 53 ● Orchestrator really doesn't care all that much for slaves ● Masters are important for HA ● the native support for the K/V store ends with updating the masters to it ("KVClusterMasterPrefix": "mysql/master" ) ● API to the rescue! ● We can create a fairly simple script that runs in a cron ● pull ALL the servers from the API (get JSON response) ● compare the slave entries with values in Consul (for example keys starting with mysql/slaves) ● update Consul if needed What about the slaves?
  • 53. 54© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 54 ● Now Orchestrator is updating Consul K/V (master via native support, slaves via our script) ● Let's install a Consul "agent" on every ProxySQL machine. ● We can now query Consul data via this local agent root@proxysql-1:~ $ consul members Node Address Status Type Build Protocol DC Segment orchestrator-1 10.0.1.2:8301 alive server 1.4.3 2 default <all> orchestrator-2 10.0.2.2:8301 alive server 1.4.3 2 default <all> orchestrator-3 10.0.3.2:8301 alive server 1.4.3 2 default <all> proxysql-1 10.0.1.3:8301 alive client 1.4.3 2 default <default> proxysql-2 10.0.2.3:8301 alive client 1.4.3 2 default <default> How to configure ProxySQL?
  • 54. 55© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 55 ● First option is to use the scripted approach ● Run a script in a cronjob or in the ProxySQL scheduler ● Crawl the Consul K/V store ● Update ProxySQL config How to configure ProxySQL? Pro Con Fairly easy A lot of wasted CPU cycles Fairly quick (ProxySQL scheduler works on a millisecond base)
  • 55. 56© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 56 ● Use consul-template ● Registers as listener to the Consul values ● Every time a value is changed it will re-generate a file from a template ● Example: {{ if keyExists "mysql/master/testcluster/hostname" }} DELETE FROM mysql_servers where hostgroup_id = 0; REPLACE into mysql_servers (hostgroup_id, hostname) values ( 0, "{{ key "mysql/master/testcluster/hostname" }}" ); {{ end }} {{ range tree "mysql/slave/testcluster" }} REPLACE into mysql_servers (hostgroup_id, hostname) values ( 1, "{{ .Key }}{{ .Value }}" ); {{ end }} LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; How to configure ProxySQL?
  • 56. 57© The Pythian Group Inc., 2018© The Pythian Group Inc., 2019 57 Architecture
  • 57. 58© The Pythian Group Inc., 2018 Questions?
  • 58. 59© The Pythian Group Inc., 2018 Contact Matthias Crauwels crauwels@pythian.com +1 (613) 565-8696 ext. 1215 Twitter @mcrauwel We're hiring!! https://pythian.com/careers