SlideShare a Scribd company logo
Barcelona	MySQL	Meetup
MySQL	Failover	and	Orchestrator
Simon	Mudd simon.mudd@booking.com
5thJuly	2017
Content
• Handling	failover	with	MySQL
• Downtime	&	Requirements
• MySQL	Clustering	solutions
• Non-clustering	solutions	and	considerations
• Orchestrator
• Questions
105/07/2017 Barcelona	MySQL	Meetup
Is	Downtime	Acceptable?
• Do	you	have	a	system	that	needs	to	run	24	x	7	?
• Not	everyone	does
• If	you	have	a	website	then	generally	downtime	is	not	acceptable
205/07/2017 Barcelona	MySQL	Meetup
Requirements
Goal:	Run	24	x	7	x	365	with	no downtime
• Is	this	really	necessary?
• If	you	ask	management	they’ll	always	say	yes…
• What	is	the	cost?
• Shorter	downtime	requirements	mean	more	effort	spent	to	achieve	that	
• How	do	you	reliably detect	failure?		Hard	problem	to	solve
If	you	accept	downtime	how	much	can	you	really	tolerate?
• 1s,	5s,	30s,	1min	?
305/07/2017 Barcelona	MySQL	Meetup
What	options	are	available?
• MySQL	Cluster
• carrier	grade
• very	high	uptime
• Not	InnoDB – specialised workloads	
• Galera
• Often	with	asynchronous	replication	between	datacentres
• InnoDB Cluster
• Very	new
• All	require	clients	to	take	action	on	failure	of	a	node
• If	you	use	a	proxy	that	can	fail	too…
405/07/2017 Barcelona	MySQL	Meetup
What	options	are	available?
“Cluster	solutions”
• Do	not	work	well	cross-DC	due	to	latency
• If	you	accept	writes	into	multiple	masters	there’s	a	chance	of	conflict
• Slows	things	down
• InnoDB Cluster	now	does	not	recommend	this	behaviour – requires	care
• Only	small	setups	work	in	a	single	data-centre so	adaptation	here	is	
also	needed
• Cluster	setups	do	not	scale	easily	to	10	or	more	servers
505/07/2017 Barcelona	MySQL	Meetup
What	options	are	available?
• Standard	MySQL,	MariaDB,	Amazon	RDS,	Google	Cloud	SQL,	…
• Read	scale-out
• Asynchronous	replication
• Semi-sync	helps	improve	performance	and	ensure	data	is	“somewhere	else”	when	
acknowledging	a	transaction
• If	you	are	out	of	the	cloud	then:	different	setups
• SBR	or	RBR?
• No	GTID,	Oracle	or	MySQL	GTID?
• Optional	semi-sync?
• If	you	are	out	of	the	cloud	then:	do	it	yourself
• MHA
• MariaDB Replication	Manager
• Orchestrator
605/07/2017 Barcelona	MySQL	Meetup
Orchestrator
705/07/2017 Barcelona	MySQL	Meetup
Orchestrator
• Handles	master	failover,	but	more…
• GUI	to	manage	and	visualise topology	– very	handy
• CLI	to	do	the	same	things	– good	for	scripting
• API	calls	to	run	at	a	distance	(more	generic	interface)
• Needs	a	DB	backend	to	store	state.
• Normally	MySQL	but	can	be	SQLite
805/07/2017 Barcelona	MySQL	Meetup
Orchestrator
• Written	by	Shlomi Noach who	works	at	github
• He	worked	previously	at	booking.com and	introduced	us	to	
orchestrator,	previously	working	at	outbrain.
905/07/2017 Barcelona	MySQL	Meetup
Orchestrator
What	failures	does	it	handle?
• Master	failures	– needs	to	talk	to	external	systems
• Intermediate	master	failures	– can	handle	on	its	own
• Does	not care	about	slaves	or	applications
• Works	with	GTID:	Oracle	or	MariaDB
• Works	without	using	GTID:	Can	add	Pseudo-GTID (events	injected	on	
the	master	are	used	to	find	a	match)	so	no	need	to	migrate	to	GTID	if	
not	wanted
• Handles	multi-level	topologies
1005/07/2017 Barcelona	MySQL	Meetup
Orchestrator	GUI
1105/07/2017 Barcelona	MySQL	Meetup
Orchestrator	GUI
1205/07/2017 Barcelona	MySQL	Meetup
Orchestrator	GUI
1305/07/2017 Barcelona	MySQL	Meetup
Orchestrator	CLI
Over	100	commands	you	can	use
• E.g.
• relocate
• discover
• begin-downtime,	end-downtime
• topology
1405/07/2017 Barcelona	MySQL	Meetup
Orchestrator	CLI
05/07/2017 Barcelona	MySQL	Meetup 15
Failure	Notifications
• Using	the	hooks	can	talk	to	jabber	or	email	to	advise	of	the	actions	
taken:
05/07/2017 Barcelona	MySQL	Meetup 16
Failure	Auditing
05/07/2017 Barcelona	MySQL	Meetup 17
Orchestrator	Setup
• Source	at	github.com/github/orchestrator
• Binaries	written	in	go
• Daemon	runs	web	service	and	discovery,	client	on	each	MySQL	server
• State	stored	in	MySQL	/	SQLite
• Single	json configuration	file:	/etc/orchestrator.conf.json
• How	to	reach	backend	database	(stores	state)
• How	to	recognise delay
• Most	defaults	are	good	to	get	you	going
• Which	systems	you	want	to	trigger	recovery	on
• Hooks	to	handle	recovery	(talking	to	external	systems)
• If	you	need	help	please	ask
1805/07/2017 Barcelona	MySQL	Meetup
Orchestrator	Characteristics
• Discover one	server	in	your	cluster	and	orchestrator	will	find	the	
others
• Detects	new	servers	in	the	cluster	automatically
• Notifies	you	of	problems	seen
• Recovery	is	optional	(per	cluster)
• Optional	selection	of	candidate	masters	or	servers	to	blacklist
• Global	ON /	OFF switch	– handy	if	several	failures	happen	at	once
• For	paranoid	DBAs,	so	far	orchestrator	has	always	done	the	right	thing
1905/07/2017 Barcelona	MySQL	Meetup
Orchestrator	HA	?
Orchestrator	can	be	run	in	HA	mode
• Multiple	daemons	will	co-operate	so	if	one	fails	another	one	takes	
over	(they	share	the	database	backend)
• Use	a	load	balancer	to	provide	an	HA	GUI	service
• Use	nginx (or	similar)	for	authentication	and	TLS	if	needed
• Upgrades	are	easier
• Replicate	the	orchestrator	MySQL	backend	to	not	lose	data
2005/07/2017 Barcelona	MySQL	Meetup
Does	it	Scale?
Yes
• Booking.com has	a	large	installation	with	a	single	cluster	monitoring		
thousands	of	MySQL	servers
• Recommended	by	YouTube	for	managing	Vitess servers
• Quite	a	number	of	other	users	but	they	are	not	very	visible
2105/07/2017 Barcelona	MySQL	Meetup
Future	work
• Simplify	configuration	and	setup	so	more	people	can	use	it
• Improve	scalability
• Make	it	work	on	Amazon	RDS
• Spread	the	word…
05/07/2017 Barcelona	MySQL	Meetup 22
Further	help	needed?
• github.com/github/orchestrator
• for	Issues	(Problems	/	Questions)	and	Pull	Requests	(patches)
• Google	Group:	Orchestrator	MySQL
• https://groups.google.com/forum/#!forum/orchestrator-mysql
• Feel	free	to	contact	me	and	I	will	try	to	help	provide	pointers
2305/07/2017 Barcelona	MySQL	Meetup
Oh,	and	Booking.com is	hiring!
• Almost	any	role:
• MySQL	Engineer	/	DBA
• System	Administrator
• System	Engineer
• Site	Reliability	Engineer
• Developer
• Designer
• Technical	Team	Lead
• Product	Owner
• Data	Scientist
• And	many	more…
• https://workingatbooking.com/
05/07/2017 Barcelona	MySQL	Meetup 24
Questions
?
05/07/2017 Barcelona	MySQL	Meetup 25

More Related Content

PDF
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
PPTX
Troubleshooting MySQL from a MySQL Developer Perspective
PDF
MMUG18 - MySQL Failover and Orchestrator
PDF
Group Replication in MySQL 8.0 ( A Walk Through )
PDF
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
PDF
Introduction to Neo4j - a hands-on crash course
PPTX
Query logging with proxysql
PPTX
Plastic recycling
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Troubleshooting MySQL from a MySQL Developer Perspective
MMUG18 - MySQL Failover and Orchestrator
Group Replication in MySQL 8.0 ( A Walk Through )
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
Introduction to Neo4j - a hands-on crash course
Query logging with proxysql
Plastic recycling

What's hot (20)

PPTX
BIODEGRADABLE PLASTIC
PDF
Introduction to Greenplum
PDF
Intel dpdk Tutorial
PDF
MySQL Multi-Source Replication for PL2016
PDF
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
PDF
How Booking.com avoids and deals with replication lag
PDF
Deep dive into PostgreSQL statistics.
PPTX
Logical Replication in PostgreSQL
 
PDF
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
PDF
Linux BPF Superpowers
PDF
大規模DCのネットワークデザイン
PPTX
Plastic ban
PDF
Dpdk accelerated Ostinato
PDF
MySQL Cluster 8.0 tutorial
PPTX
Micro plastics presentation new
PDF
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
PDF
사이드 프로젝트로 알아보는 검색 서비스 개발 - 이주경
PDF
Troubleshooting PostgreSQL Streaming Replication
PDF
NVMe Takes It All, SCSI Has To Fall
PDF
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
BIODEGRADABLE PLASTIC
Introduction to Greenplum
Intel dpdk Tutorial
MySQL Multi-Source Replication for PL2016
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
How Booking.com avoids and deals with replication lag
Deep dive into PostgreSQL statistics.
Logical Replication in PostgreSQL
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
Linux BPF Superpowers
大規模DCのネットワークデザイン
Plastic ban
Dpdk accelerated Ostinato
MySQL Cluster 8.0 tutorial
Micro plastics presentation new
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
사이드 프로젝트로 알아보는 검색 서비스 개발 - 이주경
Troubleshooting PostgreSQL Streaming Replication
NVMe Takes It All, SCSI Has To Fall
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Ad

Similar to MySQL Failover and Orchestrator (20)

PDF
MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)
PDF
MySQL 5.7 InnoDB Cluster (Jan 2018)
PDF
MySQL Database Architectures - High Availability and Disaster Recovery Solution
PPTX
MySQL High Availibility Solutions
PDF
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
PDF
MySQL InnoDB Cluster HA Overview & Demo
PDF
High-Availability using MySQL Fabric
PDF
Netherlands Tech Tour 02 - MySQL Fabric
PDF
1 architecture & design
PDF
Failover or not to failover
PDF
MySQL Intro JSON NoSQL
ODP
PoC: Using a Group Communication System to improve MySQL Replication HA
ODP
MySQL 5.7 Fabric: Introduction to High Availability and Sharding
PDF
MySQL High Availability Solutions - Avoid loss of service by reducing the r...
PDF
Moodle Moot Spain: Moodle Available and Scalable with MySQL HA - InnoDB Clust...
PDF
MySQL DW Breakfast
PDF
MySQL InnoDB Cluster - Meetup Oracle MySQL / AFUP Paris
PDF
MySQL High Availability - Managing Farms of Distributed Servers
PPTX
MySQL High Availability Solutions - Feb 2015 webinar
PPT
MySQL Alta Disponibilidade com Replicação
MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)
MySQL 5.7 InnoDB Cluster (Jan 2018)
MySQL Database Architectures - High Availability and Disaster Recovery Solution
MySQL High Availibility Solutions
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
MySQL InnoDB Cluster HA Overview & Demo
High-Availability using MySQL Fabric
Netherlands Tech Tour 02 - MySQL Fabric
1 architecture & design
Failover or not to failover
MySQL Intro JSON NoSQL
PoC: Using a Group Communication System to improve MySQL Replication HA
MySQL 5.7 Fabric: Introduction to High Availability and Sharding
MySQL High Availability Solutions - Avoid loss of service by reducing the r...
Moodle Moot Spain: Moodle Available and Scalable with MySQL HA - InnoDB Clust...
MySQL DW Breakfast
MySQL InnoDB Cluster - Meetup Oracle MySQL / AFUP Paris
MySQL High Availability - Managing Farms of Distributed Servers
MySQL High Availability Solutions - Feb 2015 webinar
MySQL Alta Disponibilidade com Replicação
Ad

Recently uploaded (20)

PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Construction Project Organization Group 2.pptx
PDF
composite construction of structures.pdf
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
web development for engineering and engineering
PPT
Mechanical Engineering MATERIALS Selection
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Geodesy 1.pptx...............................................
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Construction Project Organization Group 2.pptx
composite construction of structures.pdf
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
CYBER-CRIMES AND SECURITY A guide to understanding
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
UNIT-1 - COAL BASED THERMAL POWER PLANTS
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
web development for engineering and engineering
Mechanical Engineering MATERIALS Selection
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Embodied AI: Ushering in the Next Era of Intelligent Systems
Model Code of Practice - Construction Work - 21102022 .pdf
Lecture Notes Electrical Wiring System Components
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Geodesy 1.pptx...............................................

MySQL Failover and Orchestrator