SlideShare a Scribd company logo
Tools bridging the gap between MySQL
engineering, ops & DBAs
Shlomi Noach
MySQLDevOps@Outbrain
Shlomi Noach
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Aboutme
● Engineer, DBA
● Working with MySQL since 2000
● Formerly consultant, instructor
● Author of common_schema, openark-kit, propagator
● Write at http://openark.org
● Work at the infrastructure team, Outbrain
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
AboutOutbrain
● The leading content discovery platform on the web
● Embedded in over 90,000 websites
● Serves over 150 million unique US visitors, 15 billion
pages and 100 billion recommendations per month
● You may not be familiar with us by name, but have met us
frequently.
● We aim to provide with reliable content to our users.
MySQL DevOps at Outbrain
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
PAID
DISCOVERY
INTERNAL
DISCOVERY
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
AboutOutbrain
● Managing total of over 2,000 servers (Hadoop, Cassandra,
MySQL, web services, …)
● Processing about 1 Petabyte of information
● Over 70 engineers
● Doing continuous deployments
● Fans and supporters of open source
● Have "Ownership" culture: "You build it, you run it!"
○ Must be supported by technology
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
What'sDevOps?
● Or, DevDbaOps?
?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
What'sDevOps?
● Often described as developers doing ops work, or ops
doing engineering work
● I see this more as the integration between the groups
● Avoiding the scenario where parties have no control of
parts of their domain.
○ Tools
○ Techniques
○ Culture
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
What'sDevOps?
● With good DevOps, you get:
○ Ownership
○ Visibility
○ Action-ability (word has just been invented and will be used as axiom)
● Allowing engineers own and be responsible for their apps.
○ No need for ops telling them something is wrong
○ No need to sit with ops to understand what is wrong
○ No need to ask ops to deploy changes
● All the while giving ops visibility into engineers actions
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Tribute:automation
● We use chef for automation
● Some databags leftovers, changing to attributes
● Everything is under version control
○ Allows ops/DBAs easily add/remove packages
○ Different treatment for masters
○ Different my.cnf settings based on MySQL role
○ Different my.cnf settings based on hardware
○ Setting up backup servers
○ More...
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Automation:
OneRingtoRuleThemAll
● Outbrain's onering is an orchestration solution
● Provisioning servers: from operating system through
packages (via chef integration) to application deployment
(via glu integration)
○ Allows for a one click "I want a host with MyService
tomcat service", or "I want a host with MySQL server"
● Then acting as inventory service
○ "give me all MySQL servers in the LA data center"
○ "which disks do our OLAP servers use?"
● https://github.com/outbrain/onering
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Onering&pmysql:
on-demandsemi-automatedactions
● pmysql is a parallel MySQL client (originally developed by
Domas Mituzas)
● Using onering's API, we can:
curl "https://my.onering.service/api/devices
/list/name/where/chef.run_list/mysql/name/olap?
format=txt" | pmysql -pmypass "stop slave"
curl "https://my.onering.service/api/devices
/list/name/where/chef.run_list/mysql/name/olap?
format=txt"
| pmysql -pmypass "select @@version"
| grep tokudb
| awk '{print $1}'
| pmysql -pmypass "stop slave"
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Visibility
● A classic developers-ops collision: slow queries
○ Ops notice increased I/O, slave lags
○ What do they know of the domain of the problem?
○ Developers see long response times
○ What visibility do they get?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
BoxAnemometer
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
WhatmakesAnemometersucha
goodDevOpstool?
● It provides visibility to everyone
● The engineer doesn't need to know what slow logs are,
where they are located, how to interpret them.
● It promotes ownership in that it gets the drill down per
query/per host/per service
● The Permalink. How such a small thing can make all the
difference
● Ops can hand over what they think is the "guilty query"
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemometer Host
Anemometer@Outbrain,
behindthescenes
MySQL Slow
log
MySQL Slow
log
MySQL Slow
log
Slow
log
Slow
log
Slow
log
logstash
logstash
logstash
pt-query-digest
Web
interface
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Multipleservices,multipleMySQL
hosts:whomakesitslow?
MySQL Slow
log
MySQL Slow
log
MySQL Slow
log
service
service
service
● What is our
analysis
granulation?
● Are slow logs
caused by a query?
● Affected by a
loaded MySQL
host?
● By a loaded
service?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemometer,collectingtheslowlogs
input {
tcp {
port => 23306
type => "mysql-slow"
mode => "server"
}
}
filter {
dns {
reverse => [ "@source_host", "source_host_name" ]
action => "replace"
}
}
output {
file {
type => "mysql-slow"
message_format => "%{@message}"
path => "/path/to/slow_logs/logstash/%{@source_host}-mysql-slow.log"
}
}
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemometer,rotatingtheslowlogs
/outbrain/slow_logs/logstash/*.log {
daily
nocompress
size 1
missingok
ifempty
copytruncate
prerotate
/bin/bash /var/www/html/anemometer/outbrain/pre_rotate.sh $1
endscript
nosharedscripts
rotate 100
}
● logstash streams logs onto the anemometer machine
● We choose not to aggregate them into one; the target file
name indicates the source host name
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemometer,processingtheslowlogs
#!/bin/bash
rotated_slow_log_file=$1
rotated_slow_log_file_path=$(dirname $rotated_slow_log_file)
rotated_slow_log_file_name=$(basename $rotated_slow_log_file)
hostname=${rotated_slow_log_file_name%%-mysql-slow.log*}
/bin/grep -v "^$" $rotated_slow_log_file |
/usr/bin/pt-query-digest --user=... --password=... 
--review u=,p=,h=localhost,D=...,t=global_query_review 
--history u=,p=,h=localhost,D=...,t=global_query_review_history 
--filter=" $event->{Bytes} = length($event->{arg}) and
$event->{hostname}="${hostname}" and
$event->{clustername}="${clustername}"" 
--no-report --group-by-extra=host
● Reading files per mysql-host, adding host & cluster
● Secondary grouping by client-host
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemomaster:
visibilityintomasterDML
● One of those "How did we ever live without it?" tools.
● Provides near real time (10 minute granularity) visibility
into queries issued on master.
● Got an unexpected burst of INSERTs? Anemomaster
provides a quick and accurate access into the specific
"guilty" query.
● And ops take a permalink to the owner.
● "Anemomaster" is a nickname. This is Anemometer on
top of binary log analysis instead of slow log, analyzing
number of executions instead of total run time.
● Also writing all DMLs to graphite.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemomaster
● Pinpointing count executions of a specific UPDATE query
● This query is owned by a known team.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemomaster Host
Anemomaster@Outbrain,
behindthescenes
MySQL
Master
Binary
log
MySQL
Slave
pt-query-digestRelay
log
MySQL
Master
Binary
log
MySQL
Slave
pt-query-digest
Relay
log
Binary
log
Web
interface
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemomaster,
processingthebinarylogs
/usr/bin/mysql -umy_user -pmy_password -e 'flush relay logsG;'
sleep 1
binlog_file=$(ls -tr /path/to/mysql/mysqld-relay-bin.[0-9]* | tail -n 2
| head -n 1)
mysqlbinlog $binlog_file | /usr/bin/pt-query-digest 
--type binlog --order-by Query_time:cnt --group-by fingerprint 
--limit 100
--review h=myhost,D=anemomaster,t=global_query_review 
--history h=myhost,D=anemomaster,t=global_query_review_history 
--filter=" $event->{Bytes} = length($event->{arg}) and $event-
>{hostname}="$(hostname)" and $event->{clustername}="$
{clustername}" and $event->{host}="n/a" " 
--no-report
● Actually processing the relay logs on slaves
● Assumes SBR, work in progress for RBR
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Anemomaster,
writingtographite
query=" select ... "
mysql anemomaster --silent --silent --raw -e "$query" | while IFS=$'t' read -r -a
result_values
do
fingerprint_cluster=${result_values[0]} ;
fingerprint_count=${result_values[1]} ;
fingerprint_query=${result_values[2]} ;
fingerprint_query=$(echo $fingerprint_query | sed -r -e "s/^(-- .*)]//g")
fingerprint_query=$(echo $fingerprint_query | tr 'n' ' ' | tr 'r' ' ' | tr 't' ' ')
fingerprint_query=${fingerprint_query%%(*}
fingerprint_query=${fingerprint_query%%,*}
fingerprint_query=${fingerprint_query%% set *}
fingerprint_query=${fingerprint_query%% SET *}
fingerprint_query=${fingerprint_query%% where *}
fingerprint_query=${fingerprint_query%% WHERE *}
fingerprint_query=${fingerprint_query%% join *}
fingerprint_query=${fingerprint_query%% JOIN *}
fingerprint_query=${fingerprint_query%% using *}
fingerprint_query=${fingerprint_query%% USING *}
fingerprint_query=${fingerprint_query%% select *}
fingerprint_query=${fingerprint_query%% SELECT *}
fingerprint_query=$(echo $fingerprint_query | tr -d "`")
fingerprint_query=$(echo $fingerprint_query | tr -d "*")
fingerprint_query=$(echo $fingerprint_query | tr " " "_")
fingerprint_query=$(echo $fingerprint_query | tr "." "__")
echo "data.mysql.${fingerprint_cluster}.mysql_dml.${fingerprint_query}.count 
${fingerprint_count} $unixtime" | nc -w 1 graphite 3003
done
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
audit_login:aloginauditingplugin
● Auditing every single login to our databases
○ Keeping track of connects per minute, find problems
○ Detecting unused accounts
○ Detecting failed connects, taking action
○ Detecting naughty scripts executed by developers
(haha, got your IP!)
○ And, well, auditing for the record
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
audit_login,output
{"ts":"2013-09-11
09:11:47","type":"successful_login","myhost":"gromit03","thread":"74153868",
"user":"web_user","priv_user":"web_user","host":"web-
87.localdomain","ip":"10.0.0.87"}
{"ts":"2013-09-11
09:11:55","type":"failed_login","myhost":"gromit03","thread":"74153869","use
r":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11
09:11:57","type":"failed_login","myhost":"gromit03","thread":"74153870","use
r":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"}
{"ts":"2013-09-11
09:12:48","type":"successful_login","myhost":"gromit03","thread":"74153871",
"user":"root","priv_user":"root","host":"localhost","ip":"10.0.0.111"}
{"ts":"2013-09-11
09:13:26","type":"successful_login","myhost":"gromit03","thread":"74153872",
"user":"web_user","priv_user":"web_user","host":"web-
11.localdomain","ip":"10.0.0.11"}
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
logstash
read, transform, write
Kibana
Searchable via Lucene
audit_login@Outbrain,
behindthescenes
MySQL
Master
audit
log
MySQL
Master
audit
log
MySQL
Master
audit
log
audit meta log
grep-able like
mama used to make
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
audit_login,logstash
input {
file {
type => "mysql_audit_login"
format => "json"
sincedb_path => "/var/cache/logstash/.since_audit_login_log"
sincedb_write_interval => 1
path => [ "/path/to/audit_login.log" ]
}
}
filter {
grep {
type => "mysql_audit_login"
match => [ "user", "monitoring_user" ]
negate => true
}
grep {
type => "mysql_audit_login"
match => [ "user", "heartbeat_user" ]
negate => true
}
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
audit_login,logstash
output {
rabbitmq {
host => "my.rmq.host"
user => "logstash_user"
password => "logstash_password"
exchange => "logstash.out"
exchange_type => "fanout"
type => "mysql_audit_login"
}
}
output {
tcp {
type => "mysql_audit_login"
mode => "client"
host => "my.logstash.aggregator"
port => "23307"
message_format => "%{timestamp},%{type},%{myhost},%{thread},%
{user},%{priv_user},%{host},%{ip}"
}
}
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
audit_loginKibana@Outbrain
user:webapp AND myhost:east1 AND type:failed_login
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Actionability
● Can developers actually have controlled/automated
actions on the database?
● Such that everyone, including DBA/Ops, have visibility
into?
● Solving the above gives developers greater ownership
over their domain, even within the database server.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Schema&datadeployments
● Who controls the database schema design?
○ Ops? Is schema design within their domain?
○ DBA? Expert about schema design, but is the DBA an
expert about the business domain?
○ Developers? Do they understand indexing?
● With many dozens of engineers, we can't have the DBA be
the single mutex for any schema change.
● But the DBA must know what's going on.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Hive
Meta
Hive
Meta
MySQL&Hiveservers@Outbrain
Slave
MySQL
Slave
Slave
DWH
Slave
Slave
Meta
Slave Hive
Hive Hive
Hive
Hive
Hive Hive
Hive
Hive Hive
Hive
Hive
Hive Hive
MySQL
build
server
MySQL
build
server
MySQL
unit
tests
MySQL
dev/sim
MySQL
dev/sim
MySQL
dev/sim
MySQL
dev/sim
MySQL
build
server
MySQL
build
server
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Databaseservers@Outbrain
● Multiple servers
● Multiple roles (OLTP, OLAP, Meta, Hive, others)
● Multiple environments (dev, QA, Build, Production)
● Multiple types (MySQL, Hive)
● Multiple engineers who want to deploy to them all. How
and where does a developer issue a CREATE TABLE?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Where&howtoCREATETABLE?
● Not on all servers, since the table is irrelevant to some
(e.g. relevant to OLTP, not to DWH). Who keeps track?
● Shall the developer work them out one by one? Maybe a
shell script?
○ Does the developer know all the credentials on all the
servers?
● What if some deployment goes wrong? (Table already
there; server cannot be accessed)
○ Who keeps track and retries/fixes?
○ Do you know who did what, when & where?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Arealldatabasesequal?
● We use different schema names on our test servers than
we do on production
○ Who keeps record?
● We have services which use multiple schemas, all with
exact same structure. Changes must apply on all schemas.
○ We've just multiplied the number of deployments for
our CREATE TABLE statement.
● Different ports, different credentials, different
FEDERATED/CONNECT targets...
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Existingschemadeploymenttools
● Some excellent open source solutions. Notable are
Liquibase & flywaydb
● However we found them to be unsuitable to our needs
○ Both linear
○ Multitenancy not easy to achieve
○ Mathematically sound, but reality isn't mathematically
sound.
○ Require a lot of management to achieve visibility and
ownership
● Some Windows-Desktop apps around. Ahem.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Propagator
● Eventually we developed our own, "multi-everything"
solution
● Propagator provides ownership, action-ability and
visibility
● Developers specify what they want to execute, and for
which database role
● Propagator infers the hosts, the schema/query
transformations and awaits your approval.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Propagator:submitascriptfor
deployment
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Propagatoraction-ability,visibility&
ownership
● Deployments are fully audited. Any failure is accounted
for.
● Propagator tells you who did what, when and on which
host. Also encourages "why".
● Engineers do most of the work with no intervention by
DBA or ops
● DBA has control
over deployments.
Can retry, restart,
selectively skip or
issue partial
queries...
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Propagator:historyvisibility
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Propagator:visibility&ownership
● The DBA may review deployments history
● Has immediate feedback on anything that went wrong
● Can most of the time figure out by herself why that went
wrong and rerun the deployment
● Otherwise knows who to contact
● Commenting and tagging enhance visibility
● Typical scenario: developer is new, unsure what went
wrong (this can be considered as a bug, actually)
● Next typical scenario: developer is experienced.
Everything works.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Propagator:stillmuchTODO
● Propagator has been in production at Outbrain for a few
months now, and it gets the job done.
● But still TODO:
○ More feedback automation
○ Email alerts
○ Two-phase approval
○ Online schema changes integration
○ SVN integration
○ Maven integration
○ Cassandra
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Dataretention
● If disk space runs out, who gets the alert?
○ Ops? Sure, they can add some disk space (volume
group free space; spare disks on shelf). But only to up
to some point.
● Time for data retention. Ideally, we would store data
forever. Reality is not ideal.
○ Who is the owner of retention? If I want to drop a
partition, who do I approve this with?
○ Can this be more visible?
● Are you doing data retention via shell/Perl scripts? Are
these tested, audited, controlled?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Dataretentionautomation:Gardien
● An Outbrain internal service automating data retention
○ Currently works on Hive/HDFS; MySQL in the works
● Every partitioned table is owned by a person or group
● Gardien knows the business demands:
○ Rolls new partitions, knows partition scope
○ Drops old partitions, has retention policy
● Has a web interface, controlled by the business/engineers
● Provides visibility to all, actionability to owners
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Gardiendashboard
● Create rules (partitions), edit, remove
● Visible and audited
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
It'snighttime
Beep
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Being'nice'tothedatabase
Workinprogress
● Are you happy now that you've made
your engineers all-powerful?
● Can you sleep well at night?
● No, really. What haunts your dreams?
● Darn. It's PagerDuty alert. Beep
● Apparently all the slaves are lagging.
● An engineer someone issued too many INSERTs
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Being'nice'tothedatabase
Workinprogress
● How do you protect your database against
malfunctioning/abusing services?
● How do you define/detect/respond to an event where
your master is flooded with DMLs, and slaves just can't
keep up?
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
What'syourslaves'servingcapacity?
PerDC?Perservice?
MySQL
Master
Slave
Lagging
Slave
Slave
Lagging
Slave
Lagging
Slave
Lagging
Slave
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Visibility:servingcapacity
● We measure current serving capacity and make this value
visible
● Not only to graphite/alerts. Also visible to any of our
services.
● Our services can be nice to the database by self-throttling
access or postponing tasks.
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Slave
Visibility:servingcapacity,
Flow
Slave
Lagging
Slave
Slave
Lagging
Slave
Outbrain
service
Zookeeper
Zookeeper
Zookeeper
Slave
Availability
detector
service
Reads
status
Writes
summary
status
Consults
status,
connects
to DB
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Forcingservicestobe'nice',
WorkinProgress
● A connection pool proxy
● Proxy consults availability status
● Throttles connections based on availability
Outbrain
service
Zookeeper
Zookeeper
Zookeeper
MySQL
cluster
Consults status,
approves/throttles
connection
Proxy
Attempts to get a connection
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Contributions
● We love open source and heavily rely on open source
solutions.
● We try to contribute back in form of patches, bug reports
and subscribing for commercial support for open source
projects.
● Some code we have open sourced:
○ Onering: https://github.com/outbrain/onering
○ Graphitus: https://github.com/ezbz/graphitus
○ Propagator: https://github.com/outbrain/propagator
○ audit_login: https://github.com/outbrain/audit_login
MySQL DevOps @ Outbrain
Shlomi Noach
Percona Live 2014
Copyright © 2014, Outbrain
Thankyou!
Questions?

More Related Content

PDF
openark-kit: MySQL utilities for everyday use
PDF
Programmatic queries: things you can code with sql
PPTX
Road to sbt 1.0 paved with server
KEY
Scaling Django
PDF
High Availability Django - Djangocon 2016
PDF
Massively Scaled High Performance Web Services with PHP
PPTX
Magento 2 Workflows
PPTX
SaltConf 2014: Safety with powertools
openark-kit: MySQL utilities for everyday use
Programmatic queries: things you can code with sql
Road to sbt 1.0 paved with server
Scaling Django
High Availability Django - Djangocon 2016
Massively Scaled High Performance Web Services with PHP
Magento 2 Workflows
SaltConf 2014: Safety with powertools

What's hot (20)

PDF
Automating Complex Setups with Puppet
PPTX
Shall we play a game?
PDF
Speed up your Symfony2 application and build awesome features with Redis
PPTX
Full stack development with node and NoSQL - All Things Open - October 2017
PDF
Play Framework: async I/O with Java and Scala
PDF
OSGi ecosystems compared on Apache Karaf - Christian Schneider
KEY
Dcjq node.js presentation
KEY
Deploying Plack Web Applications: OSCON 2011
PDF
Node.js, toy or power tool?
PDF
Beyond Puppet
PDF
Integrated Cache on Netscaler
PDF
Australian OpenStack User Group August 2012: Chef for OpenStack
PPTX
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
PPTX
Vert.x vs akka
PDF
Sensu and Sensibility - Puppetconf 2014
KEY
London devops logging
ODP
HTTP, JSON, JavaScript, Map&Reduce built-in to MySQL
PPTX
Why Play Framework is fast
PDF
Deploying And Monitoring Rails
PDF
Node4J: Running Node.js in a JavaWorld
Automating Complex Setups with Puppet
Shall we play a game?
Speed up your Symfony2 application and build awesome features with Redis
Full stack development with node and NoSQL - All Things Open - October 2017
Play Framework: async I/O with Java and Scala
OSGi ecosystems compared on Apache Karaf - Christian Schneider
Dcjq node.js presentation
Deploying Plack Web Applications: OSCON 2011
Node.js, toy or power tool?
Beyond Puppet
Integrated Cache on Netscaler
Australian OpenStack User Group August 2012: Chef for OpenStack
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
Vert.x vs akka
Sensu and Sensibility - Puppetconf 2014
London devops logging
HTTP, JSON, JavaScript, Map&Reduce built-in to MySQL
Why Play Framework is fast
Deploying And Monitoring Rails
Node4J: Running Node.js in a JavaWorld
Ad

Viewers also liked (20)

PDF
Managing and Visualizing your Replication Topologies with Orchestrator
PDF
Pseudo GTID and Easy MySQL Replication Topology Management
PDF
Pseudo gtid & easy replication topology management
PDF
common_schema 2.2: DBA's framework for MySQL (April 2014)
ODP
Implementing Private Clouds
PDF
Continuous Availability for Private Database Clouds
PDF
RDS for MySQL, No BS Operations and Patterns
PDF
Deploying WSO2 Middleware on Containers
PDF
Pluk2011 deploy-mysql-like-a-devops-sysadmin
PDF
Wso2 con eu 2016 an introduction to the wso2 integration platform by chanak...
PDF
Deploying WSO2 Middleware on Kubernetes
PDF
OSS4B: Installing & Managing MySQL like a real devops
PDF
Wso2 esb-maintenance-guide
PDF
Wso2 integration platform deep dive eu con 2016
PDF
WSO2Con USA 2017: Implement an Effective Digital Platform Using WSO2 Integration
PDF
Discover Data That Matters- Deep dive into WSO2 Analytics
PDF
WSO2 API Manager 2.0 - Overview
PDF
PDF
WSO2Con USA 2017: WSO2 Partner Program – Engaging with WSO2
PDF
WSO2Con USA 2017: Integrating Systems for University of Exeter using Zero and...
Managing and Visualizing your Replication Topologies with Orchestrator
Pseudo GTID and Easy MySQL Replication Topology Management
Pseudo gtid & easy replication topology management
common_schema 2.2: DBA's framework for MySQL (April 2014)
Implementing Private Clouds
Continuous Availability for Private Database Clouds
RDS for MySQL, No BS Operations and Patterns
Deploying WSO2 Middleware on Containers
Pluk2011 deploy-mysql-like-a-devops-sysadmin
Wso2 con eu 2016 an introduction to the wso2 integration platform by chanak...
Deploying WSO2 Middleware on Kubernetes
OSS4B: Installing & Managing MySQL like a real devops
Wso2 esb-maintenance-guide
Wso2 integration platform deep dive eu con 2016
WSO2Con USA 2017: Implement an Effective Digital Platform Using WSO2 Integration
Discover Data That Matters- Deep dive into WSO2 Analytics
WSO2 API Manager 2.0 - Overview
WSO2Con USA 2017: WSO2 Partner Program – Engaging with WSO2
WSO2Con USA 2017: Integrating Systems for University of Exeter using Zero and...
Ad

Similar to MySQL DevOps at Outbrain (20)

PDF
Performance optimisations PHP meetup Rotterdam
PDF
Fosdem managing my sql with percona toolkit
PDF
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
PPTX
Database Engineering and Operations at Yahoo
PDF
MySQL Ecosystem in 2020
PDF
Using MySQL Enterprise Monitor for Continuous Performance Improvement
PDF
Percona, software libre y bases de datos
PDF
MySQL Latest News
PDF
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
PDF
介绍 Percona 服务器 XtraDB 和 Xtrabackup
PDF
SDPHP - Percona Toolkit (It's Basically Magic)
PDF
State of the Dolphin - May 2022
PPTX
Mysql ecosystem in 2019
PDF
Loadays managing my sql with percona toolkit
PPTX
Mysql ecosystem in 2018
PDF
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
PDF
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
PDF
Making MySQL Administration a Breeze - A Look Into a MySQL DBA's Toolchest
PDF
Percona Server 8.0
PDF
20190615 hkos-mysql-troubleshootingandperformancev2
Performance optimisations PHP meetup Rotterdam
Fosdem managing my sql with percona toolkit
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
Database Engineering and Operations at Yahoo
MySQL Ecosystem in 2020
Using MySQL Enterprise Monitor for Continuous Performance Improvement
Percona, software libre y bases de datos
MySQL Latest News
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
介绍 Percona 服务器 XtraDB 和 Xtrabackup
SDPHP - Percona Toolkit (It's Basically Magic)
State of the Dolphin - May 2022
Mysql ecosystem in 2019
Loadays managing my sql with percona toolkit
Mysql ecosystem in 2018
Making MySQL Administration a Breeze - A look into a MySQL DBA's toolchest
Webinar replay: MySQL Query Tuning Trilogy: Query tuning process and tools
Making MySQL Administration a Breeze - A Look Into a MySQL DBA's Toolchest
Percona Server 8.0
20190615 hkos-mysql-troubleshootingandperformancev2

Recently uploaded (20)

PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
history of c programming in notes for students .pptx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Understanding Forklifts - TECH EHS Solution
PDF
top salesforce developer skills in 2025.pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
L1 - Introduction to python Backend.pptx
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
System and Network Administration Chapter 2
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
ai tools demonstartion for schools and inter college
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
System and Network Administraation Chapter 3
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Wondershare Filmora 15 Crack With Activation Key [2025
history of c programming in notes for students .pptx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Understanding Forklifts - TECH EHS Solution
top salesforce developer skills in 2025.pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
L1 - Introduction to python Backend.pptx
Design an Analysis of Algorithms I-SECS-1021-03
System and Network Administration Chapter 2
Reimagine Home Health with the Power of Agentic AI​
ai tools demonstartion for schools and inter college
Internet Downloader Manager (IDM) Crack 6.42 Build 41
System and Network Administraation Chapter 3
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
How to Migrate SBCGlobal Email to Yahoo Easily
wealthsignaloriginal-com-DS-text-... (1).pdf
Operating system designcfffgfgggggggvggggggggg
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx

MySQL DevOps at Outbrain

  • 1. Tools bridging the gap between MySQL engineering, ops & DBAs Shlomi Noach MySQLDevOps@Outbrain Shlomi Noach
  • 2. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Aboutme ● Engineer, DBA ● Working with MySQL since 2000 ● Formerly consultant, instructor ● Author of common_schema, openark-kit, propagator ● Write at http://openark.org ● Work at the infrastructure team, Outbrain
  • 3. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain AboutOutbrain ● The leading content discovery platform on the web ● Embedded in over 90,000 websites ● Serves over 150 million unique US visitors, 15 billion pages and 100 billion recommendations per month ● You may not be familiar with us by name, but have met us frequently. ● We aim to provide with reliable content to our users.
  • 5. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain PAID DISCOVERY INTERNAL DISCOVERY
  • 6. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain AboutOutbrain ● Managing total of over 2,000 servers (Hadoop, Cassandra, MySQL, web services, …) ● Processing about 1 Petabyte of information ● Over 70 engineers ● Doing continuous deployments ● Fans and supporters of open source ● Have "Ownership" culture: "You build it, you run it!" ○ Must be supported by technology
  • 7. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain What'sDevOps? ● Or, DevDbaOps? ?
  • 8. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain What'sDevOps? ● Often described as developers doing ops work, or ops doing engineering work ● I see this more as the integration between the groups ● Avoiding the scenario where parties have no control of parts of their domain. ○ Tools ○ Techniques ○ Culture
  • 9. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain What'sDevOps? ● With good DevOps, you get: ○ Ownership ○ Visibility ○ Action-ability (word has just been invented and will be used as axiom) ● Allowing engineers own and be responsible for their apps. ○ No need for ops telling them something is wrong ○ No need to sit with ops to understand what is wrong ○ No need to ask ops to deploy changes ● All the while giving ops visibility into engineers actions
  • 10. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Tribute:automation ● We use chef for automation ● Some databags leftovers, changing to attributes ● Everything is under version control ○ Allows ops/DBAs easily add/remove packages ○ Different treatment for masters ○ Different my.cnf settings based on MySQL role ○ Different my.cnf settings based on hardware ○ Setting up backup servers ○ More...
  • 11. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Automation: OneRingtoRuleThemAll ● Outbrain's onering is an orchestration solution ● Provisioning servers: from operating system through packages (via chef integration) to application deployment (via glu integration) ○ Allows for a one click "I want a host with MyService tomcat service", or "I want a host with MySQL server" ● Then acting as inventory service ○ "give me all MySQL servers in the LA data center" ○ "which disks do our OLAP servers use?" ● https://github.com/outbrain/onering
  • 12. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Onering&pmysql: on-demandsemi-automatedactions ● pmysql is a parallel MySQL client (originally developed by Domas Mituzas) ● Using onering's API, we can: curl "https://my.onering.service/api/devices /list/name/where/chef.run_list/mysql/name/olap? format=txt" | pmysql -pmypass "stop slave" curl "https://my.onering.service/api/devices /list/name/where/chef.run_list/mysql/name/olap? format=txt" | pmysql -pmypass "select @@version" | grep tokudb | awk '{print $1}' | pmysql -pmypass "stop slave"
  • 13. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Visibility ● A classic developers-ops collision: slow queries ○ Ops notice increased I/O, slave lags ○ What do they know of the domain of the problem? ○ Developers see long response times ○ What visibility do they get?
  • 14. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain BoxAnemometer
  • 15. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain WhatmakesAnemometersucha goodDevOpstool? ● It provides visibility to everyone ● The engineer doesn't need to know what slow logs are, where they are located, how to interpret them. ● It promotes ownership in that it gets the drill down per query/per host/per service ● The Permalink. How such a small thing can make all the difference ● Ops can hand over what they think is the "guilty query"
  • 16. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemometer Host Anemometer@Outbrain, behindthescenes MySQL Slow log MySQL Slow log MySQL Slow log Slow log Slow log Slow log logstash logstash logstash pt-query-digest Web interface
  • 17. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Multipleservices,multipleMySQL hosts:whomakesitslow? MySQL Slow log MySQL Slow log MySQL Slow log service service service ● What is our analysis granulation? ● Are slow logs caused by a query? ● Affected by a loaded MySQL host? ● By a loaded service?
  • 18. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemometer,collectingtheslowlogs input { tcp { port => 23306 type => "mysql-slow" mode => "server" } } filter { dns { reverse => [ "@source_host", "source_host_name" ] action => "replace" } } output { file { type => "mysql-slow" message_format => "%{@message}" path => "/path/to/slow_logs/logstash/%{@source_host}-mysql-slow.log" } }
  • 19. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemometer,rotatingtheslowlogs /outbrain/slow_logs/logstash/*.log { daily nocompress size 1 missingok ifempty copytruncate prerotate /bin/bash /var/www/html/anemometer/outbrain/pre_rotate.sh $1 endscript nosharedscripts rotate 100 } ● logstash streams logs onto the anemometer machine ● We choose not to aggregate them into one; the target file name indicates the source host name
  • 20. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemometer,processingtheslowlogs #!/bin/bash rotated_slow_log_file=$1 rotated_slow_log_file_path=$(dirname $rotated_slow_log_file) rotated_slow_log_file_name=$(basename $rotated_slow_log_file) hostname=${rotated_slow_log_file_name%%-mysql-slow.log*} /bin/grep -v "^$" $rotated_slow_log_file | /usr/bin/pt-query-digest --user=... --password=... --review u=,p=,h=localhost,D=...,t=global_query_review --history u=,p=,h=localhost,D=...,t=global_query_review_history --filter=" $event->{Bytes} = length($event->{arg}) and $event->{hostname}="${hostname}" and $event->{clustername}="${clustername}"" --no-report --group-by-extra=host ● Reading files per mysql-host, adding host & cluster ● Secondary grouping by client-host
  • 21. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemomaster: visibilityintomasterDML ● One of those "How did we ever live without it?" tools. ● Provides near real time (10 minute granularity) visibility into queries issued on master. ● Got an unexpected burst of INSERTs? Anemomaster provides a quick and accurate access into the specific "guilty" query. ● And ops take a permalink to the owner. ● "Anemomaster" is a nickname. This is Anemometer on top of binary log analysis instead of slow log, analyzing number of executions instead of total run time. ● Also writing all DMLs to graphite.
  • 22. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemomaster ● Pinpointing count executions of a specific UPDATE query ● This query is owned by a known team.
  • 23. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemomaster Host Anemomaster@Outbrain, behindthescenes MySQL Master Binary log MySQL Slave pt-query-digestRelay log MySQL Master Binary log MySQL Slave pt-query-digest Relay log Binary log Web interface
  • 24. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemomaster, processingthebinarylogs /usr/bin/mysql -umy_user -pmy_password -e 'flush relay logsG;' sleep 1 binlog_file=$(ls -tr /path/to/mysql/mysqld-relay-bin.[0-9]* | tail -n 2 | head -n 1) mysqlbinlog $binlog_file | /usr/bin/pt-query-digest --type binlog --order-by Query_time:cnt --group-by fingerprint --limit 100 --review h=myhost,D=anemomaster,t=global_query_review --history h=myhost,D=anemomaster,t=global_query_review_history --filter=" $event->{Bytes} = length($event->{arg}) and $event- >{hostname}="$(hostname)" and $event->{clustername}="$ {clustername}" and $event->{host}="n/a" " --no-report ● Actually processing the relay logs on slaves ● Assumes SBR, work in progress for RBR
  • 25. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Anemomaster, writingtographite query=" select ... " mysql anemomaster --silent --silent --raw -e "$query" | while IFS=$'t' read -r -a result_values do fingerprint_cluster=${result_values[0]} ; fingerprint_count=${result_values[1]} ; fingerprint_query=${result_values[2]} ; fingerprint_query=$(echo $fingerprint_query | sed -r -e "s/^(-- .*)]//g") fingerprint_query=$(echo $fingerprint_query | tr 'n' ' ' | tr 'r' ' ' | tr 't' ' ') fingerprint_query=${fingerprint_query%%(*} fingerprint_query=${fingerprint_query%%,*} fingerprint_query=${fingerprint_query%% set *} fingerprint_query=${fingerprint_query%% SET *} fingerprint_query=${fingerprint_query%% where *} fingerprint_query=${fingerprint_query%% WHERE *} fingerprint_query=${fingerprint_query%% join *} fingerprint_query=${fingerprint_query%% JOIN *} fingerprint_query=${fingerprint_query%% using *} fingerprint_query=${fingerprint_query%% USING *} fingerprint_query=${fingerprint_query%% select *} fingerprint_query=${fingerprint_query%% SELECT *} fingerprint_query=$(echo $fingerprint_query | tr -d "`") fingerprint_query=$(echo $fingerprint_query | tr -d "*") fingerprint_query=$(echo $fingerprint_query | tr " " "_") fingerprint_query=$(echo $fingerprint_query | tr "." "__") echo "data.mysql.${fingerprint_cluster}.mysql_dml.${fingerprint_query}.count ${fingerprint_count} $unixtime" | nc -w 1 graphite 3003 done
  • 26. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain audit_login:aloginauditingplugin ● Auditing every single login to our databases ○ Keeping track of connects per minute, find problems ○ Detecting unused accounts ○ Detecting failed connects, taking action ○ Detecting naughty scripts executed by developers (haha, got your IP!) ○ And, well, auditing for the record
  • 27. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain audit_login,output {"ts":"2013-09-11 09:11:47","type":"successful_login","myhost":"gromit03","thread":"74153868", "user":"web_user","priv_user":"web_user","host":"web- 87.localdomain","ip":"10.0.0.87"} {"ts":"2013-09-11 09:11:55","type":"failed_login","myhost":"gromit03","thread":"74153869","use r":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"} {"ts":"2013-09-11 09:11:57","type":"failed_login","myhost":"gromit03","thread":"74153870","use r":"backup_user","priv_user":"","host":"web-32","ip":"10.0.0.32"} {"ts":"2013-09-11 09:12:48","type":"successful_login","myhost":"gromit03","thread":"74153871", "user":"root","priv_user":"root","host":"localhost","ip":"10.0.0.111"} {"ts":"2013-09-11 09:13:26","type":"successful_login","myhost":"gromit03","thread":"74153872", "user":"web_user","priv_user":"web_user","host":"web- 11.localdomain","ip":"10.0.0.11"}
  • 28. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain logstash read, transform, write Kibana Searchable via Lucene audit_login@Outbrain, behindthescenes MySQL Master audit log MySQL Master audit log MySQL Master audit log audit meta log grep-able like mama used to make
  • 29. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain audit_login,logstash input { file { type => "mysql_audit_login" format => "json" sincedb_path => "/var/cache/logstash/.since_audit_login_log" sincedb_write_interval => 1 path => [ "/path/to/audit_login.log" ] } } filter { grep { type => "mysql_audit_login" match => [ "user", "monitoring_user" ] negate => true } grep { type => "mysql_audit_login" match => [ "user", "heartbeat_user" ] negate => true }
  • 30. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain audit_login,logstash output { rabbitmq { host => "my.rmq.host" user => "logstash_user" password => "logstash_password" exchange => "logstash.out" exchange_type => "fanout" type => "mysql_audit_login" } } output { tcp { type => "mysql_audit_login" mode => "client" host => "my.logstash.aggregator" port => "23307" message_format => "%{timestamp},%{type},%{myhost},%{thread},% {user},%{priv_user},%{host},%{ip}" } }
  • 31. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain audit_loginKibana@Outbrain user:webapp AND myhost:east1 AND type:failed_login
  • 32. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Actionability ● Can developers actually have controlled/automated actions on the database? ● Such that everyone, including DBA/Ops, have visibility into? ● Solving the above gives developers greater ownership over their domain, even within the database server.
  • 33. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Schema&datadeployments ● Who controls the database schema design? ○ Ops? Is schema design within their domain? ○ DBA? Expert about schema design, but is the DBA an expert about the business domain? ○ Developers? Do they understand indexing? ● With many dozens of engineers, we can't have the DBA be the single mutex for any schema change. ● But the DBA must know what's going on.
  • 34. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Hive Meta Hive Meta MySQL&Hiveservers@Outbrain Slave MySQL Slave Slave DWH Slave Slave Meta Slave Hive Hive Hive Hive Hive Hive Hive Hive Hive Hive Hive Hive Hive Hive MySQL build server MySQL build server MySQL unit tests MySQL dev/sim MySQL dev/sim MySQL dev/sim MySQL dev/sim MySQL build server MySQL build server
  • 35. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Databaseservers@Outbrain ● Multiple servers ● Multiple roles (OLTP, OLAP, Meta, Hive, others) ● Multiple environments (dev, QA, Build, Production) ● Multiple types (MySQL, Hive) ● Multiple engineers who want to deploy to them all. How and where does a developer issue a CREATE TABLE?
  • 36. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Where&howtoCREATETABLE? ● Not on all servers, since the table is irrelevant to some (e.g. relevant to OLTP, not to DWH). Who keeps track? ● Shall the developer work them out one by one? Maybe a shell script? ○ Does the developer know all the credentials on all the servers? ● What if some deployment goes wrong? (Table already there; server cannot be accessed) ○ Who keeps track and retries/fixes? ○ Do you know who did what, when & where?
  • 37. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Arealldatabasesequal? ● We use different schema names on our test servers than we do on production ○ Who keeps record? ● We have services which use multiple schemas, all with exact same structure. Changes must apply on all schemas. ○ We've just multiplied the number of deployments for our CREATE TABLE statement. ● Different ports, different credentials, different FEDERATED/CONNECT targets...
  • 38. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Existingschemadeploymenttools ● Some excellent open source solutions. Notable are Liquibase & flywaydb ● However we found them to be unsuitable to our needs ○ Both linear ○ Multitenancy not easy to achieve ○ Mathematically sound, but reality isn't mathematically sound. ○ Require a lot of management to achieve visibility and ownership ● Some Windows-Desktop apps around. Ahem.
  • 39. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Propagator ● Eventually we developed our own, "multi-everything" solution ● Propagator provides ownership, action-ability and visibility ● Developers specify what they want to execute, and for which database role ● Propagator infers the hosts, the schema/query transformations and awaits your approval.
  • 40. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Propagator:submitascriptfor deployment
  • 41. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Propagatoraction-ability,visibility& ownership ● Deployments are fully audited. Any failure is accounted for. ● Propagator tells you who did what, when and on which host. Also encourages "why". ● Engineers do most of the work with no intervention by DBA or ops ● DBA has control over deployments. Can retry, restart, selectively skip or issue partial queries...
  • 42. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Propagator:historyvisibility
  • 43. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Propagator:visibility&ownership ● The DBA may review deployments history ● Has immediate feedback on anything that went wrong ● Can most of the time figure out by herself why that went wrong and rerun the deployment ● Otherwise knows who to contact ● Commenting and tagging enhance visibility ● Typical scenario: developer is new, unsure what went wrong (this can be considered as a bug, actually) ● Next typical scenario: developer is experienced. Everything works.
  • 44. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Propagator:stillmuchTODO ● Propagator has been in production at Outbrain for a few months now, and it gets the job done. ● But still TODO: ○ More feedback automation ○ Email alerts ○ Two-phase approval ○ Online schema changes integration ○ SVN integration ○ Maven integration ○ Cassandra
  • 45. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Dataretention ● If disk space runs out, who gets the alert? ○ Ops? Sure, they can add some disk space (volume group free space; spare disks on shelf). But only to up to some point. ● Time for data retention. Ideally, we would store data forever. Reality is not ideal. ○ Who is the owner of retention? If I want to drop a partition, who do I approve this with? ○ Can this be more visible? ● Are you doing data retention via shell/Perl scripts? Are these tested, audited, controlled?
  • 46. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Dataretentionautomation:Gardien ● An Outbrain internal service automating data retention ○ Currently works on Hive/HDFS; MySQL in the works ● Every partitioned table is owned by a person or group ● Gardien knows the business demands: ○ Rolls new partitions, knows partition scope ○ Drops old partitions, has retention policy ● Has a web interface, controlled by the business/engineers ● Provides visibility to all, actionability to owners
  • 47. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Gardiendashboard ● Create rules (partitions), edit, remove ● Visible and audited
  • 48. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain It'snighttime Beep
  • 49. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Being'nice'tothedatabase Workinprogress ● Are you happy now that you've made your engineers all-powerful? ● Can you sleep well at night? ● No, really. What haunts your dreams? ● Darn. It's PagerDuty alert. Beep ● Apparently all the slaves are lagging. ● An engineer someone issued too many INSERTs
  • 50. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Being'nice'tothedatabase Workinprogress ● How do you protect your database against malfunctioning/abusing services? ● How do you define/detect/respond to an event where your master is flooded with DMLs, and slaves just can't keep up?
  • 51. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain What'syourslaves'servingcapacity? PerDC?Perservice? MySQL Master Slave Lagging Slave Slave Lagging Slave Lagging Slave Lagging Slave
  • 52. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Visibility:servingcapacity ● We measure current serving capacity and make this value visible ● Not only to graphite/alerts. Also visible to any of our services. ● Our services can be nice to the database by self-throttling access or postponing tasks.
  • 53. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Slave Visibility:servingcapacity, Flow Slave Lagging Slave Slave Lagging Slave Outbrain service Zookeeper Zookeeper Zookeeper Slave Availability detector service Reads status Writes summary status Consults status, connects to DB
  • 54. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Forcingservicestobe'nice', WorkinProgress ● A connection pool proxy ● Proxy consults availability status ● Throttles connections based on availability Outbrain service Zookeeper Zookeeper Zookeeper MySQL cluster Consults status, approves/throttles connection Proxy Attempts to get a connection
  • 55. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Contributions ● We love open source and heavily rely on open source solutions. ● We try to contribute back in form of patches, bug reports and subscribing for commercial support for open source projects. ● Some code we have open sourced: ○ Onering: https://github.com/outbrain/onering ○ Graphitus: https://github.com/ezbz/graphitus ○ Propagator: https://github.com/outbrain/propagator ○ audit_login: https://github.com/outbrain/audit_login
  • 56. MySQL DevOps @ Outbrain Shlomi Noach Percona Live 2014 Copyright © 2014, Outbrain Thankyou! Questions?