Mysql Latency

End of a long day, I am the last stop between you and ….. ~6 hours 42 mins left Yippee! Future CTO

Software-as-a-Service Web CMS True Multi-Tenant SaaS platform from the ground up Integrated solution of all services required to run a sophisticated business website HQ in San Francisco, 8+ years old, 60+ employees Global leader in On Demand Web Content Management

250+ million pages delivered per month

Linux Apache MySQL Java Tomcat Proven open source building blocks

Scale-out horizontally Distributed infrastructure, including multiple datacenters Multiple Layers of caching for performance Loose-coupling of applications around data

M1 M2 S3 S4 S2 S1 S5 S6 VPN Tunnel Data Center 1 Data Center 2

con = db.getReadWriteConnection(); con = db.getReadOnlyConnection(); con = db.getSafeReadConnection(); Application Code Intelligently Split Queries between Masters and Slaves Inserts/Updates/Deletes sent to Master Most Reads sent to Slaves “ Safe” Reads sent to Masters – zero tolerance for latency Manual code updates to implement the split 6+ months in production to find all “Safe” Reads Slave Master RO Connection Manager RW Connection Manager

The difference in time between when a transaction is committed on one database and then subsequently committed on a replicated database. Latency can either be “slowness” or “breakage”

7… Hardware Maintenance / Recovery 6… Schema updates / DB Maintenance 5… Elevated transaction rates (i.e. bulk loads) 4... High query load on slaves 3… Network bottlenecks / Loss of connectivity 2… “Slave Errors” (ie Duplicate keys, deadlocks)

while ( 1 ) while? echo "show slave status \G;" | mysql -u USER --password=PASSWORD | grep Seconds_Behind_Master >> replication.log while? sleep 1 while? end Seconds

M1 M2 S4 S6 VPN Tunnel Data Center 1 Data Center 2 S3 S2 S1 S5

M1 M2 S4 S6 V PN Tunnel CREATE TABLE `replTest` ( timecol` bigint(20) default NULL, KEY `idx_timecol` (`timecol`) ) Loop: $val = current timestamp in epoch milliseconds M2: INSERT INTO replTest (timecol) VALUES ($val) M1: SELECT $val -max(timecol) from replTest; S4: SELECT $val -max(timecol) from replTest; S6: SELECT $val -max(timecol) from replTest; INSERT

All DBs are 1 replication hop away from transaction source All hardware is roughly equal Remote location is ~ 60 miles away Data taken from 100,000 samples over an hour of standard operations Database Characteristics Average Latency Max Latency M2 Transaction Source N/A N/A M1 Local; Moderate Load ~ 6 ms ~ 315 ms S4 Local; High Load ~ 190 ms ~12 seconds S6 Remote; Minimal load ~ 5 ms ~ 400 ms

S4 Database milliseconds 95 % of the time, replication latency will be 1 second or less

If you do, your Ops Team will love you for it. Assume that it will happen in the course of standard operations. Build the application to accommodate it.

Local ehcache on application servers Distributed Object Cache (memcached) Need to clear all caches effectively on object updates Pub 1 Pub 2 Pub 3 Local cache Reliable Cache Clearing Messages Distributed Object Cache

Multicast Notification Bus for “clear cache” messages The race is on! If message arrives before transaction is replicated, stale object maybe reloaded…. Frequently accessed objects most susceptible to problems CMS Pub DB1 DB2

Multicast Notification Bus with tuning parameters The race is on again! But the database transaction gets a tunable head start. 0.5 sec, 1 sec, 2 secs, 5 secs Better – lasted for years, but in the end 99.99+% still wasn’t reliable enough...(remember the long tail on chart?) CMS PUB DB1 DB2

Database Queue table for messages Messages are committed after data, injecting them into the replication data stream. All apps poll the database queue table once per second. Guaranteed that data will arrive before message!!! CMS PUB DB1 DB2 Queue Poller

If you don’t need to replicate it, don’t! Split data functionally (i.e. separate large blog storage from relational transactions to keep the pipes clear) Build the appropriate recovery tools – our “rewind button”

Masters in multiple data centers Greater geographic distance between data centers MySQL load balancing – will messaging still be reliable???

[email_address] Questions? Feedback?

Mysql Latency

More Related Content

What's hot (20)

Similar to Mysql Latency (20)

Recently uploaded (20)

Mysql Latency