Retaining globally distributed high availability

Retaining globally
distributed high
availability
Art van Scheppingen
Head of Database Engineering

2

1.  Who
is
Spil
Games?

2.  Theory

3.  Spil
Storage
Pla9orm

4.  Ques=ons?

Overview

Who are we?
Who
is
Spil
Games?

4

•  Company
founded
in
2001

•  350+
employees
world
wide

•  180M+
unique
visitors
per
month

•  45
portals
in
19
languages

•  Casual
games

•  Social
games

•  Real
=me
mul=player
games

•  Mobile
games

•  35+
MySQL
clusters

•  60k
queries
per
second
(3.5
billion
qpd)

Facts

5

Geographic Reach
180
Million
Monthly
Ac=ve
Users(*)

Source:
(*)
Google
Analy3cs,
August
2012

•  Over
45
localized
portals
in
19
languages

•  Mul=
pla9orm:
web,
mobile,
tablet

•  Focus
on
casual
and
social
games

•  180M
MAU
per
month
(30M
YoY
growth)

•  Over
50M
registered
users

6

Girls,
Teens
and
Family

spielen.com

juegos.com

gamesgames.com

games.co.uk

Brands

Foundations
The
exci2ng
theory

8

•  What
does
it
exactly
mean?

Retaining globally distributed HA

9

Wikipedia:

High
availability
is
a
system
design
approach
and

associated
service
implementa=on
that
ensures
a

prearranged
level
of
opera=onal
performance
will
be

met
during
a
contractual
measurement
period.

Oracle:

•  Availability
of
resources
in
a
computer
system

What is high availability?

10

•  Master
with
(many)
slave(s)

How do we reach HA with MySQL?
Master
Slave Slave Slave

11

•  Master
with
(many)
slave(s)

•  Mul=
Master

Master
Slave
Master
Slave

12

•  Master
with
(many)
slave(s)

•  Mul=
Master

•  Clustering

MysqldMysqld
ndbd
ndbd ndbd
ndbd
ndbd
mgmt

13

•  Master
with
(many)
slave(s)

•  Mul=
Master

•  Clustering

•  Geographical
redundancy

Master
local DC
Slave local
DC
Slave Asia Slave US

14

•  Scale
up

•  Ver=cal

•  Faster
CPU/Memory/disks

•  Expensive

•  Costs
mul=ply
in
same
rate
as
#
of
nodes

•  Scale
out

•  Horizontal

•  More
(small)
machines

•  Inexpensive

•  Par==oning/federa=ng
(sharding)

What if we keep growing?

15

•  Func=onal

•  Shard
your
database
func=onally

•  Reads

•  Add
more
slaves
(keep
them
coming!)

•  Writes

•  More
disks

•  Horizontal
par==oning

•  Federated
par==ons

Scale out

16

•  Breaking
up
tables
in
small
parts
on
the
same
host

•  Par==oned
on
a
column

•  Inﬁnite
growth
(as
long
as
you
add
diskspace)

•  Less
used
data
to
slower
(cheaper)
disks

•  No
stored
procedures,
func=ons,
etc

•  Uneven
usage
of
par==ons
(hash
par==on
may
help)

•  Once
wrihen,
data
remains
on
the
par==on

Horizontal partitioning

17

•  Breaking
up
your
table
in
parts
on
mul=ple
hosts

•  Par==oned
on
a
column

•  Inﬁnite
growth
(as
long
as
you
add
hosts)

•  Less
used
data
on
slower
hosts

•  Not
supported
in
(standard)
MySQL

•  Par==oning
on
applica=on
level
(or
proxy)

•  Alterna=vely:
NDB

•  Uneven
usage
of
par==ons

•  Once
wrihen
data
(mostly)
remains
on
the
par==on

•  Parallel
queries
to
retrieve
data
from
all
shards

Federated partitions (sharding)

18

•  Parallel
execu=on
of
sequen=al
jobs

•  Limited
by
the
weakest
link

•  As
fast
as
the
slowest
node

•  Fix:
nonsequen=al
(asynchronous)
execu=on

Amdahl's law

19

Typical LAMP stack
Client

Webserver

PHP

MySQL

Memcache

Webserver

PHP

Loadbalancer

20

A-typical LAMP stack
Client

Webserver

PHP

MySQL

Memcache

Webserver

PHP

Loadbalancer

MQ

Jobs

Spil Storage
Platform
Abstrac2ng
the
storage
layer

22

•  Dependent
on
one
storage
pla9orm

•  No
more
pla9orm-‐specific
query
language

•  Differen=ate
writes

•  Op=mis=c
(asynchronous)

•  Pessimis=c
(synchronous)

•  Shard
data
beher

•  Par==on
on
user
and
func=on

•  Cluster
informa=on
by
users,
not
by
func=on

•  Global
expansion

•  Par==on
on
geographic
loca=on

•  Solve
uneven
usage
of
data
storage

•  Move
data
from
shard
to
shard

•  Anything
may/could/will
fail
eventually

•  Not
designed
for
the
“happy”
flow

What was our wishlist?

23

Old architecture overview

24

New architecture overview

25

New architecture overview
Server API
Application Model
Storage platform
Client-side API
Presentation layer
Physical storage

26

•  Everything
wrihen
in
Erlang

•  Piqi
as
protocol

•  binary

•  JSON

•  XML

•  SSP
u=lizes
local
caching
(memcache)

•  Flexible
(persistent)
storage
layer

•  MySQL
(various
ﬂavors)

•  Membase/Couchbase

•  Could
be
any
other
storage
product

•  MQs
(DWH
updates)

Our building blocks

27

•  Predictable

•  Reliable

•  Decent
performance

•  Easy
to
comprehend

•  Excellent
eco
system

•  Libraries

•  Monitoring
tools

•  Knowledge

Why choose MySQL?

28

•  Func=onal
language

•  High
availability:
designed
for
telecom
solu=ons

•  Excels
at
concurrency,
distribu=on,
fault
tolerance

•  Do
more
with
less!

•  Other
companies
using
Erlang:

Why Erlang?

29

•  What
is
the
bucket
model?

•  Each
record
has
one
unique
owner
ahribute
(GID)

•  GID
(Global
IDen=ﬁer)
iden=fying
diﬀerent
types

•  Bucket(s)
per
func=onality

•  Bucket
is
structured
data

•  Ahributes
contain
data
of
records

•  Ahributes
do
not
have
to
correspond
to
schema

How do we shard?

30

$
curl
-‐X
POST
-‐H
'Accept:
applica=on/json'
-‐H

'Content-‐Type:
applica=on/json'
-‐-‐data-‐binary
"{"gid":

288511851128422401}"
hhp://127.0.0.1:8777/demobucket/get

{

"records":
[

{

"gid":
288511851128422401,

"given_name":
"g",

"registered_on":
1,

"email":
"mail1",

"gender":
"m",

"birthdate":
{
"year":
1963,
"month":
6,
"day":
21
}

}

],

"meta_info":
{
"total_ct":
1
}

}

Example bucket

31

CREATE
TABLE
demobucket
(

gid
bigint(20)
unsigned
not
null,

given_name
varchar(64)
not
null,

registered_on
=nyint(3)
unsigned
default
0,

email
varchar(255)
not
null,

gender
enum(‘m’,
‘f’,
‘u’)
not
null
default
‘m’,

birthdate
date
not
null,

PRIMARY
KEY(gid)

);

Example bucket MySQL 1

32

CREATE
TABLE
demobucket
(

gid
bigint(20)
unsigned
not
null,

user_name
varchar(64)
not
null,

user_register
=mestamp
on
update

CURRENT_TIMESTAMP(),

user_emailaddress
varchar(255)
not
null,

user_gender
char(1)
not
null
default
‘m’,

user_dob
varchar(10)
not
null,

PRIMARY
KEY(gid)

);

Example bucket MySQL 2

33

CREATE
COLUMNFAMILY
demobucket
(

gid
int
PRIMARY
KEY,

given_name
varchar,

registered_on
=mestamp,

email
varchar,

gender
varchar,

birth_date
varchar

);

Example bucket Cassandra

34

demobucket:get(
#demobucket_get_input{
gid=12345,
filters=
[

#filter{
ahr=
<<"gender">>

,
op=
<<"=">>

,
parms=
{string,
<<"f">>}},

#filter{
ahr=
<<"registered_on">>,
op=
<<"sort">>,
parms=asc
},

#filter{
ahr=
<<"gid">>,
op=
<<"limit">>,

parms={int,
10
}}

]}
)

Example Erlang filters

35

Pipeline flow of a bucket

36

•  Nearest
datacenter
(DC)
to
the
end
user

•  Satellite
DC

•  Processing
and
caching

•  Do
not
own/store
data

•  Storage
DC

•  Processing,
caching
and
persistent
storage

•  Store
all
same
user
data
in
same
DC

•  Par==on
on
user
globally

•  Global
IDen=ﬁer
per
user

Global distribution

37

•  Contains
GIDs
and
their
master
DC

•  GIDs
master
DC
predeﬁned

•  Migrated
GIDs
get
updated

The lookup server

38

•  Globally
sharded
on
GID

•  (local)
GID
Lookup

How does this work?
GID
lookup
Shard 1 Shard 2
Persistent
storage

39

Master/Satellite DC example

40

•  Spread
data
even
on
shards

•  Migra=on
of
buckets
between
shards

•  GID
migra=on
between
DCs

•  Crea=ng
a
new
storage
DC
needs
data
migra=on

•  Users
will
automa=cally
be
migrated
a‚er
visi=ng

another
DC
many
=mes

Why do we need data migration?

41

•  Versioning
on
bucket
deﬁni=ons

•  GIDs
are
assigned
to
a
bucket
version

•  Data
in
old
bucket
versions
remain
(read
only)

•  New
data
only
gets
wrihen
to
new
bucket
version

•  Updates
migrate
data
to
new
bucket
version

•  Migrates
can
be
triggered

Seamless schema upgrades

42

Seamless schema upgrades
Demobucket
v1

GID

1234

1235

1236

1237

1238

1239

name

Roy

Moss

Jen

Douglas

Denholm

Richmond

Demobucket
v2

GID

name

gender

GID

1241

name

Patricia

gender

f

GID

1241

1235

name

Patricia

Moss

gender

f

m

GID

1234

1236

1237

1238

1239

name

Roy

Jen

Douglas

Denholm

Richmond

GID

1234

1237

1238

1239

name

Roy

Douglas

Denholm

Richmond

GID

1241

1235

1236

name

Patricia

Moss

Jen

gender

f

m

f

43

•  Every
cluster
(two
masters)
will
contain
two
shards

•  Data
wrihen
interleaved

•  HA
for
both
shards

•  No
warmup
needed

•  Both
masters
ac=ve
and
“warmed
up”

•  Slaves
added
(other
DC)
for
HA
and
backup

Multi Master writes
SSP

Shard
1

Shard
2

44

•  SPAPI
is
in
place

•  SSP
is
(mostly)
running
in
shadow
mode

•  GID
buckets
running
in
produc=on

•  Ac=vity
feed
system
ﬁrst
to
produc=on

•  Satellite
DC
in
early
2013!

Where do we stand now?

47

•  Presenta=on
can
be
found
at:

hhp://spil.com/perconalondon2012

•  If
you
wish
to
contact
me:

art@spilgames.com

•  Don’t
forget
to
rate
my
talk!

Thank you!

Retaining globally distributed high availability

More Related Content

What's hot (20)

Similar to Retaining globally distributed high availability (20)

Recently uploaded (20)

Retaining globally distributed high availability