Designing for Scale

Designing for Scale
Knut Nesheim @knutin
Paolo Negri @hungryblank

About this talk

2 developers and erlang
vs.
1 million daily users

Social Games
Flash client (game) HTTP API

Social Games
Flash client

• Game actions need to be
persisted and validated

• 1 API call every 2 secs

Social Games
HTTP API

• @ 1 000 000 daily users
• 5000 HTTP reqs/sec
• more than 90% writes

The hard nut

http://www.ﬂickr.com/photos/mukluk/315409445/

Users we expect
DAU

1000000

750000
“Monster World”
daily users
july - december 2010 500000

250000

0
July December

Users we have
DAU

New game
daily users
march - june 2011

50
0
march april may june

What to do?

1 Simulate users

Simulating users

• Must not be too synthetic (like
apachebench)
• Must look like a meaningful game session
• Users must come online at a given rate and
play

Tsung

• Multi protocol (HTTP, XMPP) benchmarking tool

• Able to test non trivial call sequences

• Can actually simulate a scripted gaming session

http://tsung.erlang-projects.org/

Tsung - conﬁguration
Fixed content Dynamic parameter
<request subst="true">
<http url="http://server.wooga.com/users/%
%ts_user_server:get_unique_id%%/resources/column/5/
row/14?%%_routing_key%%"
method="POST" contents='{"parameter1":"value1"}'>
</http>
</request>


• Not something you fancy writing
• We’re in development, calls change and we
constantly add new calls
• A session might contain hundreds of
requests
• All the calls must refer to a consistent game
state


• From our ruby test code
user.resources(:column => 5, :row => 14)

• Same as
<request subst="true">
<http url="http://server.wooga.com/users/%
%ts_user_server:get_unique_id%%/resources/column/5/
row/14?%%_routing_key%%"
method="POST" contents='{"parameter1":"value1"}'>
</http>
</request>


• Session A session is a
• requests group of requests

• Arrival phase Sessions arrive in
• duration phases with a
speciﬁc arrival
• arrival rate rate


Tsung - setup
Application Benchmarking
cluster
app server tsung
HTTP reqs worker
ssh
app server
tsung
master

app server


Tsung

• Generates ~ 2500 reqs/sec on AWS
m1.large
• Flexible but hard to extend
• Code base rather obscure


What to do?

2 Collect metrics

Tsung-metrics

• Tsung collects measures and provides
reports
• But these measure include tsung network/
cpu congestion itself
• Tsung machines aren’t a good point of view


HAproxy
cluster
app server tsung
HTTP reqs worker
haproxy ssh
app server
tsung
master

app server

HAproxy

“The Reliable, High Performance TCP/
HTTP Load Balancer”
• Placed in front of http servers
• Load balancing
• Fail over

HAproxy - syslog

• Easy to setup
• Efﬁcient (UDP)
• Provides 5 timings per each request

HAproxy
• Time to receive request from client
cluster
app server tsung
haproxy worker
ssh
app server
tsung
master

HAproxy
• Time spent in HAproxy queue
cluster
app server tsung
haproxy worker
ssh
app server
tsung
master

HAproxy
• Time to connect to the server
cluster
app server tsung
haproxy worker
ssh
app server
tsung
master

HAproxy
• Time to receive response headers from server
cluster
app server tsung
haproxy worker
ssh
app server
tsung
master

HAproxy
• Total session duration time
cluster
app server tsung
haproxy worker
ssh
app server
tsung
master

HAproxy - syslog

• Application urls identify directly server call
• Application urls are easy to parse
• Processing haproxy syslog gives per call
metric

What to do?

3 Understand metrics

Reading/aggregating
metrics

• Python to parse/normalize syslog
• R language to analyze/visualize data
• R language console to interactively explore
benchmarking results

R is a free software environment for
statistical computing and graphics.

What you get

• Aggregate performance levels (throughput,
latency)
• Detailed performance per call type
• Statistical analysis (outliers, trends,
regression, correlation, frequency, standard
deviation)

Digging into the data

• From HAproxy log analisys one call
emerged as exceptionally slow
• Using eprof we were able to determine
that most of the time was spent in a redis
query fetching many keys (MGET)

Tracing erldis query
• More than 60% of runtime is spent
manipulating the socket
• gen_tcp:recv/2 is the culprit
• But why is it called so many times?

Understanding the
redis protocol
C: LRANGE mylist 0 2
<<"*2rn
s: *2 $5rn
s: $5 Hellorn
$5rn
s: Hello Worldrn">>
s: $5
s: World

Understanding erldis
• recv_value/2 is used in the protocol parser
to get the next data to parse

A different approach
• Two ways to use gen_tcp: active or passive
• In passive, use gen_tcp:recv to explicitly ask
for data, blocking
• In active, gen_tcp will send the controlling
process a message when there is data
• Hybrid: active once


• Is active sockets faster?
• Proof-of-concept proved active socket
faster
• Change erldis or write a new driver?


• Radical change => new driver
• Keep Erldis queuing approach
• Think about error handling from the start
• Use active sockets

• Active socket, parse partial replies

Circuit breaker
• eredis has a simple circuit breaker for when
Redis is down/unreachable
• eredis returns immediately to clients if
connection is down
• Reconnecting is done outside request/
response handling
• Robust handling of errors

Benchmarking eredis

• Redis driver critical for our application
• Must perform well
• Must be stable
• How do we test this?

Basho bench

• Basho produces the Riak KV store
• Basho build a tool to test KV servers
• Basho bench
• We used Basho bench to test eredis

Basho bench
• Create callback module

Basho bench
• Conﬁguration term-ﬁle

eredis is open source

https://github.com/wooga/eredis

What to do?

5 measure internals

Measure internals

HAproxy point of view is valid but how to
measure internals of our application, while
we are live, without the overhead of
tracing?

Think Basho bench

• Basho bench can benchmark a redis driver
• Redis is very fast, 100K ops/sec
• Basho bench overhead is acceptable
• The code is very simple

Cherry pick ideas from
Basho Bench
• Creates a histogram of timings on the ﬂy,
reducing the number of data points
• Dumps to disk every N seconds
• Allows statistical tools to work on already
aggregated data
• Near real-time, from event to stats in N+5
seconds

Homegrown stats
• Measures latency from the edges of our
system (excludes HTTP handling)
• And at interesting points inside the system
• Statistical analysis using R
• Correlate with HAproxy data
• Produces graphs and data speciﬁc to our
application

Recap

Measure:
• From an external point of view (HAproxy)
• At the edge of the system (excluding
HTTP handling)
• Internals in the single process (eprof)

Recap
Analyze:
• Aggregated measures
• Statistical properties of measures
• standard deviation
• distribution
• trends

Thanks!

http://www.wooga.com/jobs

knut.nesheim@wooga.com @knutin
paolo.negri@wooga.com @hungryblank

Designing for Scale

More Related Content

What's hot (12)

Viewers also liked (20)

Similar to Designing for Scale (20)

More from Wooga (20)

Recently uploaded (20)

Designing for Scale