게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming

게임을 위한 DynamoDB 사례 및 팁
김일호 | Solutions Architect
박성원, Nexon Korea | 파트장

Agenda
§Tip 1. DynamoDB Index(LSI, GSI)
§Tip 2. DynamoDB Scaling
§Tip 3. DynamoDB Data Modeling
§Scenario based Best Practice
§Nexon use-case

Indexes
Scaling
Data Modeling
Best Practice
Use-case

Local secondary index (LSI)
§Alternate range key attribute
§Index is local to a hash key (or partition)
A1
(hash)
A3
(range)
A2
(table key)
A1
(hash)
A2
(range)
A3 A4 A5
LSIs A1
(hash)
A4
(range)
A2
(table key)
A3
(projected)
Table
KEYS_ONLY
INCLUDE A3
A1
(hash)
A5
(range)
A2
(table key)
A3
(projected)
A4
(projected)
ALL
10 GB max per hash key,
i.e. LSIs limit the # of ran
ge keys!

Global secondary index (GSI)
§Alternate hash (+range) key
§Index is across all table hash keys (partitions)
A1
(hash)
A2 A3 A4 A5
GSIs A5
(hash)
A4
(range)
A1
(table key)
A3
(projected)
Table
INCLUDE A3
A4
(hash)
A5
(range)
A1
(table key)
A2
(projected)
A3
(projected) ALL
A2
(hash)
A1
(table key) KEYS_ONLY
RCUs/WCUs provisioned se
parately for GSIs
Online indexing

How do GSI updates work?
Table
Primary ta
ble
Primary ta
ble
Primary ta
ble
Primary ta
ble
Global Seco
ndary Index
Client
2. Asynchronous
update (in progress)
If GSIs don’t have enough write capacity, table writes will be throttled!

LSI or GSI?
§LSI can be modeled as a GSI
§If data size in an item collection > 10 GB, use GSI
§If eventual consistency is okay for your scenario, use GSI!

Scaling
§Throughput
§ Provision any amount of throughput to a table
§Size
§ Add any number of items to a table
§ Max item size is 400 KB
§ LSIs limit the number of range keys due to 10 GB limit
§Scaling is achieved through partitioning

Throughput
§Provisioned at the table level
§ Write capacity units (WCUs) are measured in 1 KB per second
§ Read capacity units (RCUs) are measured in 4 KB per second
§ RCUs measure strictly consistent reads
§ Eventually consistent reads cost 1/2 of consistent reads
§Read and write throughput limits are independent
WCURCU

Partitioning math
Number of Partitions
By Capacity (Total RCU / 3000) + (Total WCU / 1000)
By Size Total Size / 10 GB
Total Partitions CEILING(MAX (Capacity, Size))

Partitioning example
Table size = 8 GB, RCUs = 5000, WCUs = 500
RCUs per partition = 5000/3 = 1666.67
WCUs per partition = 500/3 = 166.67
Data/partition = 10/3 = 3.33 GB
RCUs and WCUs are uniformly sprea
d across partitions
Numberof Partitions
By Capacity (5000 / 3000) + (500 / 1000) = 2.17
By Size 8 / 10 = 0.8
Total Partitions CEILING(MAX (2.17, 0.8)) = 3

Example: hot keys
Partition
Time
Heat

Getting the most out of DynamoDB throughput
“To get the most out of
DynamoDB throughput, create
tables where the hash key
element has a large number of
distinct values, and values are
requested fairly uniformly, as
randomly as possible.”
—DynamoDB Developer Guide
§Space: access is evenly
spread over the key-
space
§Time: requests arrive
evenly spaced in time

What causes throttling?
§ If sustained throughput goes beyond provisioned throughput per partition
§ Non-uniform workloads
§ Hot keys/hot partitions
§ Very large bursts
§ Mixing hot data with cold data
§ Use a table per time period
§ From the example before:
§ Table created with 5000 RCUs, 500 WCUs
§ RCUs per partition = 1666.67
§ WCUs per partition = 166.67
§ If sustained throughput > (1666 RCUs or 166 WCUs) per key or partition, DynamoDB may
throttle requests
§ Solution: Increase provisioned throughput

1:1 relationships or key-values
§Use a table or GSI with a hash key
§Use GetItem or BatchGetItem API
§Example: Given an SSN or license number, get
attributes
Users Table
Hash key Attributes
SSN = 123-45-6789 Email = johndoe@nowhere.com, License = TDL25478134
SSN = 987-65-4321 Email = maryfowler@somewhere.com, License = TDL78309234
Users-Email-GSI
Hash key Attributes
License = TDL78309234 Email = maryfowler@somewhere.com, SSN = 987-65-4321
License = TDL25478134 Email = johndoe@nowhere.com, SSN = 123-45-6789

1:N relationships or parent-children
§Use a table or GSI with hash and range key
§Use Query API
Example:
§ Given a device, find all readings between epoch X, Y
Device-measurements
Hash Key Range key Attributes
DeviceId = 1 epoch = 5513A97C Temperature = 30, pressure = 90
DeviceId = 1 epoch = 5513A9DB Temperature = 30, pressure = 90

N:M relationships
§Use a table and GSI with hash and range key elements
switched
§Use Query API
Example: Given a user, find all games. Or given a game,
find all users.
User-Games-Table
Hash Key Range key
UserId = bob GameId = Game1
UserId = fred GameId = Game2
UserId = bob GameId = Game3
Game-Users-GSI
Hash Key Range key
GameId = Game1 UserId = bob
GameId = Game2 UserId = fred
GameId = Game3 UserId = bob

Documents (JSON)
§ New data types (M, L, BOOL,
NULL) introduced to support
JSON
§ Document SDKs
§ Simple programming model
§ Conversion to/from JSON
§ Java, JavaScript, Ruby, .NET
§ Cannot index (S,N) elements
of a JSON object stored in M
§ Only top-level table attributes
can be used in LSIs and
GSIs without
Streams/Lambda
JavaScript DynamoDB
string S
number N
boolean BOOL
null NULL
array L
object M

Rich expressions
§Projection expression
§ Query/Get/Scan: ProductReviews.FiveStar[0]
§Filter expression
§ Query/Scan: #VIEWS > :num
§Conditional expression
§ Put/Update/DeleteItem: attribute_not_exists (#pr.FiveStar)
§Update expression
§ UpdateItem: set Replies = Replies + :num

Game logging
Storing time series data

Time series tables
Events_table_2015_April
Event_id
(Hash key)
Timestamp
(range key)
Attribute1 …. Attribute N
Events_table_2015_March
Event_id
(Hash key)
Timestamp
(range key)
Events_table_2015_Feburary
Event_id
(Hash key)
Timestamp
(range key)
Events_table_2015_January
Event_id
(Hash key)
Timestamp
(range key)
RCUs = 1000
WCUs = 100
RCUs = 10000
WCUs = 10000
RCUs = 100
WCUs = 1
RCUs = 10
WCUs = 1
Current table
Older tables
Hot dataCold data
Don’t mix hot and cold data; archive cold data to Amazon S3

Important when:
Use a table per time period
§Pre-create daily, weekly, monthly tables
§Provision required throughput for current table
§Writes go to the current table
§Turn off (or reduce) throughput for older tables
Dealing with time series data

Item shop catalog
Popular items (read)

Partition 1
2000 RCUs
Partition K
2000 RCUs
Partition M
2000 RCUs
Partition 50
2000 RCU
Scaling bottlenecks
Product A Product B
Gamers
ItemShopCatalog Table
SELECT Id, Description, ...
FROM ItemShopCatalog

Requests Per Second
Item Primary Key
Request Distribution Per Hash Key
DynamoDB Requests

Partition 1 Partition 2
ItemShopCatalog Table
User
DynamoDB
User
Cache
popular items
SELECT Id, Description, ...
FROM ProductCatalog

Requests Per Second
Item Primary Key
Request Distribution Per Hash Key
DynamoDB Requests Cache Hits

Multiplayer online gaming
Query filters vs.
composite key indexes

GameId Date Host Opponent Status
d9bl3 2014-10-02 David Alice DONE
72f49 2014-09-30 Alice Bob PENDING
o2pnb 2014-10-08 Bob Carol IN_PROGRESS
b932s 2014-10-03 Carol Bob PENDING
ef9ca 2014-10-03 David Bob IN_PROGRESS
Games Table
Multiplayer online game data
Hash key

Query for incoming game requests
§DynamoDB indexes provide hash and range
§What about queries for two equalities and a range?
SELECT * FROM Game
WHERE Opponent='Bob‘
AND Status=‘PENDING'
ORDER BY Date DESC
(hash)
(range)
(?)

Secondary Index
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
Approach 1: Query filter
BobHash key Range key

Secondary Index
Approach 1: query filter
Bob
Opponent Date GameId Status Host
Alice 2014-10-02 d9bl3 DONE David
Carol 2014-10-08 o2pnb IN_PROGRESS Bob
Bob 2014-09-30 72f49 PENDING Alice
Bob 2014-10-03 b932s PENDING Carol
Bob 2014-10-03 ef9ca IN_PROGRESS David
SELECT * FROM Game
WHERE Opponent='Bob'
ORDER BY Date DESC
FILTER ON Status='PENDING'
(filtered out)

Important when:
Use query filter
§Send back less data “on the wire”
§Simplify application code
§Simple SQL-like expressions
§ AND, OR, NOT, ()
Your index isn’t entirely selective

Approach 2: composite key
StatusDate
DONE_2014-10-02
IN_PROGRESS_2014-10-08
IN_PROGRESS_2014-10-03
PENDING_2014-09-30
PENDING_2014-10-03
Status
DONE
IN_PROGRESS
IN_PROGRESS
PENDING
PENDING
Date
2014-10-02
2014-10-08
2014-10-03
2014-10-03
2014-09-30
+ =

Secondary Index
Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Hash key Range key

Opponent StatusDate GameId Host
Alice DONE_2014-10-02 d9bl3 David
Carol IN_PROGRESS_2014-10-08 o2pnb Bob
Bob IN_PROGRESS_2014-10-03 ef9ca David
Bob PENDING_2014-09-30 72f49 Alice
Bob PENDING_2014-10-03 b932s Carol
Secondary Index
Bob
SELECT * FROM Game
WHERE Opponent='Bob'
AND StatusDate BEGINS_WITH 'PENDING'

Needle in a sorted haystack
Bob

Sparse indexes
Id
(Hash)
User Game Score Date Award
1 Bob G1 1300 2012-12-23
2 Bob G1 1450 2012-12-23
3 Jay G1 1600 2012-12-24
4 Mary G1 2000 2012-10-24 Champ
5 Ryan G2 123 2012-03-10
6 Jones G2 345 2012-03-20
Game-scores-table
Award
(Hash)
Id User Score
Champ 4 Mary 2000
Award-GSI
Scan sparse hash GSIs

Important when:
Replace filter with indexes
§Concatenate attributes to form useful
§secondary index keys
§Take advantage of sparse indexes
You want to optimize a query as much
as possible
Status + Date

Nexon Korea, 박성원 파트장
넥슨 모바일 게임 관련 데이터베이스 업무를 담당하고 있습니다.

DDB 사용현황
§HIT 메인 게임DB
• 27 Tables
• Various Alarms
• Flexible Dashboards

DDB, 이런 분에게 추천합니다.
§Capacity 산정
§Monitoring
§백업 및 복원
§Tips
https://i.ytimg.com/vi/C6U2M7ZkZI8/maxresdefault.jpg

Capacity 산정
§ 서비스 전 읽기/쓰기에 대한 패턴 테스트를 꼭 수행하세요.
§ 모든 패턴이 읽혀지진 않지만, 병목의 중심이 되는 대략적인 내용은
파악할 수 있습니다.
§ 동시 사용자 증가 속도를 꼭 염두에 두셔야 합니다. 대응하려는
순간 이미 서비스 불가일 수 있습니다.
Ø 2분 안에 800,000 증가
Ø 초당 RCUs 변환 시 13,000 증가
시간
읽
기
호
출
수

Capacity 변경
§웹콘솔, AWS앱, CLI 등으로 쉽게 Capacity 변경이
가능합니다.
§다만, 즉시 적용은 되지 않습니다. AWS에도 작업 시간을
주세요.
§대량의 RCUs, WCUs 변경이 필요한 경우는 미리
준비하는 것이 좋습니다.

Capacity 모니터링
§Cloud Watch 를 활용하세요.
§대시보드 및 알람 기능이 매우 유용합니다.
§초 단위 데이터까지 보기엔 어렵습니다. (1분 단위 Sum)

백업 및 복원
§ Export / Import 개념입니다.
§ Data Pipeline 을 사용하세요.
§ CLI 에 친숙해 지세요. 반복적인 작업을 쉽게 구현할 수
있습니다.

소소한 Tip
§Read / Write 캐시 레이어 도입을 고려하세요.
• Retry 시 지수 백오프 알고리즘을 도입하세요.
• Data 용량계획을 고려하세요.
http://cfile30.uf.tistory.com/image/221DF544538C11D62866E9

DDB 를 사용해보니…
§전통적인 RDB와 비교하여 손댈 것이 거의 없습니다.
§AWS Document 에 근거하여 개발하셔야 합니다.
§운영 시 Cloud Watch 및 CLI 가 많은 도움이 됩니다.

게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to 게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming (12)

More from Amazon Web Services Korea (20)

Recently uploaded (20)

게임을 위한 DynamoDB 사례 및 팁 - 김일호 솔루션즈 아키텍트:: AWS Cloud Track 3 Gaming