SlideShare a Scribd company logo
Countdown to Zero
Counter Use Cases in Aerospike
2 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 I’m Ronen Botzer, and I’m a solutions architect at Aerospike.
 I’ve worked at Aerospike since June 2014, first on the Python and PHP clients.
 I’m active on StackOverflow and the Aerospike community forum.
 This is the third in a series of tech talks about data modeling.
 Check out the slides from the previous ASUG Israel meetups.
Welcome to Aerospike User Group Israel #3
3 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 A Quick Overview of Aerospike Data Model and Storage
 Counter Use Cases
 Data Capping – Telecom
 Frequency Capping – Ad Tech
 Consolidating Counters
 Mitigating Hot Counters
 Strong Consistency
 Counters
 Ticket Inventory – Countdown to Zero
 Summary
Agenda
4 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Aerospike is a Primary Key Database
Objects stored in Aerospike are called records
A bin holds the value of a supported data type: integer, double, string, bytes, list, map,
geospatial
Every record is uniquely identified by the 3-tuple (namespace, set, user-key)
A record contains one or more bins
(namespace, set, user-key)
EXP – Expiration Timestamp
LUT – Last Update Time
GEN – Generation
RECORD
EXP LUT GEN BIN1 BIN2
5 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 Aerospike is a row-oriented distributed database
 Rows (records) contain one or more columns (bins)
 Similar to an RDBMS with primary-key table lookups
 Single record transactions
 Namespaces can be configured for strong consistency
Aerospike Concepts
Aerospike RDBMS
Namespace Tablespace or Database
Set Table
Record Row
Bin Column
Bin type
Integer
Double
String
Bytes
List (Unordered, Ordered)
Map (Unordered,
K-Ordered, KV-Ordered)
GeoJSON
6 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 Integer
 8B of storage (64-bit)
 In places where signed values are needed (secondary indexes) the integer is
signed in the range of –(2^63) to (2^63) – 1.
 Double
 64-bit IEEE-754
 Both Integer and Double values can be used for counters.
 Support the atomic increment() operation.
 Can be stored in special data-in-index namespaces.
Numeric Data Types in Aerospike
7 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Namespace Storage
8 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Data Capping
namespace users {
memory-size 80G
replication-factor 2
prefer-uniform-balance true
partition-tree-sprigs 8192
storage-engine device {
write-block-size 128K
data-in-memory false
read-page-cache true
post-write-queue 1024
write-block-size 128K
device /dev/nvme1n1p1
device /dev/nvme1n1p2
device /dev/nvme1n1p3
device /dev/nvme2n1p1
device /dev/nvme2n1p2
device /dev/nvme2n1p3
}
}
9 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
('users', 'mobile', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2')
--
name: 'Jasper Madison Jr.'
age: 27
payment-method: [
{ 'type': 'visa', 'last4’: 6164,
'expires': '2019-09'},
{
'type': 'mastercard', 'last4': 7147,
'expires': '2023-03'}
]
number: [ 1, 408, 5551212 ]
data: 10711339520
 Atomically increment() the integer value of data to implement a counter
 What's the problem with this approach?
Data Capping
10 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 Think about the access patterns.
 The naive implementation has the user's data stored in a single record.
 Minimizes the amount of memory used by this namespace.
 Reduces the number of reads.
 Problem: the data counter updates frequently.
 Solution: split the data counter into a separate in-memory namespace.
 Consider using the data-in-index optimization.
Optimizing Counters – Namespace Considerations
11 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Optimizing Counters – Namespace Considerations
namespace counters {
memory-size 20G
default-ttl 1d
replication-factor 2
partition-tree-sprigs 8192
prefer-uniform-balance true
single-bin true
data-in-index true
storage-engine device {
data-in-memory true
filesize 40G
write-block-size 256K
file /opt/aerospike/data/counters.dat
}
}
12 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
('counters', 'mobdata', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|2019-02')
--
: 10711339520
• Initialize this record with an explicit 0 value each month
• On all writes (each increment) set the TTL to -1, i.e. NEVER_EXPIRE
• Combine the increment with a read. After every update the app knows the latest value
• Archive the counter after rolling to the new month
Data Capping – Counter Using Data in Index
13 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
from aerospike_helpers.operations import operations as oh
key = ('counters', mobdata', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|2019-02')
ops = [
oh.increment('', 1500),
oh.read('')
]
ttl = aerospike.TTL_NEVER_EXPIRE # AKA ttl -1
(key, meta, bins) = client.operate(key, ops, {'ttl': ttl},
{'timeout': 50})
Data Capping – Increment and Read in a single Transaction
14 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05')
--
: 3
 Keep track of ads served to a user in a given day.
 Compound key of userID | adID | day
 Increment should work as an upsert.
 Initial TTL of 24 hours (use the default TTL of 1 day).
 Do not update the TTL if the record exists, by setting it to -2.
Frequency Capping
15 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Frequency Capping
from aerospike import exception as e
key = ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05')
try:
(key, meta, bins) = client.get(key, {'timeout': 20})
if bins[''] < AD_LIMIT:
client.increment(key, '', 1, {'ttl': aerospike.TTL_DONT_UPDATE}) # aka ttl -2
# continue and attempt to serve the ad
except e.RecordNotFound:
ttl = aerospike.TTL_NAMESPACE_DEFAULT # AKA ttl 0, inherit the default-ttl
client.put(key, '', 1, {'ttl': ttl},
policy={'exists': aerospike.POLICY_EXISTS_CREATE})
16 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|cmp123|2019-03-05')
--
: {
'adx123': 2,
'adx456': 3
}
 Keep track of all campaign ads served to a user in a given day
 Compound key of userID | campaignID | day
 Map increment should work as an upsert, returning the current count
 Initial TTL of 24 hours (default TTL is 1 day)
 Do not update the TTL if the record exists, by setting it to -2.
 Only numeric data works with data-in-index. Use data-in-memory/single-bin.
Frequency Capping – Consolidating Counters
17 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Frequency Capping – Consolidating Counters
key = ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|cmp123|2019-03-05')
try:
val = client.map_get_by_key(key, 'ads', 'adx123', aerospike.MAP_RETURN_VALUE)
if val < AD_LIMIT:
# continue and attempt to serve the ad
client.map_increment(key, '', 'adx123', 1, {}, {'ttl': aerospike.TTL_DONT_UPDATE})
except e.RecordNotFound as err:
try:
ops = [
mh.map_put('', 'adx123', 1, {}),
mh.map_get_by_key('', 'adx123', aerospike.MAP_RETURN_VALUE)
]
ttl = aerospike.TTL_NAMESPACE_DEFAULT # AKA ttl 0, inherit the default-ttl
(key, meta, bins) = client.operate(key, ops, {'ttl': ttl})
18 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Solution:
 Identify hot counters with a key busy (error code 14) exception.
 Shard into several records <key||1> .. <key||N>. These will distribute to different nodes.
Reading the counter:
 How do you know if this counter is sharded? EAFP or Check ahead?
 EAFP approach:
 Overload the counter with a string. Assume all records aren't sharded.
 Catch any bin incompatible (error code 12) exception.
 Check-ahead approach:
 Check for the existence of a shard key with exists().
 If this counter is sharded, use a batch-read to fetch all the shards, and combine in the app.
Hot Counters
19 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05') -- : 🚀 U+1F680
And
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||1') -- : 14
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||2') -- : 11
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||3') -- : 9
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||4') -- : 12
('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||5') -- : 10
Either try to increment the original record and fail with an incompatible bin error (code 12), or check
for the existence of a shard key. One of those being true should lead to batch read the shard keys.
Hot Counters
20 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 In an AP system you have no guarantee that a write succeeded or failed on both the
master and replica, even when the cluster is stable (timeouts). It is best effort.
 Stale reads can happen from the slave before the transaction completes.
 Subsequent dirty reads if the write happens to one or both and a timeout occurs.
 An AP system will lose writes during network partitions.
 Aerospike 4.0 passed Jepsen testing of Strong Consistency (as did MongoDB,
CockroachDB).
 In a stable Aerospike 4 cluster with RF=2 the performance is the same as AP mode.
 If you want to be sure that your counters are accurate, you should use a namespace that is
defined as strong-consistency true.
Strong Consistency Mode
21 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
(Integer counter , Unordered Unique transactionIDs [ 1, 4, 7, 3, 9])
• Generate a transaction ID for each call. For example, use 4B for the client (process ID)
masked with a 4 byte transaction counter in the process.
1. In one transaction
a. Increment the counter value atomically.
b. list-prepend the transaction ID while declaring the list should be unique.
c. Trim the list to N recent transactions (by index).
d. Return the value of the counter.
2. If it succeeded then we’re done.
3. If it fails on a uniqueness violation for the transaction ID, then we’re done. This
transaction already happened (error code 24 ElementExistsError).
4. If the transaction is ‘inDoubt’ go to step 1.
Counters in Strong Consistency Mode
22 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … })
Cart: cartID => [ttl, status, quantity]
• This application handles ticket reservations for shows.
• A key identifies a specific show's inventory.
• Assume there is no assigned seating.
• The application should never overbook a show by selling more tickets than are available.
• The counter is the pair of seats remaining and shopping carts.
• When a show becomes available for booking, it is initialized with a number of seats.
• This example relies on Strong Consistency.
Ticket Inventory – Countdown to Zero
23 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … )}
Cart: cartID => [ttl, status, quantity]
• Assuming that the most recently known seats quantity for the show was greater than
zero, you’d reserve seats with the following transaction:
1. Decrement the seats value.
2. map-set carts with the cart with aerospike.MAP_WRITE_FLAGS_CREATE_ONLY
3. Return the value of seats.
• If the reservation was 'inDoubt' repeat, then go to the next step on success or
ElementExistsError (error code 24).
Ticket Inventory – Countdown to Zero
24 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … )}
Cart: cartID => [ttl, status, quantity]
• If the returned value is sub-zero, roll back with the transaction:
1. map-remove-by-key the cart ID from carts.
2. Increment seats by the cart quantity.
• If the rollback was 'inDoubt' check if the cart exists. If it does, repeat.
Ticket Inventory – Countdown to Zero
25 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … )}
Cart: cartID => [ttl, status, quantity]
• If an item is removed from a cart, or the entire cart explicitly dumped, you’d read the value
of cartID in one map-get from the carts bin, then create a transaction to
1. map-remove-by-key the cartID from carts.
2. map-increment seats with the cart’s quantity.
• Periodically check for abandoned carts using a map-get-by-value( [ttl, WILDCARD] ) and
then treat each matched cart as if it was being explicitly removed.
• When a cart has been fully checked out, remove the map entry from carts , because the
inventory reservation is now permanent.
Ticket Inventory – Countdown to Zero
26 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
 We talked about the numeric data types in Aerospike.
 We modeled counters in Aerospike using different approaches.
 We talked about Strong Consistency mode.
 We discussed using Strong Consistency for implementing accurate counters.
What next?
 Take a look at the slides from the previous two Israeli ASUG meetups.
 Go to GitHub; clone the code samples repo; run it; read the code.
 Read the Aerospike blog. Get familiar with all the database features.
 Participate in the community forum (https://discuss.aerospike.com), StackOverflow’s
aerospike tag.
Summary
27 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.
Reference
 https://www.aerospike.com/docs/guide/data-types.html
 https://www.aerospike.com/docs/architecture/primary-index.html#single-bin-optimization
 https://www.aerospike.com/blog/aerospike-4-strong-consistency-and-jepsen/
 https://jepsen.io/consistency
 https://discuss.aerospike.com/t/handling-timeout-in-case-of-counter-bin/5196/4
 https://www.aerospike.com/docs/architecture/consistency.html#strong-consistency-mode
 https://www.aerospike.com/docs/guide/consistency.html#indoubt-errors
Code Samples
 https://github.com/rbotzer/aerospike-cdt-examples
Aerospike Training
 https://www.aerospike.com/training/
 https://academy.aerospike.com/
More material you can explore:
Thank You!
Any questions?
ronen@aerospike.com

More Related Content

PDF
A Deep Dive into Query Execution Engine of Spark SQL
PPTX
Exploring Modeling - Doing More with Lists
PDF
Aerospike Today and Tomorrow Product Roadmap 2023_Lenley Hensarling.pdf
PDF
sqlmap internals
PPTX
Building Reliable Lakehouses with Apache Flink and Delta Lake
PDF
Spark streaming , Spark SQL
PDF
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
ODP
Stream processing using Kafka
A Deep Dive into Query Execution Engine of Spark SQL
Exploring Modeling - Doing More with Lists
Aerospike Today and Tomorrow Product Roadmap 2023_Lenley Hensarling.pdf
sqlmap internals
Building Reliable Lakehouses with Apache Flink and Delta Lake
Spark streaming , Spark SQL
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Stream processing using Kafka

What's hot (20)

PDF
My First 100 days with an Exadata (PPT)
PDF
Apache Spark Introduction
PDF
Memory Management in Apache Spark
PPSX
FD.io Vector Packet Processing (VPP)
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
PDF
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
PDF
Handle Large Messages In Apache Kafka
PPTX
High throughput data replication over RAFT
PDF
Apache Spark in Depth: Core Concepts, Architecture & Internals
PDF
BlueStore: a new, faster storage backend for Ceph
PPTX
Apache Flink in the Cloud-Native Era
PDF
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
PDF
Common issues with Apache Kafka® Producer
PPTX
Apache Spark Fundamentals
ODP
Exadata
PDF
Dynamic Allocation in Spark
PDF
Introducing DataFrames in Spark for Large Scale Data Science
PDF
Scaling paypal workloads with oracle rac ss
PPTX
Optimizing Apache Spark SQL Joins
PPTX
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
My First 100 days with an Exadata (PPT)
Apache Spark Introduction
Memory Management in Apache Spark
FD.io Vector Packet Processing (VPP)
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Handle Large Messages In Apache Kafka
High throughput data replication over RAFT
Apache Spark in Depth: Core Concepts, Architecture & Internals
BlueStore: a new, faster storage backend for Ceph
Apache Flink in the Cloud-Native Era
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Common issues with Apache Kafka® Producer
Apache Spark Fundamentals
Exadata
Dynamic Allocation in Spark
Introducing DataFrames in Spark for Large Scale Data Science
Scaling paypal workloads with oracle rac ss
Optimizing Apache Spark SQL Joins
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Ad

Similar to Countdown to Zero - Counter Use Cases in Aerospike (20)

PPTX
Exploring Modeling - Best Practices with Aerospike Data Types
PPTX
Configuring Aerospike - Part 2
PPTX
Aerospike Architecture
PPTX
Flash Economics and Lessons learned from operating low latency platforms at h...
PDF
Brian Bulkowski. Aerospike
PPT
fdocuments.in_aerospike-key-value-data-access.ppt
PPT
Aerospike: Key Value Data Access
PPTX
Aerospike Architecture
PDF
Aerospike AdTech Gets Hacked in Lower Manhattan
PDF
You Snooze You Lose or How to Win in Ad Tech?
PDF
Aerospike Nested CDTs - Meetup Dec 2019
PPTX
Aerospike TCO Vs memory-first architectures
PPT
Big Data Learnings from a Vendor's Perspective
PDF
Developing for Real-time_Art Anderson.pdf
PDF
Aerospike User Group: Exploring Data Modeling
PPTX
Configuring Aerospike - Part 1
PDF
What enterprises can learn from Real Time Bidding (RTB)
PDF
What enterprises can learn from Real Time Bidding
PPT
Predictable Big Data Performance in Real-time
PPTX
Aerospike - fast and furious caching @ Burgasconf 2016
Exploring Modeling - Best Practices with Aerospike Data Types
Configuring Aerospike - Part 2
Aerospike Architecture
Flash Economics and Lessons learned from operating low latency platforms at h...
Brian Bulkowski. Aerospike
fdocuments.in_aerospike-key-value-data-access.ppt
Aerospike: Key Value Data Access
Aerospike Architecture
Aerospike AdTech Gets Hacked in Lower Manhattan
You Snooze You Lose or How to Win in Ad Tech?
Aerospike Nested CDTs - Meetup Dec 2019
Aerospike TCO Vs memory-first architectures
Big Data Learnings from a Vendor's Perspective
Developing for Real-time_Art Anderson.pdf
Aerospike User Group: Exploring Data Modeling
Configuring Aerospike - Part 1
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding
Predictable Big Data Performance in Real-time
Aerospike - fast and furious caching @ Burgasconf 2016
Ad

Recently uploaded (20)

PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
ai tools demonstartion for schools and inter college
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPT
Introduction Database Management System for Course Database
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Computer Software and OS of computer science of grade 11.pptx
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
L1 - Introduction to python Backend.pptx
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
System and Network Administraation Chapter 3
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
top salesforce developer skills in 2025.pdf
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
CHAPTER 2 - PM Management and IT Context
ai tools demonstartion for schools and inter college
Navsoft: AI-Powered Business Solutions & Custom Software Development
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Introduction Database Management System for Course Database
Designing Intelligence for the Shop Floor.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Computer Software and OS of computer science of grade 11.pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
L1 - Introduction to python Backend.pptx
Upgrade and Innovation Strategies for SAP ERP Customers
Internet Downloader Manager (IDM) Crack 6.42 Build 41
System and Network Administraation Chapter 3
wealthsignaloriginal-com-DS-text-... (1).pdf
top salesforce developer skills in 2025.pdf
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus

Countdown to Zero - Counter Use Cases in Aerospike

  • 1. Countdown to Zero Counter Use Cases in Aerospike
  • 2. 2 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  I’m Ronen Botzer, and I’m a solutions architect at Aerospike.  I’ve worked at Aerospike since June 2014, first on the Python and PHP clients.  I’m active on StackOverflow and the Aerospike community forum.  This is the third in a series of tech talks about data modeling.  Check out the slides from the previous ASUG Israel meetups. Welcome to Aerospike User Group Israel #3
  • 3. 3 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  A Quick Overview of Aerospike Data Model and Storage  Counter Use Cases  Data Capping – Telecom  Frequency Capping – Ad Tech  Consolidating Counters  Mitigating Hot Counters  Strong Consistency  Counters  Ticket Inventory – Countdown to Zero  Summary Agenda
  • 4. 4 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Aerospike is a Primary Key Database Objects stored in Aerospike are called records A bin holds the value of a supported data type: integer, double, string, bytes, list, map, geospatial Every record is uniquely identified by the 3-tuple (namespace, set, user-key) A record contains one or more bins (namespace, set, user-key) EXP – Expiration Timestamp LUT – Last Update Time GEN – Generation RECORD EXP LUT GEN BIN1 BIN2
  • 5. 5 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  Aerospike is a row-oriented distributed database  Rows (records) contain one or more columns (bins)  Similar to an RDBMS with primary-key table lookups  Single record transactions  Namespaces can be configured for strong consistency Aerospike Concepts Aerospike RDBMS Namespace Tablespace or Database Set Table Record Row Bin Column Bin type Integer Double String Bytes List (Unordered, Ordered) Map (Unordered, K-Ordered, KV-Ordered) GeoJSON
  • 6. 6 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  Integer  8B of storage (64-bit)  In places where signed values are needed (secondary indexes) the integer is signed in the range of –(2^63) to (2^63) – 1.  Double  64-bit IEEE-754  Both Integer and Double values can be used for counters.  Support the atomic increment() operation.  Can be stored in special data-in-index namespaces. Numeric Data Types in Aerospike
  • 7. 7 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Namespace Storage
  • 8. 8 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Data Capping namespace users { memory-size 80G replication-factor 2 prefer-uniform-balance true partition-tree-sprigs 8192 storage-engine device { write-block-size 128K data-in-memory false read-page-cache true post-write-queue 1024 write-block-size 128K device /dev/nvme1n1p1 device /dev/nvme1n1p2 device /dev/nvme1n1p3 device /dev/nvme2n1p1 device /dev/nvme2n1p2 device /dev/nvme2n1p3 } }
  • 9. 9 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. ('users', 'mobile', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2') -- name: 'Jasper Madison Jr.' age: 27 payment-method: [ { 'type': 'visa', 'last4’: 6164, 'expires': '2019-09'}, { 'type': 'mastercard', 'last4': 7147, 'expires': '2023-03'} ] number: [ 1, 408, 5551212 ] data: 10711339520  Atomically increment() the integer value of data to implement a counter  What's the problem with this approach? Data Capping
  • 10. 10 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  Think about the access patterns.  The naive implementation has the user's data stored in a single record.  Minimizes the amount of memory used by this namespace.  Reduces the number of reads.  Problem: the data counter updates frequently.  Solution: split the data counter into a separate in-memory namespace.  Consider using the data-in-index optimization. Optimizing Counters – Namespace Considerations
  • 11. 11 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Optimizing Counters – Namespace Considerations namespace counters { memory-size 20G default-ttl 1d replication-factor 2 partition-tree-sprigs 8192 prefer-uniform-balance true single-bin true data-in-index true storage-engine device { data-in-memory true filesize 40G write-block-size 256K file /opt/aerospike/data/counters.dat } }
  • 12. 12 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. ('counters', 'mobdata', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|2019-02') -- : 10711339520 • Initialize this record with an explicit 0 value each month • On all writes (each increment) set the TTL to -1, i.e. NEVER_EXPIRE • Combine the increment with a read. After every update the app knows the latest value • Archive the counter after rolling to the new month Data Capping – Counter Using Data in Index
  • 13. 13 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. from aerospike_helpers.operations import operations as oh key = ('counters', mobdata', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|2019-02') ops = [ oh.increment('', 1500), oh.read('') ] ttl = aerospike.TTL_NEVER_EXPIRE # AKA ttl -1 (key, meta, bins) = client.operate(key, ops, {'ttl': ttl}, {'timeout': 50}) Data Capping – Increment and Read in a single Transaction
  • 14. 14 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05') -- : 3  Keep track of ads served to a user in a given day.  Compound key of userID | adID | day  Increment should work as an upsert.  Initial TTL of 24 hours (use the default TTL of 1 day).  Do not update the TTL if the record exists, by setting it to -2. Frequency Capping
  • 15. 15 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Frequency Capping from aerospike import exception as e key = ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05') try: (key, meta, bins) = client.get(key, {'timeout': 20}) if bins[''] < AD_LIMIT: client.increment(key, '', 1, {'ttl': aerospike.TTL_DONT_UPDATE}) # aka ttl -2 # continue and attempt to serve the ad except e.RecordNotFound: ttl = aerospike.TTL_NAMESPACE_DEFAULT # AKA ttl 0, inherit the default-ttl client.put(key, '', 1, {'ttl': ttl}, policy={'exists': aerospike.POLICY_EXISTS_CREATE})
  • 16. 16 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|cmp123|2019-03-05') -- : { 'adx123': 2, 'adx456': 3 }  Keep track of all campaign ads served to a user in a given day  Compound key of userID | campaignID | day  Map increment should work as an upsert, returning the current count  Initial TTL of 24 hours (default TTL is 1 day)  Do not update the TTL if the record exists, by setting it to -2.  Only numeric data works with data-in-index. Use data-in-memory/single-bin. Frequency Capping – Consolidating Counters
  • 17. 17 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Frequency Capping – Consolidating Counters key = ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|cmp123|2019-03-05') try: val = client.map_get_by_key(key, 'ads', 'adx123', aerospike.MAP_RETURN_VALUE) if val < AD_LIMIT: # continue and attempt to serve the ad client.map_increment(key, '', 'adx123', 1, {}, {'ttl': aerospike.TTL_DONT_UPDATE}) except e.RecordNotFound as err: try: ops = [ mh.map_put('', 'adx123', 1, {}), mh.map_get_by_key('', 'adx123', aerospike.MAP_RETURN_VALUE) ] ttl = aerospike.TTL_NAMESPACE_DEFAULT # AKA ttl 0, inherit the default-ttl (key, meta, bins) = client.operate(key, ops, {'ttl': ttl})
  • 18. 18 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Solution:  Identify hot counters with a key busy (error code 14) exception.  Shard into several records <key||1> .. <key||N>. These will distribute to different nodes. Reading the counter:  How do you know if this counter is sharded? EAFP or Check ahead?  EAFP approach:  Overload the counter with a string. Assume all records aren't sharded.  Catch any bin incompatible (error code 12) exception.  Check-ahead approach:  Check for the existence of a shard key with exists().  If this counter is sharded, use a batch-read to fetch all the shards, and combine in the app. Hot Counters
  • 19. 19 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05') -- : 🚀 U+1F680 And ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||1') -- : 14 ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||2') -- : 11 ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||3') -- : 9 ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||4') -- : 12 ('counters', 'adcap', 'cf296d9a-0b77-4dd0-8d2b-91e59a6f02d2|adx123|2019-03-05||5') -- : 10 Either try to increment the original record and fail with an incompatible bin error (code 12), or check for the existence of a shard key. One of those being true should lead to batch read the shard keys. Hot Counters
  • 20. 20 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  In an AP system you have no guarantee that a write succeeded or failed on both the master and replica, even when the cluster is stable (timeouts). It is best effort.  Stale reads can happen from the slave before the transaction completes.  Subsequent dirty reads if the write happens to one or both and a timeout occurs.  An AP system will lose writes during network partitions.  Aerospike 4.0 passed Jepsen testing of Strong Consistency (as did MongoDB, CockroachDB).  In a stable Aerospike 4 cluster with RF=2 the performance is the same as AP mode.  If you want to be sure that your counters are accurate, you should use a namespace that is defined as strong-consistency true. Strong Consistency Mode
  • 21. 21 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. (Integer counter , Unordered Unique transactionIDs [ 1, 4, 7, 3, 9]) • Generate a transaction ID for each call. For example, use 4B for the client (process ID) masked with a 4 byte transaction counter in the process. 1. In one transaction a. Increment the counter value atomically. b. list-prepend the transaction ID while declaring the list should be unique. c. Trim the list to N recent transactions (by index). d. Return the value of the counter. 2. If it succeeded then we’re done. 3. If it fails on a uniqueness violation for the transaction ID, then we’re done. This transaction already happened (error code 24 ElementExistsError). 4. If the transaction is ‘inDoubt’ go to step 1. Counters in Strong Consistency Mode
  • 22. 22 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … }) Cart: cartID => [ttl, status, quantity] • This application handles ticket reservations for shows. • A key identifies a specific show's inventory. • Assume there is no assigned seating. • The application should never overbook a show by selling more tickets than are available. • The counter is the pair of seats remaining and shopping carts. • When a show becomes available for booking, it is initialized with a number of seats. • This example relies on Strong Consistency. Ticket Inventory – Countdown to Zero
  • 23. 23 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … )} Cart: cartID => [ttl, status, quantity] • Assuming that the most recently known seats quantity for the show was greater than zero, you’d reserve seats with the following transaction: 1. Decrement the seats value. 2. map-set carts with the cart with aerospike.MAP_WRITE_FLAGS_CREATE_ONLY 3. Return the value of seats. • If the reservation was 'inDoubt' repeat, then go to the next step on success or ElementExistsError (error code 24). Ticket Inventory – Countdown to Zero
  • 24. 24 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … )} Cart: cartID => [ttl, status, quantity] • If the returned value is sub-zero, roll back with the transaction: 1. map-remove-by-key the cart ID from carts. 2. Increment seats by the cart quantity. • If the rollback was 'inDoubt' check if the cart exists. If it does, repeat. Ticket Inventory – Countdown to Zero
  • 25. 25 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Shows (Integer seats , Unordered Unique carts { c1, c2, c3, c4, … )} Cart: cartID => [ttl, status, quantity] • If an item is removed from a cart, or the entire cart explicitly dumped, you’d read the value of cartID in one map-get from the carts bin, then create a transaction to 1. map-remove-by-key the cartID from carts. 2. map-increment seats with the cart’s quantity. • Periodically check for abandoned carts using a map-get-by-value( [ttl, WILDCARD] ) and then treat each matched cart as if it was being explicitly removed. • When a cart has been fully checked out, remove the map entry from carts , because the inventory reservation is now permanent. Ticket Inventory – Countdown to Zero
  • 26. 26 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc.  We talked about the numeric data types in Aerospike.  We modeled counters in Aerospike using different approaches.  We talked about Strong Consistency mode.  We discussed using Strong Consistency for implementing accurate counters. What next?  Take a look at the slides from the previous two Israeli ASUG meetups.  Go to GitHub; clone the code samples repo; run it; read the code.  Read the Aerospike blog. Get familiar with all the database features.  Participate in the community forum (https://discuss.aerospike.com), StackOverflow’s aerospike tag. Summary
  • 27. 27 Proprietary & Confidential | All rights reserved. © 2018 Aerospike Inc. Reference  https://www.aerospike.com/docs/guide/data-types.html  https://www.aerospike.com/docs/architecture/primary-index.html#single-bin-optimization  https://www.aerospike.com/blog/aerospike-4-strong-consistency-and-jepsen/  https://jepsen.io/consistency  https://discuss.aerospike.com/t/handling-timeout-in-case-of-counter-bin/5196/4  https://www.aerospike.com/docs/architecture/consistency.html#strong-consistency-mode  https://www.aerospike.com/docs/guide/consistency.html#indoubt-errors Code Samples  https://github.com/rbotzer/aerospike-cdt-examples Aerospike Training  https://www.aerospike.com/training/  https://academy.aerospike.com/ More material you can explore:

Editor's Notes

  • #5: We’ll start with a recap from the beginning of my previous talk
  • #6: In an RDBMS we connect to a database. In that database we have tables containing rows. Each row will typically have a primary key that uniquely identifies it Rows contain one or more columns of a supported data types Most applications avoid entity-relationship purity for access speed, denormalizing into a single table, and accessed by primary-key lookup
  • #8: Records are stored contiguously Records are grouped together into blocks and written in a single write operation to disk The block size is controlled by the write-block-size config parameter This applies to both raw device and file-based storage The records are first placed in a streaming write buffer – a block in-memory of the same size as the write block The SWB is flushed to disk when the block is full or when flush-max-ms is hit. Or every time if commit-to-device true The primary index is updated. For each record being written Aerospike notes the device/file, the block ID, byte offset Therefore any record can be reached with a single read IO
  • #10: Assume many other bins – address, bills, payment history, etc, this object is 2K roughly If there are 400M users storage is over ten billion bytes, almost 10GiB
  • #11: In hybrid-memory mode (primary index in process memory or shared memory) Aerospike uses 64B of DRAM per-record Less reads so we get the result faster, only if we always fetch all the data Since most operations will be updates and not reads, there's a benefit to splitting them. Also the updates operate on a much smaller object reducing disk IO
  • #12: For 400M objects we're using 8192 sprigs, memory overhead of 848MiB evenly distributed over the number of nodes 8B storage for numeric (integer, double) data Removes all bin related overhead How much memory does this take in Aerospike 4.2 or higher? How much storage space? (as low as 48B, typically 64B with short set name)
  • #13: Create the monthly data counters ahead of time
  • #16: If RecordNotFound you can also set it to any explicit TTL such as 24h, rather than use default-TTL
  • #19: This is the opposite of consolidating counters into a single record. This is a more complex solution, separating into an in-memory or data-in-index namespace. In some languages exception handling is very heavy (Java), in others you are encouraged to EAFP – try first, handle the exception.
  • #21: If you search redislabs 'strong consistency' on Google the term comes up in the search results, but magically it doesn't appear in the page itself. That's 'solving' a technical gap with marketing and SEO. Some competitors use 'strong eventual consistency'. That is a form of weak consistency. As a metaphor you know Golden State Warriors? 1st place in Western Conference of the NBA. Do you know Santa Cruz Warriors? 1st place in the G-League with .707 win/total ratio (better than Golden State). However, it's 1st place in the G-League, not the NBA. Write went to the master, master wrote to the slave, master dies. The state on the slave is now different, and it doesn't know the master died. The slave is promoted but the write failed. Any read now before the client acts (to write again or not write again) is an unexpected value. The client might try to repeat the write again and modify not the previous state but the current one.
  • #22: A strongly consistent counter is modeled with a pair of bins – an integer counter and an unordered list of unique transaction IDs.
  • #23: A strongly consistent counter is modeled with a pair of bins – an integer counter and an unordered list of unique transaction IDs.
  • #24: A strongly consistent counter is modeled with a pair of bins – an integer counter and an unordered list of unique transaction IDs.
  • #25: A strongly consistent counter is modeled with a pair of bins – an integer counter and an unordered list of unique transaction IDs.
  • #26: Removing a cart is exactly like a rollback when there aren't enough seats.
  • #27: More of these to come
  • #28: New and renewing EE users, and anyone signed up to download CE gets academy access.