Riak TS

Riak TS: Basho’s New
Purpose-built Time
Series Database
Rob Genova, Solution Architect
Basho Technologies

Data Distribution
Basho Technologies | 6

Masterless Architecture
Riak has a masterless architecture in which every node in a cluster is capable of serving
read and write requests. The benefits of a masterless architecture include:

Data Replication & Consistency
Reads and writes use quorum level consistency by default.
put(“bucket/key”)

put(“bucket/key”)
High Availability
9
If a node goes offline, “fallback” virtual nodes will take over and automatically begin
serving requests on behalf of the downed virtual nodes. Control and data are
automatically handed back to the original node when it returns.

Data Guarantees
Version vectors are used to maintain an actor-based accounting of updates to
an object in Riak. This allows the system to reason about causality in the event
that multiple versions of an object exist at any given point in time.
Version 1 Version 2
v1 v2 v3v1 v2 v3
{v1:2,v2:3,v3:2} {v1:2,v2:3,v3:1}
(dominates)

Write once buckets
• Riak 2.1 introduced the
concept of "write once"
buckets
• 107% increase in
throughput vs standard
buckets
• Intended for immutable
data

Pluggable Storage Backends
Pluggable storage backends enable you to choose the low-level storage engine that best
fits your use case.
• Bitcask
Basho’s open source key/value store and Riak’s default backend.
• LevelDB
Google’s open source key/value store
• In Memory
Uses Erlang’s ets tables to store data in memory
• Multi-Backend
Select the right backend for each use case on a per bucket basis

Riak automatically replicates
between clusters
• Configurable number of remote
replicas
• Options for real-time sync and full
sync
Geo-Data Locality allows
localized data processing
• Reduced latency to
end-users
• Allows sub 5ms responses
• Active-Active ensures
continuous user experience
Availability Across Geographies
Multi-cluster Replication
13

Riak KV: Use Cases
• Mutable data
• Documents, JSON, metadata
• Session state
• User/customer data
• Transaction histories
• Archives

Riak KV: Search
• Right it like Riak, read it like
SOLR
• Riak Search communicates and
monitors the Solr OS process
• Riak Search listens for changes
in key/value (KV) data and
makes the appropriate changes
to Solr indexes
• Riak Search takes a user query
on any node and converts it to a
Solr distributed search
• Protocol Buffer interface and
Solr interface via HTTP

Riak Data Types are a developer-friendly way to avoid conflicting
versions of objects in an eventually consistent environment.
• Map
Supports the nesting of the Riak
Data Types.
• Register
A named binary field that can
only be used as part of a Map.
• Counter
Keeps tracks of increments and
decrements on an integer
• Flag
Values limited to enable or
disable
• Set
A collection of unique binary
values that supports add and
remove operations on one or
more values
Riak KV: Data Types
16

Riak TS: Use Cases
• Immutable data
• Infrastructure monitoring / metrics
• Real-time analytics
• IoT / Sensor Data
• Financial Data
• Scientific Observations

Riak TS: Requirements & design goals
 High write throughput
 Efficient range query support
 Robust queryability
 Horizontal scale
 High availability
 Multi-region support
 Enterprise scale solution

Riak TS: Design & Implementation
• Data distribution
– Data is co-located on a per series
basis for a configurable time
horizon
– A given series is partitioned into
ordered ranges of a configurable
size.
• Data modeling
– SQL-like data definition (bucket
parameterization)
• Read/write
– Efficient write path
– Query subsystem
– SQL-like query language

Riak TS Implementation: Data definition
Riak TS uses a SQL-like CREATE TABLE statement to associate a schema with a
bucket.

Riak TS Implementation: Query language
Riak TS supports a SQL-like query language using the familiar semantics of the SELECT
statement.
SELECT weather, temperature FROM GeoCheckin WHERE myfamily =
'family1’ AND myseries = 'series1' AND time > 1449864277000 and time <
1449864290000 AND temperature > 27.0
SELECT AVG(temperature) FROM GeoCheckin WHERE myfamily = 'family1’
AND myseries = 'series1' AND time > 1449864277000 and time <
1449864290000
SELECT temperature * 1.5 FROM GeoCheckin WHERE myfamily = 'family1’
AND myseries = 'series1' AND time > 1449864277000 and time <
1449864290000

Riak TS: Write Performance
• 130k writes per second
• 5 nodes (bare metal, Softlayer)
• 6-core + HT (12 logical cores)
• 32GB
• 800GB SSD x3 (RAID0)
• 1k objects
• 15-minute time quantization
• ring_size = 64

Riak TS: Unofficial Roadmap
• As of 1.2 (this week)
– Query language with SELECT, filtering, aggregation functions
and arithmetic
– Java, Python, Erlang, Node, Ruby clients
– SQL Shell
• 1.3 (end of April-ish)
– OSS & Enterprise versions
– MDC (enterprise only)
– REST API
• Q2
– Bulk delete / expiry
– SQL GROUP BY, ORDER BY
– Visualization (Graphite/Grafana integration)

UNCORKD - Overview
UNTAPPD for wine snobs! (And a tribute to the cool social/beer app)
• Tracks checkins by wine variety and location
• Maintains per-user friend lists and activity feeds
• Maintains per-location statistics
• Support Checkin, Activity feed & Location-based queries with
aggregation & filtering (time-based and geospatial)

UNCORKD – Riak KV Data Definition
• User, Location & Wine entity data stored in standard KV buckets

UNCORKD – Riak KV Data Definition (2)
• Friend lists and location statistics maintained via Set and Counter data
types

UNCORKD – Riak TS Data Definition
• Wine name/id used as the series_id for Checkin
• User name/id used as the series_id for Activity
• Include lat/long data to support basic geospatial filtering
• 14-day time quantization

UNCORKD – Generate Checkins
• Generates a months worth of checkins
• Attempts 1 checkin per minute with time-weighted probability

UNCORKD – Insert Checkin & Fan Out
• Insert checkin
• Fan out to friends activity feeds
• Update per location statistics

UNCORKD – Query Checkins
• Checkin count and average rating for ‘2015_Talisman_PinotNoir’
• List checkins with times and locations

UNCORKD – Query Checkins
• List checkins for a given geographic area (Mission District)

UNCORKD – Query Activity Feed
• List friends for user ‘AmyPhillips@gmail.com’
• Query activity feed

UNCORKD – Query Location stats
• Per day checkin counts for ‘Etcetera_Wine_Bar’

Riak TS

More Related Content

What's hot (20)

Similar to Riak TS (20)

More from clive boulton (20)

Recently uploaded (20)

Riak TS

Editor's Notes