Scaling with Riak at Showyou

Scaling at Showyou
John Muellerleile (@jrecursive, john@remixation.com)
September 26, 2011

Tuesday, September 27, 2011

John Muellerleile

• Basho Technologies: Jan. 2008 - Dec. 2010
• Riak
• Riak Search
• Automated research, NLP, spidering
• Consulting: 2003 - 2008
• E-commerce, AdSense, AdWords, ...
• Infrastructure:
• Messaging
• Riak
• Solr


Agenda

• Who am I?

• What is Showyou?

• The Nature of “Social Data”

• Showyou’s Data: Today & Tomorrow

• Data Management: Technology Stack

• Riak: The Awesome & Sub-Awesome, Integration Patterns & Observations

• Not Bob’s Riak: The “Mecha” Backend & Query System


What is Showyou?


A minute about the nature
of “Social Data”


Hi!


Showyou’s Data: Today

• Riak with Bitcask

• Solr with replication

• Often ever-growing blobs of JSON

• No useful way to “ﬁnd” data based on anything other than compound primary
keys :(

• Primary keys crafted for speciﬁc access patterns :(

• This will not last forever


Showyou’s Data: Tomorrow

• Signiﬁcant data growth per additional user

• Find & aggregate data about our users & their videos

• Derive useful “signal” from this data

• Better search: disambiguation, “more like this”, performance

• “Collective Intelligence”: trending, “smart collections”

• Spam & De-duplication

• & more: #hashtags, auto-complete, statistics, usage, ...


Data Management: Technology Stack

• Erlang/OTP

• Riak, Bitcask

• Java

• JInterface

• Hazelcast

• Solr


Riak: The Awesome Parts

• By far the best operational story in its class

• Shared-nothing

• No single point of failure

• Masterless multi-site replication

• Support via EnterpriseDS Startup Program

• I helped write it & turned it inside-out to design Riak Search while working at
Basho -- this helps


Riak: The Sub-Awesome Parts

• Riak Search as it exists today does not perform well for us & lacks features

• Bitcask keeps all keys in memory

• Listing keys for a bucket will take your cluster down

• Map/Reduce is virtually useless (for us) other than as “multi-get”

• Pre-1.0 cluster membership changes are “at your peril”

• Usable/useful built-in monitoring is non-existent

• If you are not intimately familiar with Riak, it’s very hard to debug!


Riak: Integration Patterns

• api_riak_node: “main” Riak cluster node

• sidecar: post-commit talks to local
Java-based Erlang node using JInterface

• fabric: distributed data structures &
utilities via Hazelcast - very similar in
spirit & implementation as Riak!

• indexer: pull from fabric queue &
index record in Solr

• Identical deployment on every node


Riak: Integration Patterns: The Big Picture

• backend_node: nginx, redis; logs

• spider: ﬁnds, extracts,
stores & indexes further
information on users & videos

• log_indexer: aggregate & index
interesting parts of our access logs

• research_riak_node: a special-purpose
Riak cluster node to support Showyou
data “Tomorrow”


Riak: Integration Patterns: Observations

• Riak post-commit hooks

• Why not pre-commit hooks?

• Riak as a “virtual memory”

• Post-commit hooks as “change events”

• Wishlist: pre- & post-delete hook (I realize this is tricky - do it anyway)

• Wishlist: pre- & post-create hook (Less tricky - do it anyway)


Riak: Integration Patterns: Search

• Monolithic replicated Solr won’t last forever & sharding is a faceted multi-
value “shitshow” [1]

• We’re doing ﬁne, for now, with lots of RAM, SSDs, etc. but...

• I don’t want to ﬁnd out “the hard way” where the joyride ends and the hellride
starts [2]

• Clearly the answer is to kill every bird in nearby airspace by writing a
Riak storage backend and integrated query mechanism! [2,3,4]

[1] Cliff Moon suggested the use of this word (for emphasis)
[2] It’s okay, I have done this several dozen times
[3] Yes, really: BDB, BDBJE, Innostore, sqlite, hsql, mysql, postgres, etc.
[4] This is not a pride thing: it was a hard, lonely, unforgiving road


Not Bob’s Riak: A Moment of Weakness

“My way of joking is to tell the truth;
it’s the funniest joke in the world”
George Bernard Shaw


Not Bob’s Riak: Introducing “Mecha”

• Bob?


Mecha: Goals

• Birds to kill:

• Tight, purposed Riak integration

• Efﬁcient & feature-complete indexing

• Fast sequential & range object access

• Flexible distributed query mechanism

• Query parallelism where appropriate


Mecha: What Stays The Same

• Works with “stock” (unmodified) Riak 1.0 (pre & release)

• Little/no difference from the “Riak side” -- everything works “as it should”

• Differences:

• All objects you “put” into Riak must be JSON objects (this will change by
release to respect content type)

• Any fields without a specific “indexed field type” (e.g., _t, _s, _dt, etc.) are
simply stored along with the rest of the fields (i.e. “stored field”)


Mecha: At a glance

• Written in Java, uses many, many third-party libraries:
LevelDB (JNI), Solr, JInterface, Jetlang, Netty, Protobufs, Commons, ...

• Riak backend module written in Erlang; beaten into submission for reliable
interaction with JInterface-driven Java node

• LevelDB instance per partition, per bucket; “ulimit -n 90000” :)

• One Solr instance per node (covers all partitions, buckets)

• Objects stored as Riak Objects in JSON form in LevelDB

• Standardized schema covering common data types using name sufﬁxes


Mecha: Riak Integration


Mecha: Index Field Types

• Supported index field types (by suffix):

• _t, _tt - full-text (optionally w/ term vectors, ...)

• _s, _s_mv - exact string, multi-value exact string

• _i, _l, _d, _f - trie-based integer, long, double, float

• _dt - trie-based date (YYYY-MM-DDTHH:MM:SS)

• _b - boolean

• _xy, _ll, _geo - point, lat/lon & geohash


Mecha: Example Object

{ “content_t”: “NoSQL is a ghetto”,
“lol_count_i”: 47,
“lol_dt”: “2009-04-13T07:01:43.000Z”
“rating_f”: 4.111111164093018,
“tags_s_mv”: [
“funny”,
“lol”,
“nosql”,
“ghetto”]}


Mecha: Fast Sequential & Range Object Access

• Every bucket gets own instance of LevelDB per partition

• No “multiplexing” buckets or partitions per LevelDB instance

• Keys are literal, no encoded Erlang terms; simple ranges, smaller values

• Values stored as JSON-ized Riak objects (why? you’ll see)

• LevelDB JNI binaries shipped with Snappy compression built-in


Mecha: Flexible Distributed Querying

• Exact, prefix, suffix, & wildcard filtering on multiple index fields

• Ultra-fast list_keys, list_bucket, list_buckets replacements

• Equally fast bucket count operation

• Ridiculous range query performance (any trie- type; sane datetime
functions)

• Faceting, group counts

• Spatial (bounding box/bowl, geofilt, Haversine, distance faceting)


Mecha: Query Parallelism


Mecha: Coverage

• Coverage is the set of nodes and respectively owned partitions you must
process to cover all objects in a given bucket.

Node 1

Node 2
(for n=2)


Mecha: Examples: Count & Faceting

• Count the number of records in the "research" bucket modiﬁed within the last
10 seconds

• Top search terms by count for the last hour


Mecha: Examples: Multi-value String Fields

• See which ﬁeld values occur with other values, and how often (using a multi-
value string ﬁeld)


Mecha: Next Steps

• Code cleanup, test & polish current functionality

• Sane build & deployment (yay, Maven)

• Simplify conﬁguration

• Embed Solr (currently running standalone for debugging)

• Extend & improve Solr “standard conﬁguration”

• “Out of Band” Map/reduce with “direct bucket-level” LevelDB instance

• After that? Join operators, ... :)


Mecha: Availability

• Right now it is a complex system

• It will be worth waiting for tighter integration & polish

• This is me not answering your question :)

• As soon as responsible, sane, & possible :)


Thank you!

• Basho Technologies, especially Andy Gross (@argv0) & Kelly McLaughlin
(@_klm) speciﬁcally for help with Riak 1.0 changes

• Of course, without Erlang/OTP, LevelDB, Solr, Java, JInterface, and a host of
other open source projects, this would have never even gotten started --
thank you.

• Last but not least, thank you to my Showyou teammates for encouragement
& support!

• Contact:
John Muellerleile / http://twitter.com/jrecursive
john@remixation.com, jmuellerleile@gmail.com
http://github.com/jrecursive


Scaling with Riak at Showyou

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Scaling with Riak at Showyou (20)

Recently uploaded (20)

Scaling with Riak at Showyou