Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop

Brisk: Truly peer-to-peer Hadoop
High-order bits from Cassandra & Hadoop

srisatish ambati
@srisatish

points
• Usecases
• Why cassandra?
• Usecase: Hadoop, Brisk
• FUD: Consistency
– Why facebook is not using Cassandra?
• Anti-patterns
• Community, Code, Tools
• Q&A

Users. Netflix.
Key by Customer, read-heavy
Key by Customer:Movie, write-heavy

TimeSeries: (several customers)
periodic readings: dev0,
dev1…deviceID:metric:timestamp ->value

Metrics typically way larger dataset than users.

Operational simplicity
peer-to-peer

write

Operational simplicity read
peer-to-peer

Replication:
Multi-datacenter
Multi-region ec2
Multi-availability zones

reads local
dc1 dc2

Replication:
Multi-datacenter
Multi-region ec2, aws
Multi-availability zones

4.21.2011, Amazon Web Services outage:

“Movie marathons on Netflix awaiting AWS to
come back up.” #ec2disabled

4.21.2011, Amazon Web Services outage:

Netflix was running on AWS.

fast durable writes.
fast reads.

Writes
Sequential, append-only.
~1-5ms

Writes
Sequential, append-only.
~1-5ms

On cloud: ephemeral disks rock!

Reads
Local
Key & row caches, (also, jna-based 0xffheap)
indexes, materialized

Reads
Local
Key & row caches, (also, jna-based 0xffheap)
indexes, materialized

ssds: improved read performance!

amortize
Replication over writes
Repair over reads

Distribution between nodes
Gossip
Anti-entropy
Failure-detector

L ig h t w e i g h t

Clients: cql, thrift
pycassa, phpcassa
hector, pelops
(scala, ruby, clojure)

Usecase #3: h a d o o p
Hdfs  cassandra  hive
Logs stats analytics

Brisk
Truly peer-to-peer hadoop.

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop

map(String key, String value):
// key: document name
// value: document contents for each word w in value:
EmitIntermediate(w, "1");

reduce(String key, Iterator values):
// key: a word
// values: a list of counts int result = 0;
for each v in values: result += ParseInt(v);
Emit(AsString(result));

word count in MapReduce

immutable data
write-once-read-many!
Files once created, written & closed..

not changing!

jobtracker, tasktracker
hdfs: namenode, datanode

cloudera
amazon: elastic map reduce
hortonworks
mapR
brisk

Tools & Analytics
Hive, Pig, R
Karmasphere
Datameer
… dozens of stealth startups!

“However, given that there is only a single master, it’s failure is unlikely;”
The MapReduce paper, 2004. Sanjay et,al, Google.

Namenode decomposition, explained.

NameNode:
Single Master node
Single Machine Address space
Single Point of failure

Use column families (tables)
inode
sblock

One kind of node
no master node, no spof
peer-to-peer

near-real time hadoop
Low latency: cassandra_dc nodes
Batch Analytics: brisk_dc nodes

BriskSimpleSnitch.java

if(TrackerInitializer.isTrackerNode)
{
myDC = BRISK_DC;
logger.info("Detected Hadoop trackers
are enabled, setting my DC to " + myDC);
}
else
{
myDC = CASSANDRA_DC;
logger.info("Looks like Vanilla
Cassandra nodes, setting my DC to " + myDC);
}

Hive: SQL-like access
cli, hwi, jdbc, metastore
Pushdown predicates (v beta2)

hive> CREATE TABLE invites (foo INT, bar
STRING)PARTITIONED BY (ds STRING);

hive> LOAD DATA LOCAL INPATH
'$BRISK_HOME/resources/hive/examples/files/kv2.txt'
OVERWRITE INTO TABLE invites PARTITION (ds='2008-
08-15');

hive> SELECT count(*), ds FROM invites GROUP BY ds;

http://www.datastax.com/docs/0.8/brisk/about_hive

ETL
Real-time
Cassandra CFs
DataCenters
Scale

@srisatish

No me in team!
 Ben Coverston  Michael Allen
 Ben Werther  Mike Bulman
 Brandon Williams  Nate McCall
 Cathy Daw  Nick M Bailey
 Jackson Chung  Patricio Echague
 Jake Luciani  Tyler Hobbs
 Joaquin Casares  SriSatish Ambati
 Jonathan Ellis  Yewei Zhang

100-node Brisk Cluster on Opscenter
@srisatish

FUD,
acronym: fear, uncertainty, doubt.

Consistency: R + W > N
ORACLE, 2-node: R=1, W=2, N=2,(T=2)
DNS

* N is replication factor. Not to be confused with T=total #of nodes

Tune-able, flexibility.
For High Consistency:
read:quorum, write:quorum
For High Availability:
high W, low R.

Consistency: R + W > N
ORACLE, 2-node: R=1, W=2, N=2,(T=2)
DNS
"brisk.consistencylevel.read", "QUORUM";
"brisk.consistencylevel.write", "QUORUM";

* N is replication factor. Not to be confused with T=total #of nodes

Inbox Search:
600+cores.120+TB (2008)
Went from 100-500m users.

Average NoSQL deployment size: ~6-12 nodes.

Usecase #5: search
Apache Solr + Cassandra = Solandra

Other inbox/file Searches:
xobni, c3

github.com/tjake/solandra

“Eventual consistency is harder to program.”
mostly immutable data.
complex systems at scale.

Miscellaneous,
Myth: data-loss, partial rows.
writes are durable.

Anti-Patterns
Transactions
Joins
Read before write

Anti-Patterns for cloud
ebs
jvm, virtualized
single region

A few more good reasons for Cassandra...

Tools
AMIs, OpsCenter, DataStax
AppDynamics

Getting Started with brisk ami

Netflix just builds AMIs for deployment!

Beautiful C 0 d e

= new code(); //less is more
~90k.java.concurrent.@annotate.
bloomfilters, merkletrees.
non-blocking, staged-event-driven.
bigtable, dynamo.

Current & Future Focus:
Distributed Counters, CQL.
Simple client.
operational smoothening.
compaction.

Community
Robust. Rapid. Brisk #
Professional support from DataStax.
git clone git@github.com:riptano/brisk.git

engineers: independent,startups, large companies,
Rackspace, Twitter, Netflix..

Come join the efforts!

Usecase #4: first NoSQL, then scale!
simpledb  Cassandra
mongodb  Cassandra

Copyright: plantoys
… more than one way to do it!

Summary -
high scale peer-to-peer datastore

best friend for
multi-region, multi-zone availability.

Hadoop – HDFS engulfing the DataWorld

Brisk – best of both worlds!

Dynamo, 2007
Bigtable, 2006 +

OSS, 2008

Incubator 2009
TLP, 2010

Cassandra
+ +

Brisk

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop (20)

More from srisatish ambati (11)

Recently uploaded (20)

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop