SlideShare a Scribd company logo
Before we get into the heavy
stuff, Let's imagine hacking
 around with C* for a bit...
You run a large video website
●   CREATE TABLE videos (
    videoid uuid,
    videoname varchar,
    username varchar,
    description varchar, tags varchar,
    upload_date timestamp,
    PRIMARY KEY (videoid,videoname) );
●   INSERT INTO videos (videoid, videoname, username,
    description, tags, upload_date) VALUES ('99051fe9-6a9c-
    46c2-b949-38ef78858dd0','My funny cat','ctodd', 'My cat
    likes to play the piano! So funny.','cats,piano,lol','2012-06-01
    08:00:00');
You have a bajillion users
●   CREATE TABLE users (
    username varchar,
    firstname varchar,
    lastname varchar,
    email varchar,
    password varchar,
    created_date timestamp,
    PRIMARY KEY (username));
●   INSERT INTO users (username, firstname, lastname, email,
    password, created_date) VALUES ('tcodd','Ted','Codd',
    'tcodd@relational.com','5f4dcc3b5aa765d61d8327deb882cf99'
    ,'2011-06-01 08:00:00');
You can query up a storm
●   SELECT firstname,lastname FROM users WHERE username='tcodd';
    firstname | lastname
    -----------+----------
          Ted |      Codd


●   SELECT * FROM videos WHERE videoid = 'b3a76c6b-7c7f-4af6-964f-
    803a9283c401' and videoname>'N';
    videoid                     | videoname         | description
                     | tags   | upload_date        | username
    b3a76c6b-7c7f-4af6-964f-803a9283c401 | Now my dog plays piano! | My
    dog learned to play the piano because of the cat. | dogs,piano,lol | 2012-
    08-30 16:50:00+0000 | ctodd
That's great! Then you ask
         yourself...
●   Can I slice a slice (or sub query)?
●   Can I do advanced where clauses ?
●   Can I union two slices server side?
●   Can I join data from two tables without two
    request/response round trips?
●   What about procedures?
●   Can I write functions or aggregation functions?
Let's look at the API's we have




 http://www.slideshare.net/aaronmorton/apachecon-nafeb2013
But none of those API's do what I
   want, and it seems simple
        enough to do...
Intravert joins the “party”
     at the API Layer
Why not just do it client side?
●   Move processing close to data
    –   Idea borrowed from Hadoop
●   Doing work close to the source can result in:
    –   Less network IO
    –   Less memory spend encoding/decoding 'throw
        away' data
    –   New storage and access paradigms
Vertx + cassandra
●   What is vertx ?
    –   Distributed Event Bus which spans the server and
        even penetrates into client side for effortless 'real-
        time' web applications
●   What are the cool features?
    –   Asynchronous
    –   Hot re-loadable modules
    –   Modules can be written in groovy, ruby, java, java-
        script

                           http://vertx.io
Transport, payload, and
      batching
HTTP Transport
●   HTTP is easy to use on firewall'ed networks
●   Easy to secure
●   Easy to compress
●   The defacto way to do everything anyway
●   IntraVert attempts to limit round-trips
    –   Not provide a terse binary format
JSON Payload
●   Simple nested types like list, map, String
●   Request is composed of N operations
●   Each operation has a configurable timeout
●   Again, IntraVert attempts to limit round-trips
    –   Not provide a terse message format
Why not use lighting fast transport
      and serialization library X?
●   X's language/code gen issues
●   You probably can not TCP dump X
●   Net-admins don't like 90 jars for health checks
●   IntraVert attempts to limit round-trips:
    –   Prepared statements
    –   Server side filtering
    –   Other cool stuff
Sample request and response
{"e": [ {                               {
     "type": "SETKEYSPACE",
                                            "exception":null,
     "op": { "keyspace": "myks" }
                                            "exceptionId":null,
  }, {
     "type": "SETCOLUMNFAMILY",             "opsRes": {
     "op": { "columnfamily": "mycf" }          "0":"OK",
  }, {                                         "1":"OK",
     "type": "SLICE",
                                               "2":[{
     "op": {
                                                    "name":"Founders",
         "rowkey": "beers",
         "start": "Allagash",                       "value":"Breakfast Stout"
         "end": "Sierra Nevada",               }]
         "size": 9                      }}
} }]}
Server side filter
Imagine your data looks like...
{ "rowkey": "beers", "name":
"Allagash", "value": "Allagash Tripel" }
{ "rowkey": "beers", "name":
"Founders", "value": "Breakfast Stout" }
{ "rowkey": "beers", "name": "Dogfish
Head",
"value": "Hellhound IPA" }
Application requirement
●   User request wishes to know which beers are
    “Breakfast Stout” (s)
●   Common “solutions”:
    –   Write a copy of the data sorted by type
    –   Request all the data and parse on client side
Using an IntraVert filter
●   Send a function to the server
●   Function is applied to subsequent get or slice
    operations
●   Only results of the filter are returned to the
    client
Defining a filter JavaScript
●   Syntax to create a filter
      {
           "type": "CREATEFILTER",
           "op": {
               "name": "stouts",
               "spec": "javascript",
            "value": "function(row) { if (row['value'] == 'Breakfast Stout')
    return row; else return null; }"
           }
      },
Defining a filter Groovy/Java

●   We can define a groovy closure or Java filter
    {
        "type": "CREATEFILTER",
        "op": {
         "name": "stouts",
         "spec": "groovy",
       "{ row -> if (row["value"] == "Breakfast Stout") return row else
    return null }"
            }
    },
Filter flow
Common filter use cases
●   Transform data
●   Prune columns/rows like a where clause
●   Extract data from complex fields (json, xml,
    protobuf, etc)
Some light relief
Server Side Multi-Processor
It's the cure for your “redis envy”
Imagine your data looks like...
●   { “row key”:”1”,    ●   { “row key”:”4”,
    name:”a” ,val...}       name:”a” ,val...}
●   { “row key”:”1”,    ●   { “row key”:”4”,
    name:”b” ,val...}       name:”z” ,val...}
Application Requirements
●   User wishes to intersect the column names of
    two slices/queries
●   Common “solutions”
    –   Pull all results to client and apply the intersection
        there
Server Side MultiProcessor
●   Send a class that implements MultiProcessor
    interface to server
●   public List<Map> multiProcess
    (Map<Integer,Object> input, Map params);
●   Do one or more get/slice operations as input
●   Invoke MultiProcessor on input
Multi-processor flow
Multi-processor use cases
●   Union N slices
●   Intersection N slices
●   Some “Join” scenarios
Fat client becomes
  the 'Phat client'
Imagine you want to insert this data
●   User wishes to enter this event for multiple column
    families
    –   09/10/201111:12:13
    –   App=Yahoo
    –   Platform=iOS
    –   OS=4.3.4
    –   Device=iPad2,1
    –   Resolution=768x1024
    –   Events–videoPlayPercent=38–Taste=great

         http://www.slideshare.net/charmalloc/jsteincassandranyc2011
Inserting the data
aggregateColumnNames(”AppPlatformOSVersionDeviceResolution") = "app+platform+osversion+device+resolution#”


def ccAppPlatformOSVersionDeviceResolution(c: (String) => Unit) = {
    c(aggregateColumnNames(”AppPlatformOSVersionDeviceResolution”) + app + p(platform) + p(osversion) + p(device) + p(resolution))
}


aggregateKeys(KEYSPACE  ”ByMonth") = month //201109
aggregateKeys(KEYSPACE  "ByDay") = day //20110910
aggregateKeys(KEYSPACE  ”ByHour") = hour //2011091012
aggregateKeys(KEYSPACE  ”ByMinute") = minute //201109101213


def r(columnName: String): Unit = {
    aggregateKeys.foreach{tuple:(ColumnFamily, String) => {
    val (columnFamily,row) = tuple
    if (row !=null && row.size > 0)
    rows add (columnFamily -> row has columnName inc) //increment the counter
    }
    }
}
ccAppPlatformOSVersionDeviceResolution(r)




                        http://www.slideshare.net/charmalloc/jsteincassandranyc2011
Solution
    ●   Send the data once and compute the N
        permutations on the server side
public void process(JsonObject request, JsonObject state, JsonObject response, EventBus eb) {
    JsonObject params = request.getObject("mpparams");
    String uid = (String) params.getString("userid");
    String fname = (String) params.getString("fname");
    String lname = (String) params.getString("lname");
    String city = (String) params.getString("city");

    RowMutation rm = new RowMutation("myks", IntraService.byteBufferForObject(uid));
    QueryPath qp = new QueryPath("users", null, IntraService.byteBufferForObject("fname"));
    rm.add(qp, IntraService.byteBufferForObject(fname), System.nanoTime());
    QueryPath qp2 = new QueryPath("users", null, IntraService.byteBufferForObject("lname"));
    rm.add(qp2, IntraService.byteBufferForObject(lname), System.nanoTime());
    ...
      try {
      StorageProxy.mutate(mutations, ConsistencyLevel.ONE);
    } catch (WriteTimeoutException | UnavailableException | OverloadedException e) {
       e.printStackTrace();
       response.putString("status", "FAILED");
    }
    response.putString("status", "OK");
}
Service Processor Flow
IntraVert status
●   Still pre 1.0
●   Good docs
    –   https://github.com/zznate/intravert-ug/wiki/_pages
●   Functional equivalent to thrift (mostly features
    ported)
●   CQL support
●   Virgil (coming soon)
●   Hbase like scanners (coming soon)
Hack at it




https://github.com/zznate/intravert-ug
Questions?

More Related Content

ODP
High Performance XQuery Processing in PHP with Zorba by Vikram Vaswani
PDF
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
PDF
Riak at The NYC Cloud Computing Meetup Group
PDF
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
PPTX
Django cryptography
PDF
Wprowadzenie do technologi Big Data i Apache Hadoop
PDF
XQuery Design Patterns
PDF
Testing Backbone applications with Jasmine
High Performance XQuery Processing in PHP with Zorba by Vikram Vaswani
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
Riak at The NYC Cloud Computing Meetup Group
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Django cryptography
Wprowadzenie do technologi Big Data i Apache Hadoop
XQuery Design Patterns
Testing Backbone applications with Jasmine

What's hot (19)

PPTX
Building Your First Data Science Applicatino in MongoDB
PDF
JJUG CCC 2011 Spring
PDF
Why Every Tester Should Learn Ruby
PDF
Compose Async with RxJS
PPT
{{more}} Kibana4
PDF
MongoDB World 2016: Deciphering .explain() Output
PDF
Zabbix LLD from a C Module by Jan-Piet Mens
PDF
Solr & Lucene @ Etsy by Gregg Donovan
PPTX
It's 10pm: Do You Know Where Your Writes Are?
PDF
Object Oriented Exploitation: New techniques in Windows mitigation bypass
PDF
RestMQ - HTTP/Redis based Message Queue
PDF
Writing native bindings to node.js in C++
ODP
Caching and tuning fun for high scalability @ LOAD2012
PDF
Using ngx_lua in UPYUN
ODP
Beyond php it's not (just) about the code
PDF
Annotation processing and code gen
PPTX
Nancy + rest mow2012
PPTX
Compare mysql5.1.50 mysql5.5.8
PDF
Letswift19-clean-architecture
Building Your First Data Science Applicatino in MongoDB
JJUG CCC 2011 Spring
Why Every Tester Should Learn Ruby
Compose Async with RxJS
{{more}} Kibana4
MongoDB World 2016: Deciphering .explain() Output
Zabbix LLD from a C Module by Jan-Piet Mens
Solr & Lucene @ Etsy by Gregg Donovan
It's 10pm: Do You Know Where Your Writes Are?
Object Oriented Exploitation: New techniques in Windows mitigation bypass
RestMQ - HTTP/Redis based Message Queue
Writing native bindings to node.js in C++
Caching and tuning fun for high scalability @ LOAD2012
Using ngx_lua in UPYUN
Beyond php it's not (just) about the code
Annotation processing and code gen
Nancy + rest mow2012
Compare mysql5.1.50 mysql5.5.8
Letswift19-clean-architecture
Ad

Viewers also liked (20)

PPT
Как стать партнером проекта «Претендент»?
PPT
Goodwins theory (Palance)
PPTX
Homework be going to
PPTX
Iconic stills
PDF
Pastoe
PDF
Project m
PPT
PPT
الزكاة
PPTX
Essential list 1
PDF
E book ondernemen-met-sociale-netwerken
PPT
Cfu3721 definitions of_concepts_2013__2 (1)
PPTX
Aceleracion de aplicacione 2
PDF
一個民宿老闆教我的事
PPT
Introduction to Density
PPTX
Calendario 2011
PPSX
Mal ppt 2013
PPT
25martiou2013
PDF
Truemaisha trainings catalog
PPT
20087067 choi mun jung presentation
Как стать партнером проекта «Претендент»?
Goodwins theory (Palance)
Homework be going to
Iconic stills
Pastoe
Project m
الزكاة
Essential list 1
E book ondernemen-met-sociale-netwerken
Cfu3721 definitions of_concepts_2013__2 (1)
Aceleracion de aplicacione 2
一個民宿老闆教我的事
Introduction to Density
Calendario 2011
Mal ppt 2013
25martiou2013
Truemaisha trainings catalog
20087067 choi mun jung presentation
Ad

Similar to Intravert Server side processing for Cassandra (20)

PDF
Cassandra Talk: Austin JUG
PDF
Apache Drill @ PJUG, Jan 15, 2013
PPT
Scaling web applications with cassandra presentation
PDF
NoSQL and CouchDB
PDF
Drill architecture 20120913
PPTX
Cassandra 2012 scandit
PDF
Outside The Box With Apache Cassnadra
PDF
NoSQL overview #phptostart turin 11.07.2011
PPTX
M7 and Apache Drill, Micheal Hausenblas
PDF
Os Gottfrid
PDF
Using Spring with NoSQL databases (SpringOne China 2012)
PDF
Slide presentation pycassa_upload
PPTX
PhillyDB Talk - Beyond Batch
PDF
Ben Coverston - The Apache Cassandra Project
PDF
2011-12-13 NoSQL aus der Praxis
PDF
MongoDB in FS
PPT
NOSQL and Cassandra
PPTX
Drill njhug -19 feb2013
PDF
Sep 2012 HUG: Apache Drill for Interactive Analysis
PPTX
Drill Bay Area HUG 2012-09-19
Cassandra Talk: Austin JUG
Apache Drill @ PJUG, Jan 15, 2013
Scaling web applications with cassandra presentation
NoSQL and CouchDB
Drill architecture 20120913
Cassandra 2012 scandit
Outside The Box With Apache Cassnadra
NoSQL overview #phptostart turin 11.07.2011
M7 and Apache Drill, Micheal Hausenblas
Os Gottfrid
Using Spring with NoSQL databases (SpringOne China 2012)
Slide presentation pycassa_upload
PhillyDB Talk - Beyond Batch
Ben Coverston - The Apache Cassandra Project
2011-12-13 NoSQL aus der Praxis
MongoDB in FS
NOSQL and Cassandra
Drill njhug -19 feb2013
Sep 2012 HUG: Apache Drill for Interactive Analysis
Drill Bay Area HUG 2012-09-19

More from Edward Capriolo (16)

PPT
Nibiru: Building your own NoSQL store
ODP
Web-scale data processing: practical approaches for low-latency and batch
ODP
Big data nyu
PPT
Cassandra4hadoop
ODP
M6d cassandra summit
ODP
Apache Kafka Demo
ODP
Cassandra NoSQL Lan party
PPTX
M6d cassandrapresentation
PPT
Breaking first-normal form with Hive
ODP
Casbase presentation
PPT
Hadoop Monitoring best Practices
PPT
Whirlwind tour of Hadoop and HIve
ODP
Cli deep dive
ODP
Cassandra as Memcache
PPT
Counters for real-time statistics
PPT
Real world capacity
Nibiru: Building your own NoSQL store
Web-scale data processing: practical approaches for low-latency and batch
Big data nyu
Cassandra4hadoop
M6d cassandra summit
Apache Kafka Demo
Cassandra NoSQL Lan party
M6d cassandrapresentation
Breaking first-normal form with Hive
Casbase presentation
Hadoop Monitoring best Practices
Whirlwind tour of Hadoop and HIve
Cli deep dive
Cassandra as Memcache
Counters for real-time statistics
Real world capacity

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
A Presentation on Artificial Intelligence
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
KodekX | Application Modernization Development
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
“AI and Expert System Decision Support & Business Intelligence Systems”
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The AUB Centre for AI in Media Proposal.docx
A Presentation on Artificial Intelligence
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Chapter 3 Spatial Domain Image Processing.pdf
KodekX | Application Modernization Development
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing

Intravert Server side processing for Cassandra

  • 1. Before we get into the heavy stuff, Let's imagine hacking around with C* for a bit...
  • 2. You run a large video website ● CREATE TABLE videos ( videoid uuid, videoname varchar, username varchar, description varchar, tags varchar, upload_date timestamp, PRIMARY KEY (videoid,videoname) ); ● INSERT INTO videos (videoid, videoname, username, description, tags, upload_date) VALUES ('99051fe9-6a9c- 46c2-b949-38ef78858dd0','My funny cat','ctodd', 'My cat likes to play the piano! So funny.','cats,piano,lol','2012-06-01 08:00:00');
  • 3. You have a bajillion users ● CREATE TABLE users ( username varchar, firstname varchar, lastname varchar, email varchar, password varchar, created_date timestamp, PRIMARY KEY (username)); ● INSERT INTO users (username, firstname, lastname, email, password, created_date) VALUES ('tcodd','Ted','Codd', 'tcodd@relational.com','5f4dcc3b5aa765d61d8327deb882cf99' ,'2011-06-01 08:00:00');
  • 4. You can query up a storm ● SELECT firstname,lastname FROM users WHERE username='tcodd'; firstname | lastname -----------+---------- Ted | Codd ● SELECT * FROM videos WHERE videoid = 'b3a76c6b-7c7f-4af6-964f- 803a9283c401' and videoname>'N'; videoid | videoname | description | tags | upload_date | username b3a76c6b-7c7f-4af6-964f-803a9283c401 | Now my dog plays piano! | My dog learned to play the piano because of the cat. | dogs,piano,lol | 2012- 08-30 16:50:00+0000 | ctodd
  • 5. That's great! Then you ask yourself...
  • 6. Can I slice a slice (or sub query)? ● Can I do advanced where clauses ? ● Can I union two slices server side? ● Can I join data from two tables without two request/response round trips? ● What about procedures? ● Can I write functions or aggregation functions?
  • 7. Let's look at the API's we have http://www.slideshare.net/aaronmorton/apachecon-nafeb2013
  • 8. But none of those API's do what I want, and it seems simple enough to do...
  • 9. Intravert joins the “party” at the API Layer
  • 10. Why not just do it client side? ● Move processing close to data – Idea borrowed from Hadoop ● Doing work close to the source can result in: – Less network IO – Less memory spend encoding/decoding 'throw away' data – New storage and access paradigms
  • 11. Vertx + cassandra ● What is vertx ? – Distributed Event Bus which spans the server and even penetrates into client side for effortless 'real- time' web applications ● What are the cool features? – Asynchronous – Hot re-loadable modules – Modules can be written in groovy, ruby, java, java- script http://vertx.io
  • 13. HTTP Transport ● HTTP is easy to use on firewall'ed networks ● Easy to secure ● Easy to compress ● The defacto way to do everything anyway ● IntraVert attempts to limit round-trips – Not provide a terse binary format
  • 14. JSON Payload ● Simple nested types like list, map, String ● Request is composed of N operations ● Each operation has a configurable timeout ● Again, IntraVert attempts to limit round-trips – Not provide a terse message format
  • 15. Why not use lighting fast transport and serialization library X? ● X's language/code gen issues ● You probably can not TCP dump X ● Net-admins don't like 90 jars for health checks ● IntraVert attempts to limit round-trips: – Prepared statements – Server side filtering – Other cool stuff
  • 16. Sample request and response {"e": [ { { "type": "SETKEYSPACE", "exception":null, "op": { "keyspace": "myks" } "exceptionId":null, }, { "type": "SETCOLUMNFAMILY", "opsRes": { "op": { "columnfamily": "mycf" } "0":"OK", }, { "1":"OK", "type": "SLICE", "2":[{ "op": { "name":"Founders", "rowkey": "beers", "start": "Allagash", "value":"Breakfast Stout" "end": "Sierra Nevada", }] "size": 9 }} } }]}
  • 18. Imagine your data looks like... { "rowkey": "beers", "name": "Allagash", "value": "Allagash Tripel" } { "rowkey": "beers", "name": "Founders", "value": "Breakfast Stout" } { "rowkey": "beers", "name": "Dogfish Head", "value": "Hellhound IPA" }
  • 19. Application requirement ● User request wishes to know which beers are “Breakfast Stout” (s) ● Common “solutions”: – Write a copy of the data sorted by type – Request all the data and parse on client side
  • 20. Using an IntraVert filter ● Send a function to the server ● Function is applied to subsequent get or slice operations ● Only results of the filter are returned to the client
  • 21. Defining a filter JavaScript ● Syntax to create a filter { "type": "CREATEFILTER", "op": { "name": "stouts", "spec": "javascript", "value": "function(row) { if (row['value'] == 'Breakfast Stout') return row; else return null; }" } },
  • 22. Defining a filter Groovy/Java ● We can define a groovy closure or Java filter { "type": "CREATEFILTER", "op": { "name": "stouts", "spec": "groovy", "{ row -> if (row["value"] == "Breakfast Stout") return row else return null }" } },
  • 24. Common filter use cases ● Transform data ● Prune columns/rows like a where clause ● Extract data from complex fields (json, xml, protobuf, etc)
  • 27. It's the cure for your “redis envy”
  • 28. Imagine your data looks like... ● { “row key”:”1”, ● { “row key”:”4”, name:”a” ,val...} name:”a” ,val...} ● { “row key”:”1”, ● { “row key”:”4”, name:”b” ,val...} name:”z” ,val...}
  • 29. Application Requirements ● User wishes to intersect the column names of two slices/queries ● Common “solutions” – Pull all results to client and apply the intersection there
  • 30. Server Side MultiProcessor ● Send a class that implements MultiProcessor interface to server ● public List<Map> multiProcess (Map<Integer,Object> input, Map params); ● Do one or more get/slice operations as input ● Invoke MultiProcessor on input
  • 32. Multi-processor use cases ● Union N slices ● Intersection N slices ● Some “Join” scenarios
  • 33. Fat client becomes the 'Phat client'
  • 34. Imagine you want to insert this data ● User wishes to enter this event for multiple column families – 09/10/201111:12:13 – App=Yahoo – Platform=iOS – OS=4.3.4 – Device=iPad2,1 – Resolution=768x1024 – Events–videoPlayPercent=38–Taste=great http://www.slideshare.net/charmalloc/jsteincassandranyc2011
  • 35. Inserting the data aggregateColumnNames(”AppPlatformOSVersionDeviceResolution") = "app+platform+osversion+device+resolution#” def ccAppPlatformOSVersionDeviceResolution(c: (String) => Unit) = { c(aggregateColumnNames(”AppPlatformOSVersionDeviceResolution”) + app + p(platform) + p(osversion) + p(device) + p(resolution)) } aggregateKeys(KEYSPACE ”ByMonth") = month //201109 aggregateKeys(KEYSPACE "ByDay") = day //20110910 aggregateKeys(KEYSPACE ”ByHour") = hour //2011091012 aggregateKeys(KEYSPACE ”ByMinute") = minute //201109101213 def r(columnName: String): Unit = { aggregateKeys.foreach{tuple:(ColumnFamily, String) => { val (columnFamily,row) = tuple if (row !=null && row.size > 0) rows add (columnFamily -> row has columnName inc) //increment the counter } } } ccAppPlatformOSVersionDeviceResolution(r) http://www.slideshare.net/charmalloc/jsteincassandranyc2011
  • 36. Solution ● Send the data once and compute the N permutations on the server side public void process(JsonObject request, JsonObject state, JsonObject response, EventBus eb) { JsonObject params = request.getObject("mpparams"); String uid = (String) params.getString("userid"); String fname = (String) params.getString("fname"); String lname = (String) params.getString("lname"); String city = (String) params.getString("city"); RowMutation rm = new RowMutation("myks", IntraService.byteBufferForObject(uid)); QueryPath qp = new QueryPath("users", null, IntraService.byteBufferForObject("fname")); rm.add(qp, IntraService.byteBufferForObject(fname), System.nanoTime()); QueryPath qp2 = new QueryPath("users", null, IntraService.byteBufferForObject("lname")); rm.add(qp2, IntraService.byteBufferForObject(lname), System.nanoTime()); ... try { StorageProxy.mutate(mutations, ConsistencyLevel.ONE); } catch (WriteTimeoutException | UnavailableException | OverloadedException e) { e.printStackTrace(); response.putString("status", "FAILED"); } response.putString("status", "OK"); }
  • 38. IntraVert status ● Still pre 1.0 ● Good docs – https://github.com/zznate/intravert-ug/wiki/_pages ● Functional equivalent to thrift (mostly features ported) ● CQL support ● Virgil (coming soon) ● Hbase like scanners (coming soon)