SlideShare a Scribd company logo
 
Pavlo  Baron http://www.pbit.org [email_address] @pavlobaron
Agenda Blah-blah More blah-blah Color pics Standing ovations
So, come on,  sell  this to me
Agenda Blah-blah More blah-blah Color pics Standing ovations
Somewhere  a mosquito coughs…
…  and somewhere else a data center gets  flooded with  data (PB)
Big   Data   describes datasets that grow so large that they become  awkward to work with using on-hand database management tools (Wikipedia)
NoSQL is not about … <140’000 things NoSQL is not about>… NoSQL   is about  choice (Jan Lehnardt, CouchDB)
Look here brother, who you jivin‘ with that Cosmik Debris ?
(John Muellerleile)
Agenda Blah-blah More blah-blah Color pics Standing ovations
So, you think you can tell heaven  from  hell ...
Where  does your data actually come  from ?
Do you have a million well structured records?
Or a couple of  Gigabytes  of storage?
Does your data get modified every now and then ?
Do you look at your data Once a month  to create a management report?
Or is your data an  unstructured  chaos?
Do you get flooded by  tera-/petabytes  of data?
Or do you simply get bombed  with data?
Does your data flow on  streams  at a very  high rate  from different locations?
Or do you have to read The Matrix ?
Do you need to distribute your data over the  whole  world
Or does your  existence depend on (the quality of) your data?
Look  back and turn back. Look at yourself
Is it the  storage  that you need to focus on?
Or are you more preparing  data?
Or do you have your customers  spread all  over the world ?
Or do you have complex  statistical analysis  to do?
Or do you have to  filter  data as it comes?
Or is it necessary to  visualize  the data?
...every  blade  is sharp, the  arrows  fly...
Chop  in smaller pieces
Chop in  bite-size , manageable pieces
Separate  reading from writing
Update  and  mark, don’t delete  physically
Minimize  hard relations
Separate archive  from accessible data
Trash  everything that has only to be  analyzed in  real-time
Parallelize  and distribute
Avoid  single bottle necks
Decentralize  with “ equal” nodes
Design with  Byzantine faults in mind
Build upon  consensus , agreement ,  voting ,  quorum
Don’t trust time and timestamps
Strive for O(1)  for data lookups #
Minimize the distance between the data and its processors
Utilize  commodity hardware
Consider  hardware fallibility
Relax  new hardware startup procedure
Bring  data to  its  users
Build upon  asynchronous message passing
Consider  network unreliability
Consider  asynchronous message passing unreliability
Design with  eventual actuality/consistency in mind
Implement  redundancy and  replication
Consider  latency  an adjustment screw
Consider  availability  an adjustment screw
Be prepared for disaster
Utilize the fog/clouds
Design  for theoretically unlimited amount of data
Design  for frequent structure changes
Design  for the all-in-one mix
Agenda Blah-blah More blah-blah Color pics Standing ovations
Why can we never be  sure till we  die . Or have  killed  for an answer
CAP – C onsistency, A vailability, P artition tolerance
CAP – the variations CA  – irrelevant CP  – eventually unavailable offering maximum consistency AP  – eventually inconsistent offering maximum availability
CAP – the  tradeoff A C
CP Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2 v 2
CP ( partition ) Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2
AP Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2 v 2 replicate
AP ( partition ) Replica 1 Replica 2 v 1 read write v 2 read v 1 v 2 v 2 hint handoff
BASE
BASE B asically  A vailable, S oft-state, E ventually consistent Opposite to ACID
Causal  ordering / consistency RM1 RM2 RM3
Read  your  write consistency write v 2 read v2 FE1 v 2 Data store v 3 v 1 write v 1 read v1 FE2
Session 2 Session 1 Session  consistency write v 2 read v2 FE v 2 Data store v 3 v 1 write v 1 read v1
FIFO  ordering RM1 RM2 RM3
Monotonic  read  consistency read v 2 read v2 FE1 v 2 Data store v 3 v 1 read v 3 read v4 FE2 v 4 read v3
Total  ordering RM1 RM2 RM3
Monotonic  write  consistency write v 1 write v4 FE1 Data store v 2 write v 2 write v3 FE2 v 4 v 1 v 3
Eventual  consistency read v 1 read v2 FE1 Data store v 3 write v 3 FE2 read v3 v 1 read v2 v 2
Run, rabbit, run. Dig that  hole , forget the sun
Logical  sharding
Node 1 Node 2 users products contracts Vertical sharding items orders addresses invoices „ read contract“ user=foo
Node 1 Node 2 users id(1-N) products Range  based sharding addresses zip(1234- 2345) read users id(1-M) addresses zip(2346- 9999) write write read
Hash based  sharding start with 3 nodes: node hash  N = # mod 3 add 2 nodes N = # mod 5 kill 2 nodes N = # mod 3
Insert  key Key = “foo” # = N N
rehash leave leave rehash Add 2 nodes
Lookup key Key = “foo” # = N N Value = “bar”
rehash leave leave rehash Remove node
Consistent  hashing
The  ring X bit integer space 0 <= N <= 2 ^ X or: 2 x Pi 0 <= A <= 2 x Pi x(N) = cos(A) y(N) = sin(A)
Key = “foo” # = N N Insert  key
copy leave rehash leave leave rehash Add  node
Lookup  key Key = “foo” # = N N Value = “bar”
copy/ miss leave rehash leave leave rehash Remove  node
Clustering 12 partitions (constant) 3 nodes, 4 vnodes each add node 4 nodes, 3 vnodes each Alternatives: 3 nodes, 2 x 5 + 1 x 2 vnodes container based
Quorum V: vnodes holding a key W: write quorum R: read quorum DW: durable write quorum W > 0.5 * V R + W > V
Key = “foo” # = N, W = 2 N Insert  key ( sloppy  quorum) replicate ok
leave Add  node copy copy leave
Key = “foo” # = N, R = 2 N Lookup  key ( sloppy quorum) Value = “bar”
leave Remove node copy copy leave
Inside out, outside in. Perpetual  change
Clocks V(i), V(j): competing Conflict resolution: 1:  siblings , client 2:  merge , system 3:  voting , system
Node 1 Node 2 Node 3 10:00 10:11 10:20 10:20 10:01 9:59 10:09 10:10 Timestamps 10:18 10:19
Node 1 Node 2 Node 3 1 3 5 6 2 2 4 5 4 7 7 7 Logical  clocks 6 6 ? ?
Node 1 Node 2 Node 3 1,0,0 1,2,0 3,2,0 1,3,3 1,1,0 1,0,1 1,2,2 1,2,3 2,2,0 4,3,3 4,4,3 4,3,4 Vector  clocks
Node 2 Node 3 Node 4 1,1,0,0 1,0,1,0 1,0,0,1 1,3,0,3 1,2,0, 2 1,2,0,3 Vector  clocks Node 1 1,0,0,0 1,2,0,0 1,0,2,0
Merkle  Trees N, M: nodes HT(N), HT(M): hash trees M needs update: obtain HT(N) calc delta(HT(M), HT(N)) pull keys(delta)
Node a.1 Node a.2 a ab ac abc abd acb acc Merkle  Trees a ab ad abe abd ada adb
Node a.1 Node a.2 a ab abc abd Merkle  Trees a ab ad abd ada adb
Sudden call  shouldn't take away  the startled  memory
Replication – state  transfer Target node users products addresses Source node take
Replication – operational transfer Target node updates inserts deletes Source node take run
Eager  replication -  3PC Coordinator Cohort 1 Cohort 2 yes can commit? pre commit ACK commit ok
Eager  replication – 3PC  ( failure ) Coordinator Cohort 1 Cohort 2 yes can commit? pre commit ACK abort ok
Eager  replication- Paxos  Commit 2F + 1  acceptorsoverall ,  F + 1  correct ones to achieve consensus Stability, Consistency, Non-Triviality, Non-Blocking
prepare 2b prepared initial leader other RMs RM1 2a prepared Eager  replication – Paxos  Commit Acceptors begin commit commit
Eager  replication –  Paxos Commit ( failure ) prepare timeout, no decision initial leader other RMs RM 1 2a prepared Acceptors begin commit abort prepare 2a prepared timeout, no decision
Master node Slave node(s) users products Lazy  replication – Master/slave addresses read write read
Master node(s) Master node(s) Lazy  replication – Master/master read write read users id(1-N) users id(1-M) items id(1-K) items id(1-L) write
stable updates Gossip  – RM RM1 Clock table Replica clock Update log Value clock Value Executed operation table write RM2 gossip
Node 1 Node 2 Node 3 update Gossip  – node  down/up Node 4 update update, 4 down read read, 4 up update
Hinted  handoff N: node, G: group including N node(N) is unavailable replicate  to G  or store  data(N) locally hint handoff  for later node(N) is alive handoff  data to node(N)
Key = “foo” N replicate Key = “foo”, # = N -> handoff hint = true Direct replica fails
Replica recovers handoff
N Key = “foo”, # = N -> handoff hint = true All replicas fail
All replicas recover replicate handoff
I’m a  speed  king, see me  fly
MapReduce
MapReduce model: functional  map/fold out-database  MR  irrelevant in-database  MR: data locality no splitting  needed distributed  querying distributed  processing
In-database  MapReduce map reduce Node X Node C N = &quot;Alice&quot; map query = &quot;Alice&quot; Node A N = „ Alice&quot; Node B N = &quot;Alice&quot; map hit list
Caching
Caching Variations: eager write , append only lazy write , eventual consistency
Write  through read write data store products write through users cache read read miss
Write  back  / snapshotting read write data store products write back users cache read miss
Physical  storage
Physical  storage row  based:  irrelevant column  based: many  rows,  few  columns value  based: ad-hoc  querying
Column  based storage 1, 2 Peter, Anna London, Paris data store ID Name City 1 Peter London 2 Anna Paris
Value  based storage 1:1, 3:Peter, 5:London, 2:2, 4:Anna, 6:Paris, 7:[1, 3, 5], 8:[2, 4, 6] data store ID Name City 1 Peter London 2 Anna Paris
Agenda Blah-blah More blah-blah Color pics Standing ovations
Thank  you
Many graphics I’ve created myself, though I better should have asked @mononcqc for help ‘cause his drawings are awesome Some images originate from istockphoto.com except few ones taken from Wikipedia and product pages

More Related Content

PDF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
PDF
The reasons why 64-bit programs require more stack memory
PDF
Kernel Recipes 2019 - Faster IO through io_uring
PPTX
Cryptography
PDF
Cassandra introduction mars jug
PPT
Data Presentations Cassandra Sigmod
PDF
Insider Threat – The Visual Conviction - FIRST 2007 - Sevilla
PDF
Python于Web 2.0网站的应用 - QCon Beijing 2010
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
The reasons why 64-bit programs require more stack memory
Kernel Recipes 2019 - Faster IO through io_uring
Cryptography
Cassandra introduction mars jug
Data Presentations Cassandra Sigmod
Insider Threat – The Visual Conviction - FIRST 2007 - Sevilla
Python于Web 2.0网站的应用 - QCon Beijing 2010

Similar to Big Data & NoSQL - EFS'11 (Pavlo Baron) (20)

PDF
NoSQL - how it works (@pavlobaron)
PPT
Memory Optimization
PPT
Memory Optimization
PDF
Elliptics
PDF
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
PDF
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
PPTX
HDL17_MIPS CPU Design using Verilog.pptx
PPT
Handling Data in Mega Scale Web Systems
PDF
Cassandra introduction apache con 2014 budapest
PDF
From Hand To Mouth (@pavlobaron)
PDF
Language-agnostic data analysis workflows and reproducible research
PDF
Introduction to Cassandra
PDF
Performance and Predictability - Richard Warburton
PDF
Performance and predictability (1)
ODP
Sql on hadoop the secret presentation.3pptx
PDF
Data Grids with Oracle Coherence
PDF
Redis - for duplicate detection on real time stream
PDF
Redis for duplicate detection on real time stream
PDF
Arduino reference
PDF
Spark Streaming with Cassandra
NoSQL - how it works (@pavlobaron)
Memory Optimization
Memory Optimization
Elliptics
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
HDL17_MIPS CPU Design using Verilog.pptx
Handling Data in Mega Scale Web Systems
Cassandra introduction apache con 2014 budapest
From Hand To Mouth (@pavlobaron)
Language-agnostic data analysis workflows and reproducible research
Introduction to Cassandra
Performance and Predictability - Richard Warburton
Performance and predictability (1)
Sql on hadoop the secret presentation.3pptx
Data Grids with Oracle Coherence
Redis - for duplicate detection on real time stream
Redis for duplicate detection on real time stream
Arduino reference
Spark Streaming with Cassandra
Ad

More from Pavlo Baron (20)

PDF
@pavlobaron Why monitoring sucks and how to improve it
PDF
Why we do tech the way we do tech now (@pavlobaron)
PDF
Qcon2015 living database
PDF
Becoming reactive without overreacting (@pavlobaron)
PPTX
The hidden costs of the parallel world (@pavlobaron)
PDF
data, ..., profit (@pavlobaron)
PDF
Data on its way to history, interrupted by analytics and silicon (@pavlobaron)
PDF
(Functional) reactive programming (@pavlobaron)
PDF
Near realtime analytics - technology choice (@pavlobaron)
PDF
Set this Big Data technology zoo in order (@pavlobaron)
PDF
a Tech guy’s take on Big Data business cases (@pavlobaron)
PDF
Diving into Erlang is a one-way ticket (@pavlobaron)
PDF
Dynamo concepts in depth (@pavlobaron)
PDF
Chef's Coffee - provisioning Java applications with Chef (@pavlobaron)
PDF
The Big Data Developer (@pavlobaron)
PDF
What can be done with Java, but should better be done with Erlang (@pavlobaron)
PDF
20 reasons why we don't need architects (@pavlobaron)
PDF
Theoretical aspects of distributed systems - playfully illustrated (@pavlobaron)
PDF
The Agile Alibi (Pavlo Baron)
PPT
Harry Potter and Enormous Data (Pavlo Baron)
@pavlobaron Why monitoring sucks and how to improve it
Why we do tech the way we do tech now (@pavlobaron)
Qcon2015 living database
Becoming reactive without overreacting (@pavlobaron)
The hidden costs of the parallel world (@pavlobaron)
data, ..., profit (@pavlobaron)
Data on its way to history, interrupted by analytics and silicon (@pavlobaron)
(Functional) reactive programming (@pavlobaron)
Near realtime analytics - technology choice (@pavlobaron)
Set this Big Data technology zoo in order (@pavlobaron)
a Tech guy’s take on Big Data business cases (@pavlobaron)
Diving into Erlang is a one-way ticket (@pavlobaron)
Dynamo concepts in depth (@pavlobaron)
Chef's Coffee - provisioning Java applications with Chef (@pavlobaron)
The Big Data Developer (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)
20 reasons why we don't need architects (@pavlobaron)
Theoretical aspects of distributed systems - playfully illustrated (@pavlobaron)
The Agile Alibi (Pavlo Baron)
Harry Potter and Enormous Data (Pavlo Baron)
Ad

Recently uploaded (20)

PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation theory and applications.pdf
PPTX
Cloud computing and distributed systems.
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
NewMind AI Monthly Chronicles - July 2025
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation theory and applications.pdf
Cloud computing and distributed systems.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Diabetes mellitus diagnosis method based random forest with bat algorithm
Digital-Transformation-Roadmap-for-Companies.pptx
Unlocking AI with Model Context Protocol (MCP)
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx

Big Data & NoSQL - EFS'11 (Pavlo Baron)