SlideShare a Scribd company logo
CRDTs and Redis
From sequential to concurrent executions
Carlos Baquero
Universidade do Minho & INESC TEC
2/31
The speed of communication in the 19th century
W. H. Harrison’s death
“At 12:30 am on April 4th, 1841 President
William Henry Harrison died of pneumonia
just a month after taking o ce. The Rich-
mond Enquirer published the news of his
death two days later on April 6th. The North-
Carolina standard newspaper published it on
April 14th. His death wasn’t known of in Los
Angeles until July 23rd, 110 days after it had
occurred.”
Text by Zack Bloom, A Quick History of Digital Communication Before the
Internet. https://eager.io/blog/communication-pre-internet/
Picture by By Albert Sands Southworth and Josiah Johnson Hawes
3/31
The speed of communication in the 19th century
Francis Galton Isochronic Map
4/31
The speed of communication in the 21st century
RTT data gathered via http://www.azurespeed.com
5/31
The speed of communication in the 21st century
If you really like high latencies . . .
Time delay between Mars and Earth
blogs.esa.int/mex/2012/08/05/time-delay-between-mars-and-earth/
Delay/Disruption Tolerant Networking
www.nasa.gov/content/dtn
6/31
Latency magnitudes
Geo-replication
, up to 50ms (local region DC)
⇤, between 100ms and 300ms (inter-continental)
No inter-DC replication
Client writes observe latency
Planet-wide geo-replication
Replication techniques versus client side write latency ranges
Consensus/Paxos [⇤, 2⇤] (with no divergence)
Primary-Backup [ , ⇤] (asynchronous/lazy)
Multi-Master (allowing divergence)
7/31
EC and CAP for Geo-Replication
Eventually Consistent. CACM 2009, Werner Vogels
In an ideal world there would be only one consistency model:
when an update is made all observers would see that update.
Building reliable distributed systems at a worldwide scale
demands trade-o↵s between consistency and availability.
CAP theorem. PODC 2000, Eric Brewer
Of three properties of shared-data systems – data consistency,
system availability, and tolerance to network partition – only two
can be achieved at any given time.
CRDTs provide support for partition-tolerant high availability
8/31
From sequential to concurrent executions
Consensus provides illusion of a single replica
This also preserves (slow) sequential behaviour
Sequential execution
Ops O o // p // q
Time //
We have an ordered set (O, <). O = {o, p, q} and o < p < q
8/31
From sequential to concurrent executions
Consensus provides illusion of a single replica
This also preserves (slow) sequential behaviour
Sequential execution
Ops O o // p // q
Time //
We have an ordered set (O, <). O = {o, p, q} and o < p < q
9/31
From sequential to concurrent executions
EC Multi-master (or active-active) can expose concurrency
Concurrent execution
p // q
Ops O o
??
r
s
77
Time //
Partially ordered set (O, ). o p q r and o s r
Some ops in O are concurrent: p k s and q k s
10/31
Design of Conflict-Free Replicated Data Types
A partially ordered log (polog) of operations implements any CRDT
Replicas keep increasing local views of an evolving distributed polog
Any query, at replica i, can be expressed from local polog Oi
Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }|
CRDTs are e cient representations that follow some general rules
10/31
Design of Conflict-Free Replicated Data Types
A partially ordered log (polog) of operations implements any CRDT
Replicas keep increasing local views of an evolving distributed polog
Any query, at replica i, can be expressed from local polog Oi
Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }|
CRDTs are e cient representations that follow some general rules
10/31
Design of Conflict-Free Replicated Data Types
A partially ordered log (polog) of operations implements any CRDT
Replicas keep increasing local views of an evolving distributed polog
Any query, at replica i, can be expressed from local polog Oi
Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }|
CRDTs are e cient representations that follow some general rules
11/31
Principle of permutation equivalence
If operations in sequence can commute, preserving a given result,
then under concurrency they should preserve the same result
Sequential
inc(10) // inc(35) // dec(5) // inc(2)
dec(5) // inc(2) // inc(10) // inc(35)
Concurrent
inc(35)
&&
inc(10)
88
&&
inc(2)
dec(5)
88
You guessed: Result is 42
12/31
Implementing Counters
Example: CRDT PNCounters
A inc(35)
&&
B inc(10)
88
&&
inc(2)
C dec(5)
88
Lets track total number of incs and decs done at each replica
{A(incs, decs), . . . , C(. . . , . . .)}
13/31
Implementing Counters
Example: CRDT PNCounters
Separate positive and negative counts are kept per replica
A {A(35, 0), B(10, 0)}
++
B {B(10, 0)}
66
((
{A(35, 0), B(12, 0), C(0, 5)}
C {B(10, 0), C(0, 5)}
33
Joining does point-wise maximums among entries (semilattice)
At any time, counter value is sum of incs minus sum of decs
13/31
Implementing Counters
Example: CRDT PNCounters
Separate positive and negative counts are kept per replica
A {A(35, 0), B(10, 0)}
++
B {B(10, 0)}
66
((
{A(35, 0), B(12, 0), C(0, 5)}
C {B(10, 0), C(0, 5)}
33
Joining does point-wise maximums among entries (semilattice)
At any time, counter value is sum of incs minus sum of decs
14/31
Implementing Counters
Redis CRDT Counters
There are multiple ways to implement CRDT counters
Redis has a distinct implementation that favours garbage collection
Redis CRDT counters are 59 bits (not 64) to avoid overflows
15/31
Registers
Registers are an ordered set of write operations
Sequential execution
A wr(x) // wr(j) // wr(k) // wr(x)
Sequential execution under distribution
A wr(x)
%%
wr(x)
B wr(j) // wr(k)
99
Register value is x, the last written value
16/31
Implementing Registers
Naive Last-Writer-Wins
CRDT register implemented by attaching local wall-clock times
Sequential execution under distribution
A (11:00)x
''
(11:30)?
##
B (12:02)j // (12:05)k
77
?
Problem: Wall-clock on B is one hour ahead of A
Value x might not be writeable again at A since 12:05 > 11:30
17/31
Registers
Sequential Semantics
Register shows value v at replica i i↵
wr(v) 2 Oi
and
@wr(v0
) 2 Oi · wr(v) < wr(v0
)
18/31
Preservation of sequential semantics
Concurrent semantics should preserve the sequential semantics
This also ensures correct sequential execution under distribution
19/31
Multi-value Registers
Concurrency semantics shows all concurrent values
{v | wr(v) 2 Oi ^ @wr(v0
) 2 Oi · wr(v) wr(v0
)}
Concurrent execution
A wr(x)
%%
// wr(y) // {y, k} // wr(m) // {m}
B wr(j) // wr(k)
99
Dynamo shopping carts are multi-value registers with payload sets
The m value could be an application level merge of values y and k
20/31
Implementing Multi-value Registers
Concurrency can be preciselly tracked with version vectors
Concurrent execution (version vectors)
A [1, 0]x
%%
// [2, 0]y // [2, 0]y, [1, 2]k // [3, 2]m
B [1, 1]j // [1, 2]k
66
Metadata can be compressed with a common causal context and a
single scalar per value (dotted version vectors)
21/31
Registers in Redis
LWW arbitration
Multi-value registers allows executions leading to concurrent values
Presenting concurrent values is at odds with the sequential API
Redis both tracks causality and registers wall-clock times
Querying uses Last-Writer-Wins selection among concurrent values
This preserves correctness of sequential semantics
A value with clock 12:05 can still be causally overwritten at 11:30
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
22/31
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) ! add(c) we observe that a, c 2 X
X = {. . .}, add(c) ! rmv(c) we observe that c 62 X
In general, given Oi , the set has elements
{e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
23/31
Sets
Concurrency Semantics
Problem: Concurrently adding and removing the same element
Concurrent execution
A add(x)
%%
// rmv(x) // {?} // add(x) // {x}
B rmv(x) // add(x)
::
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
24/31
Concurrency Semantics
Add-Wins Sets
Let’s choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
25/31
Equivalence to a sequential execution?
Add-Wins Sets
Can we always explain a concurrent execution by a sequential one?
Concurrent execution
A {x, y} // add(y) // rmv(x) // {y} //
##
{x, y}
B {x, y} // add(x) // rmv(y) // {x} //
;;
{x, y}
Two (failed) sequential explanations
H1 {x, y} // . . . // rmv(x) // {6 x, y}
H2 {x, y} // . . . // rmv(y) // {x, 6 y}
Concurrent executions can have richer outcomes
26/31
Concurrency Semantics
Remove-Wins Sets
Alternative: Let’s choose Remove-Wins
Xi
.
= {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)}
Remove-Wins requires more metadata than Add-Wins
Both Add and Remove-Wins have same semantics in a total order
They are di↵erent but both preserve sequential semantics
26/31
Concurrency Semantics
Remove-Wins Sets
Alternative: Let’s choose Remove-Wins
Xi
.
= {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)}
Remove-Wins requires more metadata than Add-Wins
Both Add and Remove-Wins have same semantics in a total order
They are di↵erent but both preserve sequential semantics
27/31
Take home message
Concurrent executions are needed to deal with latency
Behaviour changes when moving from sequential to concurrent
Road to accommodate transition:
Permutation equivalence
Preserving sequential semantics
Concurrent executions lead to richer outcomes
CRDTs provide sound guidelines and encode policies
Thank you!
Carlos Baquero
Email: cbm@di.uminho.pt, Twitter: @xmal
29/31
Sequence/List
Weak/Strong Specification [Attiya et al, PODC 16]
Element x is kept
rpush(b) // hxbi
$$
hi // lpush(x) // hxi
99
%%
haxbi
lpush(a) // haxi
::
Element x is removed (Redis enforces Strong Specification)
rpush(b) // hxbi
##
hi // lpush(x) // hxi
::
$$
// rem(x) // hi // habi ¬hbai
lpush(a) // haxi
;;
30/31
Causal Consistency
Redis CRDTs provide per-key causal consistency
Source FIFO (TCP)
A apple
&&
%%
B apple pie
##
hot
##
C pie hot peach? apple
Causal consistency
A apple
%%
B apple
%%
pie
$$
hot
""
C apple pie hot tasty
Strongest highly available consistency model
31/31
Causal Consistency
Redis CRDTs provide per-key causal consistency
Source FIFO (TCP)
A add(a)
''
&&
B add(a) rmv(a)
''
add(b)
''
C rmv(a) add(b) add(a)
Causal consistency
A add(a)
%%
B add(a)
%%
rmv(a)
&&
add(b)
%%
C add(a) rmv(a) add(b)
Strongest highly available consistency model

More Related Content

PDF
CRDTs and Redis
PDF
RedisConf18 - CRDTs and Redis - From sequential to concurrent executions
PDF
Collision prevention on computer architecture
PDF
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
PPT
Price of anarchy is independent of network topology
PDF
Modern Control - Lec 05 - Analysis and Design of Control Systems using Freque...
PDF
Computer Controlled Systems (solutions manual). Astrom. 3rd edition 1997
PPTX
Parallel sorting algorithm
CRDTs and Redis
RedisConf18 - CRDTs and Redis - From sequential to concurrent executions
Collision prevention on computer architecture
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Price of anarchy is independent of network topology
Modern Control - Lec 05 - Analysis and Design of Control Systems using Freque...
Computer Controlled Systems (solutions manual). Astrom. 3rd edition 1997
Parallel sorting algorithm

What's hot (20)

PDF
Reduction of multiple subsystem [compatibility mode]
PPT
Algorithm.ppt
PDF
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
PPT
Control chap3
PPTX
PDF
discrete-hmm
PDF
PDF
control engineering revision
PPTX
parallel Merging
DOCX
Parallel searching
PPT
Exploring Petri Net State Spaces
PPTX
Analysis of Algorithm (Bubblesort and Quicksort)
PPTX
DSP System Assignment Help
PPT
Algorithm: Quick-Sort
PDF
Parallel Algorithms
PPTX
Block diagrams and signal flow graphs
PPTX
Block diagram
PDF
Ke3617561763
PPTX
A petri-net
PPT
Block diagram reduction techniques
Reduction of multiple subsystem [compatibility mode]
Algorithm.ppt
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Control chap3
discrete-hmm
control engineering revision
parallel Merging
Parallel searching
Exploring Petri Net State Spaces
Analysis of Algorithm (Bubblesort and Quicksort)
DSP System Assignment Help
Algorithm: Quick-Sort
Parallel Algorithms
Block diagrams and signal flow graphs
Block diagram
Ke3617561763
A petri-net
Block diagram reduction techniques
Ad

Similar to RedisDay London 2018 - CRDTs and Redis From sequential to concurrent executions (20)

PDF
Self healing data
PDF
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
PDF
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
PPTX
Distributed system sans consensus
PPTX
Eventual Consitency with CRDTS
PDF
Conflict-Free Replicated Data Types (PyCon 2022)
PDF
No stress with state
KEY
Eventually Consistent Data Structures (from strangeloop12)
PDF
From Mainframe to Microservice: An Introduction to Distributed Systems
PDF
Building Conclave: a decentralized, real-time collaborative text editor
KEY
Eventually-Consistent Data Structures
PPTX
CRDB - Multi-Master Geo Distributed Redis with Redis Enterprise
PDF
Intro to Databases
PPTX
Put Your Thinking CAP On
PDF
OdessaJS 2017: Groupware Systems for fun and profit
PPTX
Data Engineering for Data Scientists
PDF
Distributed computing time
PDF
SE2016 Exotic Kyryl Sablin "CRDT and their uses"
PDF
Kyryl Sablin Crdt and their uses
PPTX
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Self healing data
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - No...
Uwe Friedrichsen - CRDT und mehr - über extreme Verfügbarkeit und selbstheile...
Distributed system sans consensus
Eventual Consitency with CRDTS
Conflict-Free Replicated Data Types (PyCon 2022)
No stress with state
Eventually Consistent Data Structures (from strangeloop12)
From Mainframe to Microservice: An Introduction to Distributed Systems
Building Conclave: a decentralized, real-time collaborative text editor
Eventually-Consistent Data Structures
CRDB - Multi-Master Geo Distributed Redis with Redis Enterprise
Intro to Databases
Put Your Thinking CAP On
OdessaJS 2017: Groupware Systems for fun and profit
Data Engineering for Data Scientists
Distributed computing time
SE2016 Exotic Kyryl Sablin "CRDT and their uses"
Kyryl Sablin Crdt and their uses
Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster
Ad

More from Redis Labs (20)

PPTX
Redis Day Bangalore 2020 - Session state caching with redis
PPTX
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
PPTX
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
PPTX
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
PPTX
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
PPTX
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
PPTX
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
PPTX
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
PPTX
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
PPTX
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
PPTX
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
PPTX
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
PPTX
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
PPTX
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
PPTX
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
PPTX
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
PPTX
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
PPTX
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
PDF
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
PPTX
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Redis Day Bangalore 2020 - Session state caching with redis
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Cloud computing and distributed systems.
PDF
cuic standard and advanced reporting.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
KodekX | Application Modernization Development
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Machine learning based COVID-19 study performance prediction
Programs and apps: productivity, graphics, security and other tools
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Cloud computing and distributed systems.
cuic standard and advanced reporting.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
KodekX | Application Modernization Development
Encapsulation_ Review paper, used for researhc scholars
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

RedisDay London 2018 - CRDTs and Redis From sequential to concurrent executions

  • 1. CRDTs and Redis From sequential to concurrent executions Carlos Baquero Universidade do Minho & INESC TEC
  • 2. 2/31 The speed of communication in the 19th century W. H. Harrison’s death “At 12:30 am on April 4th, 1841 President William Henry Harrison died of pneumonia just a month after taking o ce. The Rich- mond Enquirer published the news of his death two days later on April 6th. The North- Carolina standard newspaper published it on April 14th. His death wasn’t known of in Los Angeles until July 23rd, 110 days after it had occurred.” Text by Zack Bloom, A Quick History of Digital Communication Before the Internet. https://eager.io/blog/communication-pre-internet/ Picture by By Albert Sands Southworth and Josiah Johnson Hawes
  • 3. 3/31 The speed of communication in the 19th century Francis Galton Isochronic Map
  • 4. 4/31 The speed of communication in the 21st century RTT data gathered via http://www.azurespeed.com
  • 5. 5/31 The speed of communication in the 21st century If you really like high latencies . . . Time delay between Mars and Earth blogs.esa.int/mex/2012/08/05/time-delay-between-mars-and-earth/ Delay/Disruption Tolerant Networking www.nasa.gov/content/dtn
  • 6. 6/31 Latency magnitudes Geo-replication , up to 50ms (local region DC) ⇤, between 100ms and 300ms (inter-continental) No inter-DC replication Client writes observe latency Planet-wide geo-replication Replication techniques versus client side write latency ranges Consensus/Paxos [⇤, 2⇤] (with no divergence) Primary-Backup [ , ⇤] (asynchronous/lazy) Multi-Master (allowing divergence)
  • 7. 7/31 EC and CAP for Geo-Replication Eventually Consistent. CACM 2009, Werner Vogels In an ideal world there would be only one consistency model: when an update is made all observers would see that update. Building reliable distributed systems at a worldwide scale demands trade-o↵s between consistency and availability. CAP theorem. PODC 2000, Eric Brewer Of three properties of shared-data systems – data consistency, system availability, and tolerance to network partition – only two can be achieved at any given time. CRDTs provide support for partition-tolerant high availability
  • 8. 8/31 From sequential to concurrent executions Consensus provides illusion of a single replica This also preserves (slow) sequential behaviour Sequential execution Ops O o // p // q Time // We have an ordered set (O, <). O = {o, p, q} and o < p < q
  • 9. 8/31 From sequential to concurrent executions Consensus provides illusion of a single replica This also preserves (slow) sequential behaviour Sequential execution Ops O o // p // q Time // We have an ordered set (O, <). O = {o, p, q} and o < p < q
  • 10. 9/31 From sequential to concurrent executions EC Multi-master (or active-active) can expose concurrency Concurrent execution p // q Ops O o ?? r s 77 Time // Partially ordered set (O, ). o p q r and o s r Some ops in O are concurrent: p k s and q k s
  • 11. 10/31 Design of Conflict-Free Replicated Data Types A partially ordered log (polog) of operations implements any CRDT Replicas keep increasing local views of an evolving distributed polog Any query, at replica i, can be expressed from local polog Oi Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }| CRDTs are e cient representations that follow some general rules
  • 12. 10/31 Design of Conflict-Free Replicated Data Types A partially ordered log (polog) of operations implements any CRDT Replicas keep increasing local views of an evolving distributed polog Any query, at replica i, can be expressed from local polog Oi Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }| CRDTs are e cient representations that follow some general rules
  • 13. 10/31 Design of Conflict-Free Replicated Data Types A partially ordered log (polog) of operations implements any CRDT Replicas keep increasing local views of an evolving distributed polog Any query, at replica i, can be expressed from local polog Oi Example: Counter at i is |{inc | inc 2 Oi }| |{dec | dec 2 Oi }| CRDTs are e cient representations that follow some general rules
  • 14. 11/31 Principle of permutation equivalence If operations in sequence can commute, preserving a given result, then under concurrency they should preserve the same result Sequential inc(10) // inc(35) // dec(5) // inc(2) dec(5) // inc(2) // inc(10) // inc(35) Concurrent inc(35) && inc(10) 88 && inc(2) dec(5) 88 You guessed: Result is 42
  • 15. 12/31 Implementing Counters Example: CRDT PNCounters A inc(35) && B inc(10) 88 && inc(2) C dec(5) 88 Lets track total number of incs and decs done at each replica {A(incs, decs), . . . , C(. . . , . . .)}
  • 16. 13/31 Implementing Counters Example: CRDT PNCounters Separate positive and negative counts are kept per replica A {A(35, 0), B(10, 0)} ++ B {B(10, 0)} 66 (( {A(35, 0), B(12, 0), C(0, 5)} C {B(10, 0), C(0, 5)} 33 Joining does point-wise maximums among entries (semilattice) At any time, counter value is sum of incs minus sum of decs
  • 17. 13/31 Implementing Counters Example: CRDT PNCounters Separate positive and negative counts are kept per replica A {A(35, 0), B(10, 0)} ++ B {B(10, 0)} 66 (( {A(35, 0), B(12, 0), C(0, 5)} C {B(10, 0), C(0, 5)} 33 Joining does point-wise maximums among entries (semilattice) At any time, counter value is sum of incs minus sum of decs
  • 18. 14/31 Implementing Counters Redis CRDT Counters There are multiple ways to implement CRDT counters Redis has a distinct implementation that favours garbage collection Redis CRDT counters are 59 bits (not 64) to avoid overflows
  • 19. 15/31 Registers Registers are an ordered set of write operations Sequential execution A wr(x) // wr(j) // wr(k) // wr(x) Sequential execution under distribution A wr(x) %% wr(x) B wr(j) // wr(k) 99 Register value is x, the last written value
  • 20. 16/31 Implementing Registers Naive Last-Writer-Wins CRDT register implemented by attaching local wall-clock times Sequential execution under distribution A (11:00)x '' (11:30)? ## B (12:02)j // (12:05)k 77 ? Problem: Wall-clock on B is one hour ahead of A Value x might not be writeable again at A since 12:05 > 11:30
  • 21. 17/31 Registers Sequential Semantics Register shows value v at replica i i↵ wr(v) 2 Oi and @wr(v0 ) 2 Oi · wr(v) < wr(v0 )
  • 22. 18/31 Preservation of sequential semantics Concurrent semantics should preserve the sequential semantics This also ensures correct sequential execution under distribution
  • 23. 19/31 Multi-value Registers Concurrency semantics shows all concurrent values {v | wr(v) 2 Oi ^ @wr(v0 ) 2 Oi · wr(v) wr(v0 )} Concurrent execution A wr(x) %% // wr(y) // {y, k} // wr(m) // {m} B wr(j) // wr(k) 99 Dynamo shopping carts are multi-value registers with payload sets The m value could be an application level merge of values y and k
  • 24. 20/31 Implementing Multi-value Registers Concurrency can be preciselly tracked with version vectors Concurrent execution (version vectors) A [1, 0]x %% // [2, 0]y // [2, 0]y, [1, 2]k // [3, 2]m B [1, 1]j // [1, 2]k 66 Metadata can be compressed with a common causal context and a single scalar per value (dotted version vectors)
  • 25. 21/31 Registers in Redis LWW arbitration Multi-value registers allows executions leading to concurrent values Presenting concurrent values is at odds with the sequential API Redis both tracks causality and registers wall-clock times Querying uses Last-Writer-Wins selection among concurrent values This preserves correctness of sequential semantics A value with clock 12:05 can still be causally overwritten at 11:30
  • 26. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 27. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 28. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 29. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 30. 22/31 Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) ! add(c) we observe that a, c 2 X X = {. . .}, add(c) ! rmv(c) we observe that c 62 X In general, given Oi , the set has elements {e | add(e) 2 Oi ^ @rmv(e) 2 Oi · add(e) < rmv(e)}
  • 31. 23/31 Sets Concurrency Semantics Problem: Concurrently adding and removing the same element Concurrent execution A add(x) %% // rmv(x) // {?} // add(x) // {x} B rmv(x) // add(x) ::
  • 32. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 33. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 34. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 35. 24/31 Concurrency Semantics Add-Wins Sets Let’s choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) 2 Oi ^ @ rmv(e) 2 Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 36. 25/31 Equivalence to a sequential execution? Add-Wins Sets Can we always explain a concurrent execution by a sequential one? Concurrent execution A {x, y} // add(y) // rmv(x) // {y} // ## {x, y} B {x, y} // add(x) // rmv(y) // {x} // ;; {x, y} Two (failed) sequential explanations H1 {x, y} // . . . // rmv(x) // {6 x, y} H2 {x, y} // . . . // rmv(y) // {x, 6 y} Concurrent executions can have richer outcomes
  • 37. 26/31 Concurrency Semantics Remove-Wins Sets Alternative: Let’s choose Remove-Wins Xi . = {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)} Remove-Wins requires more metadata than Add-Wins Both Add and Remove-Wins have same semantics in a total order They are di↵erent but both preserve sequential semantics
  • 38. 26/31 Concurrency Semantics Remove-Wins Sets Alternative: Let’s choose Remove-Wins Xi . = {e | add(e) 2 Oi ^ 8 rmv(e) 2 Oi · rmv(e) add(e)} Remove-Wins requires more metadata than Add-Wins Both Add and Remove-Wins have same semantics in a total order They are di↵erent but both preserve sequential semantics
  • 39. 27/31 Take home message Concurrent executions are needed to deal with latency Behaviour changes when moving from sequential to concurrent Road to accommodate transition: Permutation equivalence Preserving sequential semantics Concurrent executions lead to richer outcomes CRDTs provide sound guidelines and encode policies
  • 40. Thank you! Carlos Baquero Email: cbm@di.uminho.pt, Twitter: @xmal
  • 41. 29/31 Sequence/List Weak/Strong Specification [Attiya et al, PODC 16] Element x is kept rpush(b) // hxbi $$ hi // lpush(x) // hxi 99 %% haxbi lpush(a) // haxi :: Element x is removed (Redis enforces Strong Specification) rpush(b) // hxbi ## hi // lpush(x) // hxi :: $$ // rem(x) // hi // habi ¬hbai lpush(a) // haxi ;;
  • 42. 30/31 Causal Consistency Redis CRDTs provide per-key causal consistency Source FIFO (TCP) A apple && %% B apple pie ## hot ## C pie hot peach? apple Causal consistency A apple %% B apple %% pie $$ hot "" C apple pie hot tasty Strongest highly available consistency model
  • 43. 31/31 Causal Consistency Redis CRDTs provide per-key causal consistency Source FIFO (TCP) A add(a) '' && B add(a) rmv(a) '' add(b) '' C rmv(a) add(b) add(a) Causal consistency A add(a) %% B add(a) %% rmv(a) && add(b) %% C add(a) rmv(a) add(b) Strongest highly available consistency model