SlideShare a Scribd company logo
Stream Based
Data Synchronization
Klemen Verdnik
1. Introduction
1.1 Who am I, What I Do?
• low-level programming enthusiast

(audio and video DSP routines, tight loop
optimizations)
• embedded systems (graphic EQ with DSP,
fleet management, mobile payment)
• familiar with iOS SDK since 2008
• vox.io (web / mobile sip, xmpp)
• layer.com (messaging)
• obsession with synchronization protocols
2. Data Synchronization
2.1 What is Data Synchronization?
• Having data
consistency across two
or more networked
entities
2.1.1 Example
Toggle Switch App
2.1.2 How to Design the System?
• Simple server
Toggle Switch App
2.1.2 How to Design the System?
• Simple server
• Simple client
Toggle Switch App
2.1.2 How to Design the System?
• Simple server
• Simple client
• Simple data
structure
{
lightsOn: true
}
Toggle Switch App
2.1.2 How to Design the System?
2.2 Other Use Cases
• E-mail (IMAP, POP)
• Messaging (iMessages, Hangouts)
• Photo sharing (Photo Stream, Google Photos)
• File sharing (Dropbox, iCloud Drive)
• Online text editors / spreadsheet editors (Google Docs)
• Multiplayer Games (Minecraft)
2.3 Types of Data Synchronization
• File
synchronization
2.3 Types of Data Synchronization
• File
synchronization
• Text / document
synchronization
2.3 Types of Data Synchronization
• File
synchronization
• Text / document
synchronization
• Data model
synchronization
2.4 Approaches to Data
Synchronization
2.4.1 Absolute Synchronization (copying)
• Copying (wholesale transfer) is ok
when dealing with small data-sets
(e.g. refreshing weather
forecast, RSVP list ...)
2.4.1 Absolute Synchronization (copying)
• Figuring out differences between previously
fetched data-sets costs CPU and memory
O(n ⋁ m)
2.4.1 Absolute Synchronization (copying)
• Figuring out differences between previously
fetched data-sets costs CPU and memory
O(n ⋁ m)
Dan
Alex
Blake
Emily
George
Caroline
You
2.4.1 Absolute Synchronization (copying)
• Figuring out differences between previously
fetched data-sets costs CPU and memory
O(n ⋁ m)
Dan
Alex
Blake
Emily
George
Caroline
You
Dan
Alex
Emily
George
Caroline
You
≠
2.4.2 Relative Synchronization (changes)
• Getting data up-to-date with changes

instead of full data sets.
(a.k.a. deltas)
2.4 What are Deltas?
Delta encoding is a way to describe differences
between two datasets.
2.5.1 How to Encode Deltas?
insert ― adds new values to dataset
update ― updates existing values in dataset
delete ― deletes existing values from dataset
+
✔
-
• Three primitive operations
2.5.1 How to Encode Deltas?
0000000: 4749 4654 2B31 0d00 0d00 9100 00b6 6257 GIFT+1........bW
0000010: 0804 0456 2c27 e5aa 7f21 f904 0000 0000 ...V,'...!......
0000020: 002c 0000 0000 0d00 0d00 0002 318c 8f29 .,..........1..)
0000030: 3000 7986 944f 8823 260d 0feb b620 0b03 0.y..O.#&.... ..
0000040: 2e97 e1a4 0f79 920c 60a5 28e5 c452 abc6 .....y..`.(..R..
[ {
type: "update",
offset: 0x03,
values: [ 0x38, 0x39, 0x61 ]
}, {
type: "insert",
offset: 0x50,
values: [ 0xCE, 0xE1, 0x50, 0x96, 0x89, 0x48, 0x9D, 0x02, 0x43, 0x62,
0x8D, 0x98, 0x28, 0x00, 0x00, 0x3B ]
} ]
• Example on how to encode binary data changes
2.5.1 How to Encode Deltas?
0000000: 4749 4638 3961 0d00 0d00 9100 00b6 6257 GIF89a........bW
0000010: 0804 0456 2c27 e5aa 7f21 f904 0000 0000 ...V,'...!......
0000020: 002c 0000 0000 0d00 0d00 0002 318c 8f29 .,..........1..)
0000030: 3000 7986 944f 8823 260d 0feb b620 0b03 0.y..O.#&.... ..
0000040: 2e97 e1a4 0f79 920c 60a5 28e5 c452 abc6 .....y..`.(..R..
[ {
type: "update",
offset: 0x03,
values: [ 0x38, 0x39, 0x61 ]
}, {
type: "insert",
offset: 0x50,
values: [ 0xCE, 0xE1, 0x50, 0x96, 0x89, 0x48, 0x9D, 0x02, 0x43, 0x62,
0x8D, 0x98, 0x28, 0x00, 0x00, 0x3B ]
} ]
• Example on how to encode binary data changes
2.5.1 How to Encode Deltas?
0000000: 4749 4638 3961 0d00 0d00 9100 00b6 6257 GIF89a........bW
0000010: 0804 0456 2c27 e5aa 7f21 f904 0000 0000 ...V,'...!......
0000020: 002c 0000 0000 0d00 0d00 0002 318c 8f29 .,..........1..)
0000030: 3000 7986 944f 8823 260d 0feb b620 0b03 0.y..O.#&.... ..
0000040: 2e97 e1a4 0f79 920c 60a5 28e5 c452 abc6 .....y..`.(..R..
0000050: cee1 5096 8948 9d02 4362 8d98 2800 003b ..P..H..Cb..(..;
[ {
type: "update",
offset: 0x03,
values: [ 0x38, 0x39, 0x61 ]
}, {
type: "insert",
offset: 0x50,
values: [ 0xCE, 0xE1, 0x50, 0x96, 0x89, 0x48, 0x9D, 0x02, 0x43, 0x62,
0x8D, 0x98, 0x28, 0x00, 0x00, 0x3B ]
} ]
• Example on how to encode binary data changes
2.5.1 How to Encode Deltas?
• Example on how to encode text changes
83: //
84: // Toggles the private ivar `_lightSwitchState` boolean, updates the
85: // background image, plays a sound and transmits the change over network.
86: //
87: func toggleAndSendLightSwitchState() {
88: self.lightSwitchState = !self.lightSwitchState
> 89: self.lightSwitchClient.sendLightSwitchState(self.lightSwitchState)
90: }
2.5.1 How to Encode Deltas?
• Example on how to encode text changes (diff patch)
83: //
84: // Toggles the private ivar `_lightSwitchState` boolean, updates the
85: // background image, plays a sound and transmits the change over network.
86: //
87: func toggleAndSendLightSwitchState() {
88: self.lightSwitchState = !self.lightSwitchState
> 89: self.lightSwitchClient?.sendLightSwitchState(self.lightSwitchState)
90: }
--- 89: self.lightSwitchClient.sendLightSwitchState(self.lightSwitchState)
+++ 89: self.lightSwitchClient?.sendLightSwitchState(self.lightSwitchState)
2.5.1 How to Encode Deltas?
• Example on how to encode text changes (insert operation)
83: //
84: // Toggles the private ivar `_lightSwitchState` boolean, updates the
85: // background image, plays a sound and transmits the change over network.
86: //
87: func toggleAndSendLightSwitchState() {
88: self.lightSwitchState = !self.lightSwitchState
> 89: self.lightSwitchClient?.sendLightSwitchState(self.lightSwitchState)
90: }
{
type: "insert",
offset: 2781,
values: [ "?" ]
}
2.5.1 How to Encode Deltas?
• Example on how to encode custom data model changes
{
guests: [ "Alex", "Blake", "Caroline", "Dan", "Emily", "George" ]
}
2.5.1 How to Encode Deltas?
• Example on how to encode custom data model changes
{
guests: [ "Alex", "Blake", "Caroline", "Dan", "Emily", "George" ]
}
{
type: "delete",
guest: [ "Blake" ]
}
{
guests: [ "Alex", "Caroline", "Dan", "Emily", "George" ]
}
3. Stream Based Synchronization
3.1 The Motivation
• Minimum data redundancy
3.1 The Motivation
• Speed / minimum bandwidth
3.1 The Motivation
• Fast writes = good concurrency characteristics
3.1 The Motivation
• Distributability and scalability
3.1 The Motivation
• Offline support
3.2 Stream of Mutations
• Clients with an open connection receive a live
stream of events from the server
3.2 Stream of Mutations
• Clients with an open connection receive a live
stream of events from the server
3.2 Stream of Mutations
• Example "To Do" app
3.2.1 Example (To-do List App Data-model)
• Live synchronized list of
to-do tasks
public struct Todo {
public class List: NSObject {
private let tasks: Array<Task> = []
}
}
3.2.1 Example (To-do List App Data-model)
• Live synchronized list of
to-do tasks
• Task element consists of:
checkbox, label and
color public struct Todo {
public class List: NSObject {
private let tasks: Array<Task> = []
}
}
public struct Todo {
public class Task: NSObject {
public private(set) var identifier: NSUUID
public private(set) var completed: Bool
public private(set) var title: String
public private(set) var label: ColorLabel
public enum ColorLabel: UInt8 {
case None = 0, Red, Orange, Yellow, Green,
Turquoise, Blue, Purple, Pink
}
}
}
3.2.1 Example (To-do List App Data-model)
• Live synchronized list of
to-do tasks
• Task element consists of:
checkbox, label and
color
• Tasks can be added,
edited and removed
public struct Todo {
public class List: NSObject {
private let tasks: Array<Task> = []
public func create(title: String, label: Task.ColorL
public func update(identifier: NSUUID, completed: Bo
public func remove(identifier: NSUUID)
}
}
public struct Todo {
public class Task: NSObject {
public private(set) var identifier: NSUUID
public private(set) var completed: Bool
public private(set) var title: String
public private(set) var label: ColorLabel
public enum ColorLabel: UInt8 {
case None = 0, Red, Orange, Yellow, Green,
Turquoise, Blue, Purple, Pink
}
}
}
3.2.1 Example (To-do List App Data-model)
• Live synchronized list of
to-do tasks
• Task element consists of:
checkbox, label and
color
• Tasks can be added,
edited and removed
3.2.2 Example (To-do List Sync Data-model)
• Todo.List user actions
turn into events (𝚫)
3.2.2 Example (To-do List Sync Data-model)
• Todo.List user actions
turn into events (𝚫)
• Simple concrete objects
describing changes
public struct Sync {
public class Event: NSObject {
public enum Type: UInt8 {
case Insert = 0, Update, Delete
}
public private(set) var type: Type
public private(set) var identifier: NSUUID
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
3.2.2 Example (To-do List Sync Data-model)
• Todo.List user actions
turn into events (𝚫)
• Simple concrete objects
describing changes
• Serializable
public struct Sync {
public class Event: NSObject, Serializable {
public enum Type: UInt8 {
case Insert = 0, Update, Delete
}
public private(set) var type: Type
public private(set) var identifier: NSUUID
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
public protocol Serializable: class {
init(fromDictionary dictionary: Dictionary<String, AnyObject>)
func toDictionary() -> Dictionary<String, AnyObject>
}
3.2.2 Example (To-do List Sync Data-model)
• Todo.List user actions
turn into events (𝚫)
• Simple concrete objects
describing changes
• Serializable
3.2.2 Example (To-do List Sync Data-model)
• Creating new task
{ // serialized event structure
type: 0, // 0 = Insert
identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // client generated id
completed: false, // an incomplete task
title: "Buy Milk", // task description
label: 0 // color tag
}
3.2.2 Example (To-do List Sync Data-model)
• Editing an existing task
{ // event structure
type: 1, // 1 = Update
identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // reference to task
completed: true // new state
}
3.2.3 Example (To-do List Sync and Transport)
• Receive live serialized
Events from the
server. public protocol TransportDelegate: class {
func transport(transport: Transport, didReceiveObject object: Serializable)
func transportDidConnect(transport: Transport)
func transportDidDisconnect(transport: Transport)
}
public struct Sync {
public class Client: NSObject, TransportDelegate {
public private(set) var stream: Stream = Stream()
public private(set) var transport: Transport
public private(set) var todoList: Todo.List
public private(set) var publishedEvents: Array<Event> = []
private func publish(event: Event) -> Bool
public func transport(transport: Transport, didReceiveObject object: Serializab
}
}
3.2.3 Example (To-do List Sync and Transport)
• Receive live serialized
Events from the
server.
• Send serialized
Events to server.
public protocol TransportDelegate: class {
func transport(transport: Transport, didReceiveObject object: Serializable)
func transportDidConnect(transport: Transport)
func transportDidDisconnect(transport: Transport)
}
public struct Sync {
public class Client: NSObject, TransportDelegate {
public private(set) var stream: Stream = Stream()
public private(set) var transport: Transport
public private(set) var todoList: Todo.List
public private(set) var publishedEvents: Array<Event> = []
private func publish(event: Event) -> Bool
public func transport(transport: Transport, didReceiveObject object: Serializab
}
}
3.2.3 Example (To-do List Sync and Transport)
• Receive live serialized
Events from the
server.
• Send serialized
Events to server.
3.3 Let the Streaming Begin
• Data consistent, as long as clients remain connected
3.3 Let the Streaming Begin
• Missing out on events puts the client out-of-sync
3.3 Let the Streaming Begin
• Missing out on events puts the client out-of-sync
{ // event structure
type: 1,
identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd",
completed: true
}
3.3 Let the Streaming Begin
• Data consistent, as long as clients remain
connected
• Missing out on events puts the client out-of-sync
• Clients can recover from out-of-sync state
• Server's responsibility beside broadcasting should
also be preserving the events
3.4 Persistent Stream
3.4 Persistent Stream
• Think of it as a linear magnetic tape, or as a storage
with a WORM behavior
• Append only
• Immutable events
• Journal of all the events that have happened
3.4 Persistent Stream
• Always copy all the events? (too expensive)
• Integrity check by hashing events? (only detects
mismatches)
How does a client know if it's got all the events?
3.5 Event Discovery
3.5 Event Discovery
• Sequencing Events on server
3.5 Event Discovery
• Sequencing Events on server
public struct Sync {
public class Event: NSObject {
public private(set) var seq: Int?
public private(set) var type: Type
public private(set) var identifier: NSUUID?
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
3.5 Event Discovery
• Sequencing Events on server
• Sequence is a linear function f(x)=x reproducible on client
3.5 Event Discovery
• Sequencing Events on server
• Sequence is a linear function f(x)=x reproducible on client
3.5 Event Discovery
• Client only needs to know the seq value of

the last event f(x<12)=x
3.5 Event Discovery
• Client only needs to know the seq value of

the last event f(x<12)=x
• Figuring out missing events by subtracting the set of seqs
3.5 Event Discovery
// Seq values pulled from all events the client has.
// [ 0, 1, 2, 10, 11, 12 ]
let seqsOfEvents: Set = events.map({ $0.seq })
// Calculated sequence ranging from 0 to 12.
// [ 0, 1, 2, 3 ... 12 ]
let seqsOfAllEvents: Set = [Int](0...12)
// Diffed set of seq values.
// [ 3, 4, 5, 6, 7, 8, 9 ]
let seqsOfMissingEvents: Set = seqOfAllEvents.subtract(seqOfEvents)
• Client only needs to know the seq value of

the last event f(x<12)=x
• Figuring out missing events by subtracting the set of seqs
4. Event and Model Reconciliation
Outbound and inbound reconciliation
4.1 Outbound Reconciliation
• Turning user actions (model changes) into Events
4.1 Outbound Reconciliation
• Turning user actions (model changes) into Events
4.1 Outbound Reconciliation
• Turning user actions (model changes) into Events
4.1 Outbound Reconciliation
• Turning user actions (model changes) into Events
4.1 Outbound Reconciliation
• Turning user actions (model changes) into Events
public class List: NSObject {
public func create(title: String, label: Task.ColorLabel) -> Sync.Event
public func update(identifier: NSUUID, completed: Bool?, title: String?, label: Task.ColorLabel?) -> Sync.Event?
public func remove(identifier: NSUUID) -> Sync.Event?
}
4.1 Outbound Reconciliation
• Turning user actions (model changes) into Events
let todoList = List()
let event = todoList.create("Buy Milk", label: Task.ColorLabel.None)
print("event: '(event)", event)
// event: {
// type: 0, // 0 = Insert
// identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // client generated id
// completed: false, // an incomplete task
// title: "Buy milk", // task description
// label: 0 // task without a label
// }
4.1 Outbound Reconciliation
• Publishing events
let todoList = List()
let event = todoList.create("Buy Milk", label: Task.ColorLabel.None)
print("event: '(event)", event)
// event: {
// type: 0, // 0 = Insert
// identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // client generated id
// completed: false, // an incomplete task
// title: "Buy milk", // task description
// label: 0 // task without a label
// }
// Sends the event to the stream over the network.
self.syncClient.publish(event)
4.2 Inbound Reconciliation
• Apply Events onto the model
4.2 Inbound Reconciliation
• Apply Events onto the model
public class List: NSObject {
private func apply(event: Sync.Event) -> Bool {
switch event.type {
case .Insert: // Task creation
let task = Task(identifier: event.identifier, completed: event.completed!, title: event.title!, label: Task.ColorL
self.tasks.append(task)
case .Update: // Task updates
let task = self.task(event.identifier)
if task == nil {
return false
}
task!.update(event.completed!, title: event.title!, label: Task.ColorLabel(rawValue: event.label!)!)
case .Delete: // Task removal
if !self.removeTask(event.identifier) {
return false
}
}
return true
}
}
4.3 Offline Support
• Events generated offline have to be published eventually
4.3 Offline Support
• Queue generated events; drain queue for publication
✔
4.3 Offline Support
• Generating redundant events while offline
✔
4.3 Offline Support
• Generating redundant events while offline
4.4 Reducing the Edit Distance
• Events describing the same mutation
4.4 Reducing the Edit Distance
• Causes stream pollution
• Increases the edit distance
4.4 Reducing the Edit Distance
1. Insert Event merges with

Update Events → single Insert Event
2. Update Event merge with

the rest of Update Events → single Update Event
3. Last Update Event defines final state.
4. Delete Event clobbers other Event types
Simple set of rules when queueing:
4.4 Reducing the Edit Distance
public struct Sync {
public class Event: NSObject, Serializable {
var mergedEvents = Array<Event>()
for oldEvent in events.reverse() {
if oldEvent.identifier != self.identifier {
// Event not mergeable, due to the identifier mismatch.
mergedEvents.append(oldEvent)
continue
} else if self.type == Type.Delete {
// Rule #4
self.reset()
self.type = Type.Delete
} else if self.type == Type.Update && (oldEvent.type == Type.Insert || oldEvent.type == Type.Update) {
// Rule #1, #2, #3
self.completed = self.completed ?? oldEvent.completed
self.title = self.title ?? oldEvent.title
self.label = self.label ?? oldEvent.label
}
}
mergedEvents.append(self)
return mergedEvents
}
}
4.5 Conflict Resolution
• Concurrent systems experience conflicts when two
or more nodes (clients) work on the same resource
at the same time.
4.5 Conflict Resolution
• Concurrent systems experience conflicts when two
or more nodes (clients) work on the same resource
at the same time.
• Example: a client deletes a Todo Task before
another client tries to mutate it.
4.5 Conflict Resolution
Possible conflict resolutions:
• Bring the deleted task back (last writer wins)
• Deleted task stays deleted (first writer wins)
• Ask the User what to do? (requires user interaction)
5. Order of Events
5. Order of Events
Event sequence dictates the order they were written to

stream this puts the Events in total order
• Task objects will be in the exact same order, defined by

the Event.seq
• Task mutations will be applied in the same manner on all
clients
5. Order of Events (total order)
• Queued events must be published in batches
5. Order of Events (total order)
• Queued events must be published in batches
5. Order of Events (total order)
• Queued events must be published in batches
5. Order of Events (total order)
• Queued events must be published in batches
5.1 Total Order (sequential writes)
• Synchronized sequential writes block other clients from
writing
5.1 Total Order (sequential writes)
• Synchronized sequential writes block other clients from
writing - violates our fast concurrent writes requirement
serial
writes
concurrent
writes
5.1 Total Order (offline support)
• Both clients online
5.1 Total Order (offline support)
• Both clients online
5.1 Total Order (offline support)
• Left client loses connection
5.1 Total Order (offline support)
• Offline client adds more To-do tasks to the list
5.1 Total Order (offline support)
• Online client also adds a Todo task to the list
5.1 Total Order (offline support)
• Left client comes back online ― events generated offline
get published and fall at the end (higher seq values)
5.2 Causal Order
Causes must precede their effects - effects come after causes,
and never before
5.2 Causal Order
Causes must precede their effects - effects come after causes,
and never before
cause
effect
5.2 Causal Order
• Generated Event is an effect caused by user taking action /
responding to the UI.
• Events should be reconciled in the same order as they were
generated by clients.
• Events should be applied onto the app model under the same
conditions as it was when author generated the events.
• Total order cannot guarantee Events will be written to stream in
the same order they were generated.
5.2.1 Order Based on Timestamps
Client B's events are
written before Client A's,
even though Client A
generated them first.
Encoding local time
with events.
5.2.1 Order Based on Timestamps
Sorting events based
on the embedded
timestamp.
5.2.1 Order Based on Timestamps
5.2.1 Order Based on Timestamps
• No guarantee
time will be the
same on all
devices
• Clock skew
• Manual override
5.2.2 Version Vectors
• Reconstructing Events' order as it was perceived by the author
based on happened-before information.
• Provides causality-tracking basic principle in some optimistic
(lazy) replication algorithms.
• Allows the client to operate independently from the server.
• When all clients eventually publish their events, it brings other online
clients into a consistent state eventual consistency.
5.2.2 Version Vectors
How to encode happened-before information?
public struct Sync {
public class Event: NSObject {
public private(set) var seq: Int?
public private(set) var type: Type
public private(set) var identifier: NSUUID?
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
5.2.2 Version Vectors
1. Information of what's
the last seen event -
event.seq
How to encode happened-before information?
public struct Sync {
public class Event: NSObject {
public private(set) var seq: Int?
public private(set) var precedingSeq: Int
public private(set) var type: Type
public private(set) var identifier: NSUUID?
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
5.2.2 Version Vectors
1. Information of what's
the last seen event -
event.seq
2. Keep unpublished
events in order -
event.clientSeq
How to encode happened-before information?
public struct Sync {
public class Event: NSObject {
public private(set) var seq: Int?
public private(set) var precedingSeq: Int
public private(set) var clientSeq: Int
public private(set) var type: Type
public private(set) var identifier: NSUUID?
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
5.2.2 Version Vectors
1. Information of what's
the last seen event -
event.seq
2. Keep unpublished
events in order -
event.clientSeq
3. Order
How to encode happened-before information?
public struct Sync {
public class Event: NSObject {
/// Event sorting closure
static public let causalOrder = { (e1: Event, e2: Event)
if e1.precedingSeq == e2.precedingSeq {
return e1.clientSeq < e2.clientSeq
}
return e1.precedingSeq < e2.precedingSeq
}
public private(set) var seq: Int?
public private(set) var precedingSeq: Int
public private(set) var clientSeq: Int
public private(set) var type: Type
public private(set) var identifier: NSUUID?
public private(set) var completed: Bool?
public private(set) var title: String?
public private(set) var label: Int?
}
}
5.2.2 Version Vectors
5.2.2 Version Vectors
5.2.2 Version Vectors
5.2.2 Version Vectors
5.2.2 Version Vectors
5.2.2 Version Vectors
5.2.2 Version Vectors
public struct Sync {
public class Event: NSObject, Serializable {
var mergedEvents = Array<Event>()
for oldEvent in events.sort(Event.causalOrder) {
if oldEvent.identifier != self.identifier {
// etc...
public struct Todo {
public class List: NSObject, ModelReconciler {
public func apply(events: Array<Sync.Event>) -> Bool {
for event in events.sort(Sync.Event.causalOrder) {
let success = self.apply(event)
// etc...
Minor adjustment in outbound / inbound reconciliation:
5.2.2 Version Vectors
• Newly published events generated offline are ordered by
their causality.
5.2.2 Version Vectors
• Concurrent writes - no need for batched writes anymore,

due to clientSeq.
5.2.2 Version Vectors
• Concurrent writes - events can be written with undetermined
order; order can be reconstructed on clients
serial
writes
concurrent
writes
6. Advantages
6. Advantages
• Shared source - minimal redundancy
6. Advantages
• Shared source - minimal redundancy
• Lightweight data structure - fast delivery
6. Advantages
• Shared source - minimal redundancy
• Lightweight data structure - fast delivery
• Minimal server logic (low CPU)
6. Advantages
• Shared source - minimal redundancy
• Lightweight data structure - fast delivery
• Minimal server logic (low CPU)
• Short writes - high concurrency
6. Advantages
• Shared source - minimal redundancy
• Lightweight data structure - fast delivery
• Minimal server logic (low CPU)
• Short writes - high concurrency
• Scalable / distributable
6. Advantages
• Shared source - minimal redundancy
• Lightweight data structure - fast delivery
• Minimal server logic (low CPU)
• Short writes - high concurrency
• Scalable / distributable
• Offline support
7. Disadvantages
7. Disadvantages
• Server simplicity = client complexity
7. Disadvantages
• Server simplicity = client complexity
• Rogue clients = stream pollution
7. Disadvantages
• Server simplicity = client complexity
• Rogue clients = stream pollution
• Clients must read full stream
7. Disadvantages
• Server simplicity = client complexity
• Rogue clients = stream pollution
• Clients must read full stream
• Partial sync difficult to implement
END_OF_STREAM
questions?
klemen.verdnik@gmail.comgithub.com/chipxsd
@chipxsd

More Related Content

PDF
First few months with Kotlin - Introduction through android examples
PDF
Clojure for Data Science
PPTX
Procedural Content Generation with Clojure
PPTX
Enter The Matrix
PDF
Introduction to CNN with Application to Object Recognition
PPTX
Clojure for Data Science
PDF
Machine Learning Live
PDF
Geoff Rothman Presentation on Parallel Processing
First few months with Kotlin - Introduction through android examples
Clojure for Data Science
Procedural Content Generation with Clojure
Enter The Matrix
Introduction to CNN with Application to Object Recognition
Clojure for Data Science
Machine Learning Live
Geoff Rothman Presentation on Parallel Processing

What's hot (13)

PDF
The Ring programming language version 1.5.1 book - Part 28 of 180
KEY
R for Pirates. ESCCONF October 27, 2011
PDF
From Lisp to Clojure/Incanter and RAn Introduction
PDF
Data Love Conference - Window Functions for Database Analytics
PDF
Apache Cassandra & Data Modeling
PDF
Big Data LDN 2017: From Zero to AI in 30 Minutes
PDF
[1062BPY12001] Data analysis with R / week 2
PDF
6. Vectors – Data Frames
 
PDF
Particle Filter Tracking in Python
PDF
CS253: Minimum spanning Trees (2019)
PDF
Time Series Meetup: Virtual Edition | July 2020
PDF
Multiclassification with Decision Tree in Spark MLlib 1.3
PDF
The Ring programming language version 1.4.1 book - Part 7 of 31
The Ring programming language version 1.5.1 book - Part 28 of 180
R for Pirates. ESCCONF October 27, 2011
From Lisp to Clojure/Incanter and RAn Introduction
Data Love Conference - Window Functions for Database Analytics
Apache Cassandra & Data Modeling
Big Data LDN 2017: From Zero to AI in 30 Minutes
[1062BPY12001] Data analysis with R / week 2
6. Vectors – Data Frames
 
Particle Filter Tracking in Python
CS253: Minimum spanning Trees (2019)
Time Series Meetup: Virtual Edition | July 2020
Multiclassification with Decision Tree in Spark MLlib 1.3
The Ring programming language version 1.4.1 book - Part 7 of 31
Ad

Viewers also liked (7)

PDF
High Temperature Cabinet Oven by ACMAS Technologies Pvt Ltd.
PDF
匆匆数年 - 在豆瓣
PDF
美团点评技术沙龙09 - 外卖O2O的用户画像实践
PDF
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
PDF
唯品会大数据实践 Sacc pub
PDF
豆瓣数据架构实践
PDF
Visual Design with Data
High Temperature Cabinet Oven by ACMAS Technologies Pvt Ltd.
匆匆数年 - 在豆瓣
美团点评技术沙龙09 - 外卖O2O的用户画像实践
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
唯品会大数据实践 Sacc pub
豆瓣数据架构实践
Visual Design with Data
Ad

Similar to Stream-based Data Synchronization (20)

PDF
MongoDB Solution for Internet of Things and Big Data
PDF
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
PDF
OrientDB - The 2nd generation of (multi-model) NoSQL
PPTX
DEVNET-1163 Data in Motion APIs
PDF
MongoDB: Optimising for Performance, Scale & Analytics
PDF
NoSQL Infrastructure
PDF
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
PPTX
Demonstration/explanation of Bitonic Sort Algorithm
PPTX
Se2017 query-optimizer
PPTX
PPTX
IncQuery-D: Incremental Queries in the Cloud
PDF
React London April- Fully functional: Central state is a great fit for React ...
PDF
iBeacons - the new low-powered way of location awareness
PPTX
Getting started cpp full
PPTX
Accelerating analytics on the Sensor and IoT Data.
PDF
Deep Style: Using Variational Auto-encoders for Image Generation
PPTX
SQL Server Deep Dive, Denis Reznik
PDF
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
PPTX
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
PDF
SparkSQL: A Compiler from Queries to RDDs
MongoDB Solution for Internet of Things and Big Data
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
OrientDB - The 2nd generation of (multi-model) NoSQL
DEVNET-1163 Data in Motion APIs
MongoDB: Optimising for Performance, Scale & Analytics
NoSQL Infrastructure
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Demonstration/explanation of Bitonic Sort Algorithm
Se2017 query-optimizer
IncQuery-D: Incremental Queries in the Cloud
React London April- Fully functional: Central state is a great fit for React ...
iBeacons - the new low-powered way of location awareness
Getting started cpp full
Accelerating analytics on the Sensor and IoT Data.
Deep Style: Using Variational Auto-encoders for Image Generation
SQL Server Deep Dive, Denis Reznik
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
SparkSQL: A Compiler from Queries to RDDs

Recently uploaded (20)

PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Sustainable Sites - Green Building Construction
PPTX
Construction Project Organization Group 2.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPT
Project quality management in manufacturing
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Geodesy 1.pptx...............................................
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Operating System & Kernel Study Guide-1 - converted.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Sustainable Sites - Green Building Construction
Construction Project Organization Group 2.pptx
OOP with Java - Java Introduction (Basics)
CH1 Production IntroductoryConcepts.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Project quality management in manufacturing
Lecture Notes Electrical Wiring System Components
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
bas. eng. economics group 4 presentation 1.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
additive manufacturing of ss316l using mig welding
Geodesy 1.pptx...............................................
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...

Stream-based Data Synchronization

  • 3. 1.1 Who am I, What I Do? • low-level programming enthusiast
 (audio and video DSP routines, tight loop optimizations) • embedded systems (graphic EQ with DSP, fleet management, mobile payment) • familiar with iOS SDK since 2008 • vox.io (web / mobile sip, xmpp) • layer.com (messaging) • obsession with synchronization protocols
  • 5. 2.1 What is Data Synchronization? • Having data consistency across two or more networked entities
  • 7. 2.1.2 How to Design the System? • Simple server Toggle Switch App
  • 8. 2.1.2 How to Design the System? • Simple server • Simple client Toggle Switch App
  • 9. 2.1.2 How to Design the System? • Simple server • Simple client • Simple data structure { lightsOn: true } Toggle Switch App
  • 10. 2.1.2 How to Design the System?
  • 11. 2.2 Other Use Cases • E-mail (IMAP, POP) • Messaging (iMessages, Hangouts) • Photo sharing (Photo Stream, Google Photos) • File sharing (Dropbox, iCloud Drive) • Online text editors / spreadsheet editors (Google Docs) • Multiplayer Games (Minecraft)
  • 12. 2.3 Types of Data Synchronization • File synchronization
  • 13. 2.3 Types of Data Synchronization • File synchronization • Text / document synchronization
  • 14. 2.3 Types of Data Synchronization • File synchronization • Text / document synchronization • Data model synchronization
  • 15. 2.4 Approaches to Data Synchronization
  • 16. 2.4.1 Absolute Synchronization (copying) • Copying (wholesale transfer) is ok when dealing with small data-sets (e.g. refreshing weather forecast, RSVP list ...)
  • 17. 2.4.1 Absolute Synchronization (copying) • Figuring out differences between previously fetched data-sets costs CPU and memory O(n ⋁ m)
  • 18. 2.4.1 Absolute Synchronization (copying) • Figuring out differences between previously fetched data-sets costs CPU and memory O(n ⋁ m) Dan Alex Blake Emily George Caroline You
  • 19. 2.4.1 Absolute Synchronization (copying) • Figuring out differences between previously fetched data-sets costs CPU and memory O(n ⋁ m) Dan Alex Blake Emily George Caroline You Dan Alex Emily George Caroline You ≠
  • 20. 2.4.2 Relative Synchronization (changes) • Getting data up-to-date with changes
 instead of full data sets. (a.k.a. deltas)
  • 21. 2.4 What are Deltas? Delta encoding is a way to describe differences between two datasets.
  • 22. 2.5.1 How to Encode Deltas? insert ― adds new values to dataset update ― updates existing values in dataset delete ― deletes existing values from dataset + ✔ - • Three primitive operations
  • 23. 2.5.1 How to Encode Deltas? 0000000: 4749 4654 2B31 0d00 0d00 9100 00b6 6257 GIFT+1........bW 0000010: 0804 0456 2c27 e5aa 7f21 f904 0000 0000 ...V,'...!...... 0000020: 002c 0000 0000 0d00 0d00 0002 318c 8f29 .,..........1..) 0000030: 3000 7986 944f 8823 260d 0feb b620 0b03 0.y..O.#&.... .. 0000040: 2e97 e1a4 0f79 920c 60a5 28e5 c452 abc6 .....y..`.(..R.. [ { type: "update", offset: 0x03, values: [ 0x38, 0x39, 0x61 ] }, { type: "insert", offset: 0x50, values: [ 0xCE, 0xE1, 0x50, 0x96, 0x89, 0x48, 0x9D, 0x02, 0x43, 0x62, 0x8D, 0x98, 0x28, 0x00, 0x00, 0x3B ] } ] • Example on how to encode binary data changes
  • 24. 2.5.1 How to Encode Deltas? 0000000: 4749 4638 3961 0d00 0d00 9100 00b6 6257 GIF89a........bW 0000010: 0804 0456 2c27 e5aa 7f21 f904 0000 0000 ...V,'...!...... 0000020: 002c 0000 0000 0d00 0d00 0002 318c 8f29 .,..........1..) 0000030: 3000 7986 944f 8823 260d 0feb b620 0b03 0.y..O.#&.... .. 0000040: 2e97 e1a4 0f79 920c 60a5 28e5 c452 abc6 .....y..`.(..R.. [ { type: "update", offset: 0x03, values: [ 0x38, 0x39, 0x61 ] }, { type: "insert", offset: 0x50, values: [ 0xCE, 0xE1, 0x50, 0x96, 0x89, 0x48, 0x9D, 0x02, 0x43, 0x62, 0x8D, 0x98, 0x28, 0x00, 0x00, 0x3B ] } ] • Example on how to encode binary data changes
  • 25. 2.5.1 How to Encode Deltas? 0000000: 4749 4638 3961 0d00 0d00 9100 00b6 6257 GIF89a........bW 0000010: 0804 0456 2c27 e5aa 7f21 f904 0000 0000 ...V,'...!...... 0000020: 002c 0000 0000 0d00 0d00 0002 318c 8f29 .,..........1..) 0000030: 3000 7986 944f 8823 260d 0feb b620 0b03 0.y..O.#&.... .. 0000040: 2e97 e1a4 0f79 920c 60a5 28e5 c452 abc6 .....y..`.(..R.. 0000050: cee1 5096 8948 9d02 4362 8d98 2800 003b ..P..H..Cb..(..; [ { type: "update", offset: 0x03, values: [ 0x38, 0x39, 0x61 ] }, { type: "insert", offset: 0x50, values: [ 0xCE, 0xE1, 0x50, 0x96, 0x89, 0x48, 0x9D, 0x02, 0x43, 0x62, 0x8D, 0x98, 0x28, 0x00, 0x00, 0x3B ] } ] • Example on how to encode binary data changes
  • 26. 2.5.1 How to Encode Deltas? • Example on how to encode text changes 83: // 84: // Toggles the private ivar `_lightSwitchState` boolean, updates the 85: // background image, plays a sound and transmits the change over network. 86: // 87: func toggleAndSendLightSwitchState() { 88: self.lightSwitchState = !self.lightSwitchState > 89: self.lightSwitchClient.sendLightSwitchState(self.lightSwitchState) 90: }
  • 27. 2.5.1 How to Encode Deltas? • Example on how to encode text changes (diff patch) 83: // 84: // Toggles the private ivar `_lightSwitchState` boolean, updates the 85: // background image, plays a sound and transmits the change over network. 86: // 87: func toggleAndSendLightSwitchState() { 88: self.lightSwitchState = !self.lightSwitchState > 89: self.lightSwitchClient?.sendLightSwitchState(self.lightSwitchState) 90: } --- 89: self.lightSwitchClient.sendLightSwitchState(self.lightSwitchState) +++ 89: self.lightSwitchClient?.sendLightSwitchState(self.lightSwitchState)
  • 28. 2.5.1 How to Encode Deltas? • Example on how to encode text changes (insert operation) 83: // 84: // Toggles the private ivar `_lightSwitchState` boolean, updates the 85: // background image, plays a sound and transmits the change over network. 86: // 87: func toggleAndSendLightSwitchState() { 88: self.lightSwitchState = !self.lightSwitchState > 89: self.lightSwitchClient?.sendLightSwitchState(self.lightSwitchState) 90: } { type: "insert", offset: 2781, values: [ "?" ] }
  • 29. 2.5.1 How to Encode Deltas? • Example on how to encode custom data model changes { guests: [ "Alex", "Blake", "Caroline", "Dan", "Emily", "George" ] }
  • 30. 2.5.1 How to Encode Deltas? • Example on how to encode custom data model changes { guests: [ "Alex", "Blake", "Caroline", "Dan", "Emily", "George" ] } { type: "delete", guest: [ "Blake" ] } { guests: [ "Alex", "Caroline", "Dan", "Emily", "George" ] }
  • 31. 3. Stream Based Synchronization
  • 32. 3.1 The Motivation • Minimum data redundancy
  • 33. 3.1 The Motivation • Speed / minimum bandwidth
  • 34. 3.1 The Motivation • Fast writes = good concurrency characteristics
  • 35. 3.1 The Motivation • Distributability and scalability
  • 36. 3.1 The Motivation • Offline support
  • 37. 3.2 Stream of Mutations • Clients with an open connection receive a live stream of events from the server
  • 38. 3.2 Stream of Mutations • Clients with an open connection receive a live stream of events from the server
  • 39. 3.2 Stream of Mutations • Example "To Do" app
  • 40. 3.2.1 Example (To-do List App Data-model) • Live synchronized list of to-do tasks public struct Todo { public class List: NSObject { private let tasks: Array<Task> = [] } }
  • 41. 3.2.1 Example (To-do List App Data-model) • Live synchronized list of to-do tasks • Task element consists of: checkbox, label and color public struct Todo { public class List: NSObject { private let tasks: Array<Task> = [] } } public struct Todo { public class Task: NSObject { public private(set) var identifier: NSUUID public private(set) var completed: Bool public private(set) var title: String public private(set) var label: ColorLabel public enum ColorLabel: UInt8 { case None = 0, Red, Orange, Yellow, Green, Turquoise, Blue, Purple, Pink } } }
  • 42. 3.2.1 Example (To-do List App Data-model) • Live synchronized list of to-do tasks • Task element consists of: checkbox, label and color • Tasks can be added, edited and removed public struct Todo { public class List: NSObject { private let tasks: Array<Task> = [] public func create(title: String, label: Task.ColorL public func update(identifier: NSUUID, completed: Bo public func remove(identifier: NSUUID) } } public struct Todo { public class Task: NSObject { public private(set) var identifier: NSUUID public private(set) var completed: Bool public private(set) var title: String public private(set) var label: ColorLabel public enum ColorLabel: UInt8 { case None = 0, Red, Orange, Yellow, Green, Turquoise, Blue, Purple, Pink } } }
  • 43. 3.2.1 Example (To-do List App Data-model) • Live synchronized list of to-do tasks • Task element consists of: checkbox, label and color • Tasks can be added, edited and removed
  • 44. 3.2.2 Example (To-do List Sync Data-model) • Todo.List user actions turn into events (𝚫)
  • 45. 3.2.2 Example (To-do List Sync Data-model) • Todo.List user actions turn into events (𝚫) • Simple concrete objects describing changes public struct Sync { public class Event: NSObject { public enum Type: UInt8 { case Insert = 0, Update, Delete } public private(set) var type: Type public private(set) var identifier: NSUUID public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } }
  • 46. 3.2.2 Example (To-do List Sync Data-model) • Todo.List user actions turn into events (𝚫) • Simple concrete objects describing changes • Serializable public struct Sync { public class Event: NSObject, Serializable { public enum Type: UInt8 { case Insert = 0, Update, Delete } public private(set) var type: Type public private(set) var identifier: NSUUID public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } } public protocol Serializable: class { init(fromDictionary dictionary: Dictionary<String, AnyObject>) func toDictionary() -> Dictionary<String, AnyObject> }
  • 47. 3.2.2 Example (To-do List Sync Data-model) • Todo.List user actions turn into events (𝚫) • Simple concrete objects describing changes • Serializable
  • 48. 3.2.2 Example (To-do List Sync Data-model) • Creating new task { // serialized event structure type: 0, // 0 = Insert identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // client generated id completed: false, // an incomplete task title: "Buy Milk", // task description label: 0 // color tag }
  • 49. 3.2.2 Example (To-do List Sync Data-model) • Editing an existing task { // event structure type: 1, // 1 = Update identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // reference to task completed: true // new state }
  • 50. 3.2.3 Example (To-do List Sync and Transport) • Receive live serialized Events from the server. public protocol TransportDelegate: class { func transport(transport: Transport, didReceiveObject object: Serializable) func transportDidConnect(transport: Transport) func transportDidDisconnect(transport: Transport) } public struct Sync { public class Client: NSObject, TransportDelegate { public private(set) var stream: Stream = Stream() public private(set) var transport: Transport public private(set) var todoList: Todo.List public private(set) var publishedEvents: Array<Event> = [] private func publish(event: Event) -> Bool public func transport(transport: Transport, didReceiveObject object: Serializab } }
  • 51. 3.2.3 Example (To-do List Sync and Transport) • Receive live serialized Events from the server. • Send serialized Events to server. public protocol TransportDelegate: class { func transport(transport: Transport, didReceiveObject object: Serializable) func transportDidConnect(transport: Transport) func transportDidDisconnect(transport: Transport) } public struct Sync { public class Client: NSObject, TransportDelegate { public private(set) var stream: Stream = Stream() public private(set) var transport: Transport public private(set) var todoList: Todo.List public private(set) var publishedEvents: Array<Event> = [] private func publish(event: Event) -> Bool public func transport(transport: Transport, didReceiveObject object: Serializab } }
  • 52. 3.2.3 Example (To-do List Sync and Transport) • Receive live serialized Events from the server. • Send serialized Events to server.
  • 53. 3.3 Let the Streaming Begin • Data consistent, as long as clients remain connected
  • 54. 3.3 Let the Streaming Begin • Missing out on events puts the client out-of-sync
  • 55. 3.3 Let the Streaming Begin • Missing out on events puts the client out-of-sync { // event structure type: 1, identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", completed: true }
  • 56. 3.3 Let the Streaming Begin • Data consistent, as long as clients remain connected • Missing out on events puts the client out-of-sync • Clients can recover from out-of-sync state • Server's responsibility beside broadcasting should also be preserving the events
  • 58. 3.4 Persistent Stream • Think of it as a linear magnetic tape, or as a storage with a WORM behavior • Append only • Immutable events • Journal of all the events that have happened
  • 59. 3.4 Persistent Stream • Always copy all the events? (too expensive) • Integrity check by hashing events? (only detects mismatches) How does a client know if it's got all the events?
  • 61. 3.5 Event Discovery • Sequencing Events on server
  • 62. 3.5 Event Discovery • Sequencing Events on server public struct Sync { public class Event: NSObject { public private(set) var seq: Int? public private(set) var type: Type public private(set) var identifier: NSUUID? public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } }
  • 63. 3.5 Event Discovery • Sequencing Events on server • Sequence is a linear function f(x)=x reproducible on client
  • 64. 3.5 Event Discovery • Sequencing Events on server • Sequence is a linear function f(x)=x reproducible on client
  • 65. 3.5 Event Discovery • Client only needs to know the seq value of
 the last event f(x<12)=x
  • 66. 3.5 Event Discovery • Client only needs to know the seq value of
 the last event f(x<12)=x • Figuring out missing events by subtracting the set of seqs
  • 67. 3.5 Event Discovery // Seq values pulled from all events the client has. // [ 0, 1, 2, 10, 11, 12 ] let seqsOfEvents: Set = events.map({ $0.seq }) // Calculated sequence ranging from 0 to 12. // [ 0, 1, 2, 3 ... 12 ] let seqsOfAllEvents: Set = [Int](0...12) // Diffed set of seq values. // [ 3, 4, 5, 6, 7, 8, 9 ] let seqsOfMissingEvents: Set = seqOfAllEvents.subtract(seqOfEvents) • Client only needs to know the seq value of
 the last event f(x<12)=x • Figuring out missing events by subtracting the set of seqs
  • 68. 4. Event and Model Reconciliation Outbound and inbound reconciliation
  • 69. 4.1 Outbound Reconciliation • Turning user actions (model changes) into Events
  • 70. 4.1 Outbound Reconciliation • Turning user actions (model changes) into Events
  • 71. 4.1 Outbound Reconciliation • Turning user actions (model changes) into Events
  • 72. 4.1 Outbound Reconciliation • Turning user actions (model changes) into Events
  • 73. 4.1 Outbound Reconciliation • Turning user actions (model changes) into Events public class List: NSObject { public func create(title: String, label: Task.ColorLabel) -> Sync.Event public func update(identifier: NSUUID, completed: Bool?, title: String?, label: Task.ColorLabel?) -> Sync.Event? public func remove(identifier: NSUUID) -> Sync.Event? }
  • 74. 4.1 Outbound Reconciliation • Turning user actions (model changes) into Events let todoList = List() let event = todoList.create("Buy Milk", label: Task.ColorLabel.None) print("event: '(event)", event) // event: { // type: 0, // 0 = Insert // identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // client generated id // completed: false, // an incomplete task // title: "Buy milk", // task description // label: 0 // task without a label // }
  • 75. 4.1 Outbound Reconciliation • Publishing events let todoList = List() let event = todoList.create("Buy Milk", label: Task.ColorLabel.None) print("event: '(event)", event) // event: { // type: 0, // 0 = Insert // identifier: "cb55ceec-b9ae-4bd9-8783-7dbf3e9cb2cd", // client generated id // completed: false, // an incomplete task // title: "Buy milk", // task description // label: 0 // task without a label // } // Sends the event to the stream over the network. self.syncClient.publish(event)
  • 76. 4.2 Inbound Reconciliation • Apply Events onto the model
  • 77. 4.2 Inbound Reconciliation • Apply Events onto the model public class List: NSObject { private func apply(event: Sync.Event) -> Bool { switch event.type { case .Insert: // Task creation let task = Task(identifier: event.identifier, completed: event.completed!, title: event.title!, label: Task.ColorL self.tasks.append(task) case .Update: // Task updates let task = self.task(event.identifier) if task == nil { return false } task!.update(event.completed!, title: event.title!, label: Task.ColorLabel(rawValue: event.label!)!) case .Delete: // Task removal if !self.removeTask(event.identifier) { return false } } return true } }
  • 78. 4.3 Offline Support • Events generated offline have to be published eventually
  • 79. 4.3 Offline Support • Queue generated events; drain queue for publication
  • 80. ✔ 4.3 Offline Support • Generating redundant events while offline
  • 81. ✔ 4.3 Offline Support • Generating redundant events while offline
  • 82. 4.4 Reducing the Edit Distance • Events describing the same mutation
  • 83. 4.4 Reducing the Edit Distance • Causes stream pollution • Increases the edit distance
  • 84. 4.4 Reducing the Edit Distance 1. Insert Event merges with
 Update Events → single Insert Event 2. Update Event merge with
 the rest of Update Events → single Update Event 3. Last Update Event defines final state. 4. Delete Event clobbers other Event types Simple set of rules when queueing:
  • 85. 4.4 Reducing the Edit Distance public struct Sync { public class Event: NSObject, Serializable { var mergedEvents = Array<Event>() for oldEvent in events.reverse() { if oldEvent.identifier != self.identifier { // Event not mergeable, due to the identifier mismatch. mergedEvents.append(oldEvent) continue } else if self.type == Type.Delete { // Rule #4 self.reset() self.type = Type.Delete } else if self.type == Type.Update && (oldEvent.type == Type.Insert || oldEvent.type == Type.Update) { // Rule #1, #2, #3 self.completed = self.completed ?? oldEvent.completed self.title = self.title ?? oldEvent.title self.label = self.label ?? oldEvent.label } } mergedEvents.append(self) return mergedEvents } }
  • 86. 4.5 Conflict Resolution • Concurrent systems experience conflicts when two or more nodes (clients) work on the same resource at the same time.
  • 87. 4.5 Conflict Resolution • Concurrent systems experience conflicts when two or more nodes (clients) work on the same resource at the same time. • Example: a client deletes a Todo Task before another client tries to mutate it.
  • 88. 4.5 Conflict Resolution Possible conflict resolutions: • Bring the deleted task back (last writer wins) • Deleted task stays deleted (first writer wins) • Ask the User what to do? (requires user interaction)
  • 89. 5. Order of Events
  • 90. 5. Order of Events Event sequence dictates the order they were written to
 stream this puts the Events in total order • Task objects will be in the exact same order, defined by
 the Event.seq • Task mutations will be applied in the same manner on all clients
  • 91. 5. Order of Events (total order) • Queued events must be published in batches
  • 92. 5. Order of Events (total order) • Queued events must be published in batches
  • 93. 5. Order of Events (total order) • Queued events must be published in batches
  • 94. 5. Order of Events (total order) • Queued events must be published in batches
  • 95. 5.1 Total Order (sequential writes) • Synchronized sequential writes block other clients from writing
  • 96. 5.1 Total Order (sequential writes) • Synchronized sequential writes block other clients from writing - violates our fast concurrent writes requirement serial writes concurrent writes
  • 97. 5.1 Total Order (offline support) • Both clients online
  • 98. 5.1 Total Order (offline support) • Both clients online
  • 99. 5.1 Total Order (offline support) • Left client loses connection
  • 100. 5.1 Total Order (offline support) • Offline client adds more To-do tasks to the list
  • 101. 5.1 Total Order (offline support) • Online client also adds a Todo task to the list
  • 102. 5.1 Total Order (offline support) • Left client comes back online ― events generated offline get published and fall at the end (higher seq values)
  • 103. 5.2 Causal Order Causes must precede their effects - effects come after causes, and never before
  • 104. 5.2 Causal Order Causes must precede their effects - effects come after causes, and never before cause effect
  • 105. 5.2 Causal Order • Generated Event is an effect caused by user taking action / responding to the UI. • Events should be reconciled in the same order as they were generated by clients. • Events should be applied onto the app model under the same conditions as it was when author generated the events. • Total order cannot guarantee Events will be written to stream in the same order they were generated.
  • 106. 5.2.1 Order Based on Timestamps Client B's events are written before Client A's, even though Client A generated them first.
  • 107. Encoding local time with events. 5.2.1 Order Based on Timestamps
  • 108. Sorting events based on the embedded timestamp. 5.2.1 Order Based on Timestamps
  • 109. 5.2.1 Order Based on Timestamps • No guarantee time will be the same on all devices • Clock skew • Manual override
  • 110. 5.2.2 Version Vectors • Reconstructing Events' order as it was perceived by the author based on happened-before information. • Provides causality-tracking basic principle in some optimistic (lazy) replication algorithms. • Allows the client to operate independently from the server. • When all clients eventually publish their events, it brings other online clients into a consistent state eventual consistency.
  • 111. 5.2.2 Version Vectors How to encode happened-before information? public struct Sync { public class Event: NSObject { public private(set) var seq: Int? public private(set) var type: Type public private(set) var identifier: NSUUID? public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } }
  • 112. 5.2.2 Version Vectors 1. Information of what's the last seen event - event.seq How to encode happened-before information? public struct Sync { public class Event: NSObject { public private(set) var seq: Int? public private(set) var precedingSeq: Int public private(set) var type: Type public private(set) var identifier: NSUUID? public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } }
  • 113. 5.2.2 Version Vectors 1. Information of what's the last seen event - event.seq 2. Keep unpublished events in order - event.clientSeq How to encode happened-before information? public struct Sync { public class Event: NSObject { public private(set) var seq: Int? public private(set) var precedingSeq: Int public private(set) var clientSeq: Int public private(set) var type: Type public private(set) var identifier: NSUUID? public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } }
  • 114. 5.2.2 Version Vectors 1. Information of what's the last seen event - event.seq 2. Keep unpublished events in order - event.clientSeq 3. Order How to encode happened-before information? public struct Sync { public class Event: NSObject { /// Event sorting closure static public let causalOrder = { (e1: Event, e2: Event) if e1.precedingSeq == e2.precedingSeq { return e1.clientSeq < e2.clientSeq } return e1.precedingSeq < e2.precedingSeq } public private(set) var seq: Int? public private(set) var precedingSeq: Int public private(set) var clientSeq: Int public private(set) var type: Type public private(set) var identifier: NSUUID? public private(set) var completed: Bool? public private(set) var title: String? public private(set) var label: Int? } }
  • 121. 5.2.2 Version Vectors public struct Sync { public class Event: NSObject, Serializable { var mergedEvents = Array<Event>() for oldEvent in events.sort(Event.causalOrder) { if oldEvent.identifier != self.identifier { // etc... public struct Todo { public class List: NSObject, ModelReconciler { public func apply(events: Array<Sync.Event>) -> Bool { for event in events.sort(Sync.Event.causalOrder) { let success = self.apply(event) // etc... Minor adjustment in outbound / inbound reconciliation:
  • 122. 5.2.2 Version Vectors • Newly published events generated offline are ordered by their causality.
  • 123. 5.2.2 Version Vectors • Concurrent writes - no need for batched writes anymore,
 due to clientSeq.
  • 124. 5.2.2 Version Vectors • Concurrent writes - events can be written with undetermined order; order can be reconstructed on clients serial writes concurrent writes
  • 126. 6. Advantages • Shared source - minimal redundancy
  • 127. 6. Advantages • Shared source - minimal redundancy • Lightweight data structure - fast delivery
  • 128. 6. Advantages • Shared source - minimal redundancy • Lightweight data structure - fast delivery • Minimal server logic (low CPU)
  • 129. 6. Advantages • Shared source - minimal redundancy • Lightweight data structure - fast delivery • Minimal server logic (low CPU) • Short writes - high concurrency
  • 130. 6. Advantages • Shared source - minimal redundancy • Lightweight data structure - fast delivery • Minimal server logic (low CPU) • Short writes - high concurrency • Scalable / distributable
  • 131. 6. Advantages • Shared source - minimal redundancy • Lightweight data structure - fast delivery • Minimal server logic (low CPU) • Short writes - high concurrency • Scalable / distributable • Offline support
  • 133. 7. Disadvantages • Server simplicity = client complexity
  • 134. 7. Disadvantages • Server simplicity = client complexity • Rogue clients = stream pollution
  • 135. 7. Disadvantages • Server simplicity = client complexity • Rogue clients = stream pollution • Clients must read full stream
  • 136. 7. Disadvantages • Server simplicity = client complexity • Rogue clients = stream pollution • Clients must read full stream • Partial sync difficult to implement