Master Thesis - A Distributed Algorithm for Stateless Load Balancing

UNIVERSITY OF CATANIA
MASTER’S THESIS
A Distributed Algorithm for Stateless
Load Balancing
Author:
Andrea TINO
Supervisor:
Prof. Eng. Orazio
TOMARCHIO
Assistant Supervisor:
Eng. Antonino BLANCATO
A thesis submitted in fulﬁllment of the requirements
for the degree of Master of Engineering
in the
Faculty of Computer Science Engineering
Department of Electrical, Electronic and Computer Science Engineering
July 21, 2017

iii
Declaration of Authorship
I, Andrea TINO, declare that this thesis titled, “A Distributed Algorithm for Stateless
Load Balancing” and the work presented in it are my own. I conﬁrm that:
• This work was done wholly or mainly while in candidature for a research de-
gree at this University.
• Where any part of this thesis has previously been submitted for a degree or
any other qualiﬁcation at this University or any other institution, this has been
clearly stated.
• Where I have consulted the published work of others, this is always clearly
attributed.
• Where I have quoted from the work of others, the source is always given. With
the exception of such quotations, this thesis is entirely my own work.
• I have acknowledged all main sources of help.
• Where the thesis is based on work done by myself jointly with others, I have
made clear exactly what was done by others and what I have contributed my-
self.
Signed:
Date:

v
“I like thinking that this work of mine kind of reﬂects my international personality. The
idea of this algorithm has crossed my mind while I was working in Japan (summer 2012)
on non-deterministic mathematical models to describe fast similarity search algorithms. It’s
incredible how some ideas come to life so spontaneously! Then, I started working and devel-
oping the foundations of the algorithm in Italy. After I started as an employee in Microsoft,
I kept on working on this project in Denmark. Even on vacation, I found time to work on
this thesis while roaming in several areas of South Korea. It is also worth mentioning that I
worked on some chapters while I was in The Netherlands.
When I think about this, I feel happy! ”
Andrea Tino

vii
University of Catania
Abstract
Faculty of Computer Science Engineering
Department of Electrical, Electronic and Computer Science Engineering
Master of Engineering
A Distributed Algorithm for Stateless Load Balancing
by Andrea TINO
The algorithm object of this thesis deals with the problem of balancing data units
across different stations in the context of storing large amounts of information in
data stores or data centres. The approaches being used today are mainly based on
employing a central balancing node which often requires information from the dif-
ferent stations about their load state.
The algorithm being proposed here follows the opposite strategy for which data is
balanced without the use of any centralized balancing unit, thus fulfilling the dis-
tributed property, and without gathering any information from stations about their
current load state, thus the stateless property.
This document will go through the details of the algorithm by describing the idea
and the mathematical principles behind it. By means of an analytical proof, the equa-
tion of balancing will be devised and introduced. Later on, tests and simulations,
carried on by means of different environments and technologies, will illustrate the
effectiveness of the approach. Results will be introduced and discussed in the second
part of this document together with final notes about current state of art, challenges
and deployment considerations in real scenarios.
(IT) L’algoritmo oggetto della tesi tratta il problema del bilanciamento di unitá
dati all’interno di un pool di diverse stazioni, contestualmente alla necessitá di man-
tenere in persistenza grandi quantitá di informazione all’interno di server-farm o
data-centre. Le strategie tuttora in utilizzo sono principalmente basate sull’impiego
di un componente centrale per il bilanciamento il quale, spesso, necessita di alcune
informazioni da parte dei nodi della rete circa il loro stato attuale di carico.
L’algoritmo proposto in questa sede procede verso un approccio diametralmente
opposto per cui il bilanciamento dati viene effettuato senza l’utilizzo di alcun com-
ponente centralizzato, da cui la proprietá distributed, e senza la necessitá di ottenere
alcun dato dalle stazioni relativamente al loro stato di carico, da cui la proprietá
stateless.
In questo documento, procederemo nell’esaminare i dettagli dell’algoritmo tramite
una descrizione dell’idea di fondo e dei principi matematici alla sua base. Attraverso
l’impiego di una dimostrazione analitica, verrá dedotta e analizzata l’equazione di bi-
lanciamento. Successivamente, procederemo ad esaminare i test e le simulazioni, en-
trambi condotti tramite diverse tecnologie, a supporto dell’efficacia dell’algoritmo.
I risultati verranno esaminati e discussi nella seconda parte di questo documento,
assieme alle note finali riguardo lo stato corrente della tecnologia nel campo del
bilanciamento dati. Verranno esaminate, inoltre, le problematiche e gli scenari di
utilizzo dell’algoritmo.

ix
Contents
Declaration of Authorship iii
Abstract vii
1 Introduction 1
1.1 About Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Describing the scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Characterization of balancing algorithms . . . . . . . . . . . . . . . . . 2
1.3.1 Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.2 State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.3 Static vs. dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.4 Centralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.5 DU retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Well known balancing algorithms . . . . . . . . . . . . . . . . . . . . . . 4
1.4.1 Round Robin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.2 Weighted Round Robin . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.3 Random . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.4 Source Address Hash . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.5 Least Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.6 Graph based algorithms . . . . . . . . . . . . . . . . . . . . . . . 6
Nearest Neighbour . . . . . . . . . . . . . . . . . . . . . . . . . . 6
RAND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Never Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
THRESHOLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 System overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 The algorithm 9
2.1 Network organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Station ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Ring access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Unbalanced ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Balancing the ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Extending the ring . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Adapting concepts in extended ring . . . . . . . . . . . . . . . . 19
Deﬁning sizing equations . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 Designing hash function φ . . . . . . . . . . . . . . . . . . . . . . 20
Designing r.v. sφ’s PDF . . . . . . . . . . . . . . . . . . . . . . . . 20
Designing r.v. sφ . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.3 Understanding how φ works . . . . . . . . . . . . . . . . . . . . 26
2.4 Ring balancing example . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.1 Deﬁning the ring . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

x
2.4.2 Defining the formatting impulse . . . . . . . . . . . . . . . . . . 27
2.4.3 Binding impulses to stations . . . . . . . . . . . . . . . . . . . . 28
2.4.4 Calculating amplitudes . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.5 Computing functions . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 Simulation results 31
3.1 Small-size simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.1 Verifying load balance . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.2 Evaluating load levels per station . . . . . . . . . . . . . . . . . 33
3.2 Large-size simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Evaluating the variance of hash segment amplitudes . . . . . . 37
3.2.3 Evaluating load levels per station . . . . . . . . . . . . . . . . . 39
Migrations flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 System API 43
4.1 Storing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1 Packet fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1.2 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Retrieving data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Dynamic conditions 49
5.1 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1.1 Updating φ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Broadcasting in DHT . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.2 Load re-arrangement . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1.3 Scaling overall impact . . . . . . . . . . . . . . . . . . . . . . . . 58
5.1.4 Ring scale-down . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Fault conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.1 Collisions threshold . . . . . . . . . . . . . . . . . . . . . . . . . 60
6 Conclusions and final notes 65
6.1 Open issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2 What’s next . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A C/C++ simulation engine’s architecture 67
Acknowledgements 73
Bibliography 75

xi
List of Abbreviations
BS Balancing System
DSLB Distributed Stateless Load Balancing
PA Proposed Algorithm
SS Storage System
BA Balancing Algorithm
BP Balancing Pool
DLB Data Load Balancing
DLBA Data Load Balancing Algorithm
DU Data Unit
LD Load Distribution
SL Station Load
CDF Cumulative Distribution Function
PDF Probability Density Function
DHT Distributed Hash Table
HS Hash Segment
ID IDentiﬁer
LS Leaf Set
LLS Lower Leaf Set
P2P Peer To Peer
ULS Upper Leaf Set
API Application Program Intrerface
r.v. random variable

xiii
List of Symbols
N Number of stations
Ω Balancing pool (set)
P Data Units (packets) (set)
si Station
p Data Unit (packet)
ψ Packet/station assignment application
Σ Load distribution
l Hash length (number of bits)
ξ Hash function
h Hash string (number)
hξ Regular hash string (number)
hφ φ-hash string (number)
η Station packet load (number)
η(ξ) Station packet load (via ξ) (number)
η(φ) Station packet load (via φ) (number)
πi Packet in station probability (probability)
π
(ξ)
i Packet in station probability (via ξ) (probability)
π
(φ)
i Packet in station probability (via φ) (probability)
f PDF (function)
F CDF (function)
F−1 CDF Inverse (function)
g Formatting impulse (function)
G Formatting impulse antiderivative (function)
Λ Leaf set
ΛU Upper leaf set
ΛL Lower leaf set

xv
dedicated to my Mother and my Father

1
Chapter 1
Introduction
1.1 About Balancing
Under the (umbrella) term balancing, it is possible to refer to different problems and
solutions: balancing of connections, of workloads, of tasks or data. What tells each
single type of balancing apart from the other is actually what is being balanced.
Definition 1 (Balancing). In Computing and Computer Science, it indicates the problem
of distributing an indefinitely high number of entities across multiple subjects (stations).
The selection is performed in order to guarantee that, at any given time, all stations have
(roughly) the same amount of entities.
From which follows:
Definition 2 (Balancing Algorithm). An algorithm, or a system based on a certain algo-
rithm, designed to solve the balancing problem.
In the context of this research work, we are going to focus on a specific type of
balancing:
Definition 3 (Data Load Balancing). A type of balancing focusing on units of data often
referred to as packets or, simply, data units.
The solutions to the latter are referred to as:
Definition 4 (Data Load Balancing Algorithm). A BA targeting DLB in order to solve
it by minimizing a certain objective function.
This research effort focuses on DLB and DLBAs in order to introduce a new al-
gorithm targeting multiple performance metrics.
Objective
The objective of this thesis is actually to employ this algorithm in a cloud storage
system designed to serve different applications. The system must be capable of:
1. Accepting data as input which will be stored in a pool of servers (stations).
2. Retrieving stored data on demand.
3. Removing stored data on demand.
These essential functionalities must be enabled by the BA and its design.

2 Chapter 1. Introduction
1.2 Describing the scenario
Let us describe DLB and its most important aspects in formal terms.
The network A certain number of stations N is always considered and together
they form a balancing pool: a set which we will indicate as Ω. Each station si ∈ Ω
(having i = 1 . . . N) is connected to the others by a generic protocol, we do not con-
sider any specific communication technology, the only required assumption is that
the protocol employs direct addressing of each station (one unique address per sta-
tion).
Station assignment At a any given time, a DU (or packet) p ∈ P (being P the set
of all DUs) must be stored in the BP. The system/algorithm responsible for carrying
out this activity is expressed by application ψ : P → Ω, which basically assigns a DU
to a station.
The way ψ works is essentially the core of the balancing system. The application
will choose a station with the objective of guaranteeing that, at any given time, all
stations roughly have the same amount of packets stored in them.
Remark. Application ψ typically receives a DU as input: ψ(p), however it actually accepts
more arguments: ψ(p, ·) depending on the strategy it uses to perform the balancing.
Loads At any given time, each station si in the BP will have a certain number of
DUs assigned, we indicate this quantity as station load with operator: |si|, thus:
|si| = {p ∈ P : ψ(p) = si} ⊆ N (1.1)
This quantity can sometimes be expressed as the number of bytes (or any of its
multiples) of the total packets stored in a station:
|si| =
∀p∈P:ψ(p)=si
|p| ⊆ R (1.2)
Where |p| indicates the length of a packet (typically in bytes). Otherwise speci-
fied, we will refer to the former definition.
To have an overview of the balancing state, another quantity is introduced: load
distribution indicated with symbol Σ = (|s1|, |s2|, . . . , |sN |), representing the ordered
vector of station loads at any given time.
Time The algorithm runtime does not require a continuos description. Therefore
time will be considered discrete: t ∈ N and characterized by events, an event being,
for instance, the arrival of a new DU to route in the network.
1.3 Characterization of balancing algorithms
DLBAs can be differentiated basing on several points of view. Considering this clas-
sification is important in order to locate the proposed algorithm inside the taxonomy
of today’s most used systems.
1.3.1 Randomness
Algorithms can employ non-deterministic components like pseudo-random number
generators in order to pick up the station to associate to a DU. This approach is

1.3. Characterization of balancing algorithms 3
not that bad because, provided the generator is characterized by an uniform PDF, it
guarantees a fairly good level of balancing at low cost.
Proposition 1 (Determinism). The algorithm being proposed is fully deterministic.
1.3.2 State
In order to perform balancing, algorithms may require stations to keep information
regarding their load state (e.g. disk usage or residual available space). Statefulness
implies that stations communicate the state information to other stations or special
nodes in the network; by employing this knowledge, the BA can perform a more
precise job. The downside is mostly related to communication overhead as state
information must be regularly exchagnged.
Proposition 2 (Statefulness). The algorithm being proposed is stateless, it does not require
any information from stations to be sent in order to perform balancing.
1.3.3 Static vs. dynamic
The balancing can happen at two possible points in time:
• At runtime The association to a station is performed while the DU is being
transmitted to stations. In these conditions, the same DU might be routed to
different stations depending on the contingent situation. It is also possible to
have DUs being re-routed. This behaviour makes the algorithm dynamic. As
a rule of thumb, a BA is dynamic when it is not possible to always know in
advance where a DU will be routed until the algorithm is actually run.
• Before runtime The destination station is known before the algorithm is run.
This makes the algorithm static. As a rule of thumb, a BA is static when the
same DU is always routed to the same station.
Proposition 3 (Staticity). The algorithm being proposed is static.
1.3.4 Centralization
This property determines whether the algorithm requires a central node in the net-
work used to perform the balancing. A centralized BA requires a unit, called bal-
ancer, which takes care of routing the DU to its destination station. Conversely, a
distributed BA will not need this extra component. Centralized BAs are easier to
implement, but they have 2 major downsides:
• All trafﬁc must pass through the balancer which acts like a hub node.
• The balancer represents a single fault unit in the network. If it goes down,
the whole network is compromised. Safety mechanisms can be employed in
order to avoid network downtime by limiting the outage to the balancing fea-
ture only: if the balancer fails, the trafﬁc will still be routed to stations but no
balancing will occur.
This property also impacts the topology of the network. Typically, centralized algo-
rithms employ a star topology where the balancer is the central node.
Proposition 4 (Non-centralization). The algorithm being proposed is distributed.

1.3.5 DU retrieval
A very important characteristic of BAs is the way it is possible to retrieve a DU once
it has been stored in a station. This does not really relate to the BA itself as DU re-
trieval is more an aspect concerning the storage algorithm (which employs the BA).
However the 2 systems are connected together and will be treated as one.
An essential part of the data retrieval story is DU and station identification. Since
a DU is assigned to a station, the association performed by si = ψ(p) must be iden-
tifiable. We have already introduced station identifiers, so we need to do the same
with DUs and introduce, for a packet p, its identifier indicated as ˆp ∈ N (the couple
(p, ˆp) is unique).
A key concept to understand is that station association application ψ works both
with packets ψ : P → Ω and with packet IDs: ψ : N → Ω.
• If the algorithm is dynamic, then ψ is not a bijective application and it will not
always return the same station when invoked. In such a case, the association
(ˆp, si) must be saved somewhere. This condition requires the algorithm to em-
ploy a database functioning as a lookup table, which of course takes resources
and impacts memory.
• If the BA is static, then ψ is bijective and it will be always possible to retrieve
the station where a packet p was routed by simply calculating ψ(ˆp). It means
that it is not necessary to store the coordinates of a packet in order to retrieve it.
When a DU is sent for storage, its ID is used by the owner as a key to retrieve it in a
later time.
1.4 Well known balancing algorithms
The proposed algorithm competes with some other algorithms today available in the
market and commonly used in many different application domains. We are going to
describe some of these in order to compare, later, how the proposed approach ranks
among them.
1.4.1 Round Robin
This class of algorithms keeps a counter c = 1 . . . N which points to the destination
station sc where the current DU will be routed to. The counter is incremented at
every new incoming packet: ct+1 = (ct + 1)%N. These BAs guarantee a very precise
balancing as fairness in packet association is their strongest point.
Such algorithms are deterministic, stateless, static and require a packet lookup
table. They are typically centralized and the balancer keeps track of the counter.
However this condition is not a limitation as it is still possible to use Round Robin
in a distributed way, though such an approach is not common in the market today.
1.4.2 Weighted Round Robin
Like Round Robin, these algorithms guarantee fairness by looping through stations.
However the counter is used in a different way: every station si is not assigned with

1.4. Well known balancing algorithms 5
one number, but with a range of contiguous numbers: ci . . . ci , the counter will range
from 1 to C = maxi {ci, ci } and be incremented according to rule: ct+1 = (ct +1)%C.
The higher the interval ci −ci for one station si, the more packets that station will
receive. Although this seems like breaking the balancing principle, this approach
allows to keep into account stations not having the same storage capabilities. A pos-
sible application is assigning more DUs to stations having a larger storage capacity.
These algorithms are typically centralized, they are deterministic and still require
a packet lookup table. Given their nature, they can be either static/stateless or dy-
namic/stateful; the latter implementation is valid when characteristics of stations
can change in time, the state is typically related to storage capabilities.
1.4.3 Random
Random algorithms use a random number generator employing a discrete random
variable distributed in range 1 . . . N to choose the destination station si. Such algo-
rithms are non-deterministic, dynamic, stateless and always require a packet lookup
table. They can be either centralized or distributed, though the former are the most
common in the market.
One important aspect of these BAs is the requirement on the probability distri-
bution of the random variable employed by the number generator. The distribution
must be uniform in order to have proper balancing.
1.4.4 Source Address Hash
Commonly used in TCP/IP applications, these classes of centralized algorithms em-
ploy a balancing unit which assigns a hash range hi . . . hi to each station si. Hashes
are evaluated numerically so ranges are basically contiguous sequences of integer
numbers hi, hi ∈ N. When a packet arrives, the balancer will compute a hash h ∈ N
on the source address (the hashing function can either be one of the well known
cryptographic ones or some other ad-hoc implementations) and route the DU to the
station whose hash range includes the calculated hash.
Source Address Hash algorithms guarantee that packets coming from same sources
are routed to the same stations. These algorithms are deterministic, static, stateless
and require a lookup table. The balancing relies on the hash function, a crypto-
graphic hash is necessary to guarantee that stations whose address differ by a few
bits do not end up being stored in the same station. Given their implementation,
such algorithms can offer a fairly good balancing.
1.4.5 Least Load
These algorithms usually rely on a centralized balancer, however it is still possible
to perform the balancing in a distributed way (though not common in the market
for this implementation). When a DU arrives, it is routed to the station with the
lowest current load. This means that the balancer needs to know the load of each
station, which is the reason why these algorithms are stateful and typically introduce
a considerable overhead in the network. Least Load algorithms are deterministic,
dynamic and require a lookup table.

1.4.6 Graph based algorithms
There is a class of (typically) stateful, distributed and dynamic algorithms which
calculate the destination station for a DU basing on a graph search algorithm on the
network. These algorithms are usually highly scalable, also no assumption is made
on the topologies of the network (free topology).
Nearest Neighbour
A random node si is picked in the network in order to initiate the transmission of
a DU. The balancing is performed only in the context of the subnet represented by
the chosen node and its direct neighbours sj (having j = i). The algorithm picks the
station which minimizes a speciﬁc metric usually being the load state of each node.
RAND
This algorithm is non-deterministic and randomly selects a node (station) where to
route a packet p. A threshold L ∈ R is considered for a certain metric (usually the
load state), if the packet exceeds the threshold: |p| > L, then the DU is re-routed to
another randomly selected station; otherwise it is stored in the current one.
The most prominent characteristic of this algorithm is its statelessness. Since
RAND does not require any information from stations, the implementation is very
easy, minimum overhead is generated (due to threshold-exceeding packets which
need re-transmission) and the balancing is pretty good.
CYCLIC This algorithm is a variant of RAND in which a minimal state informa-
tion is kept: the last station a packet was re-transmitted to is always remembered by
the system, this guarantees that the same station is not picked up twice in a row in
case of consecutive threshold exceed occurrences.
Never Queue
There are algorithms which use state information exchanged among stations in order
to evaluate data not strictly relating to nodes. Typically the state is represented by
station-speciﬁc quantities, like the current load or the residual amount of storage
memory left. Never Queue employs a different state across the network, in order
to able to evaluate, the moment a DU arrives and needs to be stored, the station
to which transmitting the packet implies the least cost. Thus, the algorithm always
transmits the packet to the fastest available node.
This balancing strategy (which requires a lookup table) is poor as does not guar-
antee a good quality of the overall balancing across stations. What it guarantees
though, and this is the reason why we mention this approach, is good performance
in processing packets in terms of throughput and latency. However this has a cost
in overhead due to the amount of information exchanged by nodes to refresh the
network state.
THRESHOLD
This class of highly dynamic algorithms uses network state information, derived
from message exchange among stations, to decide where a packet will be stored. An
incoming packet p is initially routed to a random node (this makes the algorithm
non-deterministic), that station then compares the packet size with a load threshold

1.5. System overview 7
User Storage sys. Balancing sys. Srv pool
Store
Retrieve
Balance
FIGURE 1.1: Overall system architecture. The end user interacts only
with the storage system, while the balancing system is hidden to the
user and transparent to the storage system with regards to accessing
the server pool.
L ∈ R and decides what to do. If the threshold is not exceeded: |p| < L, then the
DU is stored in the current station, otherwise another station is picked via a polling
mechanism. A maximum number of attempts M ∈ N is considered after which a
packet will stay in the current station even though the threshold is exceeded.
Even though this approach looks a lot like RAND, it differs in the way a station
is selected. Only the initial node selection is random, in case of threshold exceed, the
next station is picked with a process based on analysis of the network state.
LEAST This algorithm works like THRESHOLD but limited to one single iter-
ation. When a packet arrives into a randomly selected node and the threshold is
exceeded, the algorithm will poll a certain pool of stations to pick the next one in its
ﬁrst attempt to route the packet. The station is usually picked basing on its current
load (least loaded node). After that, no more attempt is performed. Think about
LEAST as THRESHOLD where M = 1.
1.5 System overview
Before detailing how the PA works, it is important to understand the architecture
of the storage system being designed as part of the research in this thesis. From a
point of view based on the API that the overall system exposes, we recognize 2 major
components:
• Storage system The component interacting with the end user and providing
the API for storing and retrieving data.
• Balancing system The component responsible for arranging the storage pool
and balancing data across its servers.
As pointed out by Figure 1.1, the architecture separates concerns by deﬁning 2
sets of API: one exposed to the user, for submitting data and retrieving it, and an-
other one, hidden to the user, which is responsible for balancing DUs in the storage
pool.

9
Chapter 2
The algorithm
In chapter 1, we have anticipated the most important properties that the PA shows.
What makes this algorithm innovative is the fact that it is at the same time dis-
tributed, static, stateless and does not require a lookup table. In this chapter we
are going to describe the mathematics behind the algorithm which makes all this
possible.
2.1 Network organization
The PA does not allow stations to be arranged in a free connection scheme. A strong
assumption is made on the topology of the network according to the DHT1 scheme.
It is important to make a few considerations about the protocol used by stations
to communicate to each other: in the specific case, no assumption is made of the
networking technology as every protocol can be employed in the network (more
protocols can actually be used as long as they are compatible with each other) except
for one:
Proposition 5 (Networking protocol). Stations are free to employ any arbitrary commu-
nication protocol, as long as this guarantees direct addressing and that a message can be
delivered from / to every pair of stations.
Real case scenario, the one considered here, is Internet and the TCP/IP protocol.
Given proposition 5, one station is actually capable to communicate with every other
one in the network; however this rarely can happen because, and this is the reason
for which direct addressing is essential, a node’s address must be known. Networks
can be mannered according to DHT specifications thanks to a limited knowledge of
other nodes’ addresses.
The direct consequence of proposition 5 is that a separation is made between sta-
tions’ physical and logical connection schemes. Physically, stations can be arranged
without any constraint, however the limited knowledge of other nodes in the graph
allows the system to generate an overlay network which is the one being considered
here.
A station is supposed to have a very limited knowledge regarding other sta-
tions, thus having in memory few of them (considered as neighbours). According to
DHT specifications, a ring topology is employed and it derives from nodes holding
a neighbour set of only 2 nodes: a predecessor and a successor as shown in figure 2.1.
1
Distributed Hash Tables are employed in distributed networks. This network protocol was
adopted by 4 major P2P systems: Chord, Pastry, CAN and Tapestry.

10 Chapter 2. The algorithm
Station 1
Station 2
Station 3
Station 4
Station 5
Station 6
Station 7
Station 8
FIGURE 2.1: A N = 8 network example showing the logical ring
topology. Each station is assigned with an ID (typically the IP address
hash) and packets are routed by content.
Station ordering
A natural order occurs among stations. When the ring is formed, every node si
computed an identifier Id(si) ∈ N represented by the hash of the station’s address.
This identifier is used to build the ring as every station needs to locate its predecessor
and successor in the network:
Lemma 1 (Ring construction). As long as every station has a minimal initial set of con-
nections which guarantees that all nodes form a connected graph, the ring can be built by
having every station reshape its neighbourhood with one successor and one predecessor.
After this initialization phase, the ring is on-line and ready to accept packets.
This aspect is very important as it allows us to define the set of stations Ω as ordered
and we can define:
Definition 5 (Station preceding operator). Given a couple of stations (si, sj) ∈ Ω2 (i = j
and i, j ≤ N), operator : Ω×Ω → {true, false} defines the preceding relation among them.
The operator works as follows:
si sj ⇐⇒ Id(si) < Id(sj) (2.1)
The first important result is the following:
Theorem 2 (Ring complete ordering). The set of stations Ω with precedence operator
: Ω × Ω → {true, false} is a complete ordered set.
Proof. Immediate by considering that operator on Ω, because of its definition, di-
rectly maps on operator < on N which is a fully ordered set.
Ring access
The ring is the place where data is stored and the purpose of the PA is to help the
storage system balance all DUs across stations. The first detail we focus on is how
the ring is accessed when the SS needs to send a packet to be stored or retrieved. The
ring has to be kept safe both from external nodes and from the same nodes that are
part of the network. The way to guarantee the latter is through the following:

2.1. Network organization 11
Station 1
Station 2
Station 3
Proxy
FIGURE 2.2: Access to the ring is guarded by proxies.
Proposition 6 (Limited knowledge principle). To guarantee safety and scalability, every
node in the system has a partial knowledge of the overall network.
In order to protect the ring from external activity, direct access to the network
must be forbidden:
Proposition 7 (Zero knowledge principle). To guarantee safety from external intrusions,
no node, except from those in the ring, knows the address of any station in the system.
This principle, though valid, cannot be adopted as-is because we would end up
with an isolated network otherwise. However, in order to fulfil the security features
promoted by proposition 7, it is possible to build a guarding system around the ring
which hides it from the external world. A collection of proxy stations is employed
for this purpose. As shown in figure 2.2, those stations will be exposed to the ex-
ternal world and they will act as intermediary to the ring, whose addresses are kept
private (in order to fulfil proposition 6, proxy stations will know the addresses of
only a few stations in the ring).
Routing
The SS we are designing is distributed. The basic idea, according to the DHT spec-
ifications, is that a packet will enter the ring from an arbitrary station, called entry
point, in the context of a transmission. From there, every station, which has the BA
deployed, knows whether that packet should be stored there or should otherwise be
routed to a different station.
Every station keeps a limited knowledge of the network. This knowledge is rep-
resented by the set of neighbour nodes one stations keeps. Given the topology, we
define a parameter called leaf radius: r ∈ N (r < N) which represents the number of
successor (or predecessor) nodes every station holds as its neighbourhood.
Definition 6 (Leaf Set). Every station si ∈ Ω keeps track of its neighbour nodes (plus
itself) in an ordered set called leaf set: Λ(si) ⊂ Ω. The leaf set’s cardinality is always
Λ(si) = 2r + 1 where r is the leaf radius of the ring. The following equation holds:
Λ(si) = {sj ∈ Ω : ai,j = 1} ∪ {si} (2.2)
Where A = [ai,j] ∈ NN×N is the adjacency matrix of the network.

Note how one station’s leaf set contains the station itself. Also, the leaf set is
always an ordered set as it is, by definition, a subset of Ω which we proved being
ordered in theorem 2. Unless otherwise specified, we will always consider r = 1.
Lastly, it is sometimes convenient to picture one station’s leaf set extensively as
the ordered vector of neighbour stations (including si):
Λ(si) = si, . . . , si−1, si, si+1, . . . , si
The notation above helps us detecting nodes si and si as the extreme nodes in
every stations’s leaf set. Those nodes will play an important role when defining
routing function ψ later on.
Definition 7 (Upper Leaf Set). Let si ∈ Ω be a station and Λ(si) ⊂ Ω its leaf set. The
upper leaf set ΛU (si) ⊂ Λ(si) is defined as the set of all neighbours which the station
precedes:
ΛU (si) = {sj ∈ Λ(si) : si sj} (2.3)
This set has always cardinality ΛU (si) = r.
Definition 8 (Lower Leaf Set). Let si ∈ Ω be a station and Λ(si) ⊂ Ω its leaf set. The
lower leaf set ΛL(si) ⊂ Λ(si) is defined as the set of all neighbours which the station is
preceded by:
ΛL(si) = {sj ∈ Λ(si) : sj si} (2.4)
This set has always cardinality ΛL(si) = r.
When a station receives a packet, it performs certain operations in order to un-
derstand whether that packet is to be stored there or elsewhere. In the latter, the
station will pick one of the stations in its leaf set and route the packet there. The next
station will repeat the same sequence of operations until the packet is stored into a
node. This algorithm is represented by function ψ : P → Ω, to perform it, every
station in the ring uses the same hash function:
Definition 9 (Hash function). Let P be the set of packets and H ⊆ N , we define ξ : P →
H as a hash function used to calculate the station where to route a packet in the ring.
Not all hash functions can be used to route packets in the ring:
Proposition 8 (Cryptographic hash function). Hash function ξ : P → H is a crypto-
graphic hash function. By definition, ξ behaves in a way such that one bit change in the input
packet will cause the change of at least 50% of the output hash’s bit string.
It is possible to consider many cryptographic hash functions out of those cur-
rently employed in modern systems. Among the most common today we have the
following families: SHA2 and MD3.
As anticipated earlier, stations self-organize in a logical overlay ring by assigning
IDs. One station’s ID Id(si) is computed by using the same hash function ξ on the
station’s address. For formal consistency, we intend hash function ξ to also work
on stations: ξ : Ω → H which is perfectly valid as hash functions do not really care
about the type of data fed as input as long as it is a bitstream. The following expres-
sion: hi = Id(si) = ξ(si) is to be intended as hash function ξ calculated on station
2
Shamir Hash Function. SHA-1 (128 bits), SHA-256 (256 bits), SHA-512 (512 bits).
3
Message Digest Hash. MD2 and MD4 (128 bits) today considered unsafe. MD5 (128 bits) and MD6
(512 bits).

2.1. Network organization 13
si’s address.
As soon as the ring is initialized and ready to work, the topology will define the
ordering of stations: s1 s2 · · · si · · · sN−1 sN following the order of
IDs (hashes): h1 < h2 < · · · < hi < · · · < hN−1 < hN . From here the DHT assigns to
each station an hash segment:
Definition 10 (Hash Segment). Let si ∈ Ω be a station in the ring and hi = Id(si) = ξ(si)
its ID. Station si’s hash segment Ξ(si) is defined as the set of contiguous hashes ranging from
hi up to hi+1 (excluded):
Ξ(si) =
{h ∈ H : hi ≤ h < hi+1} if i = N
{h ∈ H : hi ≤ h ≤ hM} ∪ {h ∈ H : 0 ≤ h < h1} if i = N
(2.5)
Where hM ∈ H is the highest value that hash function ξ can produce: ξ(·) ∈ [0, hM].
Routing function ψ employs hash function ξ in order to compute the destination
station for a packet. The algorithm is deployed on every station and behaves always
the same:
Algorithm 1 Routing a packet in the ring
Require: Ring initialized
Require: Station si has ID hi = ξ(si)
Require: Station si has associated hash segment Ξ(si)
Require: Station si has associated leaf set Λ(si)
1: function ψ(p ∈ P)
2: hp ← ξ(p)
3: Λ ← ∅
4: if hp ≥ hi then
5: Λ ← ΛU (si) {si } ∪ {si}
6: else
7: Λ ← ΛL(si)
8: end if
9: for s ← Λ do
10: if h ≤ hp ≤ h then Since Ξ(s) = [h , h ] ⊂ H
11: return s
12: end if
13: end for
14: if hp ≥ hi then Having that Λ(si) = (si, . . . , si, . . . , si )
15: return si The packet either belongs to si or further
16: else
17: return si The packet belongs to a station preceding si
18: end if
19: end function
We can now provide a better formal description of a ring by introducing its defi-
nition:
Definition 11 (Ring). Let Ω be the set of stations, r ∈ N be the leaf radius, ξ : · → H ⊆ N
the hash function used by each station and ψ : P → Ω the routing function, based on ξ,
used to assign packets to stations. Then we define R = (Ω, r, ξ, ψ) as a fully qualified ring

overlay across N = Ω stations si ∈ Ω where packets p ∈ P are routed and delivered to
each station via routing function ψ employing hash function ξ.
Given algorithm 1, we have the following result:
Lemma 3. Let R = (Ω, r, ξ, ψ) be a ring, then the following holds:
ψ {p} = si ⇐⇒ ξ(p) ∈ Ξ(si), ∀p ∈ P
Proof. Immediate by considering the first exit point of algorithm 1.
Regarding ψ, we want to describe a few more important aspects:
Definition 12 (Routing). Function ψ will be repeatedly executed for a certain number of
iterations from the moment packet p enters the ring until it finds its destination station. The
routing is over when ψ returns the same station where it is evaluated.
It is now evident that one single application of function ψ does not effectively
routes the packet in the correct station. It is necessary to perform a certain number of
iterations and apply ψ to the same packet in different stations. This scheme generates
a recursive condition which we want to make more evident. Let us denote with
ψk ∈ Ω the station returned by the k-th application of ψ, the recursive definition is
completed by setting the initial condition:
ψk+1 = ψ (ψk)
ψ1 = si
(2.6)
As for every recursive function, we ask ourselves whether the recursive defini-
tion in equation 2.6 converges to a value. As per definition 12, we expect function
ψ to assume a value, at a certain iteration b ∈ N, and always keep it in every future
iteration b + k, k ∈ N. For this reason, it is imperative that the cyclic application of ψ
does not lead to an infinite sequence of iterations, which would make the recursive
definition generate an alternating sequence.
Theorem 4 (Routing is always successful). Let R = (Ω, r, ξ, ψ) be a ring and p ∈ P a
packet entering it from station si ∈ Ω. Let b ∈ N be the number of different applications
of routing function ψ, across the different stations of the ring, before p finally reaches its
destination. Then b is always limited: b ≤ B ∈ N
Proof. We take this by contradiction, thus assuming ∃p ∈ P : b → ∞.
By analyzing algorithm 1, in order to have an infinite number of iterations, we
need to make sure that function ψ, when evaluated on station si ∈ Ω, never returns
the current station: ψ(p) = si. For such a condition to hold, then the following must
occur:
∃p ∈ P : hp = ξ(p) /∈ Ξ(si), ∀si ∈ Ω
which translates into:
∃p ∈ P : hp = ξ(p) /∈ [hi, hM
i ], ∀si ∈ Ω
Since we do not know whether si is the last station in the ring, we use hM
i to indicate
the final hash in station si’s hash segment.
However, since Ξ(si) = [hi, hM
i ] in the equations above depends only on si, and
since those equations hold for all stations, we can consider the totality of the hash
segments:
s∈Ω
Ξ(s) =
N
k=1
hk, hM
k = [0, hM]

2.2. Unbalanced ring 15
So we can re-write the previous equations as:
∃p ∈ P : hp = ξ(p) /∈ [0, hM]
Which is contradictory as hash function ξ is, by definition, limited in range ξ(·) ∈
[0, hM], and since hp is calculated via hash function ξ, it must fall in that range.
Theorem 4 proves that the recursive definition introduced before converges:
Corollary 4.1 (Recursive application of ψ converges). Let R = (Ω, r, ξ, ψ) be a ring
and p ∈ P a packet entering it from station si ∈ Ω. Then the recursive term ψk during the
routing of the packet converges to station sj ∈ Ω after b ∈ N iterations:
lim
k→∞
ψk = ψb = sj
In order to avoid confusion between the final computed destination station and
the intermediate hopping stations calculated by the several iterations of the rout-
ing function; from now on, we will indicate with expression si = ψ(p) the station
where packet p is stored at the end of the routing process. That is, we consider ψ as
returning ψb = si (last iteration) unless otherwise specified.
At this point, we have completed describing and formalizing the storage system.
2.2 Unbalanced ring
We now move forward by analyzing what problems this structure presents in terms
of balancing. Even though only the storage system has been covered so far, it is im-
portant to point out that, as it is now, the architecture already enables a primitive
form of balancing. Packets are, in fact, distributed across different stations and mod-
ern P2P networks are entirely based on this scheme. What kind of balancing do we
end up with?
Since routing algorithm ψ employs hash function ξ, the balancing state of the
ring depends entirely on ξ only! Under static conditions (the ring does not change),
the routing of packet p is done the moment hp = ξ(p) is computed! Routing function
ψ is executed many times because the knowledge that each station has of the ring
is limited, but that number of iterations does not affect the destination station being
computed.
The question we want to answer to is: “Given hash h ∈ H, what is the probability
that, given a random input packet p ∈ P, hash function ξ computed on p returns h:
Pr {ξ(p) = h}?”. Since ξ is a cryptographic hash function, it has an interesting prop-
erty: the probability distribution of the hashes being generated is approximately
uniform, it means that:
∀h ∈ H, ∀p ∈ P, Pr {ξ(p) = h} =
1
hM + 1
(2.7)
Since ξ : · → [0, hM]. If we indicate with l ∈ N the number of bits of the hash
(hash length): l = log2 hM , then ξ : · → [0, 2l − 1] and can write:
∀h ∈ H, ∀p ∈ P, Pr {ξ(p) = h} = 2−l
(2.8)
What we want to focus on is knowing, in the long run, how many packets each
stations gets with this configuration. From this knowledge, we will then analytically

calculate the information we need about the balancing.
The problem of Packets in Stations is very close (though not entirely equivalent)
to another well-known one which we will take into consideration: Balls in Bins 4. We
will see that the scenario of throwing a ball in an area full of bins and assessing in
which bin the balls falls, is equivalent to producing a random packet, calculating its
hash and checking into which station it is going to be routed.
Our analysis starts with identifying the probability for a packet to be routed into
one station of the ring:
Theorem 5 (Packet-in-station probability). Let R = (Ω, r, ξ, ψ) be a ring. Then the
probability that a packet p ∈ P is routed to station si ∈ Ω is:
π
(ξ)
i = Pr {ψξ(p) = si} =
Ξ(si)
1 + hM
Where Ξ(si) is station si’s hash segment’s length (hash coverage).
Proof. Since each station si owns a specific hash segment Ξ(si) = hk, hM
k , The proof
is immediate by considering that every hash h ∈ H ≡ [0, hM] has the same probabil-
ity to be selected, as per equation 2.8. So we just need to multiply that probability
by the length of the segment.
We use expressions π
(ξ)
i and ψξ(·) to indicate the probability for a packet to fall
into a certain station and the routing function ψ, both when hashing function ξ is
employed in the ring. The reason why we want to explicit the hash function is be-
cause, later, we are going to evaluate the same hash-based quantities with a different
hashing function and compare results.
Given one station si, the length of its hash segment Ξ(si) is an important quan-
tity. We can efficiently formalize its value by using expression: Ξ(si) = δi,i+1, and
by defining the following quantity:
Definition 13 (Segment length calculator). Let Ω be the set of stations and let every
station si ∈ Ω have assigned an hash segment hk, hM
k ⊂ N such that the hash partitioning
is circular, thus last station sN has hash segment [hN , hM] ∪ [0, h1 − 1]. We define the
segment length calculator function as the application returning the number of hashes in
the segment assigned to one station:
δi,j = 1i,j · 2l
− hi mod (N+1) − hj mod (N+1)
Having ∀i, j ∈ N ∧ i, j > 0 and:
1i,j =
1 i > j
0
Then we can express the packet-in-station probability in theorem 5 as follows:
π
(ξ)
i = Pr {ψξ(p) = si} = 2−l
· δi,i+1 (2.9)
Theorem 6 (Packets-in-station probability). Let R = (Ω, r, ξ, ψ) be a ring. Let µi =
|si| = 0 . . . m be a r.v. counting the number of packets is station si where m ∈ N represents
4
The problem describes a non-deterministic scenario where balls are thrown on an area full of bins
in a random direction as described in: (Kolchin, 1998).

2.3. Balancing the ring 17
the number of total packets sent so far to the ring. Then the probability that station si ∈ Ω
has k ∈ N packets is:
Pr µ
(ξ)
i = k =
m
k
· π
(ξ)
i
k
· 1 − π
(ξ)
i
m−k
Proof. Immediate. We consider m packets routed into the ring and we want to cal-
culate that the probability that k among them were routed in station si. This calls
for Bernoulli Trials. The probability for a packet to end up in one station is given by
theorem 5.
Thanks to theorem 6, we know the PDF of r.v. µ
(ξ)
i and we are able to calculate
how many packets in average one station gets:
η
(ξ)
i (m) = E µ
(ξ)
i =
m
k=0
k · Pr µ
(ξ)
i = k
=
m
k=0
k
m
k
· π
(ξ)
i
k
· 1 − π
(ξ)
i
m−k
(2.10)
As described in (Kolchin, 1998), in the Balls in Bins problem, the following holds:
m
k=0
k
m
k
· pk
(1 − p)m−k
= mp, ∀m ∈ N, m > 0, p ∈ [0, 1] ⊆ R
Which allows us to calculate the average load per station in a simpler form:
η
(ξ)
i = m · 2−l
· δi,i+1 (2.11)
Proposition 9 (Unbalanced ring). Let R = (Ω, r, ξ, ψ) be a ring. Given equation 2.11,
the network is not balanced. The wider is one station’s hash segment, the more packets that
station gets:
Ξ(si) > Ξ(sj) ⇐⇒ η
(ξ)
i > η
(ξ)
j , ∀si, sj ∈ Ω
Proposition 9 is very important as it states that the system architecture, so far,
is able to achieve a very high level of decentralization, however it fails in balancing
stations.
2.3 Balancing the ring
Equation 2.11 is our starting point to take the current architecture and try to modify
it in order to reach load balancing. Our ideal model is such for which each station
gets the same number of packets:
Deﬁnition 14 (Ideal station load). Given a network of N ∈ N different stations, we deﬁne
the following as the ideal load per station:
ηi =
m
N
, ∀si ∈ Ω
Where m ∈ N is the total number of packets sent to the network.

Definition 14 points out that our final goal is having our architecture move to-
wards the Balls in Bins model. Our goal can also be expressed by considering the
probability that each single bin has to get a ball:
πi = N−1
, ∀si ∈ Ω (2.12)
If we can have equation 2.9 converge to equation 2.12, our goal is reached since
both definition 14 and equation 2.12 describe a uniformly distributed r.v.
2.3.1 Extending the ring
Moving forward to our goal, as per definition 14, we need to understand why the
ring is not balanced. We can identify 2 possible causes:
1. Hash function ξ does not keep into account the fact that stations have hash
segments of different lengths. It actually assumes that all stations have hash
segments of the same size.
2. Stations should have hash segments of the same size.
The 2 problems described above are actually 2 possible explanations of the same
issue: with regards to balancing, the network structure and the hash function are not
well coupled together. For our solution, we actually choose to accept the standpoint
offered by point 1 which blames the hash function rather than stations.
Our approach is replacing hash function ξ with another one:
Definition 15 (Hash function φ). Let φ : P → [0, φM] ⊂ R, be a hash function. By
employing function φ, a ring can achieve balancing and each station approximately receives
the same number of packets:
η
(φ)
i ≈ ηi =
m
N
Where m ∈ N is the total number of packets sent to the network. Also, we still define l as the
length (bits) of hashes generated by φ: l = log2 φM .
Definition 15 represents a goal for us. In the next section we are going to design
φ so that load balancing in the ring is achieved.
The first thing we notice about φ is that we have designed it to return real num-
bers, thus the hash space is no more discrete, but continuos. We will see that the
continuos characterization of φ will not be a problem when employed in the ring
and in routing algorithm ψ. Furthermore, in our model, we will consider function φ
to be used on packets only, we will not be using this new hash function to compute
the IDs of stations: for them, we will keep using hash function ξ.
Definition 16 (Extended ring). Let Ω be the set of stations and r ∈ N be the leaf radius.
Let ξ : · → H ⊆ N be the underlying hash function and φ : P → [0, φM] ⊂ R be the
balancing hash function used by each station and based on ξ. Let ψ : P → Ω be the routing
function, based on φ, used to assign packets to stations. Then we define R = (Ω, r, ξ, φ, ψ)
as the extended ring overlay where packets are balanced via hash function φ.
Given definition 16, we see that φ does not act as a replacement of ξ, so we will
actually consider the former as an extension of the latter.

Adapting concepts in extended ring
With hash function φ in place, routing algorithm ψ needs to be slightly changed.
Actually, since the whole ring structure is based on hashes, we need to adjust a few
definitions so that extended ring (Ω, r, ξ, φ, ψ) can be properly described.
The most important concept to introduce in the extended architecture, is that φ
acts transparently with regards to the hash space. Hash function ξ returns hashes in
the discrete space H ≡ [0, hM] ⊂ N, while φ generates hashes into continuos space
Φ ≡ [0, φM] ⊂ R. We design φ such that φM = hM; thanks to this, one space contains
the other but they are bound by the same extremes: H ⊂ Φ.
This also means that hash segments can be expressed both as enumerable sets
and real intervals. Whether the former or the latter will be specified via set identities
or inferred by context.
The next concept to adapt is stations and their IDs. Nothing changes with re-
gards to this matter: every station will keep using hash function ξ to calculate the
hash of its address hi ∈ H, however here the important point is understanding that
the hash identifying one station is also contained in the spaces of phi-hashes: hi ∈ Φ.
The last aspect to cover is routing function ψ. Given the assumptions above, algo-
rithm 1 remains 99% unchanged. What changes is line 2 where, instead of using hash
function ξ to calculate the packet, hash function φ is used instead: hp ← φ(p). All
other operations remain unchanged because Φ extends H. The direct consequence
of this last point is the following:
Lemma 7. Let R = (Ω, r, ξ, φ, ψ) be an extended ring, then the following holds:
ψ {p} = si ⇐⇒ φ(p) ∈ Ξ(si), ∀p ∈ P
Proof. Immediate by considering lemma 3 and the fact that function φ replaces ξ in
algorithm 1 at line 2.
Defining sizing equations
Equation 2.12 describes the PDF of r.v. s ∈ Ω which represents the bin (station
in an ideally balanced ring) where a ball (packet) falls into. As that equation pre-
scribes, if we want to have the ring balanced, we need to make sure that all stations
get the same probability to receive a packet, which is not the case for a normal ring
(Ω, r, ξ, ψ) as per equation 2.9.
So, by employing hash function φ, r.v. sφ ∈ Ω can be defined as the station where
a packet falls into by assuming the extended ring (Ω, r, ξ, φ, ψ) is in place.
R.v. sφ’s PDF is the start point from where we can commence our sizing effort.
Since sφ is continuos, and given lemma 7, the probability that a packet is routed to
station si in the extended ring is:
π
(φ)
i = Pr hi ≤ φ(p) ≤ hM
i =
Ξ(si)
fφ (r) dr (2.13)
Where fφ : [0, hM] ⊂ R → R is r.v. hφ’s PDF: it represents the generated φ-hashes;
and Ξ(si) ⊆ Φ is station si’s continuos hash segment. It is important to notice how

this equation relates r.v. sφ ∈ Ω (since its PDF has expression: N
k=1 π
(φ)
k · δ(r − k)5)
together with r.v. hφ ∈ Φ.
Recalling equation 2.12, we basically want: π
(φ)
i = πi :
hM
i
hi
fφ (r) dr = N−1
, ∀si ∈ Ω (2.14)
We start from equation 2.14. Our purpose is designing hash function φ’s im-
plementation so that this equation holds. This approach will guarantee that r.v. sφ
behaves like continuously distributed r.v. s in the Balls in Bins scenario.
2.3.2 Designing hash function φ
Equation 2.14 represents a constraint on fφ. This expression points out an important
relationship:
Proposition 10 (Relationship between r.v. sφ and hφ). Given equation 2.14, the effort
of designing hash function φ is transferred on r.v. hφ, as its PDF fφ is the subject of such
design.
Designing r.v. sφ’s PDF
Of course, we cannot extract function fφ from the integral sign in equation 2.14, so
we need to make some assumptions on it.
Definition 17 (Formatting impulse). Let g : R → R be a continuos, domain and value
bounded function with the following constraints:
1. g(r) ≥ 0, ∀r ∈ R.
2. g is a compact-support6 function: ∃r1, r2 ∈ R, r1 < r2 : g(r) = 0, ∀r ∈ [r1, r2].
3. ∃A ∈ R, A > 0 : g(r) ≤ A, ∀r ∈ R.
4.
+∞
−∞ g(r)dr ≤ 1.
5. It is possible to calculate g’s antiderivative: ∃G(r) : G (r) = g(r).
We define g as fφ formatting impulse: a function used to shape r.v. hφ’s PDF and solve
equation 2.14. Because of its definition, we use expression gr1,r2,A(r) to refer to an impulse
with amplitude A and definition interval [r1, r2] ⊂ R.
We want to show right now an important result concerning definition 17 which
will be useful later on in this chapter:
Lemma 8 (Impulse antiderivative is invertible). Let g : R → R be a formatting impulse.
Then its antiderivative G : R → R is invertible: ∃G−1.
Proof. The inverse function theorem7 states that a continuously differentiable uni-
variate function with nonzero derivative in a certain interval is therein invertible. In
our case, G is the antiderivative of a continuous function, thus it is continuous itself
5
Function δ is intended to be a generalized function, or distribution: δ, ϕ = ϕ(0).
6
Compact-support functions used in distributional calculus.
7
As described in (Nijenhuis, 1974), the theorem provides a sufficient condition for a function to be
invertible.

s1 s2 sk sN s1
hM 0[h1, hM
1 ]
gh1,hM
1 ,A1
[h2, hM
2 ]
gh2,hM
2 ,A2
[hk, hM
k ]
ghk,hM
k ,Ak
[hN , hM]
ghN ,hM,AN
[0, h1]
g0,h1,AN
FIGURE 2.3: Hash-partitioning of a ring into different segments, one
per each station. For each segment, a different impulse is used, its
coverage matches the segment’s length.
and differentiable by definition of primitive function. Thus we meet the conditions
of invertibility.
The nonzero derivative condition is not met by g’s definition. However this does
not undermine its invertibility: rather, it does not guarantee that the inverse function
is also continuously differentiable.
Function fφ’s domain [0, hM] ⊂ R can be partitioned into N different segments:
one hash segment Ξ(si) ≡ [hi, hM
i ] per each station si.
The basic idea is having function fφ employ impulse g to cover the different seg-
ments in the whole hash space [0, hM] ⊂ R as shown in figure 2.3. So, for each hash
segment [hi, hM
i ] ⊂ R, impulse ghi,hM
i ,Ai
is considered and is employed to calculate
fφ’s values falling into that specific segment. For the last segment relative to sN ,
since it crosses the max hash value hM, we need to actually use 2 different impulses:
ghN ,hM,AN,1
and g0,h1,AN,2
. Amplitudes A1, A2, . . . , AN,1, AN,2 are sized quantities and
their values will be calculated later in this chapter.
Definition 18 (Function fφ’s structure). Let R = (Ω, r, ξ, φ, ψ) be an extended ring, let
hφ ∈ [0, hM] ⊂ R be the r.v. representing a φ-hash, then its PDF is formally defined as:
fφ(r) =
N−1
k=1
ghk,hM
k ,Ak
(r) + ghN ,hM,AN,1
(r) + g0,h1,AN,2
(r)
It is important to notice that fφ is a not a regular unconstrained function, it is
a PDF, thus it must meet certain requirements. Later on, we will verify that those
requirements are actually in place. We can now take equation 2.14 and replace fφ
with its definition:
hM
i
hi
fφ (r) dr =
hM
i
hi
ghi,hM
i ,Ai
(r)dr = N−1
, ∀i = 1 . . . N − 1 (2.15)
In case last station sN is considered, then 2.14 becomes:
hM
i
hi
fφ (r) dr =
Ξ(sN )
ghN ,hM,AN,1
(r) + g0,h1,AN,2
(r) dr
=
hM
hN
ghN ,hM,AN,1
(r)dr +
h1
0
g0,h1,AN,2
(r)dr = N−1
(2.16)
Since g has antiderivative as per definition 17, we can proceed further in both
equations:
Ghi,hM
i ,Ai
(r)
hM
i
hi
= N−1
, ∀i = 1 . . . N − 1 (2.17)
And:

GhN ,hM,AN,1
(r)
hM
hN
+ G0,h1,AN,2
(r)
h1
0
= N−1
(2.18)
Equations 2.17 and 2.18 are the closed-form constraints we have calculated from
equation 2.14 right now.
Remark (Solutions of equations 2.16 and 2.18). Regarding last station’s hash segment,
we have to use 2 different impulses whose amplitudes AN,1 and AN,2 can be sized via equa-
tions 2.16 and 2.18. Those equations provide a possibly infinite set of solutions where both
amplitudes are interdependent. Among the possible ones, we will choose to have each impulse
cover half of the target value:
hM
hN
ghN ,hM,AN,1
(r)dr = GhN ,hM,AN,1
(r)
hM
hN
=
1
2N
(2.19)
And:
h1
0
g0,h1,AN,2
(r)dr = G0,h1,AN,2
(r)
h1
0
=
1
2N
(2.20)
Remark (Requirements on impulse). In definition 17, we have required formatting im-
pulse g to have antiderivative G and that its definition is known. This assumption is pretty
strong but not essential. As we could see from calculations so far, equations 2.15 and 2.16
can actually be used to size the value of the impulse’s amplitude by using an alternative
method to exact integration. Throughout the rest of our analysis, equations 2.17 and 2.18
will always be referred to as the preferred amplitude sizing method; however it will always
be implicitly intended that equations 2.15 and 2.16 can replace them.
The process of designing fφ is completed as the equations above can be used to
calculate all impulse amplitudes A1, A2, . . . , AN,1, AN,2:
Proposition 11 (Defining function fφ). In order to reach load balancing in the ring, the
following operations are considered:
1. Formatting impulse ghi,hM
i ,Ai
: [hi, hM
i ] ⊂ R → [0, Ai] ⊂ R is defined for each station
si ∈ Ω.
2. Function fφ is designed as per definition 18 by summing impulses all together.
3. Function fφ will be parametric on the set of impulse amplitudes, hence its input space
will be RN+2: fφ(A1, . . . , AN,1, AN,2, r) with r, Ak ∈ R, Ak > 0, ∀k = 1 . . . N.
4. For each impulse, the corresponding antiderivative:
Ghi,hM
i ,Ai
(r) =
r
0
ghi,hM
i ,Ai
(x)dx (2.21)
is calculated.
5. For each impulse, by means of equations 2.17 and 2.18, the corresponding amplitude
Ai is calculated.
Now that we know fφ’s formal definition, we need to verify that such expression
meets the constraints of a PDF:
Theorem 9 (Function fφ is a regular PDF). Let R = (Ω, r, ξ, φ, ψ) be an extended ring
with N = Ω stations and let ghi,hM
i ,Ai
: [hi, hM
i ] ⊂ R → [0, Ai] ⊂ R be the formatting
impulse for each station si ∈ Ω such that equations 2.17 and 2.18 hold. Then function fφ,
as per definition 18, is a regular PDF.

Proof. The 3 basic properties of PDF functions must be met:
1. fφ is positive and bounded given the definition of impulse g:
0 ≤ ghi,hM
i ,Ai
(r) ≤ Ai, ∀i = 1 . . . N, r ∈ R
2. Given its definition, fφ’s domain is the union of all non-overlapping domains
of the impulses:
N−1
k=1
[hi, hM
i ] ∪ [hN , hM] ∪ [0, h1] ≡ [0, hM] ⊂ R
Thus the function is 0 out of its definition range:
fφ(r) = 0, ∀r < 0 ∧ r > hM =⇒ lim
r→±∞
fφ(r) = 0
3. fφ’s area is unitary because of equations 2.17 and 2.18:
+∞
−∞
fφ(r)dr =
hM
0
fφ(r)dr = N · N−1
= 1
The following comes as a direct consequence of theorem 9:
Corollary 9.1 (Function Fφ is a regular CDF). Function Fφ : R → R has the following
form:
Fφ(r) =
N−1
k=1
Ghk,hM
k ,Ak
(r) + GhN ,hM,AN,1
(r) + G0,h1,AN,2
(r) (2.22)
And is a regular CDF and r.v. hφ’s CDF.
Proof. Immediate by considering r.v. hφ’s CDF’s definition:
Fφ =
r
−∞
fφ(x)dx =
r
−∞
N−1
k=1
ghk,hM
k ,Ak
(x) + ghN ,hM,AN,1
(x) + g0,h1,AN,2
(x) dx
=
r
−∞
N−1
k=1
ghk,hM
k ,Ak
(x)dx +
r
−∞
ghN ,hM,AN,1
(x)dx +
r
−∞
g0,h1,AN,2
(x)dx
=
N−1
k=1
r
−∞
ghk,hM
k ,Ak
(x)dx +
r
−∞
ghN ,hM,AN,1
(x)dx +
r
−∞
g0,h1,AN,2
(x)dx
=
N−1
k=1
Ghk,hM
k ,Ak
(x)
r
−∞
+ GhN ,hM,AN,1
(x)
r
−∞
+ G0,h1,AN,2
(x)
r
−∞
Where Ghk,hM
k ,Ak
is defined according to equation 2.21. Given impulse’s definition,
its antiderivative and binding to hash segments as per equation 18, we know that
the following holds:
lim
r→∞
gr1,r2,A(r) = 0 ∧ r1, r2 ≥ 0 =⇒ lim
r→−∞
Gr1,r2,A(r) = 0

Hence, leading to the following result:
[Gr1,r2,A(x)]r
−∞ = Gr1,r2,A(r) − lim
r→−∞
Gr1,r2,A(r) = Gr1,r2,A(r), ∀r ∈ R
Which leads us to equation 2.22. Finally, theorem 9 has proved fφ(x) is a regular
PDF and this covers the proof of all aspects of the thesis.
Designing r.v. sφ
Although proposition 11 describes the procedure for calculating fφ, it does not pro-
vide a way to build r.v. hφ in a way such that it generates hashes according to that
function. This problem is known in literature as random variable generation8. By em-
ploying this technique, we are able to get the algorithm for calculating φ-hashes
which allow us to balance load in the ring. For the sake of completeness, the proof
of this process is described below:
Theorem 10 (R.v. generation via inverse transform). Let F : R → [0, 1] ⊂ R be a
continuos inversible function meeting the characteristics of a CDF:
1. Bounded in [0, 1]: 0 ≤ F(r) ≤ 1, ∀r ∈ R.
2. limr→−∞ F(r) = 0.
3. limr→+∞ F(r) = 1.
4. Monotone increasing: r1 < r2 =⇒ F(r1) ≤ F(r2), ∀r1, r2 ∈ R.
Let U ∈ R be a continuos uniformly distributed r.v. over [0, 1] ⊂ R. Deﬁne X ∈ R as a r.v.
such that the following transformation holds:
X = F−1
(U) (2.23)
Where F−1 : [0, 1] ⊂ R → R denotes F’s inverse function. Then X is distributed as F:
Pr {X ≤ x} = F(x), ∀x ∈ R (2.24)
Proof. We need to prove that equation 2.23 causes r.v. X to be distributed according
to CDF F. Starting from equation 2.24, we need to prove the following:
Pr F−1
(U) ≤ x = F(x), ∀x ∈ R
Since F is invertible, then the function is both injective and surjective. Also, F is, by
hypothesis, a continuos function. Thanks to those 2 conditions, we can apply F to
both ends of the inequality under the probability sign:
Pr F F−1
(U) ≤ F(x) = F(x) =⇒ Pr {U ≤ F(x)} = F(x), ∀x ∈ R
The sign of the inequality is left unchanged because F is monotone increasing. Let
now a = F(x), as x ranges in R, a will range in [0, 1] because F is a CDF and emits
values in that interval. So the previous equation becomes:
Pr {U ≤ a} = a, ∀a ∈ [0, 1] ⊂ R
8
As described in: (Haugh, 2004), r.v. transformation via inverse transform is a known technique
which makes it possible to deﬁne a random variable by using the inverse of its CDF.

But we have that Pr {U ≤ a} = FU . U is supposed to be a continuos uniformly
distributed r.v. over [0, 1]. The previous equation is actually r.v. U’s CDF’s formal
definition which proves the thesis.
Theorem 10 answers our question about the implementation of hash function
φ. When considering a ring (Ω, r, ξ, φ, ψ), we can use hash function ξ to build hash
function φ in order to reach balancing. Before detailing this process, we need to
make sure we meet all conditions defined by the theorem above:
Lemma 11 (Function Fφ is invertible). Let fφ be r.v. hφ’s PDF defined as per proposition
11. Let Fφ be its CDF. Then Fφ is invertible and we will indicate with F−1
φ its inverse.
Proof. The proof is almost immediate. We consider Fφ’s definition, as per corol-
lary 9.1, and note that it is basically built up by many different impulse antideriva-
tives Gi = Ghi,hM
i ,Ai
having i = 1 . . . N. It is possible to invert a function by parts;
given Fφ’s structure, we can prove its invertibility by proving that each impulse an-
tiderivative Gi is itself invertible. Thanks to lemma 8, every impulse antiderivative
is actually invertible, and this proves the thesis.
The process for achieving this result is as follows:
Proposition 12 (Hash function φ’s implementation). Let R = (Ω, r, ξ, φ, ψ) be an ex-
tended ring. In order to build hash function φ, the following operations must be performed
at ring initialization time:
1. For each station si ∈ Ω, compute its hash identifier: hi = ξ(si) and sort all ID hashes
by increasing value: hi, h2, . . . , hN .
2. Define a formatting impulse g(r, r1, r2, A) to use, parametric in (r1, r2, A) ∈ R3. It is
possible to use one impulse definition for all stations or use different impulse definitions
per each.
3. Bind each station’s associated impulse gr1,r2,A to the station’s hash segment Ξ(si) =
[hi, hM
i ], thus obtaining an impulse gi = ghi,hM
i ,Ai
parametric in amplitude Ai.
4. Use sizing equations 2.17 and 2.18 to compute, for each impulse gi, the value of am-
plitude Ai which allows to achieve balancing. Thus compute an array of fully qualified
impulses: g1, g2, . . . , gN−1, gN,1, gN,2 (no more parametric).
5. Build PDF function fφ as per equation 18.
6. Compute CDF function Fφ as per corollary 9.1.
7. As per theorem 10, compute φ as:
φ(·) = F−1
φ 2l
− 1
−1
· ξ(·) (2.25)
Equation 2.25 represents hash function φ formal definition.
We will refer to this equation as the: Balancing Equation.

0 hM
0 1
h1 h2 hk hN−1 hN
1
2N
1
2N
+ 1
N . . .
1
2N
+ k
N
1
2N
+ N−1
N
h1 h2 − h1 . . . . . . hN − hN−1 hM − hN
1
2N
1
N
. . . . . . 1
N
1
2N
φ φ φ φ φ φ
sN s1 s2 sk sN−1 sN
sN s1 s2 sk sN−1 sN
FIGURE 2.4: Hash segments mapped onto φ segments illustrating
how hash function φ works. The top part of the diagram shows the φ
hash-space while the bottom part the ξ hash-space
2.3.3 Understanding how φ works
Equation 2.25 allows us to balance the ring. Before moving on, we would like to
point out a few important facts regarding the balancing equation to better under-
stand how it works.
Figure 2.4 clearly demonstrates the basic principle behind φ: the balancing appli-
cation basically maps an unevenly distributed space (regular hashes, that is ξ hashes)
onto an evenly distributed space (φ hashes). The φ space even partitioning is based
on the number of stations N.
We want to remark the fact that, given its deﬁnition (equation 2.25), hash function
φ has a domain which spans values ranging in interval [0, 1] ⊂ R. This interval is
subdivided into equal segments, each one assigned to a station. We will call them
φ-segments, and we will use term φ-coverage, later on, to indicate the same concept.
2.4 Ring balancing example
For the sake of completeness, we are going to provide an example regarding how to
build hash function φ for a simple small ring consisting of a few stations.
Since our purpose here is to provide a real case scenario, easy to understand, we
are going to consider the following conditions:
1. The ring will be made of N = 6 stations.
2. Leaf set radius is the minimum: r = 1.
3. We are going to consider extremely short hashes with l = 10 bits. This means
that hM = 210 − 1 = 1023.
Remark. Hash function ξ will be considered but not deﬁned as we are not going to physically
use it in our calculations.
We will now follow proposition 12’s prescriptions.

2.4. Ring balancing example 27
Station Hash identifier HS (Ξ(si) ⊂ N) HS (Ξ(si) ⊂ R)
s1 = St. 1 h1 = 101 {101 . . . 209} [101, 210)
s2 = St. 2 h2 = 210 {210 . . . 339} [210, 340)
s3 = St. 3 h3 = 340 {340 . . . 552} [340, 553)
s4 = St. 4 h4 = 553 {553 . . . 700} [553, 701)
s5 = St. 5 h5 = 701 {701 . . . 997} [701, 998)
s6 = St. 6 h6 = 998 {998 . . . 1023} ∪ {0 . . . 100} [998, 1023] ∪ [0, 101)
TABLE 2.1: Showing, in the example, values of hash identifiers and
hash segments for each station.
2.4.1 Defining the ring
We must first define ring R = (Ω, r, ξ, φ, ψ). Remember that hash function φ is the
last quantity we will define.
Each station si ∈ Ω must first define an identifier and compute hash function ξ
on that in order to calculate its hash identifier hi.
As it is possible to see in table 2.1, once each station receives its hash identifier,
hash segments are defined so that routing is possible in the ring.
2.4.2 Defining the formatting impulse
According to the second point of proposition 12, we must move on to defining the
formatting impulse to use in order to achieve balancing. We can choose to either
define one impulse type for all stations or a different impulse type per each one of
them. We are going to choose the first option for 2 reasons:
1. Choosing one single impulse type is easier from a computation point of view
as it implies to formulate its parametric antiderivative only once.
2. Choosing more impulse types has not proved, so far, to be any more beneficial
than using a single one. The quality of the balancing is not impacted by this
choice9.
For the sake of simplicity, we are going to consider a very simple impulse type:
the rectangular impulse:
gr1,r2,A(r) = A · Π
r − r1
r2 − r1
(2.26)
Having:
Π (r) =
1 0 ≤ r ≤ 1
0
The impulse we have chosen in equation 2.26 is compliant to definition 17. Since
the proof is immediate, we will not cover it.
9
This is based on observations from simulations run so far. No proof supports this theory nor denies
it though.

2.4.3 Binding impulses to stations
Now we need to bind every station’s HS to an impulse in order to get a collection of
impulses all parametric with respect to their amplitudes.
s1 =⇒ g1 = g101,210,A1
s2 =⇒ g2 = g210,340,A2
s3 =⇒ g3 = g340,553,A3
s4 =⇒ g4 = g553,701,A4
s5 =⇒ g5 = g701,998,A5
s6 =⇒ g6,1 = g998,1023,A6,1 , g6,2 = g0,101,A6,2
Remark (Impulse for last station). The last station in the ring has a special treatment
because it might span two hash intervals since it will include the highest hash hM and the
lowest one (the null hash). Thus 2 impulses are actually used.
2.4.4 Calculating amplitudes
At the moment, we have a collection of N + 1 = 7 impulses, all parametric with
respect to amplitudes. We need to find the value of those amplitudes in order to
have these impulses define fφ in a way that hash function φ can balance the load in
the ring.
Equations 2.17 and 2.18 will be used to size those impulses. Since the sizing
equations require the computation of impulse’s antiderivative, we need to calculate
it first. Given its extremely simple definition, the calculation is almost immediate:
Gr1,r2,A(r) =
r
0
gr1,r2,A(x)dx =
r
0
A · Π
x − r1
r2 − r1
dx
= A · Π
r − r1
r2 − r1
· (r − r1) + H (r − r2)
Where H (r) is Heaviside’s step function10:
H (r) =
1 r > 0
0
In order to apply equations 2.17 and 2.18, we need to calculate the following
quantity, we will do the binding to hash segments later:
[Gr1,r2,A]r2
r1
= A · Π
r − r1
r2 − r1
· (r − r1) + A · H (r − r2)
r2
r1
= A · Π
r − r1
r2 − r1
· (r − r1) + H (r − r2)
r2
r1
= A · (r2 − r1)
We can now apply equation 2.17 to g1 . . . g5:
10
Heaviside’s function exists in literature in different forms, here we consider the variation where
the function assumes only 2 values: 0 and 1.

2.4. Ring balancing example 29
Ghk,hM
k ,Ak
hM
k
hk
= N−1
=⇒ Ak = hM
k − hk
−1
· N−1
, ∀k = 1 . . . 5
The same goes for g6,1 and g6,2 where we apply equations 2.19 and 2.20:
GhN ,hM,AN,1
hM
hN
= 1
2N =⇒ AN,1 · (hM − hN ) = 1
2N =⇒ AN,1 = 1
2(hM−hN )·N
G0,h1,AN,2
h1
0
= 1
2N =⇒ AN,2 · h1 = 1
2N =⇒ AN,2 = 1
2h1·N
We now have the values of all impulses:
A1 = (h2 − h1)−1
· N−1
= (210 − 101)−1
· 6−1
= 654−1
A2 = (h3 − h2)−1
· N−1
= (340 − 210)−1
· 6−1
= 780−1
A3 = (h4 − h3)−1
· N−1
= (553 − 340)−1
· 6−1
= 1278−1
A4 = (h5 − h4)−1
· N−1
= (701 − 553)−1
· 6−1
= 888−1
A5 = (h6 − h5)−1
· N−1
= (998 − 701)−1
· 6−1
= 1782−1
A6,1 = (hM − h6)−1
· (2N)−1
= (1023 − 998)−1
· 12−1
= 300−1
A6,2 = h−1
1 · (2N)−1
= 101−1
· 12−1
= 1212−1
2.4.5 Computing functions
Now that all impulses have been properly sized and we have their values, function
fφ is fully deﬁned. As a direct result, we also have that function Fφ is fully deﬁned.
Given its simplicity, Fφ can be easily inverted piecewise.

31
Chapter 3
Simulation results
In this chapter we are going to describe the simulations which were performed in
order to validate, on a practical standpoint, all the results analytically achieved in
chapter 5.
As a generic overview, two different simulation systems were designed and de-
veloped:
• Regular simulations A high-level engine developed in Matlab1 and Mathe-
matica2 targeting small-size simulations in order to produce real data to vali-
date the whole system.
• High performance simulations A low-level engine developed in C/C++ and
targeting large-size simulations in order to produce high fidelity data to vali-
date the system in real life conditions.
Both solutions were used to prove that all analytical results in chapter 5 do pro-
vide a valid description of the system’s behaviour.
3.1 Small-size simulations
This simulation engine was developed to generate results in the context of a con-
trolled environment where conditions are similar to those in real life. The main
features regarding this simulation set are:
• Functional definition of impulses and functions.
• Real hashes are calculated using standard Crypto3 library.
• All big integers are normalized into a smaller interval.
Of course, given its nature, the engine comes with some limitations and some
downsides too:
• Even though real hashes are used, their values are normalized to fit a smaller
interval. Thus, these values cannot be considered as high fidelity.
• Simulations are slow. Given the application of functional calculus, impulse
functions and their antiderivatives are defined in open form, thus requiring
numerical integration to be performed every time.
1
Mathworks Matlab https://www.mathworks.com/products/matlab.html.
2
Wolfram Mathematica https://www.wolfram.com/mathematica/.
3
OpenSSL library was used to compute regular hashes. More information available in appendix A.

32 Chapter 3. Simulation results
100
200
300
400
30
210
60
240
90
270
120
300
150
330
180 0
PHCP non−balanced case
50
100
150
30
210
60
240
90
270
120
300
150
330
180 0
PHCP in balanced case
FIGURE 3.1: Showing the Polar Hash Coverage Plot (PHCP) of a sim-
ulation on an N = 10 station ring after sending m = 103
packets.
Both plots show the configuration of the station hash segments to-
gether with the final load levels at the end of the simulation. The plot
on the left refers to a normal ring (hash function ξ applied), the one
on the right refers to an extended ring where hash function φ based
on same ξ is considered. The same packets were sent in both rings.
• Given the different subsystems being used, numerical accuracy is not guaran-
teed.
On one side, this set of simulations is characterized by a relatively easy imple-
mentation, thus they come with certain intrinsic limitations (mainly related to the
subsystems being used). The other set of simulations is meant to target those issues
and provide a better numerical fidelity.
3.1.1 Verifying load balance
One of the most basic simulations are used to verify that the algorithm effectively
helps the ring achieving load balance given different hash segment distributions
among stations in the hash range interval [0, hM]. These simulations perform the
following operations:
1. The hash space is divided into N random parts and each assigned to one sta-
tion.
2. A total number of m packets (random numeric vectors) are generated and fed
to the hash function which can be either ξ or φ (both are considered in order to
compare loads per station at the end of one simulation).
3. Packets are assigned to stations according to routing function ψ based on se-
lected hash function.
4. Final results are collected: the number of packets per each station is tracked.

3.1. Small-size simulations 33
0 1000 2000
0
100
200
St. 1
0 1000 2000
0
100
200
St. 2
0 1000 2000
0
100
200
St. 3
0 1000 2000
0
100
200
St. 4
0 1000 2000
0
100
200
St. 5
0 1000 2000
0
100
200
St. 6
0 1000 2000
0
100
200
St. 7
0 1000 2000
0
100
200
St. 8
0 1000 2000
0
100
200
St. 9
FIGURE 3.2: Showing load state (in blue) |sk| in each station sk as
time grows. In this simulation, hash function ξ is used (normal ring).
The green line shows the expected load state (uniform) for each point
in time.
Definition 19 (Polar Hash Coverage Plot (PHCP)). Let R = (Ω, r, ξ, φ, ψ) be a ring
with Ω = N stations. The Polar Hash Coverage Plot is a set of N vectors in the 2D
space:
E = Ak · ei·ωk
, k = 1 . . . N
Every vector has its amplitude indicate the packet load relative to the station it refers to, while
the phase indicates the station’s hash segment amplitude and position in the ring:
Ak = ηk
m
ωk = Ξ(sk)
hM
+ ωk,0
Where ωk,0 = k−1
j=0 ωj indicates the phase shift due to all stations preceding sk.
Figure 3.1 shows the PHCP of the same simulation in which the same m pack-
ets have been sent to the network, with and without load balancing hash function
φ in place. As it is possible to see, the vectors in the second plot (on the right) have
roughly the same amplitude in comparison with the first diagram (on the left), indi-
cating that hash function φ is effectively able to provide balancing on the same set of
packets across the stations in the ring.
3.1.2 Evaluating load levels per station
Another set of simulations are used to measure the difference between the final load
state in each station and the expected one (uniform) after the network has been fed
with a certain number of packets.

0 1000 2000
0
100
200
St. 1
0 1000 2000
0
100
200
St. 2
0 1000 2000
0
100
200
St. 3
0 1000 2000
0
100
200
St. 4
0 1000 2000
0
100
200
St. 5
0 1000 2000
0
100
200
St. 6
0 1000 2000
0
100
200
St. 7
0 1000 2000
0
100
200
St. 8
0 1000 2000
0
100
200
St. 9
FIGURE 3.3: Showing load state (in blue) |sk| in each station sk as time
grows. In this simulations set (same as in figure 3.1), hash function φ
is used (extended ring). The green line shows the expected load state
(uniform) for each point in time.
These simulations also have the objective of showing how the network behaves
with and without balancing hash function φ in place. These normal vs. extended
ring scenarios are important as they allow us to visually assess the work done by φ
in reshaping the load distribution in the network. For this analysis to be effective,
it is crucial that both scenarios are evaluated on the exact same set of packets gen-
erated. To guarantee this condition, when randomly generating packets, the same
seed is used when evaluating the normal and the extended ring during one simula-
tion session.
Figure 3.2 and 3.3 show, respectively, the same simulation session first conducted
on the ring and then again on the same ring but extended (φ in place). As it is
possible to see, as time grows, each station reports its load level. In the normal ring,
station load levels do not all meet expected load level η = m
N . On the other hand,
when φ is in place (figure 3.3), all station loads tend to match the expected levels.
Remark (Discrete time). This set of simulations is very important as the load state in each
station is evaluated during all the time. In this context, time is considered discrete and time
instants are to be associated to events. The only event being considered here is the generation
of a random packet.
3.2 Large-size simulations
This simulation engine was developed for two reasons: getting high fidelity simu-
lation data, and providing an initial implementation of the algorithm. As a direct

3.2. Large-size simulations 35
result, we could deliver the first implementation of the algorithm described in the
previous chapter. The main features of this system are the following:
• Being developed in C/C++, the application is very fast computing regular
hashes and performing φ-hashes processing.
• Simulations can be run sequentially or in parallel (packet generation).
• Standard Crypto library is used, therefore all generated hashes are real hashes
and not simulated quantities.
• Big integers are employed, so no scaling is performed in order to adapt real
data to simulation artifacts, hence providing more fidelity to the real scenarios.
Simulation flow In the context of this simulation effort, several computation and
memory intensive runs have been scheduled on a dedicated pool of servers. A de-
tailed description of the infrastructure being used is available in appendix A; here
we provide a brief synopsis about how these simulations work:
1. When the engine starts an initialization phase ensures memory and other con-
ditions.
2. Random packets are generated. Packets are generated as random bitstreams
of specific size. Different sizes can be specified and during one simulation the
size can range in a certain interval.
3. Hashes (using ξ and φ) are computed.
4. Routing of packets is performed for each hash by using application ψ.
5. All results are persisted in memory. Data manipulation is then performed in
order to extract information of interest.
6. Simulation output files are generated.
7. Post-processing is performed by generating diagrams and aggregated quanti-
ties using output files.
3.2.1 Overview
Many simulations have been run, all targeting different network structures and con-
ditions. Before showing results, we need to provide a synopsis of which configu-
rations have been considered in order to understand what was actually simulated.
Every simulation run is characterized by the following properties:
• Number of stations in the ring: N. This parameter directly impacts the size of
the network.
• Number of generated packets: m.
• Leaf radius r. For all simulations, the radius is unitary: r = 1.
• Packet size S ∈ {100Kb, 1Mb, 3Mb, 10Mb}.
Since one same configuration can be run different times with different seed val-
ues, aggregate properties describing one simulation group/batch include:

• Number of simulations in batch: C ∈ N.
• Overall simulation time of the batch: T ∈ R.
Grouping simulations Simulation conducted in the context of this research can
be classiﬁed using the parameters described above. The following batches were run:
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
100Kb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
ξ,φ
#
1Mb
13.1M
10
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30 PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30 PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30 PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30 PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30 PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
1Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30 PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30 PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
3Mb
89M
30
PA
R
ξ,φ
#
10Mb
89M
30
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
50 PA
R
ξ,φ
#
1Mb
10M
50
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
PA
R
ξ,φ
#
1Mb
10M
100
The diagram above illustrates the different conﬁgurations used to run simula-
tions. To read each single tile, just refer to the following legend:
ξ,φ
#
pkt. size S
m gen. pkt.
N
PA
R
ξ,φ
#
pkt. size S
m gen. pkt.
N
ξ,φ
#
pkt. size S
m gen. pkt.
N
PA
R
ξ,φ
#
pkt. size S
m gen. pkt.
N
regular parallel intensive parallel
intensive
For each simulation, a different seed was used (thus the #-symbol in the bottom-
left corner) and both regular and φ-hashes were computed (top-left corner).

0 0.2 0.4 0.6 0.8 1 1.2 1.4
·105
0
0.2
0.4
0.6
0.8
1
1.2
·105
σ of hξ
σ2
/ηofhξ
0 0.5 1 1.5
·105
0.5
1
1.5
2
2.5
·105
σ of hξ
σ2
/ηofhξ
1.5 2 2.5
·104
1.6
1.8
2
2.2
2.4
·104
σ of hξ
σ2
/ηofhξ
0 200 400 600 800
0
1,000
2,000
3,000
σ of hφ
σ2
/ηofhφ
0 0.5 1 1.5
·104
0
2
4
6
8
·104
σ of hφ
σ2
/ηofhφ
0 200 400 600 800 1,0001,2001,400
1,000
2,000
3,000
4,000
5,000
σ of hφ
σ2
/ηofhφ
FIGURE 3.4: Plotting standard deviation vs dispersion factor of gen-
erated ξ-hashes and φ-hashes during simulations batches (from left to
right): N = 10 (40 simulations), N = 30 (60 simulations) and N = 50
(10 simulations).
3.2.2 Evaluating the variance of hash segment amplitudes
Two information were of interest and, accordingly, two different types of data were
extracted from every simulation:
1. The statistical variation of regular hash values and φ’hash values in order to
see whether patterns exist.
2. The statistical relation between the distribution of hash segment amplitudes
and the distribution of φ-hash values. Since more φ hashes are routed into
a speciﬁc segment if that segment has a small amplitude, we want to assess
whether special patterns arise in case of high variance in segment amplitudes
when observing φ-hash values.
Figure 3.4 reports possible patterns between variations of regular and φ-hashes.
In general we can conclude that φ-hashes have a more localized behaviour as their
variations are more contained than regular hashes via hash function ξ.
This is expected: if we consider the whole hash space [0, hM] ⊂ R, we have that
hash function ξ has a uniform distribution over that range; on the other side, φ is
characterized by a distribution which allocates hashes with different probabilities in
different sub-intervals of the overall hash range. This last observation is the main
reason why we want to investigate the relation between the variance of segment
lengths and the variance of φ-hashes.
Hashes and segment amplitudes As anticipated, the following questions were
of interest with regards to the behaviour of φ-hashes and the distribution of hash
segment lengths Ξ(si) , ∀si ∈ Ω:
1. If all stations in the ring are arranged in a way such that the distribution of
hash segment lengths is approximately uniform, what behaviour should we
expect from φ-hashes?

0 5 10 15 20 25 30 35 40
0
2
4
HS amplitudes (×1037
)
0
1,000
2,000
3,000
Standarddeviationσ
φ-hashes
0 5 10 15 20 25 30 35 40 45 50 55 60
0
5
10
15
20
)
0
2
4
6
8
·104
Standarddeviationσ
φ-hashes
0 1 2 3 4 5 6 7 8 9 10
0
2
4
6
8
)
0
2,000
4,000
6,000
Standarddeviationσ
φ-hashes
FIGURE 3.5: Plotting standard deviation of hash segment lengths and
standard deviation of φ-hashes during each simulations in batches
(from top to bottom): N = 10 (40 simulations), N = 30 (60 simula-
tions) and N = 50 (10 simulations).
2. If all stations in the ring define very different hash segments (some very wide
and some very short), what behaviour should we expect from φ-hashes?
The diagrams in figure 3.5 try to catch such behaviour and describe it from a sta-
tistical point of view. Both questions raised above can be mathematically mapped
one one statistical descriptor which, therefore, becomes of high interest in this con-
text: the standard deviation of hash segment amplitudes and φ-hashes.
By looking at those diagrams, we can assess a very weak trend for which the
variance of φ-hashes tends to be higher the higher is the variance of hash segment
lengths. As pointed out, this is classifiable as a pattern but in a very prudent way
as the trend is not immediately evident and there are some cases where such a trend

0
0.5
1
·105
Station sk
Simulations
Loadη
(ξ)
k
0
2
4
·104
Station sk
Simulations
Loadη
(φ)
k
FIGURE 3.6: Plotting station loads η
(ξ)
k (no balancing) and η
(φ)
k (bal-
anced ring) at the end of four N30 simulations with different seeds.
does not show up. Our conclusion is that a correlation between segment lengths
Ξ(si) and hashes hφ is probably present, however more variables are involved and
more investigation on this regard is necessary.
3.2.3 Evaluating load levels per station
This set of high performance simulations have been used, of course, to verify the
quality of the balancing performed by hash function φ.
Figure 3.6 shows station loads in the context of four different simulations with
30 stations. As it is possible to see, packets are balanced across stations and the
balancing is evident when comparing loads to simulations where no balancing is
performed.
Migrations flows
A concept extremely important in the context of these simulations and, more gener-
ally, in the context of this research effort, is the following:
Definition 20 (Migration flow ξ-φ). Let R = (Ω, r, ξ, φ, ψ) be a ring and p ∈ P a packet.
Let si = ψ(ξ)(p) be the station where the packet is routed to by using hash function ξ, and

S0
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S1
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S2
0%10%20%30%40%50%60%70%80%90%100%
S3
0%10%20%30%40%50%60%70%80%90%100%
S4
0%10%20%30%40%
50%
60%
70%
80%
90%
100%
S5
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S6
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S7
0%10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S8
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S9
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S10
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S11
0%10%20%30%40%50%60%70%80%90%100%
S12
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S13
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S14
0%10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S15
0%
10%
20%
30%
40%
50%60%
70%
80%
90%
100%
S16
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S17
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S18
0%
10%
20%
30%
40%
50%
60%70%80%90%100%
S19
0%10%20%30%40%50%60%70%80%90%100%
S20
0%10%20%30%40%
50%
60%
70%
80%
90%
100%
S21
0%10%20%30%40%
50%
60%
70%
80%
90%100%
S22
0%10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S23
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S24
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S25
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S26 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
S27
0%
10%20%30%
40%
50%
60%
70%
80%
90%
100%
S28
0%10%20%30%40%50%60%70%80%90%
100%
S29
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
FIGURE 3.7: Migrations flows in a N30 ring.
let sj = ψ(φ)(p) be the station where the packet is routed to by using hash function φ. The
virtual transition that packet p experiences from si to sj is called transition flow.
Definition 20 is the foundation of a differential analysis conducted during all
simulations. By collecting all hashes and mapping them to stations, it is possible, at
the end of the simulation, to extract all migration flows. At the end, it is possible
to identify as many flows as generated packets, the final process is aggregating this
information and counting duplicates.
Visualizing migration flows is difficult using tables, thus circo-diagrams4 are em-
ployed instead. Figure 3.7 shows migration flows for a simulation with 30 stations
and 1M packets generated. The diagram clearly provides a good description about
how packets virtually move from one station to another when hash function φ is
used for routing. Thanks to these diagrams it is possible to state the following:
Proposition 13 (Packet migrations). Let R = (Ω, r, ξ, φ, ψ) be a ring, then stations be-
have in two different ways:
• Wide-coverage stations are more likely to donate packets to other stations.
• Narrow-coverage stations are more likely to accept packets from other stations.
4
Circo-diagrams have been generated by using software Circos: http://circos.ca/. To read
these diagrams, read the on-line documentation.

It is also worth noticing that all migration ﬂows are localized in adjacent stations
when considering one node in the ring. This pattern is interesting because, differ-
ently from expectations, the re-arrangement performed by φ does not move packets
so far from their ξ-selected station.

43
Chapter 4
System API
In this chapter we want to provide a description of the different interactions the
system exposes to the end user for storing and retrieving data, and what protocols
are used in the ring to ensure those services.
In chapter 1, we have covered the system architecture. As we recall, the end user
is able to interact with the storage system in order to take advantage of its services;
what happens on the other side of it, in the ring, is not known to him. The questions
we want to give an answer to are: "What happens when the user sends data to be
stored?", "How can the user retrieve data he previously stored?".
The overall system exposes a minimal set of API consisting of 4 primitives:
1. Store By means of this functionality, the user can transmit a DU and have it
persisted in the system. Typically, upon invoking this API, the user receives
some data in return, a token, which will be used later for retrieving that same
data.
2. Retrieve By invoking this API, the user can retrieve data previously stored in
the system. If the operation is successful, the user receives his data in return.
3. Remove The user can actually decide to remove data he previously stored by
invoking this primitive. No data is returned after invoking the API except
a status code indicating whether the removal was successful plus additional
information (optional, like the amount of total data that was deleted).
4. Update This functionality is used by the user to update existing data. It even-
tually results in the sequential application of a delete and store invocation.
We are going to examine in detail the ﬁrst 2 primitives: store and retrieve as the
others are extensively based on the former pair.
4.1 Storing data
The API for transmitting a DU and have it persisted in the system requires the user
to provide, as input, the byte stream. A identiﬁcation token is returned if the call is
successful:
t_token store ( stream_t& input )
The moment the user invokes store on DU p ∈ P, 2 things happen in sequence:
1. The token is computed by calculating the hash of the input packet h = ξ(p).

44 Chapter 4. System API
2. DU p’s size is considered together with fragmentation threshold c ∈ N. If p ex-
ceeds the threshold: |p| > c, then the packet is fragmented into smaller units.
The token is returned to the user in case the storage process is successful. Given
the DHT and content addressing, it is possible to retrieve the DU later by using that
specific hash.
4.1.1 Packet fragmentation
The fragmentation process is necessary for some important reasons:
• The ring has a high level of control traffic. Given the DHT and routing al-
gorithm ψ, many transmissions occur between contiguous stations in the net-
work. In order to reduce the latency of communications, the network tends to
favour quantity over size, thus allowing many packets to be exchanged as long
as their size is small enough.
• When a packet reaches a certain station which is not the final destination, an-
other routing iteration is necessary. This means that another communication
must be performed with one of the contiguous stations in the ring. However
if that link is in use, as another packet is being transmitted, then the incoming
one must be queued. In order to reduce station traversing times (while hop-
ping because of routing) for packets, packets’ size is set to a reasonably low
level.
• By dealing with small DUs, it is possible to ensure better balancing over time.
If data were stored without breaking them down into smaller pieces, we would
not ensure units of the same size to be stored across stations, this goes against
one of the assumptions for our balancing algorithm: all data units have the
same size.
As a packet is submitted for storage, the fragmentation process breaks it down
into n smaller units:
n =
|p|
c
In case the original DU is fragmented into smaller units, the final returned token
is still the hash of the original packet. Later on we will see that, by using the same
token, all fragments can be retrieved back. For this to happen, it is necessary to
create fragments in a specific way:
Definition 21 (Packet fragmentation). Given packet p ∈ P and fragmentation thresh-
old c ∈ N (number of bytes), then application ζ : P → 2P returns the set of fragments
{p1, p2, . . . , pn} given input p. Every returned fragment has the following format:
1. The hash of the original packet.
2. The sequence number of the fragment (needed when re-constructing the packet).
3. The hash of this fragment.
4. The data stream (up to c bytes).
As shown in figure 4.1.

4.1. Storing data 45
ξ(p) Seq. k ξ(pk) . . . Data
Ctrl. info
Max c bytes
FIGURE 4.1: Data unit format.
Remark. The frame format for non fragmented packets, called whole packets, is the same,
however the first field (parent hash) is null (all zeros) and the sequence number is −1, which
is the value which can be looked at for distinguishing fragments from whole packets.
4.1.2 Routing
After the fragmentation phase, which can end up with no fragment to be generated
if the original packet’s size does not exceed threshold c, every fragment pk ∈ P is
sent to the ring to be routed according to routing function ψ based on balancing hash
function φ.
As anticipated in chapter 1, the ring is never directly accessed by users. Proxies
are employed instead. A proxy station serves as an intermediary entity to balance
the access to the ring and to hide all ring’s stations from the outside world. When
the store primitive is invoked, the system, from the user’s computer, sends a store
request SReq to one of the known available proxies. When receiving a request, the
proxy will decide which station of the ring to pick for letting SReq enter the network.
The decision is based on a balancing algorithm which decides basing on each of the
known station’s link usage: the proxy’s goal is to avoid overloading one station with
incoming traffic.
Once the request reaches one of the stations, algorithm 1 will do the job and
guarantee that a station is found for the packet. The system client ensures that all
packets undergo the same process. If every fragment is successfully stored, the orig-
inal packet’s hash is returned to the user as a token for retrieving all fragments.
Asynchronous communications For performance reasons, the best approach is
having every communication asynchronous. It means that when a node (one of the
stations, a proxy or the user client node) sends a SReq, it does not keep the con-
nection open until the packet is successfully routed waiting for the final response to
close that connection. It is much better to send a request as a datagram transmission.
That station will receive a store response SRes when its request has been processed.
Every intermediate node that passes the request forward, will wait for its response
and, after receiving it, will construct its own response for the node who sent the re-
quest to it in the first place. This increases the bandwidth as links will not be owned
for long times.
Employing asynchronous transmissions complicates the communication proto-
col but allows better performance. One of the complications is represented by timers
which every station has to implement in order to raise an error when the response
does not get delivered within a reasonable time (request transmission failure). In a
synchronous scheme, timers are handled by the transmission protocol (e.g. TCP/IP)
in a transparent way to the caller, however in asynchronous scenarios, the station
has to implement timers on its own for each sent request. Figure 4.2 shows both

SReq
success
SReq
success
store(stream)
token
SReq
success
SReq
success
SRes
success
SRes
success
store(stream)
token
User: Proxy: Entry station: Dst station:
Synchronous communications
Asynchronous communications
FIGURE 4.2: Synchronous vs. asynchronous communication model
when storing a single packet.
communication schemes.
As part of the effort in writing tests and simulations of the algorithm, an actual
implementation of the ring has been developed in Microsoft .NET using communi-
cation library WCF1. Today, it is possible to implement asynchronous transmissions
in a very easy way as the IT industry has moved forward to that direction providing
developers with the set of API required to implement such protocols.
4.2 Retrieving data
The other side of the story, a little more complicated, is about getting data back. We
are going to cover this topic by considering the 2 possible scenarios here:
1. Retrieving a whole packet.
2. Retrieving a fragmented packet.
In both cases, the process always starts with the same set of operations: the user
has a token he received when storing data in the past and utilizes it to retrieve that
stream back as per retrieve primitive:
1
Microsoft’s Windows Communication Foundation: a library consisting of a collection of network
protocols highly customizable and ﬂexible.

4.2. Retrieving data 47
ξ(p) Total n ξ (p1) . . . ξ (pn)
Fragments
FIGURE 4.3: Packet info format.
stream_t& r e t r i e v e ( t_token t )
Retrieving a whole packet As soon as the user invokes the retrieve primitive,
through the proxy, a retrieve request RReq message is built and routed in the ring.
The token is the hash of the original DU, so, by following the DHT retrieval, the
request is routed to the destination station. Once in there, the station will search the
database to find the stored stream.
In order to have great performance in the packet search process, a dictionary can
be used inside every station. Since DUs are saved according to the format shown in
figure 4.1, the hash of the stream is always available and can be used for looking up
that specific packet when a RReq is routed to a station.
As soon as the stream is retrieved, it can be sent back to the request originator:
the end user, who will receive the DU in return from its retrieve call.
Retrieving a fragmented packet When the packet was fragment the time it was
stored in the ring, a problem occurs. In fact, when the request is sent and reaches
the destination station basing on the token (the original packet’s hash), nothing is
found. The original packet has been fragmented and each fragment has a different
hash completely unrelated to the token (since we use cryptographic hashing, there
is no way to get the original stream from the hash).
In order to solve this issue we can actually store all packet’s fragments’ hashes
into the token, which would become an array of hashes and grow in size. Although
this solution might work, we don’t really like it. The user should still be able to
locate all fragments just by having the original packet’s hash. In order to do so, we
need to modify the store protocol.
After a packet p ∈ P has been fragmented into n several units pk ∈ P, before
transmitting them, a packet info unit is constructed:
Definition 22 (Packet info DU). Given packet p ∈ P such that its size exceeds the frag-
menting threshold: |p| > c, a special data unit is built to track information about it and all
its fragments. The stream contains the following fields:
1. The original packet’s hash h = ξ(p).
2. The number of fragments n.
3. The hash of each single fragment pk (k = 1 . . . n) in order (from first to last).
As shown in figure 4.3.
In the revised store protocol, before sending each single fragment to be stored,
thus before calling store on each single fragment, the same primitive is called on
the packet info DU which has been built right after computing all hashes (original
packet and its fragments). This initial call will route the packet info into a station by
using the original packet’s hash.
Thanks to this approach, when retrieving a DU, the first RReq will reach the
station where the system will find the packet info. Using that, the system will then

RReq
success
RReq
success
lookup
pkt-info
RRes
success
RRes
success
retrieve(token)
pkt-info
RReq
success
. . .
RRes
success
retrieve(token pk)
fragment pk
aggregate(pk)
packet p
Station:
loop
[∀pk]
FIGURE 4.4: Sequence diagram showing the retrieval protocol in case
of a fragmented packet.
issue n retrieve calls in order to fetch each single fragment. Later, after getting all
streams, the original DU can be built, the order into which combining each fragment
is given by the sequence number in each retrieved fragment packet.
As shown in figure 4.4, the process to retrieve and build a stored packet might
take some time, not only the ring size influences this latency, but the number of
fragments too play a significant role in the process. It goes without saying that a
larger packet requires more time to be fully retrieved.

49
Chapter 5
Dynamic conditions
In the previous chapters we have described and analyzed the behaviour of the ring
under static conditions.
Definition 23 (Dynamic conditions). Let R = (Ω, r, ξ, φ, ψ) be a ring. We say the net-
work is under dynamic conditions when any of its characterizing elements changes:
1. Stations si ∈ Ω. Stations might disconnect or new stations might extend the ring.
This possibility also covers the event of stations faulting and becoming off-line.
2. Leaf set radius r changes.
3. Any of the connections in the overlay ring changes.
4. Hash function ξ or φ changes.
5. Routing strategy ψ changes.
Static conditions are the opposite of dynamic: the ring does not change and re-
mains the same. So, why do we need to talk about dynamic conditions? Why should
the ring change?
Ideally, if well designed, the system can be configured with a certain number
of stations, a certain radius and work optimally under static conditions. However,
today every system is exposed to dynamic conditions as many different planned or
unplanned events may occur:
1. One station enters a faulty state. It can happen for any reason like an hardware
issue (e.g. hard disk failure, data corruption, etc.) or a software problem (e.g.
system failure, emergency system reboot, etc.).
2. Stations can experience network issues. This can cause both a permanent of-
fline state or a temporary one if machines have a way to automatically recover
from these types of failures.
3. More stations are required because the system needs to serve an higher volume
of data (planned scale-up).
4. One or more stations need to undergo planned or unplanned maintenance.
5. Security related issues force some stations to be pulled away from the ring.
Those enumerated above are only a few possibilities. The point here is that a
storage system must keep into account such circumstances which are part of the real
world of connected systems.

50 Chapter 5. Dynamic conditions
When dynamic conditions are in place, the ring structure and the balancing al-
gorithm described so far need to be revised and modified in order to avoid perfor-
mance degradation and, in some other more critical cases, service outage. We are
going to examine the following dynamic cases:
• Scalability The ability of the ring to grow or shrink in a flexible way causing
the least possible performance degradation.
– Station join A station joins the ring causing it to expand.
– Station removal A station is pulled off the ring, causing it to shrink.
• Fault conditions One station experiences internal problems which cause it to
be unresponsive.
5.1 Scalability
What happens when a station joins the ring? When such an event occurs, there are a
few operations that need to be considered to re-initialize the ring:
1. The new station needs to build its leaf-set in order to identify its successors
and predecessors.
2. All nodes in the neighbourhood of the new station must re-arrange their leaf-
sets in order to update their successors or predecessors depending on the leaf-
set radius r.
3. Balancing hash function φ must be re-designed as now the ring has changed.
Since we have more stations, we have different hash segments and this impacts
function φ’s implementation.
The first 2 operations are infrastructural and can be addressed through well
known protocols currently employed in DHT-mannered networks; since the prob-
lem is nothing new, we are not going to spend more time talking about it. The 3rd
point though is a different story as it poses a new situation inside our network ar-
chitecture: stations must be synchronized to use a new balancing hash function φ.
Lemma 12 (Balancing hash function φ’s outdatedness upon ring scaling). Let R =
(Ω, r, ξ, φ, ψ) be a ring with N = Ω stations. Let, at any point in time, consider one
station joining R or being pulled out of it, causing the number of stations to become N =
N ± 1. Then hash function φ is no more suited for balancing the ring.
Proof. Immediate by considering proposition 12. According to that, hash function φ
depends on the number of stations in the ring, if that changes, the hash segments
affecting Fφ’s codomain change too; hence causing original hash function φ not to
reflect the new state of the network anymore.
The main problem we want to face here is the process of synchronising stations
in the ring and it consists of:
1. Computing new balancing hash function φ .
2. Updating all stations to use new hash function φ .
3. Rearranging packets across stations to ensure the balancing state of the ring.

5.1. Scalability 51
The last point is actually crucial. We are going to assume that a station joining the
ring comes with no packets stored in it. That is because any other scenario does not
make any sense. When the new station s∗ is on-line in the ring, the load distribution
changes from:
Σ = (|s1|, |s2|, . . . , |sN |) , |si| ≈ m · n−1
, ∀i = 1 . . . N
to this form:
Σ = (|s1|, |s2|, . . . , |s∗
| = 0, . . . , |sN |) , |si| ≈ m · n−1
, ∀si ∈ Ω {s∗
}
Which implies that the ring is not balanced anymore, hence the last point men-
tioned in the synchronization process introduced earlier, which looks more and more
expensive as we investigate the challenges introduced by the dynamic conditions
just taken into consideration.
If the operations required to synchronize the ring get too expensive (time-wise),
then the proposed algorithm has a serious issue in terms of scalability as it makes
the network adapt pretty badly under the hypothesis of dynamic conditions. Our
purpose is, therefore, trying to understand how actually expensive it is to scale the
ring.
Given our analysis so far, we have been able to break down the scalability issue
down to 2 sub-problems:
1. Updating hash function φ and aligning all stations in the ring to use it.
2. Re-arranging existing stored packets across stations in order to bring the load
distribution in the ring back to its balanced state.
We are going to look at these two problems separately and evaluate the ﬁnal
performance impact later.
Conjecture 1 (Scaling overall impact). Let R = (Ω, r, ξ, φ, ψ) be a ring experiencing a
scaling process due to one station joining or leaving the network.
• Let τφ ∈ R measure the performance impact (latency) of the process of updating hash
function φ to φ on all stations in the ring.
• Let τψ ∈ R measure the performance impact of the process of rearranging packets
among stations in order to take the ring back to its balanced condition.
• Let τS ∈ R measure the overall latency experienced by the system while carrying out
the two operations above in order to scale the ring.
We expect the following equation to hold:
τS ≤ τφ + τψ (5.1)
Conjencture 1 expresses our feeling that the overall performance impact caused
by the two scaling operations cannot be computed as the sum of the latencies in-
troduced by each one of them, as the two operations can be carried out in parallel,
rather than sequentially. We try to prove this throughout the rest of this chapter.

MT
hsrc ξ(Data) . . . Data
Header
Max c bytes
FIGURE 5.1: Message format.
5.1.1 Updating φ
Hash function φ is a global contract in the network.
Definition 24 (Global contract). A variable, or, more generally, a piece of information
shared by all stations in the ring. The main assumption is about all station keeping an exact
copy of the same value.
The protocol we need to design for updating hash function φ to φ on all stations
is, more generically speaking, a protocol to update a global contract in the network.
Since a DHT is designed for distributed scenarios, every condition implying a certain
level of centrality causes the system to behave with lower performance, and this is
the case here. The PA is based on a distributed approach, however the balancing
process is carried on through a global contract which is hash function φ; this explains
why we should expect this process to be relatively expensive.
Broadcasting in DHT
In order to have a global contract updated, we basically need to transmit a message,
containing the updated contract, in broadcast on the network because we need to
reach every single node. The message that needs to be sent, in terms of the API of
the balancing system, is PUM (φ Update Message). The cost of updating φ is equal to
the cost of sending a message in broadcast in the ring.
Since the broadcast occurs in the context of a network overlay, we need to create
a protocol specific for message broadcasting. Generally speaking, we can create a
message format which all transmissions between stations in the ring must comply
to. The message must contain, at least, the following information:
1. Message Type An enumeration indicating the type of communication (e.g.
RReq, RRes, etc.)
2. Source hash The hash of the source station (not strictly needed but nice to have
for performance reasons, as a station receiving a message knows its neighbours
and it is able to generate the hash of their IP addresses).
3. Body hash The hash of field Body. This is used for routing the message (des-
tination hash).
4. Body The content to transmit.
A possible implementation for the broadcasting protocol has to occur at station
level. Since the architecture is distributed, we cannot employ any centralized entity.

5.1. Scalability 53
Algorithm 2 Message broadcasting in the ring
Require: Ring initialized
Require: Station si has ID hi = ξ(si)
Require: Station si has an associated leaf set Λ(si)
Require: Station si receives packet p ∈ P from station ssrc ∈ Λ(si)
Require: Global variable hp ∈ N is available
Require: Global variable d ∈ {−1, 0, 1} ⊂ N is available and initially set to 0
1: function BROADCAST(p ∈ P)
2: hsrc ← ξ (ssrc) Actually computed or taken from message
3: if hsrc < hi then
4: d∗ ← −1 Message from LLS
5: else
6: d∗ ← 1 Message from ULS
7: end if
8: if ξ(p) = hp ∧ d + d = 0 then Same message from opposite side of ring
9: return Abort. End condition reached
10: else if ξ(p) = hp then Duplicate message from same side of ring
11: return Don’t send again
12: end if
13: hp ← ξ(p)
14: Λ ← ∅
15: if hsrc < hi then
16: Λ ← ΛU (si) Message from LLS =⇒ Send to ULS
17: else
18: Λ ← ΛL(si) Message from ULS =⇒ Send to LLS
19: end if
20: for s ← Λ do
21: Send p to s
22: end for
23: end function
Stations can recognize such a type of communication by inspecting the content of
the message, a possible solution is using a flag field in the message, or, better, using
a special value in the destination address field1.
Lemma 13 (Message broadcasting complexity). Let R = (Ω, r, ξ, φ, ψ) be a ring with
N = Ω stations. The cost of transmitting a message in broadcast, in best case scenario, is:
Θ∗
B =
N
2r
Where ΘB is expressed in number of message hops2.
Proof. Without loss of generality, we indicate with sA ∈ Ω the station initiating the
broadcast transmission in the ring. As soon as a station receives a broadcast message,
it consumes the content and then forwards it to the opposite side of its own leaf-set
in relation to which node it received the message from, as per algorithm 2. If sA
starts the protocol by sending the message to only one side of its own leaf-set, then
1
Usually protocols use the all-1 string to indicate a broadcasting address
2
A hop, in the scenario of message routing, is a single direct transmission from one node to another.

the maximum number of hops required to cover all the ring is:
ΘB =
N
r
Because one station forwards the message in one go to r neighbours. However,
initiator sA can be smarter and send the message to all nodes in its own leaf-set (both
sides). This would trigger a symmetric chain to both sides of the ring, thus leading
to the thesis as the best case scenario is when all messages travel at the same speed,
and the last transmission occurs at the very opposite side of the ring (the hypothesis
is that no delayed transmission occurs).
Finger tables A well-known routing enhancing techniques, often used in DHTs,
is the employment of ﬁnger tables. Brieﬂy, it consists in arranging leaf-sets in the
ring in a way such the LLF is empty and the ULF contains all successors in the
ring according to relative position sequence: 20, 21, 22 until reaching 2l. This way
of linking stations implies an higher cost from a control point of view because it
takes more time to re-arrange those links at initialization time and when dynamic
conditions are in place (e.g. one station joining or pulling off the ring).
That being said, on the other hand, this pattern actually also ensures better per-
formance from routing standpoint, hence guaranteeing an even better complexity
than the one considered in lemma 13 in message-broadcasting scenarios.
So, as we can see, the cost of updating global contract φ is acceptable and it is
possible to consider many well-known approaches in literature. Therefore, we have
no interest in detailing this issue any further.
5.1.2 Load re-arrangement
The part of the scaling cost we are most worried about is actually the re-arrangement
of packets. This operation is not required just from a balancing point of view, there
is a more critical aspect which needs to be addressed as soon as one station joins the
ring: packet retrieval.
Let us consider a scenario where station s∗ has joined the ring and hash function
φ has been updated. We consider that s∗’s predecessor is now station si. If no load
re-arrangement is performed, then RReq messages targeting a packet p whose hash
hp = ξ(p) is now covered by s∗: hp ∈ Ξ(s∗), will not be found as they are actually
still stored in si, since that station was covering hp before the ring scaped-up.
The question we want to answer is: "How is the retrieve primitive badly im-
pacted by the ring scaling up?". The example we just considered suggests that only
a portion of the ring is impacted by the scaling up, so the packet re-arrangement
should only occur between 2 stations, however this is something that needs to be
proved.
Theorem 14 (Load re-arrangement upon scale-up by 1 station). Let R = (Ω, r, ξ, φ, ψ)
be a ring. Let s∗ ∈ Ω be a station joining the ring causing hash function φ to be updated on all
stations to φ . Let us also consider station si now becoming s∗’s predecessor, so that its hash
segment is: Ξ(s∗) = [h∗, hM
i ], assuming that h∗ = ξ(s∗). Then the packet re-arrangement
effort required to make all packets in the new network retrievable and to re-balance the ring
impacts all stations in the network.

5.1. Scalability 55
Proof. Recalling how hash function φ works as we described in section 2.3.3, we
need to understand whether the joining of a station causes φ segments in its do-
main to change boundaries (see ﬁgure 2.4). We can try to visualize the impact on the
domain by considering the domain mapping diagram while keeping into considera-
tion dynamic conditions. We consider, for simplicity and without loss of generality,
si = s1:
0 hM
0 hM
0 1
0 1
h1 h2 hk hN−1 hN
h1 h∗
h2 hk hN−1 hN
s∗
1
2N
1
2N
+ 1
N . . .
1
2N
+ k
N
1
2N
+ N−1
N
1
2(N+1)
1
2(N+1)
+ 1
N+1
1
2(N+1)
+ 2
N+1 . . .
1
2(N+1)
+ k
N+1
1
2(N+1)
+ N
N+1
h1 h2 − h1 . . . . . . hN − hN−1 hM − hN
1
2N
1
N
. . . . . . 1
N
1
2N
h∗
− h1 h2 − h∗
1
2N
1
N+1
new φ segment: 1
N+1
1
N+1
1
N+1
1
N+1
1
2N
As it is possible to see, additional station s causes only one change in the ξ space
(φ’s codomain), but it causes all φ segments to resize in order to make room for an
additional interval of amplitude 1
N+1.
As we can see our initial assumption was not quite right unfortunately. The
whole domain of hash function φ is impacted and, possibly, all packets need to be
re-routed according to new hash function φ . Nonetheless, we still don’t know basic
information which can really tell us how bad the re-arrangement effort is, like:
1. Are packets re-routed to new stations completely unrelated to the original one?
Or there is a pattern?
2. Do all packets require re-routing? Is there a percentage of them that remains
in their current station when hash function φ makes transition to φ ?
These two question are crucial to evaluate the cost of the re-arrangement effort.
So we need more investigation.
Lemma 15 (Station transition direction upon packets re-arrangement). Under the
same hypothesis and conditions of theorem 14, any packet p ∈ P stored in any station
sj ∈ Ω of the ring, if moved because of the re-arrangement, it is moved either:
• To any of sj’s successors if sj s∗.
• To any of sj’s predecessors if s∗ sj.
Proof. To show this, we consider the ring in the 2 different conﬁgurations (before
the scale-up and after). The diagram below shows hash function φ’s domain in both
conditions (N stations at the bottom and N +1 at the top), and also plots the location
of 2 hashes therein.

h h
Ω = N
Ω = N + 1
1
2N
+ k−1
N
1
2N
+ k
N
1
2N
+ k+1
N
1
2(N+1)
+ k−1
N+1
1
2(N+1)
+ k
N+1
1
2(N+1)
+ k+1
N+1
1
2(N+1)
+ k+2
N+1
1
N
1
N
1
N+1
1
N+1
1
N+1
sk−2 . . . sk−1 sk sk+1 . . .
sk−2 . . . sk−1 s∗ sk
sk+1 . . .
As we can see, hash h falls initially in station sk−1, but after the transition it ends up
falling in station s∗’s coverage. In the same way, hash h falls initially in station sk+1,
but after the transition it ends up falling in station sk’s coverage. The formulation
of the lemma implies that packets can also remain in the same station. It is actually
possible as the diagram shows regions on the top and bottom hash spaces which
have values in common.
We now know that a minimal pattern is present while re-routing packets. How-
ever the information provided by lemma 15 is not much. A more interesting result
can be considered, but, before that, we need a quantity to be introduced:
Definition 25 (Packet’s station transition delta). Under the hypothesis of dynamic con-
ditions originating from the ring scaling up by one station, let si and sj be the original
station and the new station (after re-arrangement) for any packet p; then quantity ∆(p) ∈ N
represents the number of stations packet p had to be moved across:
∆(p) =
i − j if |i − j| ≤ N
2
sign(i − j) · N − i + j
∆(p) provides information about whether a packet was moved or not from its
original station (∆(p) = 0), and also about the direction of the move (∆(p) < 0 or
∆(p) > 0). The value of ∆(p) for each packet is the main subject of the next important
result:
Theorem 16 (Packets station transition delta upon scale-up by 1 station). Under the
hypothesis and conditions of theorem 14, the transition delta ∆(p) of any packet p ∈ P stored
in any station si ∈ Ω of the ring is, at most, unitary in absolute value: |∆(p)| ≤ 1.
Proof. The theorem basically states that if a packet is moved, that is moved to one
of the 2 direct contiguous stations. To prove this statement, we want to re-formulate
the thesis by using an equivalent definition. In conjunction with lemma 15, we need
to prove that:
1. A packet hosted in a station preceding s∗ is re-routed, at most, to its immediate
successor.
2. A packet hosted in a station preceded by s∗ is re-routed, at most, to its imme-
diate predecessor.
We will initially prove the first point, later on the second by considering it as a mir-
rored condition of the former.
Let us consider a linear bounded real space divided into N ∈ N even parts. Every
part is marked with an identifying number k = 1 . . . N. Then, we consider N to raise
to N + 1, we consider that every existing segment shrinks down in order to make

5.1. Scalability 57
space for segment N + 1 which is, therefore, supposed to be added as the last one.
This scenario abstracts the condition where stations’ φ coverages per each station
are shrunk to lower φ values due to s∗ joining the ring, in the specific case where all
stations being considered are predecessors (down to s1) of s∗.
a
[N]
[N + 1]
k
N
k+1
N
k+2
N
k
N+1
k+1
N+1
k+2
N+1
k+3
N+1
1
N
1
N
1
N+1
1
N+1
1
N+1
Without loss of generality, we consider point a ∈ [0, 1] ⊂ R and re-express the thesis
as follows: "Is it possible to find any combination of a, k and N such that, after the
shift from N to N + 1, a falls into a segment further than its original’s successor?".
Formally, this question is stated as follows:
∃a ∈ [0, 1] ⊂ R, N ∈ N, N > 0, k ∈ N, k = 1 . . . N :
a < k+1
N
a ≥ k+2
N+1
If that system of inequalities has no solution, then the thesis is confirmed. Be devel-
oping both inequalities we get the following:
a − k+1
N < 0
a − k+2
N+1 ≥ 0
=⇒
aN − k − 1 < 0
a(N + 1) − k − 2 ≥ 0
=⇒
aN − k − 1 < 0
aN + a − k − 2 ≥ 0
By isolating N, we get:
aN < k + 1
aN ≥ k + 2 − a
=⇒
N < k+1
a
N ≥ k+2−a
a
∨
−k − 1 < 0
−k − 2 ≥ 0
The second system arises from dividing both members, in both inequalities, by a.
We need to consider what solutions the system might present in case a = 0. This last
system is easily proved to be impossible:
k + 1 > 0
k + 2 ≤ 0
=⇒
k > −1
k ≤ −2
Resuming on the former system and considering from now on a ∈ (0, 1] ⊂ R, we can
develop more and get:
k + 2 − a
a
≤ N <
k + 1
a
=⇒
k + 2 − a
a
<
k + 1
a
=⇒ k + 2 − a < k + 1 =⇒ a > 1
The system has solutions for a > 1, however this is in contrast with our hypothesis
for which a ∈ (0, 1] ⊂ R, thus the system does not have solutions in the definition
boundaries of a, N and k!
We still need to prove the symmetric case of stations that are successors of s∗.
However it is possible to skip this by considering that such a scenario is the mirror
of the one just proved.
As a direct result, we have the following:

Corollary 16.1 (Packet lookup failure at re-arrangement time). Under the hypothesis
and conditions of theorem 14, if packet p ∈ P is not found in station si ∈ Ω while the system
is in the process or re-arranging packets, then it will be found in the previous or next node
depending on whether s∗ si or si s∗!
Lemma 15, theorem 16 and corollary 16.1 provide the answers to our initial ques-
tions. To draw our conclusions: the ring is not perfectly scalable as all stations need
to rearrange their packets under dynamic conditions; however the effort is extremely
localized in the context of each station.
5.1.3 Scaling overall impact
We have now more information in order to evaluate conjecture 1. Considering the
characteristics of the operations of updating hash function φ across stations and re-
distributing packets, we now understand that they can be executed in parallel. As
soon as the joining station computes φ , it commences the protocol for broadcast-
ing this knowledge in the ring. At the same time, the same station can start going
through all its packets and evaluating the new hash function on those in order to re-
route its DUs. This process can be started in every station the moment φ is available
and it is traversing the ring.
That being said, the packet moving operations are more expensive than the op-
eration of computing the new hash function or receiving it from other stations, thus
the time needed for re-routing DU loads in the network is far higher: τψ τφ, so the
overall scaling time is basically deﬁned by τψ.
5.1.4 Ring scale-down
All the considerations made so far regarding the ring scaling up can be transferred
to the opposite case where a station leaves the network. A few considerations must
be made though in relation to this dynamic condition:
• When a station leaves the network, the physical detachment to the other nodes
is not performed until all packets are re-routed. This is crucial and different
in comparison to the scenario of a station joining the ring; in fact we cannot
afford here to lose a whole bucket of packets.
• A station leaving the network is not the same scenario of a station abandoning
the ring. The former is a controlled process happening through a speciﬁc pro-
tocol and requires time; the latter is a sudden event and cannot be controlled,
its nature is described later in this chapter.
5.2 Fault conditions
As anything can happen, stations in the ring might enter weird states. The reasons
for such a scenario to occur can be many: hardware or software related and adequate
countermeasures can be considered. Nonetheless, when it comes to disaster recovery,
it is not much about all possible cases we know, but rather more about everything
we don’t know. So, we will now consider the possibility of a station becoming un-
available and we are not going to ask ourselves why! What we ask instead is: "How
do we guarantee data retrieval services and the balancing in such conditions?".

5.2. Fault conditions 59
S
p
hS p1
h
(1)
S p2
h
(2)
S p3
. . .
FIGURE 5.2: Multiple hashing mechanism for achieving safe redun-
dancy. Hashes are computed and then concatenated to the data
stream, hence generating packets ready to be sent.
When a station goes down, the first issue is infrastructural. If the ring is set to
have leaf-set radius r = 1, then we have a problem as the ring basically breaks apart
and messages cannot be routed across stations. Of course, if the radius is higher:
r > 1, then no immediate consequences are experienced in terms of message rout-
ing. In both cases, DHT networks have existing protocols in literature to fix dangling
links and isolate the unavailable station; the only difference is that a unitary radius
ring will experience some downtime until links are fixed, this is one of the reasons
for which non-unitary radius rings are more robust to disasters.
The second issue to solve is from data retrieval perspective. A station went down
unexpectedly, thus there was no time to apply any scale-down protocol (in fact the
scenario here is not a station leaving the ring, but a station disappearing from it).
The direct consequence is virtual data loss: all packets stored in that station are now
unavailable and when any RReq is sent to the ring targeting one of those DUs, the
destination station will not find the packet hash in its database.
It is clear that, to solve this issue, something has to be done before the station goes
down. However we cannot make any assumption on this condition and its timing.
So we need to change the data storage protocol to target situations where emergency
packet retrieval is needed as we cannot afford, for any reason, the possibility of data
becoming unavailable to users.
In chapter 4 we have described the API for storing a packet in the ring. Our in-
tention is to modify the storage protocol (primitive store) in order to save one packet
in multiple locations in the ring without losing balancing. The procedure applies to
either packets or fragments, in general, we consider a certain stream of data to be
sent for storage:
1. The data stream S to send is processed and its hash computed: hS = φ(S).
2. Another hash is computed, by using as input previously computed hash hS:
h
(1)
S = φ (hS).
3. The same recursive operation is repeated for ∈ N times and several hashes
are computed in chain: h
(k)
S = φ h
(k−1)
S .
4. different packets are generated by constructing a frame with the same body
(the data stream) but different associated hash as per figure 4.1 and then sent
to the ring.

RReq
success
RReq
success
lookup
null
RRes (error)
success
RRes (error)
success
retrieve(φ(p))
null
RReq
success
RReq
success
lookup
pkt p
RRes
success
RRes
success
retrieve(φ (φ(p)))
pkt p
User: Proxy: Entry station: Dst station 1:
Dst station 2:
FIGURE 5.3: Packet retrieval session under the hypothesis of one sta-
tion down. The diagram illustrates how a failed RReq triggers the
emergency retrieval process.
The procedure just described will generate different copies of the same DU
and they will all be sent to different locations in the ring. Thanks to the Lamport
scheme3, we can compute more hashes of the same initial stream and use them as
storage keys.
Remark. Generating the ﬁrst hash hS is potentially expensive because the input stream can
be long (however bound to a certain level considering fragmentation threshold c). The same
cannot be said for the other hashes h
(k)
S because they are computed on another hash (very
short string). So the process of computing the redundant hashes is very cheap.
3
The process of generating the hash of an hash is used today in security-related scenarios in order to
generate ephemeral keys. The scheme has been proved to be safe and, by using a secure cryptographic
hash function, irreversible.

How can this procedure help us when attempting to retrieve a DU stored into an
unavailable station? We consider again the broken scenario of before where station
si suddenly become unavailable:
1. The system tries to retrieve packet p via its hash h = φ(p).
2. The RReq message will reach station si−1 as it now covers the hash segment of
si when it was on-line. However station si−1 cannot find hash h in its database,
thus returns an error in the RRes.
3. The system acknowledges the first RReq is not successful, so it tries again to
retrieve the packet by computing φ(h).
4. The second RReq now reaches another station sj where the packet is found
and returned.
Figure 5.3 illustrates the protocol just described.
5.2.1 Collisions threshold
As we promote the idea of introducing redundancy of packets in the network as a
mean to achieve good levels of disaster recovery, we should try to be careful to make
this effort the most efficient possible, therefore avoiding unnecessary cost. Since we
are routing the same packet in the ring but with different hashes, we want to make
sure all the copies do not end up being routed into the same station. If we generated
one copy a packet and they both were routed to the same station, our effort would
be pointless: the moment that station goes down, our emergency retrieval procedure
would fail. On the other end we don’t want to generate too many copies of the same
packet as we would waste precious memory in our stations. How to find a good
balance? Let’s start by considering collisions in the ring:
Lemma 17 (Packets collision probability). Let R = (Ω, r, ξ, φ, ψ) be a ring with Ω =
N stations, and let p1 ∈ P and p2 ∈ P be two packets. Then the probability that they collide
(routed to) onto the same station is:
γ =
1
N
(5.2)
Proof. A collision occurs when p1 and p2 are routed to the same station si ∈ Ω:
ψ (p1) = ψ (p2) = si. The probability of this event can be defined as follows:
γ = Pr {ψ (p1) = ψ (p2) = si} , ∀p1, p2 ∈ P, ∀si ∈ Ω
We first consider packet p1 routed into the ring into station sk ∈ Ω and then consider
packet p2 being processed: the probability to have a collision with p1 is the proba-
bility of being routed into sk considering that sk can be any station of the ring, this
calls for the Law of Total Probability:
γ =
N
k=1
Pr {ψ (p2) = sk|ψ (p1) = sk} · Pr {ψ (p1) = sk} (5.3)
Since ψ is based on hash function φ which is based on ξ which is a cryptographic
hash function, we have that consecutive applications of ψ are not interdependent, it
mens that:
Pr {ψ (p2) = sk|ψ (p1) = sk} = Pr {ψ (p2) = sk} , ∀p1, p2 ∈ P, ∀sk ∈ Ω

We can rewrite equation 5.3 as follows:
γ =
N
k=1
Pr {ψ (p2) = sk} · Pr {ψ (p1) = sk}
Recalling theorem 5 and the definition of πk ∈ [0, 1] ⊂ R as the packet in station k
probability, we can write our equation as follows:
γ =
N
k=1
πk · πk =
N
k=1
π2
k
We are under the hypothesis of a balanced ring, since hash function φ is applied: so,
according to equation 2.12, we have:
γ =
N
k=1
1
N2
=
1
N2
·
N
k=1
1 =
1
N2
· N =
1
N
Which proves the thesis.
The follow up to lemma 17 is calculating the average number of collisions that
are experienced in the ring when sending packets. Remember that the fact of send-
ing copies p1 . . . p of packet p does not create a correlation between the different
instances being sent. This is due to the fact that we are sending different hashes
hS, h
(1)
S . . . h
( )
S related to each other by the Lamport chain which, actually, guaran-
tees that all the hashes are not (stochastically) interdependent.
Lemma 18 (Average number of collisions). Let R = (Ω, r, ξ, φ, ψ) be a ring with Ω =
N stations. Then, when generating m ∈ N packets, the average number of collisions experi-
enced between different couples of units is:
ηγ =
m
2
1
N
(5.4)
Proof. We introduce r.v. y ∈ N counting the number of collisions between couples of
m packets. This variable can range from 0 up to all the possible combinations of two
different packets: |C(m, 2)| = m
2 . We also introduce r.v. χ = 0, 1 ⊂ N defined as
follows:
χ(p1, p2) =
1 if a collision occurs between packets
0 no collision
Remembering that C(m, 2) enumerates all possible combinations of packets (order
does not matter), we can define y as follows:
y =
(p1,p2)∈C(m,2)
χ(p1, p2)
R.v. y’s mean value can then be calculated as:
ηγ = E [y] = E


(p1,p2)∈C(m,2)
χ(p1, p2)



Since operator E [·] is linear, we have that:
E


(p1,p2)∈C(m,2)
χ(p1, p2)

 =
(p1,p2)∈C(m,2)
E [χ(p1, p2)]
R.v. χ is discrete and distributed on two values only, so its mean can be easily calcu-
lated:
E [χ(p1, p2)] = 1 · Pr {χ = 1} + 0 · Pr {χ = 0} = Pr {χ = 1} = γ, ∀p1, p2 ∈ Ω
So, back to r.v. y’s mean value:
ηγ =
(p1,p2)∈C(m,2)
E [χ(p1, p2)] =
|C(m,2)|
k=0
γ =
|C(m,2)|
k=0
1
N
=
1
N
·
|C(m,2)|
k=0
=
1
N
m
2
Proving the thesis.
Thanks to lemma 18, we can now try to calculate a reasonable value for and
decide how many clones of a packet we should send in the network to ensure an
effective level of redundancy.
Theorem 19 (Optimal ). Let R = (Ω, r, ξ, φ, ψ) be a ring with Ω = N stations and let
p ∈ P be a packet sent with redundancy factor ∈ N. Then, in order to guarantee that at
least 50 % of sent packets do not collide, the optimal redundancy factor is:
< opt = N
Proof. Let β ∈ [0, 1] ⊂ R be the percentage of collisions that we allow on the number
of total packets + 1 (the original packet and its clones) sent to the network. So, the
following holds:
+ 1
2
1
N
< β( + 1) =⇒
+ 1
2
< βN( + 1)
=⇒
+ 1
2
1
+ 1
< βN
=⇒
( + 1)!
2!( − 1)!
·
1
+ 1
< βN
=⇒
( + 1) ( − 1)!
2( − 1)!
·
1
+ 1
< βN
=⇒
2
< βN =⇒ < 2βN
Which proves the thesis by considering: β = 1
2.

65
Chapter 6
Conclusions and final notes
Simulations have shown the effectiveness of the balancing performed by the algo-
rithm, together with the use of known distributed architectures (DHT networks),
the proposed balancing approach is feasible and potentially employable in real case
scenarios.
6.1 Open issues
The algorithm currently presents some challenges which must be addressed in order
to make the architecture more flexible and less costly from a network performance
standpoint (traffic and control overhead)
Scalability is the first priority. The analysis performed so far has provided good
upper bindings to the cost of scaling up the ring by one station; however more is
to be investigated. More simulations should be run on scaling rings and a differ-
ential analysis must be carried on to identify possible patterns which can be made
advantage of.
6.2 What’s next
As a continuation of the effort described in this document, the next action items to
focus on are:
1. Improving C/C++ simulations to target more advanced scenarios.
2. Performing more simulations on very large networks (up to 1000 stations and
more) and higher traffic volumes.
3. Developing simulations targeting traffic handling in the ring, in order to get
more information about the impact on network performance introduced by
the PA.
4. Enriching simulations with more features addressing differential analysis on
scaling rings.
The next iteration should focus on collecting more information regarding the
performance of the algorithm with special focus on high variance conditions in the
amplitudes on hash segments. Furthermore, it can be beneficial to evaluate migra-
tions flows on scaling scenarios.

67
Appendix A
C/C++ simulation engine’s
architecture
The C/C++ simulation engine has been developed with the following technologies:
• Intel’s Threading Building Blocks (TBB1) for parallel packet generation and
hash computation.
• GNU C/C++ compiler.
• Boost2 C++ libraries for big integers and other utilities.
• Tina’s Random Number Generator (TRNG3) library for randomizers.
• OpenSSL4 cryptographic library for hash computation.
• Circos5 library for circo-diagrams generation (migration flows).
Simulation steps Simulations can run sequentially or in parallel. When running
in parallel, a Monte Carlo approach is used so that packet generation and hash com-
putation can be performed much faster. When running a simulation, the following
steps are performed:
1. Pre-compilation configuration Compilation variables are assigned. The en-
gine is based on STL6 and parameters such as the number of stations N, the
number of generated packets m are all defined as compile-time constants; thus
they need to be set.
2. Compilation The simulation engine undergoes compilation in order to pro-
duce simulation executables.
3. Post-compilation configuration Simulation input files are prepared in order to
specify hash segments and other network descriptive variables.
4. Execution Simulations run.
5. Data extraction Output data is generated in order to get aggregated informa-
tion and markup files to be used for generating circo-diagrams.
1
Intel’s library for multi-threaded processing. https://www.threadingbuildingblocks.
org/.
2
Boost libraries. http://www.boost.org/
3
Random number generator library. https://www.numbercrunch.de/trng/.
4
Standard SSL implementation. https://www.openssl.org/.
5
Circos. http://circos.ca/.
6
C++ Standard Template Library allows the use of generic types and compile-time constants.

68 Appendix A. C/C++ simulation engine’s architecture
Every simulation generates 3 files:
• A data file tracking hash segments per each station and all generated packets,
hashes and φ-hashes.
• A table file containing a matrix used by Circos to generate migration flows.
• A Karyotype file used by Circos to generate other diagrams (for the future).
Infrastructure All simulations mentioned in this document have been run against
a pool of Intel 4-core machines: HP ProLiant DL180 G6 (64 bit) on CentOS 6 (RHEL).

69
List of Figures
1.1 Overall system architecture. The end user interacts only with the stor-
age system, while the balancing system is hidden to the user and
transparent to the storage system with regards to accessing the server
pool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 A N = 8 network example showing the logical ring topology. Each
station is assigned with an ID (typically the IP address hash) and
packets are routed by content. . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Access to the ring is guarded by proxies. . . . . . . . . . . . . . . . . . . 11
2.3 Hash-partitioning of a ring into different segments, one per each sta-
tion. For each segment, a different impulse is used, its coverage matches
the segment’s length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Hash segments mapped onto φ segments illustrating how hash func-
tion φ works. The top part of the diagram shows the φ hash-space
while the bottom part the ξ hash-space . . . . . . . . . . . . . . . . . . . 26
3.1 Showing the Polar Hash Coverage Plot (PHCP) of a simulation on an
N = 10 station ring after sending m = 103 packets. Both plots show
the configuration of the station hash segments together with the final
load levels at the end of the simulation. The plot on the left refers to a
normal ring (hash function ξ applied), the one on the right refers to an
extended ring where hash function φ based on same ξ is considered.
The same packets were sent in both rings. . . . . . . . . . . . . . . . . . 32
3.2 Showing load state (in blue) |sk| in each station sk as time grows. In
this simulation, hash function ξ is used (normal ring). The green line
shows the expected load state (uniform) for each point in time. . . . . . 33
3.3 Showing load state (in blue) |sk| in each station sk as time grows. In
this simulations set (same as in figure 3.1), hash function φ is used (ex-
tended ring). The green line shows the expected load state (uniform)
for each point in time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Plotting standard deviation vs dispersion factor of generated ξ-hashes
and φ-hashes during simulations batches (from left to right): N = 10
(40 simulations), N = 30 (60 simulations) and N = 50 (10 simulations). 37
3.5 Plotting standard deviation of hash segment lengths and standard
deviation of φ-hashes during each simulations in batches (from top
to bottom): N = 10 (40 simulations), N = 30 (60 simulations) and
N = 50 (10 simulations). . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.6 Plotting station loads η
(ξ)
k (no balancing) and η
(φ)
k (balanced ring) at
the end of four N30 simulations with different seeds. . . . . . . . . . . 39
3.7 Migrations flows in a N30 ring. . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 Data unit format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

70 List of Figures
4.2 Synchronous vs. asynchronous communication model when storing
a single packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Packet info format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4 Sequence diagram showing the retrieval protocol in case of a frag-
mented packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.1 Message format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Multiple hashing mechanism for achieving safe redundancy. Hashes
are computed and then concatenated to the data stream, hence gener-
ating packets ready to be sent. . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Packet retrieval session under the hypothesis of one station down.
The diagram illustrates how a failed RReq triggers the emergency re-
trieval process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

71
List of Tables
2.1 Showing, in the example, values of hash identiﬁers and hash seg-
ments for each station. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

73
Acknowledgements
Thanks to my supervisor: Prof. Eng. O. Tomarchio, for having enough patience and
waiting a few more years for me to finish this research while working in Denmark.
Thanks to Medilink srl: my host company for my master traineeship during
which this research effort was started and completed in its first step. They provided
everything I needed (resources, infrastructure) to complete my work.
Thanks to my Team Lead at Microsoft: Horina, for her flexibility and availability,
allowing me to submit this work in time.
Graphics and artwork Icons and graphics in figures created by Katemangostar -
Freepik.com.
Last but not least, thanks to all the amazing public libraries in Copenhagen which
have hosted me and my work during many weekends spent on this thesis.

75
Bibliography
Haugh, Martin (2004). “Generating Random Variables and Stochastic Processes”. In:
Monte Carlo Simulation: IEOR E4703 1.1, pp. 6–10. URL: http://www.columbia.
edu/~mh2078/MCS04/MCS_generate_rv.pdf.
Kolchin, Valentin F. (1998). “Random Allocations”. In: Washington: Winston 69.3, pp. 1236–
1239. URL: http://link.aip.org/link/?RSI/69/1236/1.
Nijenhuis, Albert (1974). “Strong derivatives and inverse mappings”. In: The Amer-
ican Mathematical Monthly: DOI: 10.2307/2319298 81.1, pp. 1–12. URL: http://
www.jstor.org/stable/2319298.

Master Thesis - A Distributed Algorithm for Stateless Load Balancing

More Related Content

What's hot (20)

Similar to Master Thesis - A Distributed Algorithm for Stateless Load Balancing (20)

More from Andrea Tino (20)

Recently uploaded (20)

Master Thesis - A Distributed Algorithm for Stateless Load Balancing