SlideShare a Scribd company logo
Pós-Graduação em Ciência da Computação
“Nubilum: Resource Management System for
Distributed Clouds”
Por
Glauco Estácio Gonçalves
Tese de Doutorado
Universidade Federal de Pernambuco
posgraduacao@cin.ufpe.br
www.cin.ufpe.br/~posgraduacao
RECIFE, 03/2012
UNIVERSIDADE FEDERAL DE PERNAMBUCO
CENTRO DE INFORMÁTICA
PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO
GLAUCO ESTÁCIO GONÇALVES
“Nubilum: Resource Management System for Distributed Clouds"
ESTE TRABALHO FOI APRESENTADO À PÓS-GRADUAÇÃO EM
CIÊNCIA DA COMPUTAÇÃO DO CENTRO DE INFORMÁTICA DA
UNIVERSIDADE FEDERAL DE PERNAMBUCO COMO REQUISITO
PARCIAL PARA OBTENÇÃO DO GRAU DE DOUTOR EM CIÊNCIA
DA COMPUTAÇÃO.
ORIENTADORA: Dra. JUDITH KELNER
CO-ORIENTADOR: Dr. DJAMEL SADOK
RECIFE, MARÇO/2012
Nubilum: Resource Management System for Distributed Clouds
Tese de Doutorado apresentada por Glauco Estácio Gonçalves à Pós- Graduação em
Ciência da Computação do Centro de Informática da Universidade Federal de
Pernambuco, sob o título “Nubilum: Resource Management System for Distributed
Clouds” orientada pela Profa. Judith Kelner e aprovada pela Banca Examinadora
formada pelos professores:
___________________________________________________________
Prof. Paulo Romero Martins Maciel
Centro de Informática / UFPE
__________________________________________________________
Prof. Stênio Flávio de Lacerda Fernandes
Centro de Informática / UFPE
____________________________________________________________
Prof. Kelvin Lopes Dias
Centro de Informática / UFPE
_________________________________________________________
Prof. José Neuman de Souza
Departamento de Computação / UFC
___________________________________________________________
Profa. Rossana Maria de Castro Andrade
Departamento de Computação / UFC
Visto e permitida a impressão.
Recife, 12 de março de 2012.
___________________________________________________
Prof. Nelson Souto Rosa
Coordenador da Pós-Graduação em Ciência da Computação do
Centro de Informática da Universidade Federal de Pernambuco.
To my family Danielle, João
Lucas, and Catarina.
iv
Acknowledgments
I would like to express my gratitude to God, cause of all the things and also my
existence; and to the Blessed Virgin Mary to whom I appealed many times in prayer, being
attended always.
I would like to thank my advisor Dr. Judith Kelner and my co-advisor Dr. Djamel
Sadok, whose expertise and patience added considerably to my doctoral experience. Thanks
for the trust in my capacity to conduct my doctorate at GPRT (Networks and
Telecommunications Research Group).
I am indebted to all the people from GPRT for their invaluable help for this work. A
very special thanks goes out to Patrícia, Marcelo, and André Vítor, which have given
valuable comments over the course of my PhD.
I must also acknowledge my committee members, Dr. Jose Neuman, Dr. Otto Duarte,
Dr. Rossana Andrade, Dr. Stênio Fernandes, Dr. Kelvin Lopes, and Dr. Paulo Maciel for
reviewing my proposal and dissertation, offering helpful comments to improve my work.
I would like to thank my wife Danielle for her prayer, patience, and love which gave
me the necessary strength to finish this work. A special thanks to my children, João Lucas
and Catarina. They are gifts of God that make life delightful.
Finally, I would like to thank my parents, João and Fátima, and my sisters, Cynara and
Karine, for their love. Their blessings have always been with me as I followed in my doctoral
research.
v
Abstract
The current infrastructure of Cloud Computing providers is composed of networking and
computational resources that are located in large datacenters supporting as many as
hundreds of thousands of diverse IT equipment. In such scenario, there are several
management challenges related to the energy, failure and operational management and
temperature control. Moreover, the geographical distance between resources and final users
is a source of delay when accessing the services. An alternative to such challenges is the
creation of Distributed Clouds (D-Clouds) with geographically distributed resources along to
a network infrastructure with broad coverage.
Providing resources in such a distributed scenario is not a trivial task, since, beyond the
processing and storage resources, network resources must be taken in consideration offering
users a connectivity service for data transportation (also called Network as a Service – NaaS).
Thereby, the allocation of resources must consider the virtualization of servers and the
network devices. Furthermore, the resource management must consider all steps from the
initial discovery of the adequate resource for attending developers’ demand to its final
delivery to the users.
Considering those challenges in resource management in D-Clouds, this Thesis
proposes then Nubilum, a system for resource management on D-Clouds considering geo-
locality of resources and NaaS aspects. Through its processes and algorithms, Nubilum
offers solutions for discovery, monitoring, control, and allocation of resources in D-Clouds
in order to ensure the adequate functioning of the D-Cloud while meeting developers’
requirements. Nubilum and its underlying technologies and building blocks are described
and their allocation algorithms are also evaluated to verify their efficacy and efficiency.
Keywords: cloud computing, resource management mechamisms, network virtualization.
vi
Resumo
Atualmente, a infraestrutura dos provedores de computação em Nuvem é composta por
recursos de rede e de computação, que são armazenados em datacenters de centenas de
milhares de equipamentos. Neste cenário, encontram-se diversos desafios quanto à gerência
de energia e controle de temperatura, além de, devido à distância geográfica entre os recursos
e os usuários, ser fonte de atraso no acesso aos serviços. Uma alternativa a tais desafios é o
uso de Nuvens Distribuídas (Distributed Clouds – D-Clouds) com recursos distribuídos
geograficamente ao longo de uma infraestrutura de rede com cobertura abrangente.
Prover recursos em tal cenário distribuído não é uma tarefa trivial, pois, além de
recursos computacionais e de armazenamento, deve-se considerar recursos de rede os quais
são oferecidos aos usuários da nuvem como um serviço de conectividade para transporte de
dados (também chamado Network as a Service – NaaS). Desse modo, o processo de alocação
deve considerar a virtualização de ambos, servidores e elementos de rede. Além disso, a
gerência de recursos deve considerar desde a descoberta dos recursos adequados para
atender as demandas dos usuários até a manutenção da qualidade de serviço na sua entrega
final.
Considerando estes desafios em gerência de recursos em D-Clouds, este trabalho
propõe Nubilum: um sistema para gerência de recursos em D-Cloud que considera aspectos
de geo-localidade e NaaS. Por meio de seus processos e algoritmos, Nubilum oferece
soluções para descoberta, monitoramento, controle e alocação de recursos em D-Clouds de
forma a garantir o bom funcionamento da D-Cloud, além de atender os requisitos dos
desenvolvedores. As diversas partes e tecnologias de Nubilum são descritos em detalhes e
suas funções delineadas. Ao final, os algoritmos de alocação do sistema são também
avaliadas de modo a verificar sua eficácia e eficiência.
Palavras-chave: computação em nuvem, mecanismos de alocação de recursos, virtualização
de redes.
vii
Contents
Abstract v
Resumo vi
Abbreviations and Acronyms xii
1 Introduction 1
1.1 Motivation............................................................................................................................................. 2
1.2 Objectives ............................................................................................................................................. 4
1.3 Organization of the Thesis................................................................................................................. 4
2 Cloud Computing 6
2.1 What is Cloud Computing?................................................................................................................ 6
2.2 Agents involved in Cloud Computing.............................................................................................. 7
2.3 Classification of Cloud Providers...................................................................................................... 8
2.3.1 Classification according to the intended audience..................................................................................8
2.3.2 Classification according to the service type.............................................................................................8
2.3.3 Classification according to programmability.........................................................................................10
2.4 Mediation System............................................................................................................................... 11
2.5 Groundwork Technologies.............................................................................................................. 12
2.5.1 Service-Oriented Computing...................................................................................................................12
2.5.2 Server Virtualization..................................................................................................................................12
2.5.3 MapReduce Framework............................................................................................................................13
2.5.4 Datacenters.................................................................................................................................................14
3 Distributed Cloud Computing 15
3.1 Definitions.......................................................................................................................................... 15
3.2 Research Challenges inherent to Resource Management ............................................................ 18
3.2.1 Resource Modeling....................................................................................................................................18
3.2.2 Resource Offering and Treatment..........................................................................................................20
3.2.3 Resource Discovery and Monitoring......................................................................................................22
3.2.4 Resource Selection and Optimization....................................................................................................23
3.2.5 Summary......................................................................................................................................................27
4 The Nubilum System 28
4.1 Design Rationale................................................................................................................................ 28
4.1.1 Programmability.........................................................................................................................................28
4.1.2 Self-optimization........................................................................................................................................29
4.1.3 Existing standards adoption.....................................................................................................................29
4.2 Nubilum’s conceptual view.............................................................................................................. 29
4.2.1 Decision plane............................................................................................................................................30
4.2.2 Management plane.....................................................................................................................................31
4.2.3 Infrastructure plane...................................................................................................................................32
4.3 Nubilum’s functional components.................................................................................................. 32
4.3.1 Allocator......................................................................................................................................................33
4.3.2 Manager.......................................................................................................................................................34
viii
4.3.3 Worker.........................................................................................................................................................35
4.3.4 Network Devices.......................................................................................................................................36
4.3.5 Storage System ...........................................................................................................................................37
4.4 Processes............................................................................................................................................. 37
4.4.1 Initialization processes..............................................................................................................................37
4.4.2 Discovery and monitoring processes......................................................................................................38
4.4.3 Resource allocation processes..................................................................................................................39
4.5 Related projects.................................................................................................................................. 40
5 Control Plane 43
5.1 The Cloud Modeling Language ....................................................................................................... 43
5.1.1 CloudML Schemas.....................................................................................................................................45
5.1.2 A CloudML usage example......................................................................................................................52
5.1.3 Comparison and discussion .....................................................................................................................56
5.2 Communication interfaces and protocols...................................................................................... 57
5.2.1 REST Interfaces.........................................................................................................................................57
5.2.2 Network Virtualization with Openflow.................................................................................................63
5.3 Control Plane Evaluation ................................................................................................................. 65
6 Resource Allocation Strategies 68
6.1 Manager Positioning Problem ......................................................................................................... 68
6.2 Virtual Network Allocation.............................................................................................................. 70
6.2.1 Problem definition and modeling ...........................................................................................................72
6.2.2 Allocating virtual nodes............................................................................................................................74
6.2.3 Allocating virtual links...............................................................................................................................75
6.2.4 Evaluation...................................................................................................................................................76
6.3 Virtual Network Creation................................................................................................................. 81
6.3.1 Minimum length Steiner tree algorithms ...............................................................................................82
6.3.2 Evaluation...................................................................................................................................................86
6.4 Discussion........................................................................................................................................... 89
7 Conclusion 91
7.1 Contributions ..................................................................................................................................... 92
7.2 Publications ........................................................................................................................................ 93
7.3 Future Work ....................................................................................................................................... 94
References 96
ix
List of Figures
Figure 1 Agents in a typical Cloud Computing scenario (from [24]) ..................................................7
Figure 2 Classification of Cloud types (from [71]).................................................................................9
Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24]) ................11
Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud............................................16
Figure 5 ISP-based D-Cloud example ...................................................................................................17
Figure 6 Nubilum’s planes and modules...............................................................................................30
Figure 7 Functional components of Nubilum......................................................................................33
Figure 8 Schematic diagram of Allocator’s modules and relationships with other components..33
Figure 9 Schematic diagram of Manager’s modules and relationships with other components...34
Figure 10 Schematic diagram of Worker modules and relationships with the server system........35
Figure 11 Link discovery process using LLDP and Openflow ..........................................................38
Figure 12 Sequence diagram of the Resource Request process for a developer..............................39
Figure 13 Integration of different descriptions using CloudML........................................................44
Figure 14 Basic status type used in the composition of other types..................................................45
Figure 15 Type for reporting status of the virtual nodes ....................................................................46
Figure 16 XML Schema used to report the status of the physical node...........................................46
Figure 17 Type for reporting complete description of the physical nodes.......................................46
Figure 18 Type for reporting the specific parameters of any node ...................................................47
Figure 19 Type for reporting information about the physical interface ...........................................48
Figure 20 Type for reporting information about a virtual machine..................................................48
Figure 21 Type for reporting information about the whole infrastructure ......................................49
Figure 22 Type for reporting information about the physical infrastructure...................................49
Figure 23 Type for reporting information about a physical link .......................................................50
Figure 24 Type for reporting information about the virtual infrastructure .....................................50
Figure 25 Type describing the service offered by the provider .........................................................51
Figure 26 Type describing the requirements that can be requested by a developer .......................52
Figure 27 Example of a typical Service description XML ..................................................................53
Figure 28 Example of a Request XML..................................................................................................53
Figure 29 Physical infrastructure description........................................................................................54
Figure 30 Virtual infrastructure description..........................................................................................55
Figure 31 Communication protocols employed in Nubilum..............................................................57
Figure 32 REST operation for the retrieval of service information..................................................59
Figure 33 REST operation for updating information of a service ....................................................59
Figure 34 REST operation for requesting resources for a new application.....................................59
Figure 35 REST operation for changing resources of a previous request .......................................60
Figure 36 REST operation for releasing resources of an application ...............................................60
Figure 37 REST operation for registering a new Worker...................................................................60
Figure 38 REST operation to unregister a Worker..............................................................................61
Figure 39 REST operation for update information of a Worker ......................................................61
Figure 40 REST operation for retrieving a description of the D-Cloud infrastructure .................61
Figure 41 REST operation for updating the description of a D-Cloud infrastructure...................61
Figure 42 REST operation for the creation of a virtual node............................................................62
Figure 43 REST operation for updating a virtual node ......................................................................62
Figure 44 REST operation for removal of a virtual node...................................................................62
Figure 45 REST operation for requesting the discovered physical topology ..................................63
Figure 46 REST operation for the creation of a virtual link ..............................................................63
Figure 47 REST operation for updating a virtual link.........................................................................64
Figure 48 REST operation for removal of a virtual link.....................................................................64
x
Figure 49 Example of a typical rule for ARP forwarding...................................................................65
Figure 50 Example of the typical rules created for virtual links: (a) direct, (b) reverse..................65
Figure 51 Example of a D-Cloud with ten workers and one Manager.............................................69
Figure 52 Algorithm for allocation of virtual nodes............................................................................74
Figure 53 Example illustrating the minimax path................................................................................75
Figure 54 Algorithm for allocation of virtual links..............................................................................76
Figure 55 The (a) old and (b) current network topologies of RNP used in simulations................77
Figure 56 Results for the maximum node stress in the (a) old and (b) current RNP topology....78
Figure 57 Results for the maximum link stress in the (a) old and (b) current RNP topology ......79
Figure 58 Results for the mean link stress in the (a) old and (b) current RNP topology...............80
Figure 59 Mean path length (a) old and (b) current RNP topology..................................................80
Figure 60 Example creating a virtual network: (a) before the creation; (b) after the creation ......81
Figure 61 Search procedure used by the GHS algorithm....................................................................83
Figure 62 Placement procedure used by the GHS algorithm.............................................................84
Figure 63 Example of the placement procedure: (a) before and (b) after placement.....................85
Figure 64 Percentage of optimal samples for GHS and STA in the old RNP topology................87
Figure 65 Percentage of samples reaching relative error ≤ 5% in the old RNP topology.............88
Figure 66 Percentage of optimal samples for GHS and STA in the current RNP topology ........88
Figure 67 Percentage of samples reaching relative error ≤ 5% in the current RNP topology......89
xi
List of Tables
Table I Summary of the main aspects discussed..................................................................................27
Table II MIMEtypes used in the overall communications.................................................................58
Table III Models for the length of messages exchanged in the system in bytes.............................67
Table IV Characteristics present in Nubilum’s resource model ........................................................71
Table V Reduced set of characteristics considered by the proposed allocation algorithms ..........72
Table VI Factors and levels used in the MPA’s evaluation ................................................................78
Table VII Factors and levels used in the GHS’s evaluation...............................................................86
Table VIII Scientific papers produced ..................................................................................................94
xii
Abbreviations and Acronyms
CDN Content Delivery Network
CloudML Cloud Modeling Language
D-Cloud Distribute Cloud
DHCP Dynamic Host Configuration Protocol
GHS Greedy Hub Selection
HTTP Hypertext Transfer Protocol
IaaS Infrastructure as a Service
ISP Internet Service Provider
LLDP Link Layer Discovery Protocol
MPA Minimax Path Algorithm
MPP Manager Positioning Problem
NaaS Network as a Service
NV Network Virtualization
OA Optimal Algorithm
OCCI Open Cloud Computing Interface
PoP Point of Presence
REST Representational state transfer
RP Replica Placement
RPA Replica Placement Algorithm
STA Steiner Tree Approximation
VM Virtual Machine
VN Virtual Network
XML Extensible Markup Language
ZAA Zhu and Ammar Algorithm
1
1 Introduction
“A inea incipere.”
Erasmus
Nowadays, it is common to access content across the Internet with little reference to the underlying
datacenter hosting infrastructure maintained by content providers. The entire technology used to
provide such level of locality transparency offers also a new model for the provisioning of
computing services, known as Cloud Computing. This model is attractive as it allows resources to be
provisioned according to users’ requirements leading to overall cost reduction. Cloud users can rent
resources as they become necessary, in a much more scalable and elastic way. Moreover, such users
can transfer operational risks to cloud providers. In the viewpoint of those providers, the model
offers a way for a better utilization of their own infrastructure. Ambrust et al. [1] point out that this
model benefits from a form of statistical multiplexing, since it allocates resources for several users
concurrently on a demand basis. This statistical multiplexing of datacenters is subsequent to several
decades of research in many areas such as distributed computing, Grid computing, web
technologies, service computing, and virtualization.
Current Cloud Computing providers mainly use large and consolidated datacenters in order to
offer their services. However, the ever increasing need for over-provisioning to attend peak
demands and providing redundancy against failures allied to expensive cooling needs are important
factors increasing the energetic costs of centralized datacenters [62]. In current datacenters, the
cooling technologies used for heat dissipation control accounts for as much as 50% of the total
power consumption [38]. In addition to these aspects, it must be observed that the network between
users and the Cloud is often an unreliable best-effort IP service, which can harm delay-constrained
services and interactive applications.
To deal with these problems, there have been some indicatives whereby small cooperative
datacenters can be more attractive since they offer cheaper and low-power consumption alternative
reducing the infrastructure costs of centralized Clouds [12]. These small datacenters can be built at
different geographical regions and connected by dedicated or public (provided by Internet Service
Providers) networks, configuring a new type of Cloud, referred to as a Distributed Cloud. Such
2
Distributed Clouds [20], or just D-Clouds, can exploit the possibility of (virtual) links creation and
the potential of sharing resources across geographic boundaries to provide latency-based allocation
of resources to fully utilize this emerging distributed computing power. D-Clouds can reduce
communication costs by simply provisioning storage, servers, and network resources close to end-
users.
The D-Clouds can be considered as an additional step in the ongoing deployments of Cloud
Computing: one that supports different requirements and leverages new opportunities for service
providers. Users in a Distributed Cloud will be free to choose where to allocate their resources in
order to attend specific market niches, constraints on jurisdiction of software and data, or quality of
service aspects of their clients.
1.1 Motivation
Similarly to Cloud Computing, one of the most important design aspects of D-Clouds is the
availability of “infinite” computing resources which may be used on demand. Cloud users see this
“infinite” resource pool because the Cloud offers the continuous monitoring and management of its
resources and the allocation of resources in an elastic way. Nevertheless, providing on-demand
computing instances and network resources in a distributed scenario is not a trivial task. Dynamic
allocation of resources and their possible reallocation are essential characteristics for accommodating
unpredictable demands and, ultimately, contributing to investment return.
In the context of Clouds, the essential feature of any resource management system is to
guarantee that both user and provider requirements are met satisfactorily. Particularly in D-Clouds,
users may have network requirements, such as bandwidth and delay constraints, in addition to the
common computational requirements, such as CPU, memory, and storage. Furthermore, other user
requirements are relevant including node locality, topology of nodes, jurisdiction, and application
interaction.
The development of solutions to cope with resource management problems remains a very
important topic in the field of Cloud Computing. With regard to this technology, there are solutions
focused on grid computing ([49], [70]) and on datacenters in current Cloud Computing scenarios
([4]). However, such strategies do not fit well the D-Clouds as they are heavily based on assumptions
that do not hold in Distributed Cloud scenarios. For example, such solutions are designed for over-
provisioned networks and commonly do not take into consideration the cost of resources’
communication, which is an important aspect for D-Clouds that must be cautiously monitored
and/or reserved in order to meet users’ requirements.
3
The design of a resource management system involves challenges other than the specific
design of optimization algorithms for resource management. Since D-Clouds are composed of
computational and network devices with different architectures, software, and hardware capabilities,
the first challenge is the development of a suitable resource model covering all this heterogeneity
[20],. In addition, the next challenge is to describe how resources are offered, which is important
since the requirements supported by the D-Cloud provider are defined in this step. The other
challenges are related with the overall operation of the resource management system. When requests
arrive, the system should be aware of the current status of resources, in order to determine if there
are sufficient available resources in the D-Cloud that could satisfy the present request. In this way,
the right mechanisms for resource discovery and monitoring should also be designed, allowing the
system to be aware of the updated status of all its resources. Then, based on the current status and
the requirements of the request, the system may select and allocate resources to serve the present
request.
Please note that the solution to those challenges involves the fine-grained coordination of
several distributed components and the orchestrated execution of the several subsystems composing
the resource management system. At a first glance, these subsystems can be organized into three
parts: one responsible for the direct negotiation of requirements with users; another responsible for
deciding what resources to allocate for given applications; and one last part responsible for the
effective enforcement of these decisions on the resources.
Designing such system is a very interesting and challenging task, and it raises the following
research questions that will be investigated in this thesis:
1. How Cloud users describe their requirements? In order to enable the automatic
negotiation between users and the D-Cloud, the Cloud must recognize a language or
formalism for requirements description. Thus, the investigation of this topic must determine
the proper characteristics of such a language. In addition, it must verify the existent
approaches around this topic in the many relative computing areas.
2. How to represent the resources available in the Cloud? Correlated to the first question,
the resource management system must also maintain an information model to represent all
the resources in the Cloud, including their relationships (topology) and their current status.
3. How the users’ applications are mapped onto Cloud resources? This question is about
the very aspect of resource allocation, i.e., the algorithms, heuristics, and strategies that are
used to decide the set of resources meeting the applications’ requirements and optimizing a
utility function.
4
4. How to enforce the decisions made? The effective enforcement of the decisions involves
the extension of communication protocols or the development of new ones in order to
setup the state of the overall resources in the D-Cloud.
1.2 Objectives
The main objective of this Thesis is to propose an integrated solution to problems related to the
management of resources in D-Clouds. Such solution is presented as Nubilum, a resource
management system that offers a self-managed system for challenges on discovery, control,
monitoring, and allocation of resources in D-Clouds. Nimbulus provides fine-grained orchestration
of their components in order to allocate applications on a D-Cloud.
The specific goals of this Thesis are strictly related to the research questions presented in
Section 1.1, they are:
• Elaborate an information model to describe D-Cloud resources and application
requirements as computational restrictions, topology, geographic location and other
correlated aspects that can be employed to request resources directly to the D-Cloud;
• Explore and extend communication protocols for the provisioning and allocation of
computational and communication resources;
• Develop algorithms, heuristics, and strategies to find suitable D-Cloud resources based on
several different application requirements;
• Integrate the information model, the algorithms, and the communication protocols, into a
single solution.
1.3 Organization of the Thesis
This Thesis identifies the challenges involved in the resource management on Distributed Cloud
Computing and presents solutions for some of these challenges. The remainder of this document is
organized as follows.
The general concepts that make up the basis for all the other chapters are introduced in the
second chapter. Its main objective is to discuss Cloud Computing while trying to explore such
definition and to classify the main approaches in this area.
The Distributed Cloud Computing concept and several important aspects of resource
management on those scenarios are introduced in the third chapter. Moreover, this chapter will
make a comparative analysis of related research areas and problems.
5
The fourth chapter introduces the first contribution of this Thesis: the Nubilum resource
management system, which aggregates the several solutions proposed on this Thesis. Moreover, the
chapter highlights the rationale behind Nubilum as well as their main modules and components.
The fifth chapter examines and evaluates the control plane of Nubilum. It describes the
proposed Cloud Modeling Language and details the communication interfaces and protocols used
for communicating between Nubilum components.
The sixth chapter gives an overview of the resource allocation problems in Distributed
Clouds, and makes a thorough examination of the specific problems related to Nubilum. Some
particular problems are analyzed and a set of algorithms is presented and evaluated.
The seventh chapter of this Thesis reviews the obtained evaluation results, summarizes the
contributions and sets the path to future works and open issues on D-Cloud.
6
2 Cloud Computing
“Definitio est declaratio essentiae rei.”
Legal Proverb
In this chapter the main concepts of Cloud Computing will be presented. It begins with a discussion
on the definition of Cloud Computing (Section 2.1) and the main agents involved in Cloud
Computing (Section 2.2). Next, classifications of Cloud initiatives are offered in Section 2.3. An
exemplary and simple architecture of a Cloud Mediation System is presented in Section 2.4 followed
by a presentation in Section 2.5 of the main technologies acting behind the scenes of Cloud
Computing initiatives.
2.1 What is Cloud Computing?
A definition of Cloud Computing is given by the National Institute of Standards and Technology
(NIST) of the United States: “Cloud computing is a model for enabling convenient, on-demand
network access to a shared pool of configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly provisioned and released with minimal
management effort or service provider interaction” [45]. The definition says that on-demand
dynamic reconfiguration (elasticity) is a key characteristic. Additionally, the definition highlights
another Cloud Computing characteristic: it assumes that minimal management efforts are required
to reconfigure resources. In other words, the Cloud must offer self-service solutions that must
attend to requests on-demand, excluding from the scope of Cloud Computing those initiatives that
operate through the rental of computing resources in a weekly or monthly basis. Hence, it restricts
Cloud Computing to systems that provide automatic mechanisms for resource rental in real-time
with minimal human intervention.
The NIST definition gives a satisfactory concept of Cloud Computing as a computing model.
But, NIST does not cover the main object of Cloud Computing: the Cloud. Thus, in this Thesis,
Cloud Computing is defined as the computing model that operates based on Clouds. In turn, the
Cloud is defined as a conceptual layer that operates above an infrastructure to provide elastic
services in a timely manner.
7
This definition encompasses three main characteristics of Clouds. Firstly, it notes that a Cloud
is primarily a concept, i.e., a Cloud is an abstraction over an infrastructure. Thus, it is independent of
the employed technologies and therefore one can accept different setups, like Amazon EC2 or
Google App Engine, to be named Clouds. Moreover, the infrastructure is defined in a broad sense
once it can be composed by software, physical devices, and/or other Clouds. Secondly, all Clouds
have the same purpose: to provide services. This means that a Cloud hides the complexity of the
underlying infrastructure while exploring the potential of overlying services and acting as a
middleware. In addition, providing a service involves, implicitly, the use of some type of agreement
that should be guaranteed by the Cloud. Such agreements can vary from pre-defined contracts to
malleable agreements defining functional and non-functional requirements. Note that these services
are qualified as elastic ones, which has the same meaning of dynamic reconfiguration that appeared
in the NIST definition. Last but not least, the Cloud must provide services as quickly as possible
such that the infrastructure resources are allocated and reallocated to attend the users’ needs.
2.2 Agents involved in Cloud Computing
Despite previous approaches ([64], [8], [72], and [68]), this Thesis focuses only on three distinct
agents in Cloud Computing as shown in Figure 1: clients, developers, and the provider. The first
notable point is that the provider deals with two types of users that are called developers and clients.
Thus, clients are the customers of a service produced by a developer. Clients use services from
developers, but such use generates demand to the provider that actually hosts the service, and
therefore the client can also be considered a user of the Cloud. It is important to highlight that in
some scenarios (like scientific computing or batch processing) a developer may behave as a client to
the Cloud because it is the end-user of the applications. The text will use “users” when referring to
both classes without distinctions.
Figure 1 Agents in a typical Cloud Computing scenario (from [24])
Developers can be service providers, independent programmers, scientific institutions, and so
on, i.e., all who build applications into the Cloud. They create and run their applications while
Developer
Developer
Client Client Client Client
8
keeping decisions related to maintenance and management of the infrastructure to the provider.
Please note that, a priori, developers do not need to know about the technologies that makeup the
Cloud infrastructure, neither about the specific location of each item in the infrastructure.
Lastly, the term application is used to mean all types of services that can be developed on the
Cloud. In addition, it is important to note that the type of applications supported by a Cloud
depends exclusively on the goals of the Cloud as determined by the provider. Such a wide range of
possible targets generates many different types of Cloud Providers that are discussed in the next
section.
2.3 Classification of Cloud Providers
Currently, there are several operational initiatives of Cloud Computing; however despite all being
called Clouds, they provide different types of services. For that reason, the academic community
([64], [8], [45], [72], and [71]) classified these solutions accurately in order to understand their
relationships. The three complementary proposals for classification are as follows.
2.3.1 Classification according to the intended audience
This first simple taxonomy is suggested by NIST [45] that organizes providers according to the
audience to which the Cloud is aimed. There are four classes in this classification: Private Clouds,
Community Clouds, Public Clouds, and Hybrid Clouds.
The first three classes accommodate providers in a gradual opening of the intended audience
coverage. The Private Cloud class encompasses such types of Clouds destined to be used solely by
an organization operating over their own datacenter or one leased from a third party for exclusive
use. When the Cloud infrastructure is shared by a group of organizations with similar interests it is
classified as a Community Cloud. Furthermore, the Public Cloud class encompasses all initiatives
intended to be used by the general public. Finally, Hybrid Clouds are simply the composition of two
or more Clouds pertaining to different classes (Private, Community, or Public).
2.3.2 Classification according to the service type
In [71], authors offer a classification as represented in Figure 2. Such taxonomy divides Clouds in
five categories: Cloud Application, Cloud Software Environment, Cloud Software Infrastructure,
Software Kernel, and Firmware/Hardware. The authors arranged the different types of Clouds in a
stack, showing that Clouds from higher levels are created using services in the lower levels. This idea
pertains to the definitions of Cloud Computing discussed previously in Sections 2.1 and 2.2.
Essentially, the Cloud provider does not need to be the owner of the infrastructure.
9
Figure 2 Classification of Cloud types (from [71])
The class in the top of the stack, also called Software-as-a-Service (SaaS), involves applications
accessed through the Internet, including social networks, Webmail, and Office tools. Such services
provide software to be used by the general public, whose main interest is to avoid tasks related to
software management like installation and updating. From the point of view of the Cloud provider,
SaaS can decrease costs with software implementation when compared with traditional processes.
Similarly, the Cloud Software Environment, also called Platform-as-a-Service (PaaS), encloses
Clouds that offer programming environments for developers. Through well-defined APIs,
developers can use software modules for access control, authentication, distributed processing, and
so on, in order to produce their own applications in the Cloud. Moreover, developers can contract
services for automatic scalability of their software, databases, and storage services.
In the middle of the stack there is the Cloud Software Infrastructure class of initiatives. This
class encompasses solutions that provide virtual versions of infrastructure devices found in
datacenters like servers, databases, and links. Clouds in this class can be divided into three subclasses
according to the type of resource that is offered by them. Computational resources are grouped in
the Infrastructure-as-a-service (IaaS) subclass that provides generic virtual machines that can be used
in many different ways by the contracting developer. Services for massive data storage are grouped
in the Data-as-a-Service (DaaS) class, whose main mission is to store remotely users’ data on remote,
which allows those users to access their data from anywhere and at anytime. Finally, the third
subclass, called Communications-as-a-Service (CaaS), is composed of solutions that offer virtual
private links and routers through telecommunication infrastructures.
The last two classes do not offer Cloud services specifically, but they are included in the
classification to show that providers offering Clouds in higher layers can have their own software
and hardware infrastructure. The Software Kernel class includes all of the software necessary to
provide services to the other categories like operating systems, hypervisors, cloud management
10
middleware, programming APIs, and libraries. Finally, the class of Firmware/Hardware covers all
sale and rental services of physical servers and communication hardware.
2.3.3 Classification according to programmability
The five-class scheme presented above can classify and organize the current spectrum of Cloud
Computing solutions, but such a model is limited because the number of classes and their
relationships will need to be rearranged as new Cloud services emerge. Therefore, in this Thesis, a
different classification model will be used based on the programmability concept, which was
previously introduced by Endo et al. [19].
Borrowed from the realm of network virtualization [11], programmability is a concept related
to the programming features a network element offers to developers, measuring how much freedom
the developer has to manipulate resources and/or devices. This concept can be easily applied to the
comparison of Cloud Computing solutions. More programmable Clouds offer environments where
developers are free to choose programming paradigms, languages, and platforms. Less
programmable Clouds restrict developers in some way: perhaps by forcing a set of programming
languages or by providing support for only one application paradigm. On the other hand,
programmability directly affects the way developers manage their leased resources. From this point-
of-view, providers of less programmable Clouds are responsible to manage their infrastructure while
being transparent to developers. In turn, a more programmable Cloud leaves more of these tasks to
developers, thus introducing management difficulties due to the more heterogeneous programming
environment.
Thus, Cloud Programmability can be defined as the level of sovereignty under which
developers have to manipulate services leased from a provider. Programmability is a relative
concept, i.e., it was adopted to compare one Cloud with others. Also, programmability is directly
proportional to heterogeneity in the infrastructure of the provider and inversely proportional to the
amount of effort that developers must spend to manage leased resources.
To illustrate how this concept can be used, one can classify two current Clouds: Amazon EC2
and Google App Engine. Clearly the Amazon EC2 is more programmable, since in this Cloud
developers can choose between different virtual machine classes, operating systems, and so on. After
they lease one of these virtual machines, developers can configure it to work as they see fit: as a web
server, as a content server, as a unit for batch processing, and so on. On the other hand, Google
App Engine can be classified as a less programmable solution, because it allows developers to create
Web applications that will be hosted by Google. This restricts developers to the Web paradigm and
to some programming languages.
11
2.4 Mediation System
Figure 3 introduces an Archetypal Cloud Mediation System. This is a conceptual model that will be
used as a reference to the discussion on Resource Management in this Thesis. The Archetypal Cloud
Mediation System focuses on one principle: resource management as the main service of any Cloud
Computing provider. Thus, other important services like authentication, accounting, and security are
out of the scope of this conceptual system and, therefore these services are separated from the
Mediation System in this archetypal Cloud mediation system. Clients also do not factor into this
view of the system, since resource management is mainly related to the allocation of developers’
applications and meeting their requirements.
Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24])
The mediation system is responsible for the entire process of resource management in the
Cloud. Such a process covers tasks that range from the automatic negotiation of developers
requirements to the execution of their applications. It has three main layers: negotiation, resource
management, and resource control.
The negotiation layer deals with the interface between the Cloud and developers. In the case
of Clouds selling infrastructure services, the interface can be a set of operations based on Web
Services for control of the leased virtual machines. Alternately, in the case of PaaS services, this
interface can be an API for software development in the Cloud. Moreover, the negotiation layer
handles the process of contract establishment between the enterprises and the Cloud. Currently, this
process is simple and the contracts tend to be restrictive. One can expect that in the future, Clouds
will offer more sophisticated avenues for user interaction through high level abstractions and service
level policies.
Mediation
System
Resources
Resource Management
Negotiation
Resource Control
Developers
Auxiliary
Services
Account
Authentication
Security
12
The resource management layer is responsible for the optimal allocation of applications for
obtaining the maximum usage of resources. This function requires advanced strategies and heuristics
to allocate resources that meet the contractual requirements as established with the application
developer. These may include service quality restrictions, jurisdiction restrictions, elastic adaptation,
among others.
Metaphorically, one can say that while the resource management layer acts as the “brain” of
the Cloud, the resource control layer plays the role of its “limbs”. The resource control encompasses
all functions needed to enforce decisions generated by the upper layer. Beyond the tools used to
configure the Cloud resources effectively, all communication protocols used by the Cloud are
included in this layer.
2.5 Groundwork Technologies
Some of the main technologies that used by the current Cloud mediation systems (namely Service-
oriented Computing, Virtualization, MapReduce, and Datacenters) will be discussed.
2.5.1 Service-Oriented Computing
Service-Oriented Computing defines a set of principles, architectural models, and technologies for
the design and development of distributed applications. The recent development of software while
focusing on services gave rise to SOA (Service-Oriented Architecture), which can be defined as an
architectural model “that supports the discovery, message exchange, and integration between loosely
coupled services using industry standards” [37]. The common technology for the implementation of
SOA principles is the Web Service that defines a set of standards to implement services over the
World Wide Web.
In Cloud Computing, SOA is the main paradigm for the development of functions on the
several layers of the Cloud. Cloud providers publish APIs for their services on the web, allowing
developers to use the Cloud and to automate several tasks related to the management of their
applications. Such APIs can assume the form of WSDL documents or REST-based interfaces.
Furthermore, providers can make available Software Development Kits (SDKs) and other toolkits
for the manipulation of applications running on the Cloud.
2.5.2 Server Virtualization
Server virtualization is a technique that allows a computer system to be partitioned onto multiple
isolated execution environments offering a similar service as a single physical computer, which are
called Virtual Machines (VM). Each VM can be configured in an independent way while having its
own operating system, applications, and network parameters. Commonly, such VMs are hosted on a
13
physical server running a hypervisor, the software that effectively virtualizes the server and manages
the VMs [54].
There are several hypervisor options that can be used for server virtualization. From the open-
source community, one can cite Citrix’s Xen1
and the Kernel-based Virtual Machine (KVM)2
. From
the realm of proprietary solutions, some examples are VMWare ESX3
and Microsoft’s HyperV4
.
The main factor that boosted up the adoption of server virtualization within Cloud
Computing is that such technology offers good flexibility regarding the dynamic reallocation of
workloads across servers. Such flexibility allows, for example, providers to execute maintenance on
servers without stopping developers’ applications (that are running on VMs) or to implement
strategies for better resource usage through the migration of VMs. Furthermore, server virtualization
is adapted for the fast provisioning of new VMs through the use of templates, which enables
providers to offer elastic services for applications developers [43].
2.5.3 MapReduce Framework
MapReduce [15] is a programming framework developed by Google for distributed processing of
large data sets across computing infrastructures. Inspired on the map and reduce primitives present
in functional languages, its authors developed an entire framework for the automatic distribution of
computations. In this framework, developers are responsible for writing map and reduce operations
and for using them according to their needs, which is similar to the functional paradigm. These map
and reduce operations will be executed by the MapReduce system that transparently distributes
computations across the computing infrastructure and treats all issues related to node
communication, load balancing, and fault tolerance. For the distribution and synchronization of the
data required by the application, the MapReduce system also requires the use of a specially tailored
distributed file system called Google File System (GFS) [23].
Despite being introduced by Google, there are some open source implementations of the
MapReduce system, like Hadoop [6] and TPlatform [55]. The former is a popular open-source
software used for running applications on large clusters built of commodity hardware. This software
is used by large companies like Amazon, AOL, and IBM, as well as in different Web applications
such as Facebook, Twitter, Last.fm, among others. Basically, Hadoop is composed of two modules:
a MapReduce environment for distributed computing, and a distributed file system called the
Hadoop Distributed File System (HDFS). The latter is an academic initiative that provides a
1 http://www.xen.org/products/cloudxen.html
2 http://www.linux-kvm.org/page/Main_Page
3 http://www.vmware.com/
4 http://www.microsoft.com/hyper-v-server/en/us/default.aspx
14
development platform for Web mining applications. Similarly to Hadoop and Google’s MapReduce,
the TPlatform has a MapReduce module and a distributed file system known as the Tianwang File
System (TFS) [55].
The use of MapReduce solutions is common groundwork technology in PaaS Clouds because
it offers a versatile sandbox for developers. Differently from IaaS Clouds, PaaS developers using a
general-purpose language with MapReduce support do not need to be concerned with software
configuration, software updating and, network configurations. All these tasks are the responsibility
of the Cloud provider, which, in turn, benefits from the fact that such configurations will be
standardized across the overall infrastructure.
2.5.4 Datacenters
Developers who are hosting their applications on a Cloud wish to scale their leased resources,
effectively increasing and decreasing their virtual infrastructure according to the demand of their
clients. This is also the case for developers making use of their own private Clouds. Thus,
independently of the class of Cloud under consideration, a robust and safe infrastructure is needed.
Whereas virtualization and MapReduce respond for the software solution required to attend
this demand, the physical infrastructure of Clouds is based on datacenters, which are infrastructures
composed of TI components providing processing capacity, storage, and network services for one
or more organizations [66]. Currently, the size of a datacenter (in number of components) can vary
from tens of components to tens of thousands of components depending on the datacenter’s
mission. In addition, there are several different TI components for datacenters including switches
and routers, load balancers, storage devices, dedicated storage networks, and, the main component
of any datacenter, in other words, servers [27].
Cloud Computing datacenters provide the required power to attend developers’ demands in
terms of processing, storage, and networking capacities. A large datacenter, running a virtualization
solution, allows for better granularity division of the hardware’s power through the statistical
multiplexing of developers’ applications.
15
3 Distributed Cloud Computing
“Quae non prosunt singula, multa iuvant.”
Ovid
This chapter discusses the main concepts of Distributed Cloud (D-Cloud) Computing. It begins
with a discussion of their definition (Section 3.1) in an attempt to distinguish the D-Cloud from the
current Clouds and highlight their main characteristics. Next, the main research challenges regarding
resource management on D-Clouds will be described in Section 3.2.
3.1 Definitions
Current Cloud Computing setups involve a huge amount of investments as part of the datacenter,
which is the common underlying infrastructure of Clouds as previously detailed in Section 2.5.4.
This centralized infrastructure brings many well-known challenges such as the need for resource
over-provisioning and the high cost for heat dissipation and temperature control. In addition to
concerns with infrastructure costs, one must observe that those datacenters are not necessarily close
to their clients, i.e., the network between end-users and the Cloud is often a long best-effort IP
connection, which means longer round-trip delays.
Considering such limitations, industry and academy researchers have presented indicatives that
small datacenters can be sometimes more attractive since they offer a cheaper and low-power
consumption alternative while also reducing the infrastructure costs of centralized Clouds [12].
Moreover, Distributed Clouds, or just D-Clouds, as pointed out by Endo et al. in [20], can exploit
the possibility of links creation and the potential of sharing resources across geographic boundaries
to provide latency-based allocation of resources to ultimately fully utilize this distributed computing
power. Thus, D-Clouds can reduce communication costs by simply provisioning data, servers, and
links close to end-users.
Figure 4 illustrates how D-Clouds can reduce the cost of communication through the spread
of computational power and the usage of a latency-based allocation of applications. In Figure 4(a)
the client uses an application (App) running on the Cloud through the Internet, which is subject to
the latency imposed by the best-effort network. In Figure 4(b), the client is accessing the same App,
16
but in this case, the latency imposed by the network will be reduced due to the allocation of the App
in a server that is in a small datacenter closest to the client than the previous scenario.
(a) (b)
Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud
Please note that the Figure 4(b) intentionally does not specify the network connecting the
infrastructure of the D-Cloud Provider. This network can be rented from different local ISPs (using
the Internet for interconnection) or from an ISP with wide area coverage. In addition, such ISP
could be the own D-Cloud Provider itself. This may be the case as the D-Cloud paradigm
introduces an organic change in the current Internet where ISPs can start to play as D-Cloud
providers. Thus, ISPs could offer their communication and computational resources for developers
interested in deploying their applications at the specific markets covered by those ISPs.
This idea is illustrated by Figure 5 that shows a D-Cloud offered by a hypothetical Brazilian
ISP. In this example, a developer deployed its application (App) on two servers in order to attend
requests from northern and southern clients. If the number of northeastern clients increases, the
developer can deploy its App (represented by the dotted box) on one server close to the northeast
region in order to improve its service quality. It is important to pay attention to the fact that the
contribution of this Thesis falls in this last scenario, i.e., a scenario where the network and
computational resources are all controlled by the same provider.
CloudProvider
Client
Internet
App
Client
App
DistributedCloudProvider
17
Figure 5 ISP-based D-Cloud example
D-Clouds share similar characteristics with current Cloud Computing, including essential
offerings such as scalability, on demand usage, and pay-as-you-go business plans. Furthermore, the
agents already stated for current Clouds (please see Figure 1) are exactly the same in the context of
D-Clouds. Finally, the many different classifications discussed in Section 2.3 can be applied also.
Despite the similarity, one may highlight two peculiarities of D-Clouds: support to geo-locality and
Network as a Service (NaaS) provisioning ([2], [63], [17]).
The geographical diversity of resources potentially improves cost and performance and gives
an advantage to several different applications, particularly, those that do not require massive internal
communication among large server pools. In this category, as pointed out by [12], one can
emphasize, firstly, applications being currently deployed in a distributed manner, like VOIP (Voice
over IP) and online games; secondly, one can indicate the applications that are good candidates for
distributed implementation, like traffic filtering and e-mail distribution. In addition, there are other
different types of applications that use software or data with specific legal restrictions on
jurisdiction, and specific applications whose public is restricted to one or more geographical areas,
like the tracking of buses or subway routes, information about entertainment events, local news, etc.
Support for geo-locality can be considered to be a step further in the deployment of Cloud
Computing that leverages new opportunities for service providers. Thus, they will be free to choose
where to allocate their resources in order to attend to specific niches, constraints on jurisdiction of
software and data, or quality of service aspects of end-users.
The NaaS (or Communication as a Service – CaaS as cited in section 2.3.2) allows service
providers to manage network resources, instead of just computational ones. Authors in [2] call NaaS
as a service offering transport network connectivity with a level of virtualization suitable to be
App
App
App
18
invoked by service providers. In this way, D-Clouds are able to manage their network resources
according to their convenience, offering better response time for hosted applications. The NaaS is
close to the Network Virtualization (NV) research area [31], where the main problem consists in
choosing how to allocate a virtual network over a physical one, meeting requirements and
minimizing usage of the physical resources. Although NV and D-Clouds are subject to similar
problems and scenarios, there is an essential difference between these two. While NV commonly
models its resources at the infrastructure level (requests are always virtual networks mapped on
graphs), a D-Cloud can be engineered to work with applications in a different abstraction level,
exactly as it occurs with actual Cloud service types like the ones described at Section 2.3.2. This way,
one may see Network Virtualization simply as a particular instance of the D-Cloud. Other insights
about NV are given in Section 3.3.2.
Finally, it must be highlighted that the D-Cloud does not compete with the current Cloud
Computing paradigm, since the D-Cloud merely fits a certain type of applications that have hard
restrictions on geographical location, while the existent Clouds continue to be attracting for
applications demanding massive computational resources or simple applications with minor or no
restrictions on geographical location. Thus, the current Cloud Computing providers are the first
potential candidates to take advantage of the D-Cloud paradigm, since the current Clouds could hire
D-Cloud resources on-demand and move the applications to certain geographical locations in order
to meet specific developers’ requirements. In addition to the current Clouds, the D-Clouds can also
serve the developers directly.
3.2 Research Challenges inherent to Resource Management
D-Clouds face challenges similar to the ones presented in the context of current Cloud Computing.
However, as stated in Chapter 1, the object of the present study is the resource management in D-
Clouds. Thus, this Section gives special emphasis to the challenges for resource management in D-
Clouds, while focusing on four categories as presented in [20]: a) resource modeling; b) resource
offering and treatment; c) resource discovery and monitoring; and d) resource selection.
3.2.1 Resource Modeling
The first challenge is the development of a suitable resource model that is essential to all operations
in the D-Cloud, including management and control. Optimization algorithms are also strongly
dependent of the resource modeling scheme used.
In a D-Cloud environment, it is very important that resource modeling takes into account
physical resources as well as virtual ones. On one hand, the amount of details in each resource
should be treated carefully, since if resources are described with great details, there is a risk that the
19
resource optimization becomes hard and complex, since the computational optimization problem
considering the several modeled aspects can create NP-hard problems. On the other hand, more
details give more flexibility and leverage the usage of resources.
There are some alternatives for resource modeling in Clouds that could be applied to D-
Clouds. One can cite, for example, the OpenStack software project [53], which is focused on
producing an open standard Cloud operating system. It defines a Restful HTTP service that
supports JSON and XML data formats and it is used to request or to exchange information about
Cloud resources and action commands. OpenStack also offers ways to describe how to scale server
down or up (using pre-configured thresholds); it is extensible, allowing the seamless addition of new
features; and it returns additional error messages in faults case.
Other resource modeling alternative is the Virtual Resources and Interconnection Networks
Description Language (VXDL) [39], whose main goal is to describe resources that compose a virtual
infrastructure while focusing on virtual grid applications. The VXDL is able to describe the
components of an infrastructure, their topology, and an execution chronogram. These three aspects
compose the main parts of a VXDL document. The computational resource specification part
describes resource parameters. Furthermore, some peculiarities of virtual Grids are also present,
such as the allocation of virtual machines in the same hardware and location dependence. The
specification of the virtual infrastructure can consider specific developers’ requirements such as
network topology and delay, bandwidth, and the direction of links. The execution chronogram
specifies the period of resource utilization, allowing efficient scheduling, which is a clear concern for
Grids rather than Cloud computing. Another interesting point of VXDL is the possibility of
describing resources individually or in groups, according to application needs. VXDL lacks support
for distinct services descriptions, since it is focused on grid applications only.
The proposal presented in [32], called VRD hereafter, describes resources in a network
virtualization scenario where infrastructure providers describe their virtual resources and services
prior to offering them. It takes into consideration the integration between the properties of virtual
resources and their relationships. An interesting point in the proposal is its use of functional and
non-functional attributes. Functional attributes are related to characteristics, properties, and
functions of components. Non-functional attributes specify criteria and constraints, such as
performance, capacity, and QoS. Among the functional properties that must be highlighted is the set
of component types: PhysicalNode, VirtualNode, Link, and Interface. Such properties suggest a
flexibility that can be used to represent routers or servers, in the case of nodes, and wired or wireless
links, in the case of communication links and interfaces.
20
Another proposal known as the Manifest language was developed by Chapman et al. [9]. They
proposed new meta-models to represent service requirements, constraints, and elasticity rules for
software deployment in a Cloud. The building block of such framework is the OVF (Open
Virtualization Format) standard, which was extended by Chapman et al. to perform the vision of D-
Clouds considering locality constraints. These two points are very interesting to our scenario. With
regard to elasticity, it assumes a rule-based specification formed by three fields: a monitored
condition related to the state of the service (such as workload), an operator (relational and logical
ones are accepted), and an associated action to follow when the condition is met. The location
constraints identify sites that should be favored or avoided when selecting a location for a service.
Nevertheless, the Manifest language is focused on the software architecture. Hence, the language is
not concerned with other aspects such as resources’ status or network resources.
Cloud# is a language for modeling Clouds proposed by [16] to be used as a basis for Cloud
providers and clients to establish trust. The model is used by developers to understand the behavior
of Cloud services. The main goal of Cloud# is to describe how services are delivered, while taking
into consideration the interaction among physical and virtual resources. The main syntactic
construct within Cloud# is the computation unit CUnit, which can model Cloud systems, virtual
machines, or operating systems. A CUnit is represented as a tuple of six components modeling
characteristics and behaviors. This language gives developers a better understanding of the Cloud
organization and how their applications are dealt with.
3.2.2 Resource Offering and Treatment
Once the D-Cloud resources are modeled, the next challenge is to describe how resources are
offered to developers, which is important since the requirements supported by the provider are
defined in this step. Such challenge will also define the interfaces of the D-Cloud. This challenge
differs from resource modeling since the modeling is independent of the way that resources are
offered to developers. For example, the provider could model each resource individually, like
independent items in a fine-grained scale such as GHz of CPU or GB of memory, but could offer
them like a coupled collection of those items or a bundle, such as VM templates as cited at Section
2.5.2.
Recall that, in addition to computational requirements (CPU and memory) and traditional
network requirements, such as bandwidth and delay, new requirements are present under D-Cloud
scenarios. The topology of the nodes is a first interesting requirement to be described. Developers
should be able to set inter-nodes relationships and communication restrictions (e.g., downlink and
uplink rates). This is illustrated in the scenario where servers – configured and managed by
21
developers – are distributed at different geographical localities while it is necessary for them to
communicate with each other in a specific way.
Jurisdiction is related to where (geographically) applications and their data must be stored and
handled. Due to restrictions such as copyright laws, D-Cloud users may want to limit the location
where their information will be stored (such as countries or continents). Other geographical
constraint can be imposed by a maximum (or minimum) physical distance (or delay value) between
nodes. Here, though developers do not know about the actual topology of the nodes, they may
merely establish some delay threshold value for example.
Developers should also be able to describe scalability rules, which would specify how and
when the application would grow and consume more resources from the D-Cloud. Authors in [21]
and [9] define a way of doing this, allowing the Cloud user to specify actions that should be taken,
like deploying new VMs, based on thresholds of metrics monitored by the D-Cloud itself.
Additionally, resource offering is associated to interoperability. Current Cloud providers offer
proprietary interfaces to access their services, which can hinder users within their infrastructure as
the migration of applications cannot be easily made between providers [8]. It is hoped that Cloud
providers identify this problem and work together to offer a standardized API.
According to [61], Cloud interoperability faces two types of heterogeneities: vertical
heterogeneity and horizontal heterogeneity. The first type is concerned with interoperability within a
single Cloud and may be addressed by a common middleware throughout the entire infrastructure.
The second challenge, the horizontal heterogeneity, is related to Clouds from different providers.
Therefore, the key challenge is dealing with these differences. In this case, a high level of granularity
in the modeling may help to address the problem.
An important effort in the search for horizontal standardization comes from the Open Cloud
Manifesto5
, which is an initiative supported by hundreds of companies that aims to discuss a way to
produce open standards for Cloud Computing. Their major doctrines are collaboration and
coordination of efforts on the standardization, adoption of open standards wherever appropriate,
and the development of standards based on customer requirements. Participants of the Open Cloud
Manifesto, through the Cloud Computing Use Case group, produced an interesting white paper [51]
highlighting the requirements that need to be standardized in a cloud environment to ensure
interoperability in the most typical scenarios of interaction in Cloud Computing.
5 http://www.opencloudmanifesto.org/
22
Another group involved with Cloud standards is the Open Grid Forum6
, which is intended to
develop the specification of the Open Cloud Computing Interface (OCCI)7
. The goal of OCCI is to
provide an easily extendable RESTful interface Cloud management. Originally, the OCCI was
designed for IaaS setups, but their current specification [46] was extended to offer a generic scheme
for the management of different Cloud services.
3.2.3 Resource Discovery and Monitoring
When requests reach a D-Cloud, the system should be aware of the current status of resources, in
order to determine if there are available resources in the D-Cloud that could satisfy the requests. In
this way, the right mechanisms for resource discovery and monitoring should also be designed,
allowing the system to be aware of the updated status of all its resources. Then, based on the current
status and request’ requirements, the system may select and allocate resources to serve these new
request.
Resource monitoring should be continuous and help taking allocation and reallocation
decisions as part of the overall resource usage optimization. A careful analysis should be done to
find a good and acceptable trade-off between the amount of control overhead and the frequency of
resource information updating.
The monitoring may be passive or active. It is considered passive when there are one or more
entities collecting information. The entity may continuously send polling messages to nodes asking
for information or may do this on-demand when necessary. On the other hand, the monitoring is
active when nodes are autonomous and may decide when to send asynchronously state information
to some central entity. Naturally, D-Clouds may use both alternatives simultaneously to improve the
monitoring solution. In this case, it is necessary to synchronize updates in repositories to maintain
consistency and validity of state information.
The discovery and monitoring in a D-Cloud can be accompanied by the development of
specific communication protocols. Such protocols act as a standard plane for control in the Cloud,
allowing interoperability between devices. It is expected that such type of protocols can control the
different elements including servers, switches, routers, load balancers, and storage components
present in the D-Cloud. One possible method of coping with this challenge is to use smart
communication nodes with an open programming interface to create new services within the node.
One example of this type of open nodes can be seen in the emerging Openflow-enabled switches
[44].
6 http://www.gridforum.org/
7 http://occi-wg.org/about/specification/
23
3.2.4 Resource Selection and Optimization
With information regarding Cloud resource availability at hand, a set of appropriate candidates may
then be highlighted. Next, the resource selection process finds the configuration that fulfills all
requirements and optimizes the usage of the infrastructure. Selecting solutions from a set of
available ones is not a trivial task due to the dynamicity, high algorithm complexity, and all different
requirements that must be contemplated by the provider.
The problem of resource allocation is recurrent on computer science, and several computing
areas have faced such type of problem since early operating systems. Particularly in the Cloud
Computing field, due to the heterogeneous and time-variant environment in Clouds, the resource
allocation becomes a complex task, forcing the mediation system to respond with minimal
turnaround time in order to maintain the developer’s quality requirements. Also, balancing
resources’ load and projecting energy-efficient Clouds are major challenges in Cloud Computing.
This last aspect is especially relevant as a result of the high demand for electricity to power and to
cool the servers hosted on datacenters [7].
In a Cloud, energy savings may be achieved through many different strategies. Server
consolidation, for example, is a useful strategy for minimizing energy consumption while
maintaining high usage of servers’ resources. This strategy saves the energy migrating VMs onto
some servers and putting idle servers into a standby state. Developing automated solutions for
server consolidation can be a very complex task since these solutions can be mapped to bin-packing
problems known to be NP-hard [72].
VM migration and cloning provides a technology to balance load over servers within a Cloud,
provide fault tolerance to unpredictable errors, or reallocate applications before a programmed
service interruption. But, although this technology is present in major industry hypervisors (like
VMWare or Xen), there remains some open problems to be investigated. These include cloning a
VM into multiple replicas on different hosts [40] and developing VM migration across wide-area
networks [14]. Also, the VM migration introduces a network problem, since, after migration, VMs
require adaptation of the link layer forwarding. Some of the strategies for new datacenter
architectures explained in [67] offer solutions to this problem.
Remodeling of datacenter architectures is other research field that tries to overcome
limitations on scalability, stiffness of address spaces, and node congestion in Clouds. Authors in [67]
surveyed this theme, highlighted the problems on network topologies of state-of-the-art datacenters,
and discussed literature solutions for these problems. One of these solutions is the D-Cloud, as
24
pointed also by [72], which offers an energy efficient alternative for constructing a cloud and an
adapted solution for time-critical services and interactive applications.
Considering specifically the challenges on resource allocation in D-Clouds, one can highlight
correlated studies based on the Placement of Replicas and Network Virtualization. The former is
applied into Content Distribution Networks (CDNs) and it tries to decide where and when content
servers should be positioned in order to improve system’s performance. Such problem is associated
with the placement of applications in D-Clouds. The latter research field can be applied to D-Clouds
considering that a virtual network is an application composed by servers, databases, and the network
between them. Both research fields will be described in following sections.
Replica Placement
Replica Placement (RP) consists of a very broad class of problems. The main objective of this type
of problems is to decide where, when, and by whom servers or their content should be positioned in
order to improve CDN performance. The correspondent existing solutions to these problems are
generally known as Replica Placement Algorithms (RPA) [35].
The general RP problem is modeled as a physical topology (represented by a graph), a set of
clients requesting services, and some servers to place on the graph (costs per server can be
considered instead). Generally, there is a pre-established cost function to be optimized that reflects
service-related aspects, such as the load of user’s requests, the distance from the server, etc. As
pointed out by [35], an RPA groups these aspects into two different components: the problem
definition, which consists of a cost function to be minimized under some constraints, and a
heuristic, which is used to search for near-optimal solutions in a feasible time frame, since the
defined problems are usually NP-complete.
Several different variants of this general problem were already studied. But, according to [57],
they fall into two classes: facility location and minimum K-median. In the facility location problem,
the main goal is to minimize the total cost of the graph through the placement of a number of
servers, which have an associated cost. The minimum K-median problem, in turn, is similar but
assumes the existence of a pre-defined number K of servers. More details on the modeling and
comparison between different variants of the RP problem are provided by [35].
Different versions of this problem can be mapped onto resource allocation problems in D-
Clouds. A very simple mapping can be defined considering an IaaS service where virtual machines
can be allocated in a geo-distributed infrastructure. In such mapping, the topology corresponds to
the physical infrastructure elements of the D-Cloud, the VMs requested by developers can be
treated as servers, and the number of clients accessing each server would be their load.
25
Qiu et al. [57] proposed three different algorithms to solve the K-median problem in a CDN
scenario: Tree-based algorithm, Greedy algorithm, and Hot Spot algorithm. The Tree-based solution
assumes that the underlying graph is a tree that is divided into several small trees, placing each server
in each small tree. The Greedy algorithm places servers one at a time in order to obtain a better
solution in each step until all servers are allocated. Finally, the Hot Spot solution attempts to place
servers in the vicinity of clients with the greatest demand. The results showed that the Greedy
Algorithm for replica placement could provide CDNs with performance that is close to optimal.
These solutions can be mapped onto D-Clouds considering the simple scenario of VM
allocation on a geo-distributed infrastructure with the restriction that each developer has a fixed
number of servers to attend their clients. In such case, this problem can be straightforwardly
reduced to the K-median problem and the three solutions proposed could be applied. Basically, one
could treat each developer as a different CDN and optimize each one independently still considering
a limited capacity of the physical resources caused by the allocation of other developers.
Presti et al. [56], treat a RP variant considering a trade-off between the load of requests per
content and the number of replica additions and removals. Their solution considers that each server
in the physical topology decides autonomously, based on thresholds, when to clone overloaded
contents or to remove the underutilized ones. Such decisions also encompass the minimization of
the distance between clients and the respective accessed replica. A similar problem is investigated in
[50], but considering constraints on the QoS perceived by the client. The authors propose a
mathematical offline formulation and an online version that uses a greedy heuristic. The results
show that the heuristic presents good results with minor computational time.
The main focus of these solutions is to provide scalability to the CDN according to the load
caused by client requests. Thus, despite working only with the placement of content replicas, such
solutions can be also applied to D-Clouds with some simple modifications. Considering replicas as
allocated VMs, one can apply the threshold-based solution proposed in [56] to the simple scenario
of VM scalability on a geo-distributed infrastructure.
Network Virtualization
The main problem of NV is the allocation of virtual networks over a physical network [10] and [3].
Analogously, D-Clouds’ main goal is to allocate application requests on physical resources according
to some constraints while attempting to obtain a clever mapping between the virtual and physical
resources. Therefore, problems on D-Clouds can be formulated as NV problems, especially in
scenarios considering IaaS-level services.
26
Several instances of the NV based resource allocation problem can be reduced to a NP-hard
problem [48]. Even the versions where one knows beforehand all the virtual network requests that
will arrive in the system is NP-hard. The basic solution strategy thus is to restrict the problem space
making it easier to deal with and also consider the use of simple heuristic-based algorithms to
achieve fast results.
Given a model based on graphs to represent both physical and virtual servers, switches, and
links [10], an algorithm that allocates virtual networks should consider the constraints of the
problem (CPU, memory, location or bandwidth limits) and an objective function based on the
algorithm objectives. In [31], the authors describe some possible objective functions to be
optimized, like the ones related to maximize the revenue of the service provider, minimizing link
and nodes stress, etc. They also survey heuristic techniques used when allocating the virtual
networks dividing them in two types: static and dynamic. The dynamic type permits reallocating
along the time by adding more resources to already allocated virtual networks in order to obtain a
better performance. The static one means once a virtual network is allocated it will hardly ever
change its setup.
To exemplify the type of problems studied on NV, one can be driven to discuss the one
studied by Chowdhury et al. [10]. Its authors propose an objective function related to the cost and
revenue of the provider and constrained by capacity and geo-location restrictions. They reduce the
problem to a mixed integer programming problem and then relax the integer constraints through the
deriving of two different algorithms for the solution’s approximation. Furthermore, the paper also
describes a Load Balancing algorithm, in which the original objective function is customized
in order to avoid using nodes and links with low residual capacity. This approach implies in
allocation on less loaded components and an increase of the revenue and acceptance ratio of
the substrate network.
Such type of problem and solutions can be applied to D-Clouds. One example could be the
allocation of interactive servers with jurisdiction restrictions. In this scenario, the provider must
allocate applications (which can be mapped on virtual networks) whose nodes are linked and that
must be close to a certain geographical place according to a maximum tolerated delay. Thus, a
provider could apply the proposed algorithms with minor simple adjustments.
In the paper of Razzaq and Rathore [58], the virtual network embedding algorithm is divided
in two steps: node mapping and link mapping. In the node mapping step, nodes with highest
resource demand are allocated first. The link mapping step is based on an edge disjoint k-shortest
path algorithm, by selecting the shortest path which can fulfill the virtual link bandwidth
27
requirement. In [42], a backtracking algorithm for the allocation of virtual networks onto substrate
networks based on the graph isomorphism problem is proposed. The modeling considers multiple
capacity constraints.
Zhu and Ammar [74] proposed a set of four algorithms with the goal of balancing the load on
the physical links and nodes, but their algorithms do not consider capacity aspects. Their algorithms
perform the initial allocation and make adaptive optimizations to obtain better allocations. The key
idea of the algorithms is to allocate virtual nodes considering the load of the node and the load of
the neighbor links of that node. Thus one can say that they perform the allocation in a coordinated
way. For virtual link allocation, the algorithm tries to select paths with few stressed links in the
network. For more details about the algorithm see [74].
Considering the objectives of NV and RP problems, one may note that NV problems are a
general form of the RP problem: RP problems try to allocate virtual servers whereas NV considers
allocation of virtual servers and virtual links. Both categories of problems can be applied to D-
Clouds. Particularly, RP and NV problems may be respectively mapped on two different classes of
D-Clouds: less controllable D-Clouds and more controllable ones, respectively. The RP problems
are suitable for scenarios where allocation of servers is more critical than links. In turn, the NV
problems are especially adapted to situations where the provider is an ISP that has full control over
the whole infrastructure, including the communication infrastructure.
3.2.5 Summary
The D-Clouds’ domain brings several engineering and research challenges that were discussed in this
section and whose main aspects are summarized in Table I. Such challenges are only starting to
receive attention from the research community. Particularly, the system, models, languages, and
algorithms presented in the next chapters will cope with some of these challenges.
Table I Summary of the main aspects discussed
Categories Aspects
Resource Modeling
Heterogeneity of resources
Physical and virtual resources must be considered
Complexity vs. Flexibility
Resource Offering
and Treatment
Describe the resources offered to developers
Describe the supported requirements
New requirements: topology, jurisdiction, scalability
Resource Discovery
and Monitoring
Monitoring must be continuous
Control overhead vs. Updated information
Resource Selection
and Optimization
Find resources to fulfill developer’s requirements
Optimize usage of the D-Cloud infrastructure
Complex problems solved by approximation algorithms
28
4 The Nubilum System
“Expulsa nube, serenus fit saepe dies.”
Popular Proverb
Section 2.4 introduced an Archetypal Cloud Mediation system focusing specifically on the resource
management process that ranges from the automatic negotiation of developers requirements to the
execution of their applications. Further, this system was divided into three layers: negotiation,
resource management, and resource control. Keeping in mind this simple archetypal mediation
system, this chapter presents Nubilum a resource management system that offers a self-managed
solution for challenges resulting from the discovery, monitoring, control, and allocation of resources
in D-Clouds. This system appears previously in [25] under the name of D-CRAS (Distributed Cloud
Resource Allocation System).
Section 4.1 presents some decisions taken to guide the overall design and implementation of
Nubilum. Section 4.2 presents a conceptual view of the Nubilum’s architecture highlighting their
main modules. The functional components of Nubilum are detailed in Section 4.3. Section 4.4
presents the main processes performed by Nubilum. Section 4.5 closes this chapter by summarizing
the contributions of the system and comparing them with correlated resource management systems.
4.1 Design Rationale
As stated previously in Section 1.2, the objective of this Thesis is to develop a self-manageable
system for resource management on D-Clouds. Before the development of the system and their
correspondent architecture, some design decisions that will guide the development of the system
must be delineated and justified.
4.1.1 Programmability
The first aspect to be defined is the abstraction level in which Nubilum will act. Given that D-
Clouds concerns can be mapped on previous approaches on Replica Placement (see Section 0) and
Network Virtualization (see Section 0) research areas, a straightforward approach would be to
consider a D-Cloud working at the same abstraction level. Therefore, knowing that proposals in
both areas commonly seem to work at the IaaS level, i.e., providing virtualized infrastructures,
Nubilum would naturally also operate at the IaaS level.
29
Nubilum offers a Network Virtualization service. Applications can be treated as virtual
networks and the provider’s infrastructure is the physical network. In this way, the allocation
problem is a virtual network assignment problem and previous solutions for the NV area can be
applied. Note that such approach does not exclude previous Replica Placement solutions because
such area can be viewed as a particular case of Network Virtualization.
4.1.2 Self-optimization
As defined in Section 2.1, the Cloud must provide services in a timely manner, i.e., resources
required by users must be configured as quickly as possible. In other words, to meet such restriction,
Nubilum must operate as much as possible without human intervention, which is the very definition
of self-management from Autonomic Computing [69].
The operation involves maintenance and adjustment of the D-Cloud resources in the face of
changing application demands and innocent or malicious failures. Thus, Nubilum must provide
solutions to cope with the four aspects leveraged by Autonomic Computing: self-configuration, self-
healing, self-optimization, and self-protection. Particularly, this Thesis focuses on investigating self-
optimization – and, at some levels possibly, self-configuration – on D-Clouds. The other two
aspects are considered out of scope of this proposal.
According to [69], self-optimization of a system involves letting its elements “continually seek
ways to improve their operation, identifying and seizing opportunities to make themselves more
efficient in performance or cost”. Such definition fits very well the aim of Nubilum, which must
ensure an automatic monitoring and control of resources to guarantee the optimal functioning of
the Cloud while meeting developers’ requirements.
4.1.3 Existing standards adoption
The Open Cloud Manifesto, an industry initiative that aims to discuss a way to produce open
standards for Cloud Computing, states that Cloud providers “must use and adopt existing standards
wherever appropriate” [51]. The Manifesto argues that several efforts and investments have been
made by the IT industry in standardization, so it seems more productive and economic to use such
standards when appropriate. Following this same line, Nubilum will adopt some industry standards
when possible. Such adoption is also extended to open processes and software tools.
4.2 Nubilum’s conceptual view
As shown in Figure 6, the conceptual view of Nubilum’s architecture is composed of three planes: a
Decision plane, a Management plane, and an Infrastructure plane. Starting from the bottom, the
lower plane nestles all modules responsible for the appropriate virtualization of each resource in the
30
D-Cloud: servers, dedicated storage, and network, which includes links, routers, and switches. The
Management plane is responsible for the monitoring and control of the D-Cloud, as well as the
enforcement of allocation decisions taken by the upper layer. Finally, at the top of the architecture,
the Decision plane is where the advanced strategies and heuristics to allocate resources are
implemented. In the following sections the modules will be detailed.
Figure 6 Nubilum’s planes and modules
4.2.1 Decision plane
The Decision plane is composed of four modules: Negotiator, Mapper, Application Monitors, and
the D-Cloud Monitor. The Negotiator module is the front-end of Nubilum and is similar to the
Negotiation Layer presented in the Archetypal Mediation System at Section 2.4. Thus, the
Negotiator offers an API that can be used by developers to contract and control their virtual
resources. In this system, these operations will be implemented as Web Services using a descriptive
language whose definitions are illustrated in Chapter 5.
The Mapper is responsible for deciding the mapping of the required virtual resources on the
corresponding physical resources. It is important to note that, in Nubilum, the mapping algorithms
work with a model of the entire infrastructure, as the effective allocation is the responsibility of
lower planes. Thus, the Mapper inputs are a representation of a developer’s request and a
representation of the current status of the D-Cloud and the output is a new representation of the D-
Cloud status to be enforced over the real system. It is important to stress that the Mapper module
contains the main intelligence of the system and it is responsible for finding a configuration that
fulfills all computational and network resources’ requirements and optimizes the usage of the D-
Cloud infrastructure. The allocation algorithms used as part of Nubilum will be discussed at Chapter
6.
Infrastructure Plane
ManagementPlane
DecisionPlane
Mapper
Network
Monitor
Server
Monitor
Storage
Monitor
Network
Controller
Server
Controller
Storage
Controller
Resource
Maestro
Server
Virtualization
Network
Virtualization
Storage
Virtualization
Application
Managers
Application
Managers
Application
Monitor
Negotiator
Resource
Discoverer
D-Cloud
Monitor
31
The Decision plane also has a set of modules managing the applications of each developer: the
Application Monitors. The Application Monitor is responsible by the monitoring part of the self-
management loop of the system, since it periodically checks an application to guarantee the fulfilling
of their requirements. Each application submitted to Nubilum has an Application Monitor
associated with it. This module continuously checks the current status of the application against its
requirements and selects the appropriate actions when attempting to improve the application’s
performance, if that is necessary. Please note that the Application Monitor does not make allocation
decisions; it is merely responsible for detecting performance degradations and requesting new
resources from the Allocator.
The provider requirements - with respect to the usage of physical resources - are constantly
monitored and verified through the D-Cloud Monitor module. Similar to the Application Monitor
modules, the D-Cloud Monitor is other module acting in the self-management loop through
monitoring of the physical resources. It checks the current state of the physical resources against the
provider’s requirements. Upon a detection of surpassing a threshold (e.g: an increase of server load
beyond an established threshold), the D-Cloud Monitor communicates the Mapper soliciting the re-
optimization of the infrastructure.
4.2.2 Management plane
Nubilum divides D-Cloud resources in three categories: server, network, and storage. In the server
category there are all the devices that can host virtual machines. The network resources represent
links, network devices (routers and switches), and protocols that compose the underlying topology
of the D-Cloud. Finally, the storage resources are the nodes dedicated to storing virtual machine
images, files, or databases.
Considering such division, the Management plane has different modules for controlling and
monitoring each category of resources. Thus, the Server Controller and Server Collector are
respectively responsible for the control of the servers and hosted VMs, and for the acquisition of
updated status of the server and VMs. The Network Collector is responsible for obtaining updated
information about the state of the network and the load of links, whereas the Network Controller is
responsible for communicating decisions about the creation, modification, and removal of virtual
links to network devices. Further, this controller manages the assignments of IP and MAC addresses
to virtual resources. Similarly, the Storage Controller and the Storage Collector are responsible for
controlling and collecting storage resources. All such information about resource monitoring is sent
to the D-Cloud Monitor and Application Monitors.
32
Associated with the individual controllers and collectors of each resource is the Resource
Discoverer, which is responsible by the detection of resources in the D-Cloud and for maintaining
information about their respective status, which will be collected by the respective monitors of
computing, storage, and network resources.
Finally, the Resource Maestro orchestrates the creation and removal of the virtual resources in
the respective physical resources according to the decisions made by the modules at the Decision
Plane. When a new request arrives to Nubilum and the Mapper module decides where the new
virtual resources will be allocated, the Maestro enforces this decision by communicating it to the
components involved. One important task of this module is that it must translate high level orders
taken by the Mapper into low level requests that can be handled by the other modules. Such
translations must be made carefully, considering the application’s requirements, once the order of
the enforcement of any changes may alter the application’s execution. For example, consider a
virtual network of two virtual nodes A and B and one direct virtual link between them. Positioning a
new virtual node C in the middle (with one virtual link from A to C and another one between A and
B) involves the coordinated creation of these two virtual links in order to minimize interruptions in
the intercommunications of the virtual nodes.
4.2.3 Infrastructure plane
The infrastructure plane offers tools used for the appropriate virtualization of servers, network
elements and storage devices in the D-Cloud. As a result, this plane requires three modules to
accomplish its tasks.
The Server Virtualization is the module responsible for the effective management of resources
in the physical servers. It also manages the creation and maintenance of virtual machines, i.e., it
corresponds to the hypervisor installed in the physical servers. Thus, it is important to note that,
differently from the Server Controller and Server Monitor modules already described, the Server
Virtualization module is intended for the local control of a server and its resources.
The Network Virtualization module corresponds to the platform used for the effective
virtualization of the network and the associated protocols to accomplish this task. Similarly, the
Storage Virtualization comprehends all the technologies used for the effective control of storage
components in the D-Cloud.
4.3 Nubilum’s functional components
This section describes how the modules presented in the conceptual view in Section 4.2 are
rearranged to create the functional components of Nubilum. Moreover, this section provides
33
information about the technological solutions taken to overcome some D-Cloud-related design
issues.
The functional components in Nubilum are: the Allocator (located at the Decision plane in
the conceptual view), the Manager (that has modules from the Decision and Management plane), the
Storage System (situated at the Infrastructure plane), the Network Devices (situated at the
Infrastructure plane), and the Workers (situated in a mid-term between the Management plane and
the Infrastructure plane). Figure 7 illustrates these five components and their respective modules. At
the top are the Allocator and the Manager responsible, respectively, for taking decisions in the
allocation of incoming requests and for the overall control of the D-Cloud resources. The other
three components are responsible for operational tasks in each type of physical resource.
Figure 7 Functional components of Nubilum
4.3.1 Allocator
The Allocator (showed at Figure 8) is responsible for treating the resource requests made by
developers and for mapping the requested virtual resources on the physical resources of the D-
Cloud.
Figure 8 Schematic diagram of Allocator’s modules and relationships with other components
Manager
Worker
Worker
Workers
Allocator
Network Devices
StorageSystem
Mapper Network
Monitor
Resource
Maestro
Network
Virtualization
Storage
Virtualization
Network
Controller
Storage
Controller
Storage
Monitor
Network
Controller
Server
Monitor
Server
Virtualization
Server
Controller
Resource
Discoverer
Application
Managers
Application
Managers
Application
Managers
Negotiator D-Cloud
Manager
Allocator
Mapper
Negotiator
Manager
Cloud
Provider
Cloud
Developer
34
This Allocator needs to be as simple as possible in order to leverage system performance,
since the optimization problems for resource allocation tend to be computationally intensive.
Therefore, it has just the Mapper module (part of the Decision plane), which communicates with the
Manager to obtain information about the D-Cloud state and sends commands to enforce new
decisions, through a REST-based API. The Negotiator module of the Decision plane is the external
interface of Nubilum, which offers a REST API for communication with developers and the
provider.
4.3.2 Manager
The Manager is the central component in Nubilum that coordinates the overall control of the D-
Cloud and maintains updated information about the status of all its resources. It has four modules
from the Management plane (Network Controller, Network Collector, Resource Discoverer, and
Resource Maestro) and two modules from the Decision plane (Application Monitor, and D-Cloud
Monitor), which are represented in Figure 9. This figure also shows the communication with other
components [44].
Figure 9 Schematic diagram of Manager’s modules and relationships with other components
The Resource Discoverer module uses a simple process for discovering servers and network
devices based on the prior registration of each individual device at the Manager. Thus, the known
address of the Manager must be manually configured on the Worker and the Network Devices. The
acquiring of status information of the Network Devices is made through periodical queries to the
Openflow Collector module, while the status of each Worker is sent by them when a configured
threshold is crossed. All information is passed and maintained by the Resource Discoverer module.
The Network Controller and Network Collector modules implement the functions of control
and status collection of network devices individually. This module assumes the usage of Openflow
as a platform for network virtualization. This platform was chosen because of its versatility. It allows
configuring Network Devices in order to setup a path in the physical network corresponding to the
virtual link of an application. This module can be implemented using current APIs which support
the Openflow protocol or, alternatively, through invoking Openflow controllers such as NOX [29].
Manager
WorkersWorkers
Network
Monitor
Resource
Maestro
Resource
Discoverer
Application
Managers
Application
Managers
Application
Managers
D-Cloud
Manager
I
n
t
e
r
f
a
c
e
Allocator
Workers
WorkersWorkers
Network
Devices
Network
Controller
35
4.3.3 Worker
The basic component of Nubilum is the Worker. This component is installed on each physical
server and its main function is to manage the entire life-cycle of the virtual machines hosted on the
hypervisor. A schematic view of the relationship between the modules of this component is
presented in Figure 10.
Figure 10 Schematic diagram of Worker modules and relationships with the server system
From the Infrastructure plane, the Workers execute the Server Virtualization module. The
Workers are required to support third-party hypervisors as Xen, VMWare, or KVM, for example.
To avoid the implementation of the different drivers supporting each one of these hypervisors into
the Worker, the Libvirt [5] open-source API is used. Libvirt provides a common generic and stable
layer for VM management and offers an abstraction to Cloud providers since it supports the
execution on different hypervisors like: Xen, KVM, VirtualBox, and VMWare ESX. Through
Libvirt, the Workers can effectively control and monitor several aspects of virtual machines, which
can be simply consumed by the other modules. The advantage obtained when using Libvirt is
notorious, since this API offers a hypervisor-agnostic alternative for virtual machine control. In
addition, Libvirt leverages an abstraction of other aspects associated with the virtual machine as
storage and network manipulation.
Workers have several modules. For the control and monitoring of the physical server and
virtual machines hosted on it, there are the Server Controller and the Server Collector modules. The
Server Controller is responsible for the overall tasks involved in the creation, maintenance, and
destruction of virtual machines, while the Server Collector maintains records of the status of the
Worker
Storage
Controller
Storage
Monitor
Server
Monitor
Server
Virtualization
Server
Controller
Network
Controller
LibvirtLibvirt I
n
t
e
r
f
a
c
e
VM1 VM2 VMn...
Manager
36
server and the virtual machines hosted on it. To accomplish these tasks, these modules coordinate
the execution of the other modules hosted at the Worker, call the hypervisor through the Libvirt
API and invoke the operating system through some host monitoring API available in the
programming language (in case of Java, one could use the JavaSysMon system call library [33]).
The Storage Controller and Storage Collector modules manage the devices used for storing
virtual machine images. They maintain the storage pool while creating and deleting virtual machine
images. For this, the modules use Libvirt, which offers a common interface to manage many
different types of storage spaces ranging from iSCSI devices to a simple directory in the same
machine. Thus, these modules can access any type of Storage, making Nubilum independent of the
actual underlying storage scheme used on the D-Cloud, while the only assumption made is that all
files are potentially accessible from every machine in the D-Cloud. More about the storage in the D-
Cloud can be found in the corresponding section.
The Workers are responsible for the effective assignment of IPs and MACs to the virtual
machines, this task is performed by the Network Controller module, which is composed by a built-
in DHCP server and a LLDP (Link Layer Discovery Protocol) agent. The DHCP module is
dynamically configured with a direct mapping between the MAC and IP numbers, which are used to
give reserved IPs to each virtual machine hosted on the server. Please note that the presence of
several DHCP servers (one per Worker) does not produce interferences between them, since each
built-in DCHP server will only respond to requests of the virtual machines hosted on the Worker,
whereas requests with unknown MAC addresses will be dropped. The LLDP is employed for the
advertisement of LLDP messages used in the discovery of their links. The use of LLDP for
discovery is detailed in Section 4.4.2.
In addition to the modules used for the effective control and monitoring of the server and
virtual machines, the Worker has an Interface for communicating with Manager component. Such
interface is a REST-based Web Service.
4.3.4 Network Devices
Nubilum considers using a network virtualization platform for the setup and maintenance of the
virtual links across the system. On the current version of Nubilum, it is required that all network
devices in the system implement Openflow for network virtualization purposes. The advantage of
adopting such a solution is that not only is there a native support for the creation of virtual links, but
also no adaptations must be made to the network devices. This component has one only module
from the Conceptual View: the Network Virtualization module.
37
4.3.5 Storage System
The Storage System covers the Storage Virtualization module responsible by virtualize all the storage
resources spread along the D-Cloud. As earlier stated, Nubilum does not enforce the usage of any
actual storage technology type, but entrusts this to Libvirt’s coverage of storage technologies. The
only requirement is that all the virtual machine images are available to all servers in the D-Cloud.
This occurs because Nubilum depends on the current virtual machine migration techniques that
have this particular requirement.
Please note that the D-Cloud should not employ centralized storage solutions, but it can
employ distributed storage solutions, such as the Sector project [28], a file system designed for
distributed storage over WANs.
4.4 Processes
Several processes in Nubilum are performed as a result of the coordination between the
components. The present section will discuss these processes, which can be organized into three
classes: initialization processes, discovery processes, and resource request processes.
4.4.1 Initialization processes
The initialization process of the components that make up Nubilum is simple, since each
component is configured before its initialization. The first component to be initiated is the Storage
system as it is a third-party system. Similarly, the network devices must be configured with routes for
accessing the overall physical resources in the system and with the indication of the Openflow
controller address (i.e., the known IP address of the Manager component). The connection follows
the standardized process specified in the Openflow documentation [52].
The last infrastructural component, the Worker, is installed in each server of the D-Cloud and
its configuration includes the Manager’s address for registration. This configuration also includes
information about storage pools, geographical location, and base images to be used for virtual
machine creation. When started, the Worker connects with the local Libvirt API and obtains
information about the server and current virtual machines hosted on the server. Using such
information, the Worker prepares a description of the local server and virtual machines it hosts and
then initiates a simple registration process in the Manager, through the REST interface. If the
Manager is not initialized yet, the Worker will try again after a random sleep period.
The Manager can be initialized either on a reserved server or on a virtual machine hosted by a
server. Its initialization process opens an HTTP server for listening to REST messages and waits for
connections from Workers and Allocator components. Also, an Openflow controller is started to
38
listen for incoming connections of Network Devices. As with the Manager, the Allocator can be
initialized on any server of the D-Cloud or on a virtual machine, alternatively. The address of the
Manager is configured before the Allocator’s initialization.
4.4.2 Discovery and monitoring processes
The resource discovery comprehends the processes for finding new resources and acquiring updated
status on these. The initialization processes of Workers and Network Devices – which involves the
registration of both into the Manager – is the first part of the discovery processes that find new
resources on the D-Cloud. However, such processes are not sufficient to discover the connections
between physical resources in the D-Cloud. Therefore, Nubilum employs a discovery strategy
supported by NOX, which makes use of LLDP messages generated by the Manager and sent by
each Network Device to their neighbors [24].
The link discovery strategy is illustrated at Figure 11. Firstly, the Manager sends an Openflow
message (number 1 in the figure) to a switch in the network requesting forwarding of a LLDP
packet for each of its neighbor switches. When such neighbors receive the LLDP packet (arrows
marked as 2) they generate a new Openflow message (marked as 3) to the Manager informing the
receiving of this specific LLDP packet. The same process is executed for all switches concurrently
and the LLDP packet of each switch is specific for that switch in order to identify each link. Such a
strategy guarantees the discovery of all links between Network Devices in the D-Cloud, since it is
assumed that all Network Devices are Openflow-enabled. This process was extended to capture
LLDP messages sent by Workers in order to discover links between servers and Network Devices.
Figure 11 Link discovery process using LLDP and Openflow
Collecting information about the status of each resource involves retrieving information about
the Storage System, the Network Devices, and the Workers. The performance status of the Network
Manager
1
2
2
2
3
33
Network
Device
Network
Device
Network
Device
Network
Device
39
Devices is monitored by the Manager in a passive fashion by polling the resources according to a
configured period. Such polling is made through well-defined Openflow messages, which can
inform several Nubilum counters as, for example, the number of received packets or bytes per
virtual link in a network device (flows in Openflow terminology).
Differently from the Network Devices, the Workers actively send status information through
the Storage and Server Collector modules. The information offered by the Worker is closely related
to the support provided by Libvirt. It is possible to obtain information about CPU and Memory
usage and the disk used space in the server and virtual machines.
4.4.3 Resource allocation processes
The resource allocation processes are started up when developers, the Application Monitors,
or the D-Cloud Monitor, contact the Allocator requiring, respectively, resources for a new
application, more resources for an existent application, or the reallocation of resources to meet
provider’s requirements. Figure 12 shows all messages exchanged by the resource allocation process
when a developer requests resources for a new application. The developer sends to the Allocator a
POST message containing a CloudML document (see Chapter 5 for details about this language)
describing the requirements of its application, i.e., a virtual network with virtual machines, its
geographical position, and virtual links. After that, the Allocator sends a GET message to the
Manager to retrieve the current status of the entire D-Cloud, which will be used as input to the
resource allocation algorithms.
Figure 12 Sequence diagram of the Resource Request process for a developer
PUT /dcloud (CloudML)
Cloud
Developer
Allocator Workers Network Devices
POST App(CloudML)
GET DCloud
CloudML
Resource
Allocation
POST/virnode (CloudML)
POST/virnode (CloudML)
POST/virnode (CloudML)
ReplyPOST (CloudML)
ReplyPOST (CloudML)
ReplyPOST (CloudML)
Create flow (Openflow)
Create flow (Openflow)
Create flow (Openflow)
ReplyPUT (CloudML)
ReplyPOST (CloudML)
Manager
Nubilum
40
After the resource allocation algorithm execution, the Allocator sends a PUT message to the
Manager indicating what resources (and their respective configuration) must be dedicated to this
given developer. Then, the Manager sends POST messages to each Worker in the D-Cloud that will
host the requested virtual resources, receives respective replies, and sends Openflow messages to
each Network Device informing them of the flows for the setup of new virtual links. Then, the
Manager sends a reply PUT to the Allocator to confirm the allocation, and finally, the Allocator
returns a reply POST to the developer indicating if it was possible to allocate their application and
its respective IP address.
The processes triggered by the D-Cloud Manager or the Application Managers are similar to
the one presented in Figure 12, except that these modules use the PUT instead of the POST method
in the initial message.
4.5 Related projects
This chapter presented the implementation guidelines of Nubilum – a resource management system
that offers a solution for challenges related to allocation, discovery, control, and monitoring of
resources for ISP-based Distributed Clouds (please see Section 3.1 for more details). The system
manages a D-Cloud as an IaaS Cloud offering developers an environment that manages the entire
life-cycle of their applications ranging in scope from the request of resources, which are allocated
into a virtualized infrastructure of scattered geo-distributed servers connected by network devices, to
their removal from the D-Cloud.
In order to obtain optimal or quasi-optimal solutions while maintaining scalability, Nubilum
has a centralized decision ingredient with two high-level components: the Allocator and the
Manager; and a decentralized control ingredient with three infrastructural components: Workers,
Network Devices, and the Storage System. It also introduces a clear separation between the
enforcement actions and the intelligence roles played by the Manager and the Allocator, respectively.
The Manager offers an abstraction of the overall D-Cloud to the Allocator, which, in turn, is
described in a high-level perspective, where only the functionalities and communication processes
were defined. Thus, the proposed system intends to remain open for different model-driven
resource allocation strategies.
Some architectural ideas present in Nubilum are similar to the ones presented by open source
resource management systems for Cloud Computing like, Eucalyptus, OpenNebula, and Nimbus,
which are compared in [24]. These systems propose solutions to some problems that can arise in a
D-Cloud scenario, thus providing good starting solutions for the design of resource management
systems for D-Clouds. Beyond the centralized resource management, such systems also are based in
41
open interfaces and open tools like, respectively, Web Service-based interfaces (REST in case of
Nubilum) and Libvirt as a hypervisor-agnostic solution. Particularly from OpenNebula, Nubilum
leverages the idea of a conceptual stacked view separated from a functional view, as well as the
separation between the decision and the control tasks. Despite these similarities, those systems have
a lack of direct support to D-Clouds, mainly with regard to the virtualization of the network, which
is a main aspect of D-Clouds discussed in this Thesis.
Differently from the above-mentioned systems, the RESERVOIR project proposes a model,
architecture, and functionalities for Open Federated Cloud Computing, which is a D-Cloud scenario
where a Cloud provider may dynamically partner with others to provide a seemingly infinite resource
pool [59]. To achieve this goal, RESERVOIR leverages virtualization technologies and embeds
autonomous management into the infrastructure. To cope with networking aspects, the architecture
designs a scheme of Virtual Ethernet [30], which offers isolation while sharing network resources
between the federated Clouds composing the system. In contrast, following the design decision to
use existing standards, Nubilum employs Openflow as a solution for network virtualization.
An initiative that works with an ISP-based D-Cloud scenario is the research project called
GEYSERS [22]. This project extends standard GMPLS-related traffic engineering protocols to
virtual infrastructure reservation, i.e., for the integrated allocation of computational and networking
resources. Nubilum adopts a different approach, working with two different standards for
communication: REST-based interfaces combined with Libvirt for the allocation of server and
storage resources and Openflow for allocating network links. The integration of these standards is
made by the internal algorithms of the Manager. Such strategy is interesting since the system can be
deployed without support of additional protocols.
Also working with an ISP-based D-Cloud, one of the SAIL project objectives is to provide
resource virtualization through the CloNe (Cloud Networking) architecture [60]. CloNe addresses
the networking aspect in Cloud Computing and introduces the concept of a FNS (Flash Network
Slice), a network resource that can be provisioned on a time scale comparable to existing compute
and storage resources. The implementation of this concept already is an object of study in the SAIL
project, but some solutions were presented ranging from VPNs to networks based on Openflow
devices. Thus, the network virtualization solution presented in Nubilum, which is based on
Openflow, can be viewed as a first implementation of the FNS proposed by CloNe.
Another distinction between Nubilum and other existing resource management
systems/architectures is that Nubilum is focused on the resource management properly whereas
42
other practical aspects as security and fault tolerance were not considered, although Nubilum could
be extended to consider those important aspects.
Hitherto, one specific critic that can be made to Nubilum is that some aspects of the system
were generically described. One example is their openness to the usage of different algorithms for
resource allocation, which makes it difficult to see how the system can guarantee designs decision
related to self-optimization. Such aspects will be discussed in the next chapters what will specialize
the architectural specifications further. Chapter 5 details the overall communication protocols and
their respective messages used in the control plane of the system. Basically, the next chapter will
describe the HTTP and Openflow messages used by the components as well as it will describe the
modeling language CloudML. Furthermore, algorithms for resource allocation derived for specific
cases will be presented at Chapter 6.
43
5 Control Plane
“Si quam rem accures sobrie aut frugaliter, solet illa recte sub manus succedere.”
Plautus Persa
This chapter details and evaluates the control plane of Nubilum. It consists of two main elements:
the HTTP and Openflow messages for communication between the components in Nubilum, and
the Cloud Modeling Language (CloudML) that is used to describe services, resources and
requirements. Such elements are intrinsically correlated since the language represents data that is
exchanged by the HTTP messages and that defines the number and format of the Openflow
messages. These also specify how developers and the provider interact with the D-Cloud and how
the effective resource allocation is made on the D-Cloud.
Following the above division of the control plane, this chapter is split in three main sections: it
starts by presenting CloudML and discussing its characteristics with respect to similar description
languages in Section 5.1; next, in Section 5.2, the communication interfaces and protocols are
explained; finally, Section 5.3 evaluates the overall control plane solution.
5.1 The Cloud Modeling Language
As explained in Chapter 3, the representation of user requirements and cloud resources is the first
challenge to face when considering an automatic resource management system. This chapter
introduces the Cloud Modeling Language (CloudML), an XML-based language intended to cope
with the aforementioned required representations. CloudML is proposed to model service profiles
and developer’s requirements, while at the same time, to represent physical and virtual resource
status in D-Clouds. This language was previously introduced in [26].
Considering previous languages ([9], [39], [32], [16], [73], [47]) for the representation of
resources and their respective limitations (more about this topic is discussed in Section 5.1.3), it was
decided to design the Cloud Modeling Language (CloudML) to ensure three clear objectives: a) the
language must represent all physical and virtual resources in a D-Cloud, including their current state;
b) the proposed language must be able to model the service supported by the provider; and c) the
language must represent a developer’s requirements while also relating them to the provider’s
services.
44
In order to grasp how CloudML offers the integration of such objectives, please consider
Figure 13. This figure depicts a scenario with three actors: the application developer, the Cloud
provider, and Nubilum. Further, the figure shows the interactions between these actors through the
use of a respective description in CloudML.
Figure 13 Integration of different descriptions using CloudML
First, the Cloud provider should describe all services offered by the D-Cloud, generating a
service description document (step number “1” in the figure). Next, a Cloud developer may use
these descriptions (step number “2”) to verify if its requests are attended by the concerned D-Cloud.
Note that the CloudML allows different D-Cloud providers to comply with their respective service
descriptions. In this way, a developer may choose between different providers according to its own
criteria and convenience. Once a Cloud provider is selected, the developer composes a document
describing its requirements and may then submit its requests to the D-Cloud (step number “3”),
more specifically represented by the Cloud System Management, which will ultimately allocate
resources according to the requested resources and the current status of the D-Cloud. At any time,
the provider may asynchronously request status or the description of resources (step number “4”) of
the D-Cloud. This same information is the input of the resource management process performed by
Nubilum.
As defined in Section 4.1.1, Nubilum works as an IaaS Cloud. Thus, in CloudML the
developers’ applications are treated as virtual networks and the provider’s infrastructure is the
physical network. Another important design aspect is the use XML technologies as the underlying
structure to compose CloudML documents. By adopting this well-established technology, in
Service
Description
Cloud Operator describes its
services and makes them
available
1
Cloud Developer uses
Services’ description to
make its requests
2
Cloud Operator may request
status/description of
resources
4
Resource
Description
Request
Description
Cloud Developer sends its
requests to the Cloud
3
Nubilum
Cloud
Provider
Application
Developer
45
contrast to new ones such as JSON [13], it is possible to use solutions and APIs present in the XML
ecosystem to guarantee syntax correction, query of documents, and other facilities.
The rest of this section is organized as in the following: Section 5.1.1 presents and details the
CloudML language and its XML Schemas; Section 5.1.2 is dedicated to illustrate the use of
CloudML in a simple scenario; finally, a qualitative comparison and discussion between CloudML
and existent languages are given in Section 5.1.3.
5.1.1 CloudML Schemas
The XML Schemas of CloudML were divided into three groups: schemas for resource description,
schemas for service description, and schemas for requirements description. For didactical purposes,
it was opted for presenting those schemas through intuitive diagrams generated by the Web Tools
Platform (an Eclipse plug-in [18]) instead of the tangled XML Schema code.
Resource Description
This first group has two subgroups: one for short reports on the status of resources, and the other
group provides a complete description of resources.
The basic XML element for reporting resources’ status is the NodeStatusType (Figure 14)
which represents the status of both physical servers and virtual machines (called just nodes in our
language). This type is composed by two required attributes (CPU and RAM) and a sequence of
Storage elements. These attributes are presented in percentage values while the Storage has a type
defining the absolute value of the used space (Size), the Unit relative to this space (KB, MB, GB, etc),
and a logical ID since a node can have many drives for storage.
Figure 14 Basic status type used in the composition of other types
The next type is the VirNodeStatusType that is used to report the status of a specific virtual
machine (Figure 15). Such type has three attributes: the ID which is a unique value used for
identification purposes and defined when the VM is created; the Owner is the identification of the
developer that owns such VM; and the VMState that indicates the current state of the VM.
CloudML defines three states for the VM: stopped, running, and suspended, which are self-
descriptive. The associated type still has a Status element whose type is the NodeStatusType
46
already described. The PhyNodeStatusType is similar to the one for virtual nodes, except for the
omission of the VMState and Owner attributes.
Figure 15 Type for reporting status of the virtual nodes
The NodesStatusType gives information about the status of the whole resources managed
by the Worker. The NodesStatusType has only one root element called Nodes, as showed in
Figure 16.
Figure 16 XML Schema used to report the status of the physical node
One basic element for complete descriptions of physical nodes is the PhysicalNodeType
(Figure 17). This type has the ID attribute and four inner elements: NodeParams, PhyIface,
VirNodeID, and VirEnvironment.
Figure 17 Type for reporting complete description of the physical nodes
The NodeParametersType (Figure 18) describes relevant characteristics including: node
parameters (memory, processor, and storage), its geographical location (Location element), its
47
Functionality on the network (switch, server, etc…), its current status (which is an element of
the type NodeStatusType) and the OtherParams for general use and extension.
Figure 18 Type for reporting the specific parameters of any node
Here, there are two aspects that should be highlighted. First, the Location is the element that
enables a provider to know where resources are geo-located in the infrastructure. Second, the
OtherParams element can be used by providers or equipment vendors to extend CloudML
including other parameters not covered by this current version. In this way, CloudML presents itself
as an extensible language.
The PhysicalInterfaceType (Figure 19) is an extension of InterfaceType and is used to
describe physical links associated to the interface (PhysicalLinksID element) and virtual interfaces
(VirtualInterfacesID element) also related to the physical node. Such interfaces can be, for
example, interfaces from virtual nodes. The general InterfaceType has an ID, MAC, IPv4, and
IPv6 as attributes that are inherited by the PhysicalInterfaceType.
48
Figure 19 Type for reporting information about the physical interface
As part of the PhysicalNodeType, the VirNodeID is a simple list of the IDs of the virtual
machines hosted on the node, and the VirEnvironment is a list containing information about the
virtualization environment. Each item in the list informs its CPU architecture (32 or 64 bits), the
virtualization method (full or paravirtualized), and the hypervisor. Thus, an item indicates a type of
virtual machine supported.
The VirtualNodeType (Figure 20) gives a complete description of a virtual machine and is
similar to the physical node. The VirtualInterfaceType also inherits from the InterfaceType,
and the VirEnvironment contains only two attributes: one indicating the hypervisor and the other
indicating the virtualization mode of the VM.
Figure 20 Type for reporting information about a virtual machine
49
The VirNodesDescription and the NodesDescription are lists similar to the ones defined
into the Status XML Schemas.
The InfraStructureType (Figure 21) is composed by a PhyInfra element and zero or
more VirInfra elements. The element PhyInfra is a PhysicalInfraStructureType and
corresponds to the collection of physical nodes and links. The VirtualInfraStructureType
indicates virtual infrastructures currently hosted by the physical infrastructure.
Figure 21 Type for reporting information about the whole infrastructure
The PhysicalInfraStructureType (Figure 22) has an ID attribute and is composed by two
elements: PhyNode and PhyLink; which clearly represent the nodes (computers, switches, etc…)
and their connections (cable links, radio links, etc…), respectively. The PhyNode element is of the
type PhysicalNodeType which was already described whereas the PhyLink is of the type
PhyisicalLinkType (Figure 23).
Figure 22 Type for reporting information about the physical infrastructure
50
Figure 23 Type for reporting information about a physical link
The PhysicalLinkType describes physical links between physical nodes. It has an ID
attribute, a LinkParams element, and zero or more VirLinkID elements. The
LinkParametersType, just as the NodeParametersType, supports all relevant characteristics of
the link, which include: link technology (Ethernet, Wi-Fi, etc…), capacity, the current status (current
delay, current allocated rate and current bit error rate), and also an extensible element
(OtherParams) serving for future extension purposes. The VirLinkID element identifies the virtual
links currently allocated on this physical link.
Similarly to the physical infrastructure there is a type dedicated towards the collection of
virtual nodes and virtual links called VirtualInfraStructureType (Figure 24). It has an ID, an
Owner attribute (identifying the developer who owns these virtual resources) and can be composed
of one or more VirNode elements (of the described VirtualNodeType) and several VirLink
elements of the VirtualLinkType, which is very similar to the type for physical links.
Figure 24 Type for reporting information about the virtual infrastructure
Service Description
CloudML provides a profile-based method for describing a provider’s services, which are described
by an XML Scheme whose root element is Services from the ServiceType (Figure 25). This type
has a Version attribute and a sequence of Node and Link profiles elements. The Node element
51
consists of the nodes profiles that are described by the NodeProfileType whereas the Link
element uses the LinkProfileType. A Coverage element, from the CoverageType, is also
described.
The NodeProfileType uses the MemoryType for the RAM and Storage elements and the
CPUProfileType for CPU. The first has two attributes indicating the amounts of memory and the
second has three attributes indicating the following aspects: CPU frequency, number of cores, and
CPU architecture. The LinkProfileType has only three intuitive attributes: ID for identification of
the profile, the Rate for reserved rate, and the Delay for the maximum delay.
The CoverageType is intended to inform the geographical areas that the provider covers.
Thus, this type is just a sequence of Location identified by three attributes: Country, State, and
City. It is important to notice that there is a Location element in NodeParametersType (already
explained) used to geo-locate nodes in the infrastructure. With these two elements (Location and
Coverage) a provider is able to identify the geographical location of its resources, which allows the
provider to offer location-aware services.
Figure 25 Type describing the service offered by the provider
Request Description
Developers describe their application requirements through a request document, whose root
element is the RequestType (Figure 26). Such type is composed by three attributes: an ID, an
52
Owner, and a Tolerance (delay value in milliseconds), which expresses how far the virtual nodes
can be placed from their required location specified on the Node element.
The NodeSpecType and LinkSpecType have an attribute to indicate their ID and an attribute
to indicate the ID of the corresponding of the respective profile (described in the ServiceType)
chosen by the developer. The NodeSpecType has also a Location element indicating where the
node must be positioned, which is defined using the LocationType. The LinkSpecType has
several Node elements indicating the ID of the requested nodes that the link will be connecting.
Figure 26 Type describing the requirements that can be requested by a developer
5.1.2 A CloudML usage example
This section considers an example of the CloudML usage. Through CloudML, the provider is able
to list the different profiles using the service description model, informing nodes and links
configurations. A developer should inform its requirements using the requirements description
document. Moreover, Nubilum uses the description and status documents to describe physical and
virtual resources with regard to the D-Cloud. These can be used for internal communication
between equipments and components of the system and for external communication with the
provider.
Next, these XML documents are described in more details. Please notice that some irrelevant
parts of the XML documents were omitted for a better visualization.
Services XML
The XML document shown in Figure 27 represents the service defined by the D-Cloud provider.
There are two node profiles (nodeprofile01 and nodeprofile02), two link profiles
(linkprofile01 and linkprofile02), and three Coverage items with Country, State, and City
specifications.
The nodeprofile01 is a node running the Linux operating system with 2 GB of RAM, 80
GB of storage and acts as a server. The linkprofile01 represents a link with maximum delay and
53
capacity equal to 0.150 ms and 1.5 Mbps, respectively. The linkprofile02 is similar to that of
linkprofile01. The Coverage is defined as a set of Country, State, and City elements. In this case, the
Cloud provider informs that developers can request resources located at three different localities:
“Brazil, São Paulo, Campinas”, “Brazil, Pernambuco, Recife”, or “Brazil, Ceará, Fortaleza”.
Figure 27 Example of a typical Service description XML
The nodeprofile02 is from a different type of the nodeprofile01. This node profile is a
router as is indicated in the Functionality tag. In this Thesis, such type of node indicates an
abstract entity used to create virtual links for a specific location without use of a more robust virtual
node. Thus, a developer can request the use of this node to ask for a link with guarantees to attend a
specific geographical region.
Request XML
This example considers a developer making a simple request for two nodes and one link spanning
between them. To compose this document, the developer should read the Service description
offered by the provider, select the correspondent profile, and then make its request. The XML
document in Figure 28 represents such simple request. The node01 is from nodeprofile01 and is
located at the state of “São Paulo”. The node02 is at “Recife” and it is a router. The link is
linkprofile01.
Figure 28 Example of a Request XML
54
Description XML
The description document represents the infrastructure of the D-Cloud, including all physical and
virtual nodes. Depending on the size of the D-Cloud, this document can be very long. Thus, for a
better visualization, the next examples will illustrate only some parts of CloudML: the physical
infrastructure, the virtual infrastructure, and the virtual links.
The XML document at Figure 29 presents a complete description and status of all physical
nodes and physical links, with the first <PhyNode> tag informing resource characteristics (like
CPU, RAM…) for node 100. The node 101 description was omitted, since it is similar.
Figure 29 Physical infrastructure description
55
The <VirNodeID> tag informs the IDs of the virtual nodes that are running at the specific
physical node. In this case, according to our example, only the virtual node node01 is running at
physical node 100. There are also two physical links (<PhyLink> tags). The physical link
phylink01 has virtual link virlink01 associated to it. Further information about this link was
omitted here and will be described in the next Figure.
Figure 30 shows the description and the status of all the virtual nodes and virtual links of a
specific owner in the D-Cloud. Particularly, this example shows how the virtual network allocated
resources after receiving the request in Figure 28. Please note that this description is very similar to
the physical infrastructure. The virtual node node01 has many characteristics, such as RAM, CPU,
storage, network interface, and virtual environment. In this case, as the two virtual nodes that
resulted from the same type, the virtual node node02, omitted in the document, is a simple virtual
node with the Router functionality.
Figure 30 Virtual infrastructure description
56
Furthermore, this example is also about the description and the status of all virtual links
established in the D-Cloud. The virtual link virlink01 has information, such as technology and
rate, described in the <LinkParams> tag. Note that virlink01 was referenced previously in the
physical infrastructure description as a link associated to a physical one.
5.1.3 Comparison and discussion
Some alternatives for resource and requirements description in D-Clouds were presented at Section
3.2.1. In this section, CloudML characteristics will be contrasted, and a comparison will be done in
order to discuss these languages, highlighting their advantages and weaknesses.
CloudML presented in this work was developed to be a vendor-neutral language for
resource/request description on D-Clouds. Such neutrality is obtained both internally and externally
to the D-Cloud.
First, in the internal viewpoint, CloudML brings a common language that can be implemented
by different Cloud equipment vendors in order to integrate their different solutions. Certainly, such
integration cannot be made without some common protocol implemented by the vendors, but
CloudML offers a common terrain for data representation that is a crucial step towards
interoperability. Moreover, CloudML supports vendors’ innovation offering flexibility through the
use of the OtherParams element in the description of virtual and physical nodes and links. Such
optional field can be used by different vendors to convey private information in order to tune
equipments of the same vendor in the infrastructure. This characteristic is similar to OpenStack.
In the second and external viewpoint, the supported neutrality allows developers to request
services from different D-Cloud providers in order to compare characteristics from each one and
choose the appropriated services for their tasks. Here, it is important to notice that these providers
should use some standardized interface, such as OCCI, to handle this information model.
All the languages covered in Section 3.2.1 describe, in some way, computational and network
resources in the Cloud. Service description is also a common place for description languages.
However, these services are described in different manners. For example, the CloudML uses profiles
to represent distinct types of nodes and links that compose services; the VXDL is itself a
representational way to describe Grid applications; the OpenStack uses flavors idea, but it is
restricted to computational resources. Request description is not treated by the VRD.
One interesting aspect of CloudML is that of geo-location. With this information, the Cloud
may offer services with location-awareness. This point is also covered by the VXDL, VRD, and
Manifest languages, but this aspect is described without details in the respective works.
57
In addition to these points, the main CloudML characteristic is the description integration.
With CloudML, different Cloud providers may easily describe their resources and services and make
them available to developers. Thus, developers may search for the more suitable Cloud to submit
their requests to.
5.2 Communication interfaces and protocols
Figure 31 highlights the communication protocols used by Nubilum’s components, which were cited
at Chapter 4. Note that the figure omits the Storage System since it is controlled through the same
interface available at Workers. Basically, two different protocols are used for communication: the
HTTP protocol employed by REST interfaces available in the Allocator, Manager, and Worker
components, and the Openflow protocol for communication with network devices. Together those
protocols cope with the integrated control and monitoring of all physical and virtual resources in the
D-Cloud. The HTTP protocol is also used by the D-Cloud provider to describe the supported
services and by the developers to submit their requests to the system.
Figure 31 Communication protocols employed in Nubilum
The next sections will detail the communication protocols employed in Nubilum. First,
Section 5.2.1, will show the REST interfaces of each component: Allocator, Manager, and Worker.
After that, Section 5.2.2 will discuss how Openflow is explored by the Nubilum to setup virtual links
in the physical network.
5.2.1 REST Interfaces
As showed in Section 5.1, the XML Schemas defining CLoudML were divided into three groups:
schemas for resource description, schemas for service description, and schemas for requirements
description. Here, these schemas are mapped in seven MIMEtypes (Table II) that describe the
specific data types (xml documents) to be used by the REST interfaces of Nubilum.
NUBILUM
Allocator Manager
Worker
Network
Elements
D-Cloud
Provider
Application
Developer
HTTP
HTTP
HTTP
HTTP Openflow
58
Table II MIMEtypes used in the overall communications
Mimetype Description
cloudml/nodesstatus+xml Status of physical and virtual nodes
cloudml/nodesdescription+xml Description of physical and virtual nodes
cloudml/virnodedescription+xml Description of a single virtual node
cloudml/infradescription+xml Descriptin of the entire D-Cloud
cloudml/virinfradescription+xml Description of a particular virtual infrastructure
cloudml/servicesdescription+xml Description of the service defined by the Provider
cloudml/appdescription+xml Description of a developer’s request for their application
The cloudml/nodesstatus+xml is used to exchange volatile status information between
entities, and it refers to a short description of the status of all nodes managed by a worker, i.e., the
server (referred to as a physical node) and the virtual machines (or virtual nodes). The other
MIMEtypes offer complete descriptions of the resources. The cloudml/nodesdescription+xml
(corresponding to the NodesDescription XML Schema) gives a long description of the physical
node and virtual nodes and the cloudml/virnodedescription+xml (corresponding to the
VirNodesDescription XML Schema) refers to the complete description of a specific virtual node
hosted on the server. The type cloudml/infradescription+xml refers to the entire D-cloud and
it shows all the servers, physical links, virtual machines, and virtual links hosted on the D-Cloud,
whereas the cloudml/virinfradescription+xml refers to the complete description of the
resources in a virtual infrastructure. The cloudml/servicedescription+xml and the
cloudml/appdescription+xml describe the service offered by the provider and the developer’s
requests, respectively.
The REST interfaces were defined according to five URL resources: the virnode resource is
used to describe operations available at virtual machines; the worker resource is used for registering
and unregistering Workers at the Manager; the dcloud resource is the representation of the entire
D-Cloud infrastructure; the services resource is used to configure the profiles of virtual machines
and virtual links that can be requested by developers; and the app resource is used to represent
operations for requesting, updating, and releasing resources of a particular application. The next
sections offer more details about those resources.
Allocator
The interface available at the Allocator component can be accessed by developers and the provider.
The developers make use of operations to submit their requests to Nubilum whereas the provider
can configure the services offered by Nubilum. As stated earlier, through the services resource the
provider can configure the profiles of virtual machines and virtual links that can be requested by
59
developers. The operations are used to retrieve and update the list of services offered by the D-
Cloud provider, as showed in Figure 32 and Figure 33. The operation for updating the offered
services uses a PUT method on the URL “/services” and it carries into the body the new parameters
of the service in an XML document following the cloudml/servicesdescription+xml
MIMEtype. Developers use the GET operation to request the description of the current services
supported by the provider.
URL http://allocator_ip/services/
Method GET
Returns
200 OK & XML (cloudml/servicesdescription+xml)
401 Unauthorized
404 Not Found
Figure 32 REST operation for the retrieval of service information
URL http://allocator_ip/services
Method PUT
Request Body XML (cloudml/servicesdescription+xml)
Returns
201 Created & Location
400 Bad Request
401 Unauthorized
422 Unprocessable Entity
Figure 33 REST operation for updating information of a service
Using the app resource, a developer can submit, delete, or update an application (Figure 34, Figure
35, and Figure 36, respectively). Such operations use the cloudml/appdescription+xml
MIMEtype, which describes the virtual network, i.e., virtual nodes and virtual links requested by the
developer. Moreover, some error messages were defined also to deal with exception cases: 400 for
syntax errors in the XML document sent; 422 for any errors on the information contained in the
XML, e.g., invalid identification numbers or server characteristics; and 401 to report authentication
errors.
URL http://allocator_ip/app
Method POST
Request Body XML (cloudml/appdescription+xml)
Returns
201 Created & XML (cloudml/virinfradescription+xml)
400 Bad Request
401 Unauthorized
409 Conflict
422 Unprocessable Entity
Figure 34 REST operation for requesting resources for a new application
60
URL http://allocator_ip/app/[id]
Method PUT
Request Body XML (cloudml/appdescription+xml)
Returns
201 Created & XML (cloudml/virinfradescription+xml)
400 Bad Request
401 Unauthorized
409 Conflict
422 Unprocessable Entity
Figure 35 REST operation for changing resources of a previous request
URL http://allocator_ip/app/[id]
Method DELETE
Returns
204 No Content
401 Unauthorized
404 Not Found
Figure 36 REST operation for releasing resources of an application
These operations define the external interface of Nubilum, but other standardized interfaces
may be used instead for the same purpose of external communication, such as OCNI (Open Cloud
Networking Interface), an extension of OCCI proposed by [60] to cover network requirements.
Manager
The Manager’s interface has five operations that manipulate two resources. The worker resource is
used by Workers for registering and unregistering at the Manager (Figure 37 and Figure 38,
respectively). The operation for registration uses a HTTP POST method whose body section
contains a XML document corresponding to an entire description of the server and the virtual
machines. The second operation over the worker resource is used by Workers to unregister from the
Manager. In this operation, the Worker must inform the URL previously sent by the Manager in the
POST operation, which includes an ID for this Worker. The third operation (Figure 39) is
employed by Workers to update the Manager with information about the server and their virtual
machines.
URL http://manager_ip/worker
Method POST
Request Body XML (cloudml/nodesdescription+xml)
Returns
201 Created & Location
400 Bad Request
401 Unauthorized
422 Unprocessable Entity
Figure 37 REST operation for registering a new Worker
61
URL http://manager_ip/worker/[id]
Method DELETE
Returns
204 No Content
401 Unauthorized
404 Not Found
Figure 38 REST operation to unregister a Worker
URL http://manager_ip/worker/[id]
Method PUT
Request Body XML (cloudml/nodesstatus+xml)
Returns
201 Created & Location
400 Bad Request
401 Unauthorized
404 Not Found
422 Unprocessable Entity
Figure 39 REST operation for update information of a Worker
The other two operations available at the Manager’s interface operate over the overall dcloud
resource, which is a representation of the entire D-Cloud infrastructure (Figure 40 and Figure 41).
A first operation is intended to retrieve the complete description of the D-Cloud whereas the other
one is used to submit a new description of the D-Cloud, which will then be enforced onto the
physical resources by the Manager. Both operations are used by the Allocator. Please note that these
operations use the cloudml/infradescription+xml MIMEtype.
URL http://manager_ip/dcloud
Method GET
Returns
200 OK & XML (cloudml/infradescription+xml)
401 Unauthorized
404 Not Found
Figure 40 REST operation for retrieving a description of the D-Cloud infrastructure
URL http://manager_ip/dcloud
Method PUT
Request Body XML (cloudml/infradescription+xml)
Returns
200 OK
400 Bad Request
401 Unauthorized
422 Unprocessable Entity
Figure 41 REST operation for updating the description of a D-Cloud infrastructure
62
Worker
The Worker interface is focused on operations on virtual machines. Through the virnode resources,
the interface of the Workers’ components offers operations for the creation, updating, and removal
of virtual machines, which are respectively showed at Figure 42, Figure 43, and Figure 44.
URL http://worker_ip/virnode
Method POST
Request Body XML (cloudml/virnodedescription+xml)
Returns
201 Created & XML (cloudml/virnodedescription+xml)
400 Bad Request
401 Unauthorized
422 Unprocessable Entity
Figure 42 REST operation for the creation of a virtual node
The operation for the creation of a virtual node carries the node parameters in an XML document
following the cloudml/virnodedescription+xml MIMEtype. If executed with success, the operation
returns a 201 message and a XML document containing the parameters of the allocated virtual
machine. This document is very similar to the one that was passed in the request body but with
some additional information like the current state of the virtual machine. The other two operations
are similar to the POST operation, except that they access the new virtual node through a different
URI.
URL http://worker_ip/virnode/[id]
Method PUT
Request Body XML (cloudml/virnodedescription+xml)
Returns
201 Created & Location
400 Bad Request
404 Not Found
401 Unauthorized
422 Unprocessable Entity
Figure 43 REST operation for updating a virtual node
URL http://worker_ip/virnode/[id]
Method DELETE
Returns
204 No Content
401 Unauthorized
404 Not Found
Figure 44 REST operation for removal of a virtual node
63
5.2.2 Network Virtualization with Openflow
Nubilum uses the Openflow protocol for controlling and collecting information of the physical and
virtual links. As discussed in Section 4.3.2, the Manager’s modules coping with these functions are
implemented in a NOX Openflow controller through a specific application, which has also a set of
REST-based interfaces for communication with other Manager’s modules. The next sections will
describe in details the introduced REST interfaces and will specify the process to setup the correct
flows in each network device when creating virtual links.
NOX REST Interfaces
The REST interface in NOX has only two resources: topo, to inform the topology of the physical
network; and vlink that allows the manipulation of virtual links.
The topo resource is subject to a single GET operation (see Figure 45) used by the Resource
Discoverer module to request the topology of the physical network, which is a result of the LLDP-
based discovery processes detailed in Section 4.4.2. The operation is detailed in Figure 45. Note that
when the operation proceeds correctly, a XML document from the
cloudml/infradescription+xml MIMEtype is sent to the requester, but this description
contains only reduced information about the physical resources and the links between them.
URL http://nox_ip/topo
Method GET
Returns
200 OK & XML (cloudml/infradescription+xml)
404 Not Found
Figure 45 REST operation for requesting the discovered physical topology
Over the vlink resource, three operations were defined, namely, POST (Figure 46), PUT
(Figure 47), and DELETE (Figure 48), for the creation, updating, and deleting of virtual links,
respectively. A new MIMEtype is also introduced that contains information describing the virtual
link: the <MAC,IP> for source and destination nodes of the virtual link, and the path of physical
nodes/links that will host this virtual link. This path is informed since the calculus of the path is not
a NOX’s task but is rather a task performed by the Allocator component.
URL http://nox_ip/vlink
Method POST
Request Body XML (cloudml/virlinkdescription+xml)
Returns
201 Created & XML (cloudml/virlinkdescription+xml)
400 Bad Request
422 Unprocessable Entity
Figure 46 REST operation for the creation of a virtual link
64
URL http://nox_ip/vlink/[id]
Method PUT
Request Body XML (cloudml/virlinkdescription+xml)
Returns
201 Created & Location
400 Bad Request
404 Not Found
422 Unprocessable Entity
Figure 47 REST operation for updating a virtual link
URL http://nox_ip/vlink/[id]
Method DELETE
Returns
204 No Content
404 Not Found
Figure 48 REST operation for removal of a virtual link
Virtual links setup
The Openflow protocol leverages the usage of flow-based network devices that are configured by
NOX. Thus, when the Manager calls NOX through a POST operation in the vlink resource, the
Openflow Controller and Collector module will send appropriated Openflow messages to each
network device.
In our implementation, the entire D-Cloud network is an Ethernet network and each network
device is a switch. Thus, the first aspect to be considered is the forwarding of ARP packets which
must be flooded in the network without causing the retransmission of multiple copies of each
packet. This problem can be solved by creating a spanning tree in the graph in order to allow the
communication of all servers in the D-Cloud. The spanning tree can be created using the Spanning
Tree Protocol (STP) which is an optional protocol that Openflow switches can implement. The use
of STP is documented in the Openflow specification document [52]. In order to support switches
that do not perform STP, the next procedure is used in Nubilum.
Considering the physical network formed by servers, switches, and their respective links, a
breadth-first search was implemented in order to find the edges of the graph pertaining to the
spanning tree. Based on this, each network device is configured with a number of flows equal to the
number of ports on the switch. For each port, the module checks if the link associated with the
switch port is part of the spanning tree, if it does then all incoming ARP packets will be flooded to
the other ports except the port from where the packet arrived. If the link associated with the port is
out of the spanning tree, the ARP packets arriving from that port will be dropped. Figure 49 shows
an example of the typical Openflow tuple of these flows. In the example, all ARP packets incoming
from switch port 1 will be flooded to the other ports except port 1.
65
Figure 49 Example of a typical rule for ARP forwarding
In Nubilum, virtual links are implemented by a set of flows configured in the network devices
that form a path between the two virtual nodes in the borders of this virtual link. This path is
calculated based on the developer’s requirements and the specific goals of the allocation algorithms
implemented in the Mapper module at the Allocator component and no optimization is done in
NOX. Thus, the creation of virtual links is based on the XML document received. As explained in
the previous section, each request indicates the 2-uple <MAC,IP> for the source and destination
nodes of the virtual link as well as the path (a list of network devices).
Each network device in the path will be configured with two specific flows for bidirectional
communication. The flows have the characteristics presented in Figure 50. Note that these flows
include information about the MAC and IP addresses of the virtual machines connected by the
virtual link in order to restrict the virtual link to these machines.
(a)
(b)
Figure 50 Example of the typical rules created for virtual links: (a) direct, (b) reverse
In Section 5.1.2, the CloudML example introduced the use of virtual routers for the creation
of virtual links towards certain geographical locations. A link with a virtual router receives a different
treatment in Nubilum since it is intended to route data from a virtual node to clients and vice versa.
Thus, NOX creates two flows similar to the ones presented in Figure 50, but containing only the
MAC and IP addresses of the virtual node. Thus, traffic incoming from clients in the geographical
region can be routed to the virtual node.
5.3 Control Plane Evaluation
As discussed in this Chapter, the Nubilum’s processes generate several control messages for
monitoring and allocating resources, which naturally introduce load in the network. Thus, two
Switch
Port
MAC
src
MAC
dst
Eth
type
VLAN
ID
IP
src
IP
dst
TCP
sport
TCP
dport
Action
1 * * ARP * * * * * FLOOD
Switch
Port
MAC
src
MAC
dst
Eth
type
VLAN
ID
IP
src
IP
dst
TCP
sport
TCP
dport
Action
1 1:1:1:1:1:1 2:2:2:2:2:2 IP * 1.1.1.1 2.2.2.2 * * Outputto port2
Switch
Port
MAC
src
MAC
dst
Eth
type
VLAN
ID
IP
src
IP
dst
TCP
sport
TCP
dport
Action
2 2:2:2:2:2:2 1:1:1:1:1:1 IP * 2.2.2.2 1.1.1.1 * * Outputto port1
66
questions can be posed: “What is the impact of these control messages onto the system?”, and “how
does this scale up with the number of Workers and network elements?”.
Considering both questions, this Section shows the evaluation of the control load (total
message size in bytes) generated by the system through some simple and useful models derived from
measurements made on a prototype of the system. The prototype of Nubilum was implemented in a
testbed. This prototype comprehends the overall components of the system: Allocator, Manager,
Worker, Network Device, and Storage System. All the communication interfaces that make the
Nubilum’s control plane were also implemented as described in the previous sections.
After implementation, some measurements were taken in this prototype in order to measure
the number and size of HTTP and Openflow messages generated by three distinct events: the
resource allocation for a new request to the system; the status update of the physical resources
(Workers and Network Devices); and the release of resources of a developer’s application. Please
note that this section does not evaluate the algorithms for resource allocation – which will be
evaluated through simulations in Chapter 6 – it only measures the control load introduced by
Nubilum in order to evaluate the impact that the system cause in the network due to the
communication interfaces.
The messages exchanged between the components in Nubilum were measured in an actual
prototype of the system. One can divide these control messages into two major groups: HTTP
messages – used for communication with the developer, allocator, manager, and worker – and
Openflow messages – used for communication with the network devices. Furthermore, these
messages can be divided into three sub-groups according to their associated events: messages for
application allocation, messages for application release, and messages for status update.
Table III shows the number of bytes generated by the control messages between each
component of the system for each type of event. Each two lines in the table represent one interface
of the system: one between the Developer and the Allocator, another between the Allocator and the
Manager, others between the Manager and each Worker, and a last one between the Manager and
each Network Device.
The size of the messages depends on the specific parameters: VN = number of virtual nodes,
PN = number of physical nodes, VL = number of virtual links, PL = number of physical links, IF =
infrastructure description (in bytes, given by IF = 300+734*PN+857*VN+389*PL+314*VL), P =
number of ports in the network device. Thus, the length of an allocation message sent from
Developer to Allocator (first line in the table), for example, is proportional to the number of virtual
nodes (VN) and to the number of virtual links (VL) required by the Developer. Furthermore, it is
67
important to note that for each message there is a fixed number (in bytes) that represents the HTTP
header length, a fixed XML part, and the parameters’ description length. For example, with regard to
the message between the Developer and the Allocator, one can note that 505 bytes is the length of
the HTTP header and the fixed size XML part, 84 is the length of one virtual node description, and
74 is the length of one virtual link description. If there are 10 VN and 5 VL, the total length of an
allocation message submitted by a Developer to an Allocator is 1715 bytes.
Table III Models for the length of messages exchanged in the system in bytes
Interface Allocation event Release event Update event Type
Developer Allocator
505+84*VN+74*VL
(GET)
161
(DELETE)
N/A HTTP
Allocator Developer
537+857*VN+314*VL
(Reply GET)
46
(Reply DELETE)
N/A HTTP
Allocator Manager
120 (GET)
221+IF (PUT)
221+IF
(PUT)
N/A HTTP
Manager Allocator
237+IF (Reply GET)
242+IF (Reply PUT)
242+IF
(Reply PUT)
N/A HTTP
Manager Worker 978 (POST)
169
(DELETE)
639+180*VN
(PUT)
HTTP
Worker Manager 1024 (Reply POST)
46
(Reply DELETE)
130
(Reply PUT)
HTTP
Manager Network Device 320 288 20 Openflow
Network Device Manager N/A 352 12+104*P Openflow
Examining Table III, one can note that the HTTP messages are bigger than the messages
generated by the Openflow protocol. This is expected since the HTTP messages carry CloudML
documents whose text format introduces flexibility but tends to be bigger than a byte-oriented
protocol. Another interesting aspect that can be highlighted is that the messages between Allocator
and Manager are bigger than other HTTP messages. This is due to the large amount of data that
needs to be exchanged between these two components because of the way the system was designed.
This needs to be done in order to allow the separation between the component that allocates new
requests and the one that gathers information about the whole system. Another reason for this
design choice is to allow the allocator to be stateless in respect to the status of the resources. Thus
each time a new request arrives the Allocator must acquire the current status of the D-Cloud. Notice
also that part of this infrastructural information that needs to be exchanged provides valuable
feedback about the developers’ applications, as the allocator acts as an interface between developers
and the whole system.
The table also reveals that the control load increases linearly with the size of the physical
infrastructure (number of Workers and Network Devices) as well as the size of the requests. This is
an important validation, as it shows that there is no unusual increase of the control traffic in the
system according to the number of Workers, which is an aspect that affects system’s scalability.
68
6 Resource Allocation Strategies
“Fugiunt amici a male gerente res suas.”
Schottus, Adagia
Section 3.2 detailed four research challenges related to resource management on D-Clouds:
Resource Modeling, Resource Offering and Treatment, Resource Discovery and Monitoring, and
Resource Selection and Optimization. The first three research challenges were discussed in the
context of Chapter 4 and Chapter 5. In a complementary form, this chapter will discuss the fourth
challenge about resource allocation.
The problem of finding the best mapping of distributed applications onto a physical network
is not a trivial task due to the dynamicity, high algorithm complexity, and all the different
requirements (from developers and the provider) that must be contemplated by Nubilum. Moreover,
this problem is not restricted to the allocation of an incoming application, but it also involves the
continuous monitoring and maintenance of the application in order to guarantee developer’s
requirements. This chapter focuses on problems for resource allocation on D-Clouds. The first
problem investigated (in Section 6.1) is relative to the allocation of resources for the management
entities of the Cloud. Specifically, the Manager Positioning Problem will be addressed and solved.
After that, Section 6.2 discusses the way Nubilum deals with the problem of allocating virtual
networks and evaluates an algorithm for load balance. Section 6.3 presents algorithms for the
creation of a virtual network when the request is composed only by the virtual nodes. Finally,
Section 6.4 discusses and summarizes the main results obtained in this chapter.
6.1 Manager Positioning Problem
Recall, from Chapter 4, that Nubilum is composed by five components: Allocator, Manager,
Workers, Network Devices, and the Storage System. As described earlier, the Manager is at the
center of the entire system and maintains continuous communication with all the other components.
This implicates that the relative position of the Manager in the network can influence the
performance of the entire system, since choosing a bad one will certainly increase communication
costs with part of the network and it may also cause delayed responses to urgent system’s
69
operational queries hence leading to performance degradation events. The problem of finding an
optimal position for the Manager entity is referred here as the Manager Positioning Problem (MPP).
Let us initiate by supposing without loss of generality that the communication between the
Manager and the Allocator is of minor importance in comparison to the communication with the
many Workers and Network Devices. This can be achieved, for example, if the two components are
executed in the same physical node. Assume, finally, that the D-Cloud is composed by nodes able to
work as Network Devices and Workers at the same time. This assumption only turns the MPP more
general, since each node in the D-Cloud is capable of hosting the Manager.
Figure 51 shows an example of a D-Cloud containing ten Worker/Network Device nodes
(represented by W nodes) and the Manager (the overlaid M node) already allocated in one of these
nodes. Furthermore, each link has an associated delay that is showed on the figure as annotated
values over the links. Thus, ‫ܦ‬௜,௝ is defined as the shortest path between nodes i and j considering the
delays as links’ weights. Note that this value is defined to be zero when ݅ ൌ ݆ and is positive
otherwise.
Figure 51 Example of a D-Cloud with ten workers and one Manager
The MPP can be defined to determine the position of the Manager in the D-Cloud in order to
minimize the cost function ∑ ‫ܦ‬௜,ெ௜‫א‬ே , where N is the set of nodes of the graph and M is the node
where Manager is hosted. Please note that this problem is equivalent to solve the 1-median problem,
which is seen as a very simple form of the replica problems introduced in Section 3.2.4. This
problem can be solved calculating the all-pairs shortest path in a weighted graph which can be
obtained with the Floyd-Warshall algorithm whose complexity is ܱሺܸଷሻ [41], where ܸ is the number
of nodes. Using the distance from each node to each other, the sum of the distances can be
calculated for each node in the graph that is a candidate for positioning the Manager. The node that
has the minimum sum of distances is the solution to the MPP. The next example will detail as this
simple algorithm can be used in a particular instance of the MPP.
W
W W W
W W
W
W
WW
M0.1
0.5
0.2
0.01
0.1
0.2 0.3
0.05
0.2
70
Considering the example in Figure 51 and using one of the algorithms for all-pairs shortest
path computation, one can obtain the distance matrix below (nodes are numbered from left to right
and from top to bottom):
‫ۏ‬
‫ێ‬
‫ێ‬
‫ێ‬
‫ێ‬
‫ێ‬
‫ێ‬
‫ێ‬
‫ێ‬
‫ۍ‬
0 0.3 0.1 0.6 0.11 0.21 0.31 0.36 0.61 0.81
0.3 0 0.2 0.7 0.21 0.31 0.41 0.46 0.71 0.91
0.1 0.2 0 0.5 0.01 0.11 0.21 0.26 0.51 0.71
0.6 0.7 0.5 0 0.51 0.61 0.71 0.76 1.01 1.21
0.11 0.21 0.01 0.51 0 0.1 0.2 0.25 0.5 0.7
0.21 0.31 0.11 0.61 0.1 0 0.3 0.35 0.6 0.8
0.31 0.41 0.21 0.71 0.2 0.3 0 0.05 0.3 0.5
0.36 0.46 0.26 0.76 0.25 0.35 0.05 0 0.35 0.55
0.61 0.71 0.51 1.01 0.5 0.6 0.3 0.35 0 0.2
0.81 0.91 0.71 1.21 0.7 0.8 0.5 0.55 0.2 0 ‫ے‬
‫ۑ‬
‫ۑ‬
‫ۑ‬
‫ۑ‬
‫ۑ‬
‫ۑ‬
‫ۑ‬
‫ۑ‬
‫ې‬
.
Summing up the values in the columns, each element of this cost vector will give the cost of
the function considering that M is hosted in the respective node and the minimal value indicates the
best node to allocate the Manager. For the matrix above, the cost vector is
ሾ3.41 4.21 2.61 6.61 2.59 3.39 2.99 3.39 4.79 6.39ሿ
and the minimal cost is 2.59 corresponding to the fifth node where the Manager is already allocated
as showed at Figure 51.
Please note that, the problem can be simplified considering a restricted number of nodes as
Workers. In this case, the same solution can be applied by simply restricting the path computations
to the Worker nodes and using links and intermediary nodes as the inputs for the calculus. Also, the
same idea can be applied to reduce the number of path computations considering only Workers that
have at least one virtual machine already allocated, since the status of these nodes must be
maintained more accurately than that of servers that have no running VMs.
One may note that if using static delays values, the path matrix can be computed only once,
saving execution time. However, a more practical approach of the problem would be to measure the
delay along the time according to the traffic in the D-Cloud. In this case, the same solution could be
used repeatedly while considering a delay sample in time (or any function over this value, including
here functions over historic values) as the input for the new path matrix computation.
6.2 Virtual Network Allocation
In Nubilum, developers submit their requirements to the system as a virtual network, composed of
virtual nodes and virtual links. A priori, Nubilum’s resource model (condensed in the CloudML)
supports a wide range of virtual network algorithms while considering diverse characteristics as geo-
locality, network delay, Worker capacity, network capacity, and so on. Such characteristics are used
71
as inputs of the resource allocation algorithms and they can figure as constraints of the problem or
as part of the objective function. Note that Nubilum neither restricts the objective functions that
can be used by the allocation algorithms nor constrains the type of the algorithm (please see Section
0 for discussion on the topic of Network Virtualization). However, Nubilum confines the
algorithms to work with the characteristics of their resource model that are summarized in Table
IV.
Table IV Characteristics present in Nubilum’s resource model
Components Characteristics
Physical
Network
Physical
Nodes
Memory, Storage, CPU
Geo-location
Supported virtualization environment
Functionality
Physical network interfaces
Already allocated virtual machines
Physical
Links
Capacity and allocated capacity
Current delay
Current bit error rate
Technology
Already allocated virtual links
Requested
Virtual Network
Virtual
Nodes
Memory, Storage, CPU
Geo-location and tolerance delay
OS type
Functionality
Virtual Links
Required rate
Maximum delay
Allocated Virtual
Network
Virtual
Nodes
Memory, Storage, CPU
Geo-location
Functionality
VM state
Owner
Virtual Links
Capacity and current rate
Current delay
Current maximum bit error rate
Owner
The characteristics in the table allow that several resource allocation algorithms for virtual
networks can be implemented in the Mapper module (see Section 4.3.1) with just minor
modifications. For example, the algorithms D-ViNE and R-ViNE proposed in [10] consider aspects
as node capacity, node location, maximum tolerance distance, and link capacity for the physical and
the requested virtual networks. The algorithms consider those aspects in a abstract way, but
associating the abstract node capacity with a concrete characteristic as Memory, Storage, and/or
CPU and considering the distance measured in delay, these algorithms could be implemented
without any additional modifications. Similarly other proposals cited in [31] and [3] could be adapted
for use in Nubilum.
72
This section presents a resource allocation algorithm that works with subset of the
characteristics supported by Nubilum. This algorithm has the objective of guarantying the load
balancing of the physical resources (servers and links). The problem is to allocate an incoming
virtual network in the physical network in order to balance the load of virtual resources allocated in
physical resources, subject to some constraints, i.e., this problem can be better defined as one that
allocates the virtual network in order to minimize the sum of two components: the usage of
resources at physical nodes and the maximum number of virtual links in a given physical link.
The problem considers restrictions associated to virtual nodes and virtual links. In case of
virtual nodes memory, storage, CPU, geo-location, and functionality constraints are considered,
whereas in the case of virtual links the problem considers only a required maximum delay of the
physical path. Table V shows these set of characteristics. Note that, differently from other
approaches, the present problem does not consider link capacity since it seems to provide better
insights and more useful results with unrestricted capacity in a pay-as-you-go business plan.
Table V Reduced set of characteristics considered by the proposed allocation algorithms
Components Characteristics
Physical
Network
Physical
Nodes
Memory, Storage, CPU
Geo-location
Functionality
Already allocated virtual machines
Physical
Links
Current delay
Already allocated virtual links
Virtual Network
Request
Virtual
Nodes
Memory, Storage, CPU
Geo-location
Functionality
Virtual Links Maximum delay
Following, Section 6.2.1 defines the problem stated above in a formal way. In order to solve
the problem, the solution is divided into two steps: virtual node allocation, which is covered in
Section 6.2.2, and the allocation of virtual links, discussed in Section 6.2.3. The proposed algorithms
are evaluated in Section 6.2.4.
6.2.1 Problem definition and modeling
The physical or substrate network is represented by an undirected graph GS
ൌ ሺNS
, ESሻ. Each
substrate node vS ‫א‬ NS
has three distinct capacities: cሺvSሻ, mሺvSሻand sሺvSሻ, representing CPU,
memory, and storage remaining capacities on each substrate node. Each substrate link has a delay
associated represented by the function dS: ES
՜ R. The current stress sሺeSሻ of a substrate link
eS ‫א‬ ES
is the number of virtual links that already pass through this substrate link.
Each virtual network will be given as a graph GV
ൌ ሺNV
, EVሻ. Each virtual node vV ‫א‬ NV
has
three distinct capacity requirements: cሺvVሻ, mሺvVሻand sሺvVሻ, representing CPU, memory and
73
storage consumption. For each virtual link e୴ ‫א‬ EV
, a maximum delay given by a function dV: EV
՜
R is associated. Each virtual node will be assigned to a substrate node. This assignment can be seen
as a function MN: NV
՜ NS
. For each virtual node v ‫א‬ NV
, the set NSሺvሻ ‫ؿ‬ NS
indicates the
physical nodes where this virtual node can be allocated.
Each virtual link will be assigned to a substrate simple path between the corresponding
substrate nodes that host the virtual nodes at both ends of that virtual link. Being PS
the set of
simple paths of GS
and PSሺu, wሻ the set of simple paths of GS
between physical nodes u and w, this
assignment can be seen as a function ME: EV
՜ PS
such that if e୴ ൌ ሺu, wሻ ‫א‬ EV
, then MEሺe୴ሻ ‫א‬
PSሺu, wሻ. If p ‫א‬ PS
is any simple path on GS
, then its delay dSሺpሻ is the sum of delays of each
physical link of the path. For eୱ ‫א‬ ES, the stress Sሺeୱሻ generated by the allocation of GV
is defined
as the number of virtual links e୴ ‫א‬ EV that pass through the substrate link eୱ, i.e., the number of
virtual links e୴ ‫א‬ EV such that eୱ is a substrate link of the path given by MEሺe୴ሻ.
Given a substrate network GS
ൌ ሺNS
, ESሻ and a virtual network GV
ൌ ሺNV
, EVሻ, the problem
is to find mappings MN: NV
՜ NS
and ME: EV
՜ PS
in order to minimize:
max
ୣS‫א‬ES
ሾsሺeSሻ ൅ SሺeSሻሿ ൅ max
୴‫א‬NS
቎ ෍ cሺuሻ െ cሺvሻ
MNሺ୳ሻୀ୴
቏ ൅ max
୴‫א‬NS
቎ ෍ mሺuሻ െ mሺvሻ
MNሺ୳ሻୀ୴
቏
൅ max
୴‫א‬NS
቎ ෍ sሺuሻ െ sሺvሻ
MNሺ୳ሻୀ୴
቏
Subject to the constraints:
‫׊‬vԖNV
, MNሺvሻ ‫א‬ NSሺvሻ
1)
‫׊‬e ൌ ሺu, wሻԖEV
, MEሺeሻ ‫א‬ PS
൫MNሺuሻ, MNሺwሻ൯
2)
‫׊‬eԖE୴
, dVሺeሻ ൑ dS൫MEሺeሻ൯ 3)
‫׊‬vԖNS
, cሺvሻ ൑ ෍ c
MNሺ୳ሻୀ୴
ሺuሻ
4)
‫׊‬vԖNS
, mሺvሻ ൑ ෍ m
MNሺ୳ሻୀ୴
ሺuሻ
5)
‫׊‬vԖNS
, sሺvሻ ൑ ෍ s
MNሺ୳ሻୀ୴
ሺuሻ
6)
The objective function aims to minimize the sum of maximum link stress and remaining CPU,
memory and storage, thus balancing the load. The first constraint (1) implies that each virtual node
74
will be allocated in a required physical node, according to geo-location restriction given. The second
constraint (2) enforces the virtual link between virtual nodes u and w to be allocated in the path
between the physical nodes in which these virtual nodes are allocated. The third constraint (3)
implies that the delay of the virtual link will be respected in the path in which it is allocated. The last
three constrains (4), (5), and (6) enforce that the capacities restrictions for virtual nodes are fulfilled.
6.2.2 Allocating virtual nodes
The problem of allocating virtual nodes on multiple physical nodes considering multiple constraints
can be reduced to that of the multidimensional multiple knapsacks, which is NP-complete [36].
Therefore, the algorithm for virtual node allocation proposed in this Thesis uses a greedy approach
as showed at Figure 52. This algorithm consists in selecting, for each virtual machine of a new
request, the appropriate Workers according to the storage, memory and CPU capacities associated
with the virtual nodes and also the geo-location constraints.
The algorithm initiates sorting the virtual nodes (a vm) and Workers that will be processed.
This sort is made in the following order: highest free memory, highest free CPU, and lastly highest
free storage space (line 4 for virtual machines and line 6 for Workers). For each virtual machine, the
algorithm tries to greedily allocate it to the Worker with most remaining capacities that satisfies its
location restriction (modularized in a simple check in line 8). In other words, this check will get a
given virtual node and verify if this virtual node can be allocated in the current Worker being tested.
If it is possible, the virtual node is allocated in this Worker (line 9) and the algorithm will try to
allocate other virtual nodes; if it is not possible, the next Worker in the list of Workers will be tested.
The algorithm allocates the virtual machines that need more resources first. Note that the
algorithm performs admission control, if a virtual machine of the list cannot be mapped to a
Worker, then the entire request is rejected (line 13).
Algorithm 1: Allocation of virtual nodes
1 Inputs: vmList, workerList;
2 Outputs: vmMapping;
3 begin
4 sortByDecreasingValueOfResources (vmList);
5 for each vm in vmList do
6 sortByRemainingResources (workerList);
7 for each worker in workerList do
8 if( tryToAllocate (vm, worker) ) then
9 vmMapping.Add(vm, worker);
10 update( worker );
11 stop;
12 end
13 if vm not in vmMapping then stop;
14 end
15 end
Figure 52 Algorithm for allocation of virtual nodes
75
6.2.3 Allocating virtual links
Regarding link allocation, the algorithm focuses on minimizing the maximum number of virtual
paths allocated over a physical link while obeying the restriction on the maximum delay. In order to
achieve link load balancing when allocating on physical links, a minimax path with delay restriction
algorithm was used to determine the paths that will be allocated.
A minimax path between two Workers is the path that minimizes the maximum weight of any
of its links. The link weight is given by the number of virtual paths that is already allocated in a
physical link. So, the minimax path has the minimum maximum link weight. Figure 53 illustrates the
minimax path concept considering the paths between two nodes S and T. The numbers annotated in
the links and separated by the semicolon are, in the left, the link weight for the minimax path
calculus and, in the right, the delay for checking the restriction. In the example, the path costs of P1
and P2 are 4 and 2, respectively, which are the maximum weights of the links in each path. The
minimax path is P2, but P1 is the chosen path when considering the delay restriction.
Figure 53 Example illustrating the minimax path
To determine the minimax path with delay restriction between two nodes one can execute a
binary search on a vector ranging from the minimum link weight to the maximum link weight of the
graph. In each iteration of the binary search, a weight k is selected and all links with weight greater
than k are removed from the graph. The Djikstra algorithm considering the delay as the cost is then
executed to verify if there is a path between the two nodes in this new graph that is constrained by
the maximum delay. If there is such a path in this graph the upper bound is adjusted to the current
k. Adversely, if there is not a path in this graph, the lower bound is adjusted to k. The binary search
is repeated until the vector is reduced to two elements. Thus, k is chosen as the element in the upper
bound, and Djikstra is used again to verify the delay constraint. If no path is found for that link, it
cannot be allocated in this network. This simple algorithm is showed only for illustrating a solution
for the problem, but more efficient algorithms can be used, like the ones proposed for the
bottleneck shortest path problem, which is a closely related problem [34].
Similarly to the problem of virtual node allocation, the allocation of several virtual links
focusing on minimizing the number of virtual links on each physical link is a NP-hard optimization
problem (please see [74] to verify this aspect); this is the reason why the solution presented here is a
greedy approach based on some heuristics.
S T
2;0.1
2;0.2 1;0.3
4;0.2
P1
P2
CP1=max(2,4)=4
CP2=max(2,1)=2
CST=min(4,2)=2 P2 is the minmaxpath
P1 is the minmax path with delay restriction
76
In the virtual link allocation algorithm (Figure 54), each iteration computes a minimax path
with delay restrictions for each virtual link not yet allocated. If one of those virtual links cannot be
allocated, then the entire process is stopped (lines 7 and 8). The virtual link with the largest minimax
cost is allocated in the current iteration. The key idea behind this heuristic is that this is the most
restrictive virtual link. The next iterations continue with this behavior, but the calculus of the
minimax path considers this new allocated virtual link added to the graph. Thus, this greedy
algorithm performs link allocation considering the minimax path in order to achieve load balancing
without transgressing the delay restriction.
Algorithm 2: Allocation of virtual links
1 Inputs: vmMapping, vLinkList, physicalLinksState;
2 Outputs: linkMapping;
3 begin
4 while (vLinkList)
5 for each vlink in vLinkList do
6 status = minimaxPathWithDelay (vlink, vmMapping, physicalLinksState);
7 if status = reject then
8 stop;
9 end
10 selectedVLink = largestMinimaxCost;
11 linkMapping.Add( allocate(selectedVLink));
12 vLinkList.remove( selectedVLink );
13 update( physicalLinksState );
14 end
15 end
Figure 54 Algorithm for allocation of virtual links
6.2.4 Evaluation
An event-driven simulator for the allocation virtual networks was implemented to compare the
performance results of the algorithms proposed by this Thesis (Minimax Path Algorithm – MPA)
and the algorithm of Zhu and Ammar (ZAA), already described in Section 3.2.4. This specific
evaluation is interesting since the performance of ZAA is a reference for virtual network allocation
problems with load balance. This fact can be seen in the results shown by Chowdhury et al. [10].
Amongst the contribution of Chowdhury et al., they proposed a virtual network allocation algorithm
with load balance that proved to achieve worst results than ZAA. Moreover, a Hybrid solution that
applies the ZAA heuristic for node allocation and the MPA strategy for link allocation will be tested.
Thus, the Hybrid strategy allows comparing just the link allocation strategy of MPA against the link
allocation of ZAA, showing the impact of the minimax strategy.
The simulator performs an event-driven simulation, whose main events are arrival and
departure of virtual networks from a D-Cloud. The D-Cloud is represented as a topology composed
of Workers connected by some links. The simulation model considers a simplified version of the
model introduced in the previous section, i.e., the several constraints as link delay, CPU, Memory,
Storage, and geo-location were removed from this simulation model in order to evaluate the MPA in
77
the same scenario for which ZAA was developed. Thus, the simulation model will consider only the
stress, which is defined for Workers and physical links. In case of a Worker, the stress is the number
of virtual nodes that are currently allocated on that Worker, for a physical link, the stress is the
number of virtual links that are currently allocated on that physical link.
In this way, the node allocation strategy of MPA was modified to consider only the stress as
the metric for optimization, while disregarding specific aspects that were considered in the CPU,
memory, storage, and geo-location restrictions. Thus, the algorithm is similar to the least-load
strategy compared in Zhu and Ammar paper [74], with the difference that MPA’s link allocation
algorithm uses the minimax path while the least-load studied in their paper uses the shortest path.
Observe also that the link allocation algorithm was changed to consider the delay of each link as one
unit.
The adopted physical network is the backbone of the RNP (Rede Nacional de Ensino e Pesquisa),
a Brazilian academic ISP, which is currently composed of 28 nodes and 33 links as showed at Figure
55(b)8
. In this evaluation, it was also considered an old topology of the RNP, which was composed
of 27 nodes and 29 links (showed at Figure 55(a)). As one can see, the current topology has 5 nodes
of degree one, and the old has 17. The impact of this aspect, as will be showed in the results, is that
the number of disjoint paths tends to be greater in the current RNP topology affecting the
performance of MPA.
(a) (b)
Figure 55 The (a) old and (b) current network topologies of RNP used in simulations
The several factors varied in the simulation are showed at Table VI. Excepting the network
topology, the other factors were obtained from the paper of Zhu and Ammar, since the idea is
comparing the algorithms in the same scenario proposed for the ZAA. The virtual network requests
arrive in a Poisson process whose average rate was set to 1, 5, 9, and 13 requests per unit time in
each scenario. The lifetime of each virtual network follows an exponential distribution of mean
equal to 275 time units. The evaluation includes only networks in a star topology, since the authors
of ZAA argued that their algorithm is better for such type of networks. Observe that Zhu and
8 http://www.rnp.br/backbone/index.php
78
Ammar propose also an algorithm for graph division that subdivides a general network into several
star networks, but, since only star networks are used in this simulation, such algorithm will not be
evaluated. The size of each virtual network is uniformly distributed from 5 to 15 nodes including the
center node. Thus, the evaluation occurs in a better case for ZAA. On the other hand, their
algorithm for adaptive reallocation is not used in this comparison, since the objective here is to
evaluate the allocation only.
Table VI Factors and levels used in the MPA’s evaluation
Factors Levels
Physical network topology
Old RNP topology and Current RNP topology
(one Worker per PoP)
Virtual network request rate
Exponentially distributed with rate equal 1, 5, 9, and 13
virtual networks per unit time
Virtual network lifetime
Exponentially distributed with mean equal to 275 unit
times
Virtual network topology Star networks
Size of the virtual networks Uniformly distributed between 5 and 15 nodes
As occurs in the Zhu and Ammar’s paper, the simulation is finished after 80,000 requests have
been serviced, and to reduce the effects of the warm-up period the metrics are collected after 40,000
requests have been serviced. For each simulation time, the maximum node stress, the maximum link
stress, the mean link stress, and the mean path length are observed, which are averaged across the
simulation time. The results were calculated with 95% confidence intervals, which will be showed if
necessary.
Regarding the maximum node stress (showed at Figure 56), the results show that the MPA
strategy reaches the same performance of ZAA and Hybrid solutions. Actually, such an achievement
can be observed in the ZAA’s paper, which shows that their algorithm for allocating star networks
obtains a maximum node stress equivalent to the least-load algorithm. These results are similar in
both evaluated topologies.
(a) (b)
Figure 56 Results for the maximum node stress in the (a) old and (b) current RNP topology
0
10
20
30
40
50
60
70
80
0 2 4 6 8 10 12 14
MaximumNodeStress
ArrivalRate (requestsperunit time)
ZAA
MPA
Hybrid
0
10
20
30
40
50
60
70
80
0 2 4 6 8 10 12 14
MaximumNodeStress
Arrival Rate (requestsperunit time)
ZAA
MPA
Hybrid
79
About the maximum link stress, one can observe that Hybrid solution and ZAA outperform
MPA in the old topology (Figure 57(a)). The MPA is about 43% greater than the ZAA in the worst
case at Arrival Rate equal to one, in the better case (at 13 requests per simulation time) the MPA is
23% greater than the ZAA maximum node stress. This occurs because the node allocation of ZAA
takes in consideration the stress of neighbor links and the path cost has the property of adapting
between shortest path and a least-loaded path according to the current load. Thus, initially, ZAA
tends to allocate nodes in lesser stressed regions of the graph connected through shorter links. On
the other hand, MPA tends to allocate the virtual nodes in physical nodes that not are necessarily
close that can cause the use of longer paths, even when the network is lightly stressed.
In this same scenario, the Hybrid strategy shows as the minimax path allocation strategy can
outperform the strategy adopted in ZAA. This strategy eliminates bad the effect of the node
allocation in MPA using the node allocation heuristic of ZAA. The gain in the maximum link stress
of the Hybrid strategy is about 35% and it is practically constant over the variation of the arrival
rate.
(a) (b)
Figure 57 Results for the maximum link stress in the (a) old and (b) current RNP topology
MPA obtain better results than ZAA in a topology with more links available for allocation, as shows
Figure 57(b). In this way, it was perceived that the addition of more links in the network mitigates
the node positioning algorithm and turns the link positioning strategies more purposeful. When the
arrival rate is one request per unit time the gain of MPA over ZAA is about 4% whereas as the
arrival rate increases the MPA gain goes to about 40% over the ZAA. For the Hybrid approach the
gain in the maximum link stress is about 13% and it is practically constant over the variation of the
arrival rate. The MPA’s better performance is due to the fact that the minimax path algorithm tries
to directly minimize the maximum number of virtual links in each physical link, which brings better
results than searching for the path whose stress is minimal, as it is on the case with the algorithm of
Zhu and Ammar.
0
100
200
300
400
500
600
700
0 2 4 6 8 10 12 14
MaximumLinkStress
ArrivalRate (requestsperunit time)
ZAA
MPA
Hybrid
0
100
200
300
400
500
600
700
0 2 4 6 8 10 12 14
MaximumLinkStress
Arrival Rate (requestsperunit time)
ZAA
MPA
Hybrid
80
The mean link stress is showed at Figure 58. These results show that the MPA link allocation
strategy obtains a mean link stress higher than the ZAA, which is an expected result since ZAA
optimizes the mean link stress, while MPA optimizes directly the maximum link stress.
(a) (b)
Figure 58 Results for the mean link stress in the (a) old and (b) current RNP topology
Figure 59 shows the mean path length considering the mean of the length of each virtual link
in the physical network, i.e., the number of physical links on which the virtual link is allocated. As
one can see, MPA and Hybrid solutions obtain, in average, one link more than ZAA. This result
occurs because the minimax path allocation algorithm seeks alternative paths in the network leading
to an increase of this metric.
(a) (b)
Figure 59 Mean path length (a) old and (b) current RNP topology
Another point that must be highlighted is that the original problem solved by MPA is to
allocate the nodes considering several restrictions including the geographical ones. In this way, if a
modified version of the ZAA node allocation algorithm is used in a D-Cloud scenario it could
obtain a poor performance since the virtual nodes should be allocated in their respective
geographical regions preventing ZAA of allocating the virtual nodes in few stressed regions of the
network connected by a shortest distance. Differently, the MPA’s node allocation algorithm is
designed for the D-Cloud as it tries to maintain a lower stress level inside each geographical region
of the D-Cloud.
0
50
100
150
200
250
0 2 4 6 8 10 12 14
MeanLinkStress
Arrival Rate (requestsperunit time)
ZAA
MPA
Hybrid
0
50
100
150
200
250
0 2 4 6 8 10 12 14
MeanLinkStress
ArrivalRate (requestsperunit time)
ZAA
MPA
Hybrid
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14
MeanPathLenght
ArrivalRate (requestsperunit time)
ZAA
MPA
Hybrid
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14
MeanPathLenght
ArrivalRate (requestsperunit time)
ZAA
MPA
Hybrid
81
6.3 Virtual Network Creation
In Nubilum, the Request Description XML Scheme allows a request to have virtual nodes only (see
Section 5.1.1). In this case, the submitted request is not a virtual network and finding a specific route
with a maximum delay is not necessary. Thus, the simple routing through shortest paths would be
sufficient. But, the D-Cloud provider could use some solution to create a virtual network for
intercommunication between the virtual machines for this new request in order to obtain a better
usage of its network resources. Working with this scenario, this section investigates the problem of
creating a virtual network when a request is submitted without any specified links.
The problem consists in creating a network connecting a subset of the physical nodes (the
nodes where the virtual nodes were previously allocated by the Algorithm 1) with the objective of
balancing the load across virtual links and, secondly, minimizing the energy consumption. Figure
60(a) shows an example of a given physical topology where the virtual machines are allocated in
nodes A, B, and E and the number in each physical link indicates the number of current virtual links
crossing this physical link. Figure 60(b) shows the virtual network (the grey lines) created to
interconnect those nodes. The created virtual network reaches the two objectives of balancing the
load in the network while reducing the number of used links, which leads to the reduction of the
energy consumption.
(a) (b)
Figure 60 Example creating a virtual network: (a) before the creation; (b) after the creation
This problem can be better defined as to find a minimum length (number of links) interconnect
for a subset of nodes such that the maximum weight of its links is minimal. The minimum Steiner
tree problem can be reduced to this one considering minimizing the length in number of links and
with all link weights with the same arbitrary value. On the other hand, if the value of the minimal
maximum weight is given, the problem then reduces to finding a minimum length Steiner tree.
As can be seen, the proposed problem is NP-hard. Therefore, approximate solutions will be
proposed. The reduction of the problem to a Steiner tree gives an interesting idea for an
approximate algorithm. Such algorithm consists in executing a binary search in a vector ranging
from the current minimum link weight to the maximum link weight of the graph. During each
iteration, a weight k is selected and all links with weight greater than k are removed from the graph.
An approximate minimum length Steiner tree algorithm [65] is executed in this subgraph to verify if
D
E
F
C
0
A B
1
0 0
41
2
D
E
F
C
0
A B
1
0 1
42
2
82
there is a tree between the nodes. If there is a Steiner tree in this graph the upper bound is adjusted
to the current k. Adversely, if there is no path in this graph, the lower bound is adjusted to k. The
binary search is repeated until the vector is reduced to two elements. Thus, k is chosen as the
element in the upper bound, and the Steiner tree approximation gives the virtual network for the
problem.
This algorithm is similar to the minimax path with delay restriction algorithm described
previously in Section 6.2.3, since it is defined by an outer function doing binary search and an inner
algorithm to solve a specific problem (minimum delay path, in the previous algorithm, and
minimum length Steiner tree, in this one). The main idea behind this combined approach is that the
binary search provides the load balance searching by minimal maximum link weight, while the
Steiner tree approximation gives the minimum length virtual network that minimizes energy. Next,
the algorithms used to solve the minimum length Steiner tree problem (Section 6.3.1) and an
evaluation of these algorithms are shown (Section 6.3.2).
6.3.1 Minimum length Steiner tree algorithms
This section shows three algorithms to solve the minimum length Steiner tree problem in a graph.
The first algorithm is a well-known approximation based on the minimum spanning tree in the
distance graph [65]. The second and third algorithms are proposed by this Thesis: one uses a greedy
heuristic that searches for the better hubs in the network in order to minimize the path lengths, and
the other is an exponential algorithm that finds the minimum Steiner tree through successive tries of
link removal.
Steiner Tree Approximation (STA)
Basically, the STA algorithm consists in transforming a general Steiner tree problem in a metric
Steiner tree problem, which is a variant of the general problem where the graph is complete and the
links satisfy the triangle inequality (ܿ‫ݐݏ݋‬ሺ‫,ݑ‬ ‫ݒ‬ሻ ൑ ܿ‫ݐݏ݋‬ሺ‫,ݑ‬ ‫ݓ‬ሻ ൅ ܿ‫ݐݏ݋‬ሺ‫,ݒ‬ ‫ݓ‬ሻ). The transformation is
made by calculating the all-pairs shortest path in the original graph ‫ܩ‬ and generating a new
connected graph ‫ܩ‬ᇱ
whose nodes are the nodes of the original graph and the links are the cost of the
shortest path between the nodes in the original graph.
In ‫’ܩ‬ one can find a minimum spanning tree ܶ’ that is a 2-approximation of the minimum
Steiner tree in this distance graph. Given this tree, one can replace their links by the original paths to
obtain a subgraph in ‫.ܩ‬ If this subgraph contains cycles, removing links will generate a 2-
approximation of the minimum Steiner tree in the general graph. More details and proofs on this
process can be found in [65]. The complexity of the STA algorithm is ܱሺܰଷ
ሻ, where ܰ is the
number of physical nodes.
83
Greedy Hub Selection (GHS)
The solution of the Steiner tree contains a subset of nodes of the graph that act like hubs
interconnecting two or more nodes of the Steiner tree (the nodes C and F in Figure 60(b)). Thus,
given a graph and a subset of physical nodes containing the requested virtual nodes, the objective of
the GHS algorithm is to find the hubs of the minimum length Steiner tree interconnecting the
virtual nodes. This problem is similar to the replica placement problems (Section 0) but with the
difference that the replicas must form a virtual tree topology, which makes the problem more
difficult.
The GHS algorithm initiates with a tree formed by the physical nodes where a virtual node was
allocated. One of these nodes is chosen as the root of the tree and as the first hub. Following an
iterative approach, a new hub node is placed at the best location in the network – defined as the one
which achieves the minimal number of used links (the cost) among all the possible positions. This
location is then fixed. The new hub is then connected to other nodes in the tree through the
shortest path, but a heuristic is used to maintain the tree. The positioning of a new hub and the
immediate link rewiring reduces the cost and these processes follow while the positioning of a new
hub reduces the cost.
Algorithm 3: Search Procedure
1 Inputs: nodeList;
2 Outputs: selectedNode, best;
3 best = infinite;
4 for each node in nodeList(available) do
5 cost = placementProcedure(node);
6 if(cost < best)
7 selectedNode = node;
8 best = cost;
9 end
10 undoInsertion(node);
11 end
Figure 61 Search procedure used by the GHS algorithm
The selection characterizes GHS as greedy: it selects the best configuration possible for the
current iteration. Thus, GHS searches for a solution with a relevant degree of optimality, calculating
the new value of the cost for each selected candidate physical node, and selecting the one that
achieves the best cost. The pseudo-code of this search procedure is presented in Figure 61. Such
procedure simply selects any available physical node as a candidate. The variable nodeList can be
indexed with available, hub, root, or requested in order to return respectively: the available
nodes; the hub nodes other than the root; the root node; or the other requested nodes. Thus,
nodeList(available) returns the nodes of the physical network that are not already a hub, the
root, or a requested node. After that, the algorithm calls the placementProcedure (Figure 62)
adding a new hub in this candidate and rewiring the virtual links (line 5). The cost achieved by
84
placing this new hub is calculated and if better than others it is selected as the best candidate node
(lines 7 and 8). The line 10 returns the network to the state before the hub insertion in order to
prepare the network for the next candidate hub.
The placement strategy (Figure 62) is a heuristic approach that links a candidate hub to a parent
and children maintaining the tree structure and guaranteeing optimality. The parent is selected firstly
as the nearest hub considering the shortest path length (line 3), but if the virtual link from the
current parent of this nearest hub crosses the candidate (line 4) the parent of the candidate will be
the parent of the nearest hub, and the candidate will be the new parent of this hub. After that, the
children of the new hub are selected amongst the other hub and requested nodes (line 10). A node is
chosen as a child if the new hub is a better parent (nearer) than the current parent and if the node is
not an ancestor of the new hub. After placing the new hub, the new number of used links is
returned.
Algorithm 4: Placement Procedure
1 Inputs: candidate
2 Outputs: newCost
3 nearest = nearestHub(candidate);
4 if(path(nearest,parent(nearest)).contains(candidate))
5 parent(candidate) = parent(nearest);
6 parent(nearest) = candidate;
7 else
8 parent(candidate) = nearest;
9 end
10 for each node in nodeList(hub or requested) do
11 if(distance(node, parent(node)) > distance(node, candidate) and not
isAncestor(node, candidate))
12 parent(node) = candidate;
13 end
14 end
15 newCost = calculateNewCost();
Figure 62 Placement procedure used by the GHS algorithm
In order to clarify the proposed heuristic, let’s consider the current virtual tree in Figure 63(a),
which is formed by the grey nodes, with the R node representing the selected root node, the H node
as a hub selected in a previous iteration, and the grey lines indicating the path of the current virtual
links. The white nodes are available for adding new hubs. Considering the current candidate as the
node A, one must select the nearest one as node H, since it is the nearest hub from A. However, as
the path that goes from H to its parent R passes through A, the parent of A is set as R and H as a
child of A. Finally, one must set all the other children of A, as any node that is not an ancestor of
the new hub (as the parent of A was already set, its ancestor is well defined as the node R) which has
a distance to A of less than its distance to its own parent. Notice that the nodes that can be its
children are 1, 2, and 3. For such nodes, only the node 1 is nearer to node A than to its own parent,
the others are nearer to node H. So, the new children of A are 1 and H. Figure 63(b) depicts the new
configuration. The cost function should be calculated now and compared to other candidate results.
85
(a) (b)
Figure 63 Example of the placement procedure: (a) before and (b) after placement
These two procedures are the core of GHS, but there are two additional procedures. The
procedure for selecting the root node tries to select the virtual node that minimizes the summing of
the distances between all the allocated virtual nodes, which is equivalent to the MPP explained in
Section 6.1, where the link weights are all equal to one and the summation is made only in a subset
of nodes. This problem can be solved in the same way as the MPP. After choosing the root node,
the virtual links in initial virtual tree are created trough the shortest path between the root node and
each other requested node. If the virtual link between a requested node A and the root node passes
through a requested node B the node B will be the parent of A.
The other procedure is an external loop that calls the search procedure to greedily add new
hubs to the network until the cost achieved by the placement of a new hub is not reduced. The
complexity of GHS is ܱሺܰଷ
ሻ.
Optimal Algorithm (OA)
Because a Steiner tree never contains a cycle, then there exists a subgraph of the original graph
which is a tree and contains a minimal Steiner tree for any given subset of nodes. Observe that, in
this subgraph there is only one path between any two nodes. Thus, if the graph is already a tree, the
minimal Steiner tree between the given subset of nodes can be found on linear time through a
depth-first search.
Considering such property, an optimal algorithm to find the minimum length Steiner tree can
be proposed: for the connected component of the graph in which the subset of nodes is – which
can be found by a breadth-first search after removing the links with weight greater than k –, count
the minimal number of links that are needed to be removed in order to turn this graph into a tree. If
ܰ is the number of nodes and ‫ܮ‬ the number of links of the considered component, this number is
given by ݉ ൌ ‫ܮ‬ െ ܰ ൅ 1 [41]. Then, in this component, for each subset of ݉ links, remove them
and find the optimal Steiner tree in this subgraph, considering that this graph is a tree, through the
depth-first search, as observed previously. As the algorithm tries every possible way of removing
A
B
C
R
1
2
3
H H
A
B
C
R
1
2
3
H
86
these links, one of them will find the tree which leads to an optimal Steiner tree. The complexity of
this algorithm is ܱሺቀ
‫ܮ‬
݉
ቁ ሺܰ ൅ ‫ܮ‬ሻሻ.
6.3.2 Evaluation
This section evaluates the minimum length Steiner tree algorithms discussed in the previous section.
The section does not evaluate the overall solution for the creation of the virtual network, which has
the binary search as the outer procedure, but it evaluates only the inner Steiner tree procedure, i.e.,
this evaluation only covers the energy reduction objective.
The evaluation was made through Monte Carlo simulations considering a fixed physical
topology with the random positioning of the virtual nodes. In each simulation sample, the stress of
each physical link is drawn from a uniform distribution and the virtual nodes to be connected are
positioned in a uniform way. Note that, each sample is independent, and the physical network is
returned to its initial values before a new set of requested nodes is attempted. The two RNP
topologies used in Section 6.2.4 and showed at Figure 55 are also used in this experiment. At each
run of the simulation each algorithm (STA, GHS, and OA) is submitted to the same scenario and
the number of used physical links (the cost) in the tree is measured for each algorithm
independently.
The factors varied in the experiment are showed at Table VII. For the old RNP topology, the
number of requested virtual nodes in the network is varied though the simulations from 3 to 27
nodes, which is equivalent to 11% and 100% of the physical nodes, respectively. For the current
RNP topology, this number ranges from 3 to 28. For each run, from a total of 1000, the requested
nodes were positioned in a different physical node in a random way, with every subset of the
physical nodes equally likely to be selected without repetitions. For each sample, the relative error
between the costs of the GHS and STA algorithms was calculated against the optimal cost. All the
results showed use a confidence level of 95%, which are showed in the graphs.
Table VII Factors and levels used in the GHS’s evaluation
Factors Levels
Physical network topology
Old RNP topology and Current RNP topology
(one Worker per PoP)
Number of requested virtual nodes
3 to 27 (old RNP)
3 to 28 (current TNP)
Figure 64, Figure 65, Figure 66, and Figure 67 show the main results for the algorithms. The
graphs in Figure 64 and Figure 66 present the percentage of samples that reached the optimum
(relative error equal zero), and the graphs in Figure 65 and Figure 67 present the percentage of
samples that reached a relative error below than five percent. These percentages are showed
87
according to the number of requested nodes. Results for requests with 27 nodes for the scenario of
the old RNP topology are not showed because this scenario is simple and all the samples reach the
optimum. The same occurs for 28 nodes in the current RNP topology scenario.
As shown by Figure 64, the GHS algorithm achieves the optimum cost in 100% of the samples
for virtual networks with 3 nodes and 95% to 99% of the samples for virtual networks of 4, 5, 6,
and 7 nodes. However, the performance of GHS tends to decrease when the number of requested
nodes increases. In the worst case, about 75% of the samples of GHS reach the optimum in the
cases from 14 to 22 requested nodes. Moreover, for small virtual networks until 7 nodes, GHS
statistically outperforms STA, with the best cases occurring with 5 and 6 nodes where the difference
is about 6%. The performance of STA is high for few requested nodes, decreases in the middle of
the range of virtual network size, and reaches the optimum when 26 nodes are requested.
Figure 64 Percentage of optimal samples for GHS and STA in the old RNP topology
As the number of nodes in the virtual network tends to the total number of nodes (27 in the old
RNP topology), the performance of STA is improved and outperforms GHS since the problem
tends to be to compute the minimum spanning tree in the physical network. In the other hand, GHS
is better for small networks because the placement strategy is designed to find the hubs in the
physical network whereas the STA strategy is to find the common links in the shortest paths
between the requested nodes, if there are no common links in these shortest paths STA cannot find
the hubs minimizing the cost.
Looking only for the samples that reached the optimum, one can conclude that the GHS
algorithm is not adequate for bigger virtual networks. But, Figure 65 shows the performance of each
algorithm considering the samples that reached a relative cost error less than 5% in relation to the
optimum. In this case, the GHS performance has significantly improved for virtual networks greater
or equal than 19 nodes. For example, in the scenario with 26 nodes – the worst scenario for GHS
60,00%
70,00%
80,00%
90,00%
100,00%
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Percentageofsamplesthatreachedthe
optimum
Number of nodes of the Virtual Networkto be created
Performance of GHS and STA against the Optimal Algorithm
GHS
STA
88
considering only optimum samples – all the samples of the GHS algorithm obtained a relative error
lesser than 5%. Furthermore, the STA algorithm has improved for virtual networks with 19 nodes
or more. This shows that, considering all virtual networks’ sizes, even the GHS algorithm reaching
optimality only in few cases, most samples reached less than 5% of the optimum in the old RNP
topology and the worst cases was for 69.1% of the samples with 16 nodes. For STA, the worst case
is 65.6% with 14 virtual nodes.
Figure 65 Percentage of samples reaching relative error ≤ 5% in the old RNP topology
The results for the current RNP topology are presented at Figure 66 and Figure 67. In this
scenario, the behavior of the results is similar to the behavior in the previous scenario. Again, the
GHS outperforms STA for smaller virtual networks from 3 to 11 nodes and the inverse occurs for
virtual networks from 16 to 27 nodes. Thus, it must be observed that increasing the quantity of
physical links can improve the results of the GHS algorithm, since in the old RNP topology GHS
outperforms STA until 7 nodes only. Moreover, performance of GHS is substantially improved
when considering samples that reached 5% of the optimum as occurred in the previous scenario.
Figure 66 Percentage of optimal samples for GHS and STA in the current RNP topology
60,00%
70,00%
80,00%
90,00%
100,00%
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Percentageofsamplesthatreached
relativeerrorlessthan5%
Number of nodes of the Virtual Networkto be created
Performance of GHSand STA against the Optimal Algorithm(w/5% error)
GHS
STA
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Percentageofsamplesthatreachedthe
optimum
Number of nodes of the Virtual Networkto be created
Percentageof samples that reachedthe optimum
GHS STA
89
Figure 67 Percentage of samples reaching relative error ≤ 5% in the current RNP topology
6.4 Discussion
This chapter presented how Nubilum performs the allocation of resources in a D-Cloud. It was
showed as the comprehensive resource model employed by Nubilum leverages the usage of previous
algorithms for virtual network allocation present in the literature. In addition, this chapter presented
the evaluation of the specific algorithms for self-optimization of resources in a D-Cloud considering
several aspects such as load balancing, energy consumption, network restrictions, and server
restrictions.
Considering particularly the problems involving virtual networks – both the allocation and
creation of virtual networks – it was showed that, due to the complexity introduced when coping
with several different requirements and objectives (from developers and the provider), the problems
are in NP. Thus, the proposed solutions for such problems are greedy algorithms which employ
some sort of heuristics.
In the problem where the virtual network is given, the proposed solution tries to minimize the
number of virtual links allocated in each physical link considering restrictions associated to virtual
nodes and virtual links. Note that the virtual link restriction does not consider link capacity, since
considering unconstrained capacities seems better suited to a D-Cloud with a pay-as-you-go business
plan, thus, the problem differs from several previous ones on the field of virtual networks allocation.
Other aspect that can be highlighted is that the proposed solution is not restricted to the allocation
of an incoming virtual network. It could be used for continuous maintenance of the application in
order to guarantee developer’s requirements. Particularly, the algorithm for allocation of virtual links
(Figure 52) could be called by the Application Monitor to reallocate a virtual link whose current
delay is greater than the maximum required. Also, the same algorithm could be called by the D-
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Percentageofsamplesthatreached
relativeerrorlessthan5%
Number of nodes of the Virtual Networkto be created
Percentageof samples that reached relativeerror less than 5%
GHS STA
90
Cloud Monitor to rearrange the overall virtual links on the D-Cloud in order to obtain a better load
balance in the physical network.
Other studied problem was to build a virtual network for a given set of virtual nodes in order
to minimize the energy consumption whereas balancing the load of physical links. Such a problem
considers that the developer requested only virtual nodes with no virtual links, thus, the problem
tries to capture provider’s requirements on load balancing and energy consumption. For this
problem, two algorithms were proposed, one an optimum algorithm and other one based on
heuristics. The heuristic approach was compared with a known approximation algorithm for the
Steiner tree problem, and the results showed that the proposed heuristic is better suited for small
size virtual networks whereas the traditional approximation algorithm is better for bigger virtual
networks. One possible implementation of this algorithm in a production scenario could consider
alternating between these algorithms according to the size of the virtual network in order to obtain
the best solution in each case.
The optimum algorithm was used in the experiments as a baseline that references the
performance of the others algorithms. Observe that the optimum algorithm is adapted for the
physical network used in the test scenario that is close to a tree (27 nodes and 29 links). In this case
the optimum algorithm could be used for virtual network creation since the number of possible
combinations is low, lowering the computing time required.
91
7 Conclusion
“Non datur denarius incipientibus, sed finientibus, et corona non currentibus, sed pervenientibus.”
Saint Augustine
Small cooperative datacenters can be attractive since they offer cheaper and low-power consumption
alternative that reduces the costs of centralized Clouds. These small datacenters can be built at
different geographical regions and connected to form a Distributed Cloud, which reduces costs by
simply provisioning storage, servers, and network resources close to end-users. Users in a D-Cloud
are free to choose where to allocate their resources in order to attend specific market niches,
constraints on jurisdiction of software and data, or quality of service aspects of their clients.
One of the most important design aspects of D-Clouds is the availability of “infinite”
computing resources. But, providing on-demand computing instances and network resources in a
distributed scenario is not a trivial task, and the resource management system must be carefully
designed in order to guarantee that both user and provider requirements are met satisfactorily. Such
design covers several aspects such as the optimization algorithms for resource management, the
development of suitable resource and offering models, and the right mechanisms for resource
discovery and monitoring should also be designed.
This Thesis introduced Nubilum, a resource management system that offers a self-managed
solution for challenges on allocation, discovery, control, and monitoring of resources in Distributed
Clouds. This system can be seen as an integrated solution meeting the following requirements:
• An suitable information model – called as CloudML – for describing the range of D-Cloud
resources and application’s computational, topologyc, and geographic restrictions;
• An integrated control plane for network and computational resources, which combines
HTTP-based interfaces and the Openflow protocol to enforce decisions in the D-Cloud;
• A set of algorithms for allocation of resources to developer’s applications based on several
different requirements and objectives.
Next, the main contributions of Nubilum are described in Section 7.1 whereas the related
publications obtained during the PhD are presented at Sectio 7.2. Finally, some further work are
listed in Section 7.3.
92
7.1 Contributions
Nubilum introduces a clear separation between the enforcement and the intelligence roles
played by the Manager and the Allocator, respectively. The Manager offers an abstraction of the
available resources on the D-Cloud to the Allocator, which, in turn, uses algorithms to allocate
resources for incoming developer’s requests.
Note that the system defines a resource model but it intends to remain open for different
model-driven resource allocation strategies. The resource model is condensed in the CloudML, a
description language that expresses resources, services, and requests in an integrated way. CloudML
presents interesting characteristics for both providers and developers. CloudML is used in Nubilum
for representing virtual and physical infrastructure, services, and requests.
In terms of the solutions employed for controlling the D-Cloud, Nubilum uses open
protocols for communication between their components and open strategies to define their
processes. The overall communication is made through HTTP messages (using REST) and
Openflow messages. These messages are exchanged between the components for the coordinated
execution of each process in the D-Cloud. One of these processes is the discovery process that
combines Openflow registration messages, REST operations, and LLDP messages in an integrated
and versatile solution for discovering of servers, switches, and the network topology.
In terms of the allocation process, Nubilum performs the allocation of virtual machines and
virtual links on the physical servers and links. The use of Libvirt for management of the hypervisors
allows Nubilum to work with very heterogeneous infrastructures, whereas the use of Openflow
allows the versatile reorganization of the network when necessary. The IP/MAC management is
centralized into the Manager, but the actual attribution of those addresses to virtual machines is
made by each Worker through a built-in DHCP server.
The overall Nubilum’s control plane solution was evaluated through measurements in a
prototype implementation of Nubilum. From those measurements some models were derived to
estimate the load introduced by Nubilum’s control messages when performing their main
operations. Such models show the linear growth of the control load in respect to the growth of the
number of resources in the D-Cloud.
Finally, some optimization algorithms were developed for two different problems. The first
problem involves the on-line allocation of virtual networks in a physical network with the objective
of balancing the load over the physical infrastructure. Such problem also considers restrictions on
geo-location, memory, processing, and network delay. Due to the complexity of this problem, its
solution is a procedure developed in two stages: one for node allocation and another for links.
93
Compared with a previous approach designed for load balancing, our procedure showed to be
adequate in scenarios with bigger virtual networks.
The second studied case addresses the problem of allocating a virtual tree for connecting
nodes, when the request does not contain any links. Again load balancing is the main objective, with
the energy reduction as a secondary objective. A two-step algorithm is designed for this problem,
with an outer loop responsible for load balancing and an inner loop for energy reduction. The inner
loop problem can be reduced to a minimum length Steiner tree problem and two algorithms are
proposed for this step. One uses a greedy strategy for hub placement and the other uses a
combinatorial approach to find the optimum. The heuristic-based algorithm is compared against a
Steiner tree approximation algorithm with the optimum algorithm as the baseline. The results
showed that the hub placement heuristic is better suited for small virtual networks, while the Steiner
approximation has better results in scenarios with greater virtual networks.
7.2 Publications
Some results presented in this Thesis were developed in cooperation with the Location and
Allocation Mechanisms for Distributed Cloud Servers, a research project carried by the GPRT
(Grupo de Pesquisa em Redes e Telecomunicações) and funded by Ericsson Telecomunicações S.A., Brazil.
Parts of this Thesis were published in the form of scientific papers at conferences and
journals. Table VIII shows all the papers produced, including the papers already accepted and the
submitted papers that are under revision.
The results and analysis developed in papers #1 and #2 are not part of this Thesis, but they
make part of a previous evaluation of several open-source systems for management of Clouds.
These earlier works were an important milestone in the development of this Thesis since they
provided sufficient knowledge about the technologies being used to leverage current Cloud
Computing setups. Paper #1, particularly, contributed with the concept of programmability applied
to Cloud Computing.
Papers #3 and #4 are in the group of the conceptual contributions of this Thesis which
discuss the research challenges related to resource management on Clouds and D-Clouds,
respectively. The book chapter is partially reproduced in Chapter 2, whereas Chapter 3 is composed
by some parts of paper #4.
The papers #5, #6, #7, and #8 compose the group of effective contributions of this Thesis.
Part of the paper #5 introducing CloudML appears in Section 5.1 of this Thesis. The short paper #6
corresponds to Chapter 4 and it presents the main components and modules of Nubilum.
94
Table VIII Scientific papers produced
# Reference Type Status
1 P. Endo, G. Gonçalves, J. Kelner, D. Sadok. A Survey on
Open-source Cloud Computing Solutions. VIII
Workshop in Clouds, Grids e Applications, Brazil, June,
2010.
Conference
Full Paper
Published
2 T. Cordeiro, D. Damálio, N. Pereira, P. Endo, A. Palhares,
G. Gonçalves, D. Sadok, J. Kelner, B. Melander, V. Souza, J.
Mångs. Open Source Cloud Computing Platforms. 9th
International Conference on Grid and Cloud Computing,
China, November 2010.
Conference
Full Paper
Published
3 G. Gonçalves, P. Endo, T. Cordeiro, A. Palhares, D., J.
Kelner, B. Melander, J. Mångs. Resource Allocation in
Clouds: Concepts, Tools and Research Challenges. 29º
Simpósio Brasileiro de Redes de Computadores e Sistemas
Distribuídos, 2011.
Book
Chapter
Published
4 P. Endo, A. Palhares, N. Pereira, G. Gonçalves, D. Sadok, J.
Kelner, B. Melander, J. Mångs. Resource Allocation for
Distributed Cloud – Concepts and Research Challenges.
IEEE Network Magazine, July/August 2011.
Journal
Full Paper
Published
5 G. Gonçalves, P. Endo, M. Santos, D. Sadok, J. Kelner, B.
Melander, J. Mångs. CloudML: An Integrated Language
for Resource, Service and Request Description for D-
Clouds. IEEE Conference on Cloud Computing
Technology and Science (CloudCom), December 2011.
Conference
Full Paper
Published
6 G. Gonçalves, M. Santos, G. Charamba, P. Endo, D. Sadok,
J. Kelner, B. Melander, J. Mångs. D-CRAS: Distributed
Cloud Resource Allocation System, IEEE/IFIP Network
Operations and Management Symposium (NOMS), 2012
Conference
Short Paper
Published
7.3 Future Work
As can be noticed throughout this Thesis, the Cloud Computing paradigm will certainly be present
in our lives during the years to come. Also, D-Clouds should gain its niche slowly. But, before seeing
new developments, new research for automating the formation of such distributed environments is
still needed. These include strategies for advertising and scavenging for resources, loading and
freeing these automatically. Similar challenges include usage of different protocols for controlling
dedicated network devices as radio stations, modems, and other devices that do not support the
Openflow protocol.
Future works on this Thesis include testing the current implemented version of Nubilum in an
ISP-scale environment. Such case study would employ all the components of the system and some
of the proposed algorithms to instantiate applications on a D-cloud environment. The main idea of
this case study is to obtain performance data of the system in a real environment, allowing
identifying the bottlenecks and, eventually, engineering challenges that could be not perceived in the
tests made in laboratory.
95
Another future possibility would be adding support for elasticity on Nubilum. This aspect
involves reviewing several aspects of the system. One first aspect to be considered is the change on
the CloudML information model, which should be modified to support elasticity rules that would be
used by Nubilum to determine when and how to scale up and down the applications. These very
specific rules differs from the common practice used in current Cloud Computing setups since
creating a new virtual node can require the creation of one or several links, which would be
indicated by the elasticity rules informed by the developer. Moreover, specific algorithms should be
developed to determine the suitable physical resources for the elastic growth of the virtual networks.
96
References
[1] ARMBRUST, M., FOX, A., GRIFFITH, R., JOSEPH, A.D., KATZ, R.H., KONWINSKI,
A., LEE, G., PATTERSON, D.A., RABKIN, A., STOICA, I., and ZAHARIA, M. Above
the Clouds: A Berkeley View of Cloud Computing, Tech. Rep. UCB/EECS-2009-28,
EECS Department, University of California, Berkeley, 2009.
[2] BARONCELLI, F., MARTINI, B., and CASTOLDI, P. Network virtualization for cloud
computing, Annals of Telecommunications, v. 65, pp. 713-721, Springer-Verlag, 2010.
[3] BELBEKKOUCHE, A., HASAN, M., and KARMOUCH, A. Resource Discovery and
Allocation in Network Virtualization, IEEE Communications Surveys & Tutorials, n. 99,
pp. 1-15, 2012.
[4] BELOGLAZOV, A., BUYYA, R., LEE, Y. C., and ZOMAYA, A. A Taxonomy and
Survey of Energy-Efficient Data Centers and Cloud Computing Systems, Advances in
Computers, v. 82, Elsevier, pp. 47-111, 2011.
[5] BOLTE, M., SIEVERS, M., BIRKENHEUER, G., NIEHÖRSTER, O., and
BRINKMANN, A. Non-intrusive Virtualization Management using libvirt. Conference
on Design, Automation and Test in Europe (DATE), Germany, March 2010.
[6] BORTHAKUR, D. The Hadoop Distributed File System: Architecture and Design.
Available at: http://hadoop.apache.org/core/docs/current/hdfs_design.pdf. Last access:
February, 2012.
[7] BUYYA, R., BELOGLAZOV, A., and ABAWAJY, J. Energy-Efficient Management of
Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and
Open Challenges. International Conference on Parallel and Distributed Processing
Techniques and Applications, pp. 6-17, 2010.
[8] BUYYA, R., YEO, C.S., VENUGOPAL, S., BROBERG, J., and BRANDIC, I. Cloud
computing and emerging IT platforms: Vision, hype, and reality for delivering
computing as the 5th utility. Future Generation Computer Systems, Elsevier B. V, 2009
[9] CHAPMAN, C., EMMERICH, W., MARQUEZ, F. G., CLAYMAN, S., and GALIS, A.
Software architecture definition for on-demand cloud provisioning. Proceedings of the
19th ACM International Symposium on High Performance Distributed Computing, pp. 61-
72, 2010.
[10] CHOWDHURY, N. M. M. K., RAHMAN, M. R., and BOUTABA, R. Virtual Network
Embedding with Coordinated Node and Link Mapping, IEEE INFOCOM, 2009.
[11] CHOWDHURY, N.M. M. K. and BOUTABA, R. A survey of network virtualization.
Computer Networks, Vol. 54, issue 5, pp. 862-876, Elsevier, 2010.
[12] CHURCH, K., GREENBREG, A., and HAMILTON, J. On Delivering Embarrassingly
Distributed Cloud Services, Workshop on Hot Topics in Networks (HotNets), 2008.
[13] CROCKFORD, D. JSON: The fat-free alternative to XML. Presented at XML 2006,
Boston, USA, December 2006. Available at: http://www.json.org/fatfree.html. Last access:
February, 2012.
[14] CULLY, B., LEFEBVRE, G., MEYER, D., FEELEY, M., HUTCHINSON, N. and
WARFIELD, A. Remus: High availability via asyncronous virtual machine replication.
5th USENIX Symposium on Networked Systems Design and Implementation, April 2008.
97
[15] DEAN, J. and GHEMAWAT, S. Mapreduce: simplified data processing on large
clusters. Proceedings of the 6th conference on Symposium on Operating Systems Design
& Implementation, Berkeley, CA, USA, 2004.
[16] DONGXI, L. and JOHN, Z. Cloud#: A Specification Language for Modeling Cloud,
IEEE International Conference on Cloud Computing, pp. 533-540, 2011.
[17] DUDKOWSKI, D., TAUHID, B., NUNZI, G., and BRUNNER, M. A Prototype for In-
Network Management in NaaS-enabled Networks, 12th IFIP/IEEE International
Symposium on Integrated Network Management, pp. 81-88, 2011.
[18] ECLIPSE FOUNDATION. Web Tools Platform, 2011. Available at:
http://www.eclipse.org/webtools/. Last access: February, 2012.
[19] ENDO, P. T., GONÇALVES, G. E., KELNER, J., and SADOK, D. A Survey on Open-
source Cloud Computing Solutions. Workshop em Clouds, Grids e Aplicações, Simpósio
Brasileiro de Redes de Computadores e Sistemas Distribuídos, 2010.
[20] ENDO, P. T., PALHARES, A. V. A., PEREIRA, N. N., GONÇALVES, G. E., SADOK, D.,
KELNER, J., MELANDER, B., and MANGS, J. E. Resource allocation for distributed
cloud: concepts and research challenges, IEEE Network Magazine, vol. 25, pp. 42-46,
July 2011.
[21] GALAN, F., Sampaio, A., Rodero-Merino, L., Loy, I., Gil, V., and Vaquero, L. M. Service
specification in cloud environments based on extensions to open standards.
Proceedings of the Fourth International ICST Conference on Communication System
software and middleware, 2009.
[22] GEYSERS Project, Initial GEYSERS Architecture & Interfaces Specification,
Deliverable D2.1 of the Generalised Architecture for Dynamic Infrastructure Services
(GEYSERS) FP7 Project, January 2010.
[23] GHEMAWAT, S., GOBIOFF, H., and LEUNG, S. The Google file system. 19th
Symposium on Operating Systems Principles, pages 29–43, Lake George, New York,
2003.
[24] GONÇALVES, G., ENDO, P., CORDEIRO, T., PALHARES, A., SADOK, D., KELNER,
J., MELANDER, B., and MÅNGS, J. Resource Allocation in Clouds: Concepts, Tools
and Research Challenges. Simpósio Brasileiro de Redes de Computadores e Sistemas
Distribuídos, June 2011.
[25] GONÇALVES, G., ENDO, P., SANTOS, M., SADOK, D., KELNER, J., MELANDER, B.
MÅNGS, J. CloudML: An Integrated Language for Resource, Service and Request
Description for D-Clouds. IEEE Conference on Cloud Computing Technology and
Science (CloudCom), December 2011.
[26] GONÇALVES, G., ENDO, P., SANTOS, M., SADOK, D., KELNER, J., MELANDER,
B., MÅNGS, J. CloudML: An Integrated Language for Resource, Service and Request
Description for D-Clouds. IEEE Conference on Cloud Computing Technology and
Science (CloudCom), December 2011.
[27] GREENBERG, A., HAMILTON, J., MALTZ, D. A., and PATEL, P. The cost of a cloud:
research problems in data center networks. SIGCOMM Comput. Commun. Rev. 39, n.
1, pp. 68-73, 2008.
[28] GU, Y., and GROSSMAN, R. Sector and Sphere: The Design and Implementation of a
High Performance Data Cloud, Philosophical Transactions: Series A, Mathematical,
physical, and engineering sciences, v. 367, pp. 2429-2445, June 2009.
98
[29] GUDE, N., KOPONEN, T., PETTIT, J., PFAFF, B., CASADO, M., MCKEOWN, N., and
SHENKER. S. NOX: towards an operating system for networks. ACM SIGCOMM
Computer Communication Review, v. 38, no. 3, July 2008.
[30] HADAS, D., GUENENDER, S., and ROCHWERGER, B. Virtual Network Services for
Federated Cloud Computing, Technical report H-0269, IBM, 2009.
[31] HAIDER, A., POTTER, R., and NAKAO, A. Challenges in Resource Allocation in
Network Virtualization. 20th ITC Specialist Seminar, pp. 18-20, Hoi An, Vietnam, May
2009.
[32] HOUIDI, I., LOUATI, W., ZEGHLACHE, D., and BAUCKE, S. Virtual Resource
Description and Clustering for Virtual Network Discovery, Proceedings of IEEE ICC
Workshop on the Network of the Future, 2009.
[33] HUMBLE, J. JavaSysMon, version 0.3.0, January 2010. Available at:
https://github.com/jezhumble/javasysmon. Last access: February, 2012.
[34] JUNGNICKEL, D. Graphs, Networks and Algorithms. Algorithms and Computation in
Mathematics, v.5, 3rd
Edition, Springer-Verlag, 2007.
[35] KARLSSON, M., KARAMANOLIS, C., and MAHALINGAM, M. A Framework for
Evaluating Replica Placement Algorithms. Technical Report HPL-2002, HP
Laboratories, July 2002.
[36] KELLERER, H., PFERSCHY, U., and PISINGER, D. Knapsack Problems, 1st
Edition,
Springer-Verlag, 2004.
[37] KHOSHAFIAN, S. Service Oriented Enterprises. Auerbach Publications, 2007.
[38] KOOMEY, J. Growth in Data center electricity use 2005 to 2010. Analytics Press,
Oakland, August 2011. Available at: http://www.analyticspress.com/datacenters.html. Last
access: February, 2012.
[39] KOSLOVSKI, G.P., PRIMET, P.V.B., and CHARAO, A.S. VXDL: Virtual resources and
interconnection networks description language, Networks for Grid Applications, pp.
138-154, Springer, 2009.
[40] LAGAR-CAVILLA, H. A., WHITNEY, J. A., SCANNELL, A. M., PATCHIN, P.,
RUMBLE, S. M., LARA, E., BRUDNO, M., and SATYANARAYANAN, M. SnowFlock:
rapid virtual machine cloning for cloud computing. Fourth ACM European Conference
on Computer Systems, 2009.
[41] LEISERSON, C. E., STEIN, C., RIVEST, R. L., and CORMEN, T. H. Algoritmos: Teoria
e Prática. Campus, ed. 1, 2002.
[42] LISCHKA, J. and KARL, H. A virtual network mapping algorithm based on subgraph
isomorphism detection, ACM SIGCOMM Workshop on Virtualized Infastructure
Systems and Architectures, 2009.
[43] MARSHALL, P., KEAHEY, K., and FREEMAN, T. Elastic Site: Using Clouds to
Elastically Extend Site Resources, 10th IEEE/ACM International Conference on Cluster,
Cloud and Grid Computing, pp. 43-52, Australia, June, 2010.
[44] MCKEOWN, N., ANDERSON, T., BALAKRISHNAN, H., PARULKAR, G.,
PETERSON, L., REXFORD, J., SHENKER, S., and TURNER, J. OpenFlow: enabling
innovation in campus networks, ACM SIGCOMM Computer Communication Review,
2008.
[45] MELL, P. and GRANCE, T. The NIST Definition of Cloud Computing. National
Institute of Standards and Technology, Information Technology Laboratory, 2009.
99
[46] METSCH, T., EDMONDS, A., and NYRÉN, R. Open Cloud Computing Interface –
Core, Open Grid Forum, OCCI-WG, Specification Document. Available at:
http://forge.gridforum.org/sf/go/doc16161?nav=1, 2010. Last access: February, 2012.
[47] MORÁN, D., VAQUERO, L. M., and GALÁN, F. Elastically Ruling the Cloud:
Specifying Application’s Behavior in Federated Clouds, IEEE International Conference
on Cloud Computing, pp. 89-96, 2011.
[48] MOSHARAF, N., CHOWDHURY, K., and BOUTABA, R. A survey of network
virtualization. Computer Networks, v. 54, i. 5, pp. 862–876, April 2010.
[49] MURPHY, M, and GOASGUEN, S. Virtual Organization Clusters: Self-provisioned
clouds on the grid. In Future Generation Computer Systems, 2010.
[50] NEVES, T. A., DRUMMOND, L. M. A., OCHI, L. S., ALBUQUERQUE, C., and
UCHOA, E. Solving Replica Placement and Request Distribution in Content
Distribution Networks, Electronic Notes in Discrete Mathematics, Volume 36, pp. 89-96,
ISSN 1571-0653, 2010.
[51] OPENCLOUD. The Open Could Manifesto - Dedicated to the belief that the cloud
should be open, 2009. Available at: http://www.opencloudmanifesto.org/. Last access:
February, 2012.
[52] OPENFLOW PROJECT. OpenFlow Switch Specification, Version 1.1.0 Implemented,
February 28, 2011.
[53] OPENSTACK – Developer Guide – API v1.1. September, 2011. Available at:
http://docs.openstack.org/api/openstack-compute/1.1/os-compute-devguide-1.1.pdf. Last
access: February, 2012.
[54] PADALA, P. Automated management of virtualized data centers. Ph.D. Thesis, Univ.
Michigan, USA, 2010.
[55] PENG, B., CUI, B. and LI, X. Implementation Issues of a Cloud Computing Platform.
IEEE Data Engineering Bulletin, volume 32, issue 1, 2009.
[56] PRESTI, F. L., PETRIOLI, C., and VICARI, C. Distributed dynamic replica placement
and request redirection in content delivery networks, MASCOTS, pp. 366–373, 2007.
[57] QIU, L., PADMANABHAN, V., and VOELKER, G. On the Placement of Web Server
Replicas. Proceedings of IEEE INFOCOM, pages 1587–1596, April 2001.
[58] RAZZAQ, A. and RATHORE, M. S. An approach towards resource efficient virtual
network embedding, IEEE Conference on High Performance Switching and Routing,
2010.
[59] ROCHWERGER, B., BREITGAND, D., EPSTEIN, A., HADAS, D., LOY, I., NAGIN, K.,
TORDSSON, J., RAGUSA, C., VILLARI, M., CLAYMAN, S., LEVY, E.,
MARASCHINI, A., MASSONET, P., MUÑOZ, H., and TOFETTI, G. Reservoir - When
One Cloud Is Not Enough. IEEE Computer, v. 44, i. 3, pp. 44-51, 2011.
[60] SAIL Project, Cloud Network Architecture Description, Deliverable D-D.1 of the
Scalable and Adaptable Internet Solutions (SAIL) FP7 Project, July 2011.
[61] SHETH, A and RANABAHU, A. Semantic Modeling for Cloud Computing, Part I.
IEEE Computer Society - Semantics & Services, 2010.
[62] VALANCIUS, V., LAOUTARIS, N., MASSOULIE, L., DIOT, C., and Rodriguez, P.
Greening the Internet with Nano Data Centers. Proceedings of the 5th international
conference on Emerging networking experiments and technologies, pp. 37-48. 2009.
100
[63] VAQUERO, L. M., MERINO, L. R., and BUYYA, R. Dynamically Scaling Applications
in the Cloud, ACM SIGCOMM Computer Communication Review, v. 41, n. 1, pp. 45- 52,
January 2011.
[64] VAQUERO, L., MERINO, L., CACERES, J., and LINDNER, M. A Break in the Clouds:
Towards a Cloud Definition, vol. 39, pp. 50–55, January 2008.
[65] VAZIRANI, V. V. Approximation Algorithms, 2nd
Edition, Springer-Verlag, 2003.
[66] VERAS, M. Datacenter: Componente Central da Infraestrutura de TI. Brasport Livros
e Multimídia, Rio de Janeiro, 2009.
[67] VERDI, F. L., ROTHENBERG, C. E., PASQUINI, R., and MAGALHÃES, M. Novas
arquiteturas de data center para cloud computing. SBRC 2010 – Minicursos, 2010.
[68] VOUK, M.A. Cloud Computing – Issues, Research and Implementations. Journal of
Computing and Information Technology, pages 235–246. University of Zagreb, Croatia,
2008.
[69] WHITE, S.R., HANSON, J.E., WHALLEY, I., CHESS, D.M., KEPHART, J.O. An
architectural approach to autonomic computing, Proceedings. International Conference
on Autonomic Computing, vol., no., pp. 2- 9, 17-18, May 2004.
[70] XHAFA, F. and ABRAHAM, A. Computational models and heuristics methods for
Grid scheduling problems. In Future Generation Computer Systems, 2010.
[71] YOUSEFF, L., BUTRICO, M., and SILVA, D. Toward a unified ontology of cloud
computing. Grid Computing Environments Workshop, 2008.
[72] ZHANG, Q., CHENG, L., and BOUTABA, R. Cloud computing: state-of-the-art and
research challenges. Journal of Internet Service Applications, Springer, pp. 7-18, 2010.
[73] ZHOU, D., ZHONG, L., WO, T., and KAN, J. CloudView: Describe and Maintain
Resource View in Cloud, IEEE Cloud Computing Technology and Science (CloudCom),
pp. 151-158, 2010.
[74] ZHU, Y. and AMMAR, M. Algorithms for assigning substrate network resources to
virtual network components, IEEE INFOCOM, pp. 1-12, 2006.

More Related Content

PDF
Adapative Provisioning of Stream Processing Systems in the Cloud
PDF
A request skew aware heterogeneous distributed
PDF
05958007cloud
PDF
Cache and consistency in nosql
PDF
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
PDF
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
PDF
Classical Distributed Algorithms with DDS
PDF
ACM HPDC 2010参加報告
Adapative Provisioning of Stream Processing Systems in the Cloud
A request skew aware heterogeneous distributed
05958007cloud
Cache and consistency in nosql
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Classical Distributed Algorithms with DDS
ACM HPDC 2010参加報告

What's hot (18)

PDF
DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
PDF
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
PPT
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
PDF
Distributeddatabasesforchallengednet
PDF
Addressing the Challenges of Tactical Information Management in Net-Centric S...
PPT
Session 46 - Principles of workflow management and execution
PPTX
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
PPTX
Real time data management on wsn
PDF
High Performance Distributed Computing with DDS and Scala
PPTX
Collaborative Service Models: Building Support for Digital Scholarship
PDF
Managing Big Data (Chapter 2, SC 11 Tutorial)
PDF
Software-Defined Inter-Cloud Composition of Big Services
PPT
Archiving and managing a million or more data files on BiG Grid
PDF
Componentizing Big Services in the Internet
PPTX
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
PDF
NoSql And The Semantic Web
PDF
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
PDF
The Distributed Cloud
DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
Distributeddatabasesforchallengednet
Addressing the Challenges of Tactical Information Management in Net-Centric S...
Session 46 - Principles of workflow management and execution
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Real time data management on wsn
High Performance Distributed Computing with DDS and Scala
Collaborative Service Models: Building Support for Digital Scholarship
Managing Big Data (Chapter 2, SC 11 Tutorial)
Software-Defined Inter-Cloud Composition of Big Services
Archiving and managing a million or more data files on BiG Grid
Componentizing Big Services in the Internet
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
NoSql And The Semantic Web
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
The Distributed Cloud
Ad

Viewers also liked (16)

PPTX
Proyecto Escuela Saludable2009
PDF
Jonathan Teague Gsa Info Pack
PPT
Balton From Poland
PDF
Lean Mean & Agile 2009
PPT
PPTX
MiM Madison SO Conference Presentation
PPT
MiM Madison 2010 rally
PPT
Fda phacilitate2010final
PPTX
Year 8 cad toy instructions
PPT
Seventh grade mi m presentation
PDF
FDA 1997 Points to Consider for Monoclonal Antibodies
PPTX
The Three Branches Of Government Power Point
PDF
Burst TCP: an approach for benefiting mice flows
PDF
2. guru bimbingan kaunseling
PPT
So You Say You Want a Revolution? Evolving Agile Authority
PPTX
stress management
Proyecto Escuela Saludable2009
Jonathan Teague Gsa Info Pack
Balton From Poland
Lean Mean & Agile 2009
MiM Madison SO Conference Presentation
MiM Madison 2010 rally
Fda phacilitate2010final
Year 8 cad toy instructions
Seventh grade mi m presentation
FDA 1997 Points to Consider for Monoclonal Antibodies
The Three Branches Of Government Power Point
Burst TCP: an approach for benefiting mice flows
2. guru bimbingan kaunseling
So You Say You Want a Revolution? Evolving Agile Authority
stress management
Ad

Similar to Nubilum: Resource Management System for Distributed Clouds (20)

PDF
Hybrid Based Resource Provisioning in Cloud
PDF
Cloud computing Review over various scheduling algorithms
DOCX
On the Optimal Allocation of VirtualResources in Cloud Compu.docx
PDF
Rapport eucalyptus cloud computing
PDF
Rapport eucalyptus cloud computing
PDF
International Journal of Engineering Research and Development
PDF
Trustworthy service oriented architecture and platform for cloud computing (2...
PDF
Cloud Computing: Theory and Practice 3rd Edition Dan C. Marinescu
PDF
WJCAT2-13707877
PPT
Presentation cloud computing
PDF
Cloud Computing: Theory and Practice 3rd Edition Dan C. Marinescu
PDF
T04503113118
PDF
PDF
Iaetsd effective fault toerant resource allocation with cost
PDF
Mikel berdufi university_of_camerino_thesis
PDF
Virtualization Technology using Virtual Machines for Cloud Computing
PDF
An Efficient Cloud Scheduling Algorithm for the Conservation of Energy throug...
PDF
Am36234239
PDF
Cloud Computing: Concepts, Technology, Security, and Architecture, Second Edi...
PDF
Cloud computing and CloudStack
Hybrid Based Resource Provisioning in Cloud
Cloud computing Review over various scheduling algorithms
On the Optimal Allocation of VirtualResources in Cloud Compu.docx
Rapport eucalyptus cloud computing
Rapport eucalyptus cloud computing
International Journal of Engineering Research and Development
Trustworthy service oriented architecture and platform for cloud computing (2...
Cloud Computing: Theory and Practice 3rd Edition Dan C. Marinescu
WJCAT2-13707877
Presentation cloud computing
Cloud Computing: Theory and Practice 3rd Edition Dan C. Marinescu
T04503113118
Iaetsd effective fault toerant resource allocation with cost
Mikel berdufi university_of_camerino_thesis
Virtualization Technology using Virtual Machines for Cloud Computing
An Efficient Cloud Scheduling Algorithm for the Conservation of Energy throug...
Am36234239
Cloud Computing: Concepts, Technology, Security, and Architecture, Second Edi...
Cloud computing and CloudStack

More from Glauco Gonçalves (20)

PPTX
História da Igreja - Cruzadas
PPTX
Nubilum: Sistema para gerência de recursos em Nuvens Distribuídas
PPTX
A Santa Inquisição
PPTX
História da Igreja - Fátima e o Século XX
PPTX
História da Igreja - O Século XIX e as Revoluções
PPTX
História da Igreja - Revolução Francesa
PPTX
História da Igreja - Embates islâmico-cristãos
PPTX
História da Igreja - Reforma e Contra-reforma
PPTX
História da Igreja - O Renascimento
PPTX
História da Igreja - Visão Geral da Modernidade
PPTX
Igreja na Idade Média
PPTX
História da Igreja - O Cisma do Ocidente
PPTX
História da Igreja - Os gloriosos séculos XII e XIII
PPTX
História da Igreja - O Cisma do Oriente
PPTX
História da Igreja - Cluny e a reforma da Igreja
PPTX
História da Igreja - Francos: de Clóvis a Carlos Magno
PPTX
História da Igreja - Visão geral da Idade Média
PPTX
O Primado de São Pedro
PPTX
História da Igreja - A queda do Império Romano
PPTX
História da Igreja - Concílios de Nicéia e Constantinopla
História da Igreja - Cruzadas
Nubilum: Sistema para gerência de recursos em Nuvens Distribuídas
A Santa Inquisição
História da Igreja - Fátima e o Século XX
História da Igreja - O Século XIX e as Revoluções
História da Igreja - Revolução Francesa
História da Igreja - Embates islâmico-cristãos
História da Igreja - Reforma e Contra-reforma
História da Igreja - O Renascimento
História da Igreja - Visão Geral da Modernidade
Igreja na Idade Média
História da Igreja - O Cisma do Ocidente
História da Igreja - Os gloriosos séculos XII e XIII
História da Igreja - O Cisma do Oriente
História da Igreja - Cluny e a reforma da Igreja
História da Igreja - Francos: de Clóvis a Carlos Magno
História da Igreja - Visão geral da Idade Média
O Primado de São Pedro
História da Igreja - A queda do Império Romano
História da Igreja - Concílios de Nicéia e Constantinopla

Recently uploaded (20)

PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
sap open course for s4hana steps from ECC to s4
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Review of recent advances in non-invasive hemoglobin estimation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MYSQL Presentation for SQL database connectivity
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
Understanding_Digital_Forensics_Presentation.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Unlocking AI with Model Context Protocol (MCP)
sap open course for s4hana steps from ECC to s4
The AUB Centre for AI in Media Proposal.docx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...

Nubilum: Resource Management System for Distributed Clouds

  • 1. Pós-Graduação em Ciência da Computação “Nubilum: Resource Management System for Distributed Clouds” Por Glauco Estácio Gonçalves Tese de Doutorado Universidade Federal de Pernambuco posgraduacao@cin.ufpe.br www.cin.ufpe.br/~posgraduacao RECIFE, 03/2012
  • 2. UNIVERSIDADE FEDERAL DE PERNAMBUCO CENTRO DE INFORMÁTICA PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO GLAUCO ESTÁCIO GONÇALVES “Nubilum: Resource Management System for Distributed Clouds" ESTE TRABALHO FOI APRESENTADO À PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO DO CENTRO DE INFORMÁTICA DA UNIVERSIDADE FEDERAL DE PERNAMBUCO COMO REQUISITO PARCIAL PARA OBTENÇÃO DO GRAU DE DOUTOR EM CIÊNCIA DA COMPUTAÇÃO. ORIENTADORA: Dra. JUDITH KELNER CO-ORIENTADOR: Dr. DJAMEL SADOK RECIFE, MARÇO/2012
  • 4. Tese de Doutorado apresentada por Glauco Estácio Gonçalves à Pós- Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco, sob o título “Nubilum: Resource Management System for Distributed Clouds” orientada pela Profa. Judith Kelner e aprovada pela Banca Examinadora formada pelos professores: ___________________________________________________________ Prof. Paulo Romero Martins Maciel Centro de Informática / UFPE __________________________________________________________ Prof. Stênio Flávio de Lacerda Fernandes Centro de Informática / UFPE ____________________________________________________________ Prof. Kelvin Lopes Dias Centro de Informática / UFPE _________________________________________________________ Prof. José Neuman de Souza Departamento de Computação / UFC ___________________________________________________________ Profa. Rossana Maria de Castro Andrade Departamento de Computação / UFC Visto e permitida a impressão. Recife, 12 de março de 2012. ___________________________________________________ Prof. Nelson Souto Rosa Coordenador da Pós-Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco.
  • 5. To my family Danielle, João Lucas, and Catarina.
  • 6. iv Acknowledgments I would like to express my gratitude to God, cause of all the things and also my existence; and to the Blessed Virgin Mary to whom I appealed many times in prayer, being attended always. I would like to thank my advisor Dr. Judith Kelner and my co-advisor Dr. Djamel Sadok, whose expertise and patience added considerably to my doctoral experience. Thanks for the trust in my capacity to conduct my doctorate at GPRT (Networks and Telecommunications Research Group). I am indebted to all the people from GPRT for their invaluable help for this work. A very special thanks goes out to Patrícia, Marcelo, and André Vítor, which have given valuable comments over the course of my PhD. I must also acknowledge my committee members, Dr. Jose Neuman, Dr. Otto Duarte, Dr. Rossana Andrade, Dr. Stênio Fernandes, Dr. Kelvin Lopes, and Dr. Paulo Maciel for reviewing my proposal and dissertation, offering helpful comments to improve my work. I would like to thank my wife Danielle for her prayer, patience, and love which gave me the necessary strength to finish this work. A special thanks to my children, João Lucas and Catarina. They are gifts of God that make life delightful. Finally, I would like to thank my parents, João and Fátima, and my sisters, Cynara and Karine, for their love. Their blessings have always been with me as I followed in my doctoral research.
  • 7. v Abstract The current infrastructure of Cloud Computing providers is composed of networking and computational resources that are located in large datacenters supporting as many as hundreds of thousands of diverse IT equipment. In such scenario, there are several management challenges related to the energy, failure and operational management and temperature control. Moreover, the geographical distance between resources and final users is a source of delay when accessing the services. An alternative to such challenges is the creation of Distributed Clouds (D-Clouds) with geographically distributed resources along to a network infrastructure with broad coverage. Providing resources in such a distributed scenario is not a trivial task, since, beyond the processing and storage resources, network resources must be taken in consideration offering users a connectivity service for data transportation (also called Network as a Service – NaaS). Thereby, the allocation of resources must consider the virtualization of servers and the network devices. Furthermore, the resource management must consider all steps from the initial discovery of the adequate resource for attending developers’ demand to its final delivery to the users. Considering those challenges in resource management in D-Clouds, this Thesis proposes then Nubilum, a system for resource management on D-Clouds considering geo- locality of resources and NaaS aspects. Through its processes and algorithms, Nubilum offers solutions for discovery, monitoring, control, and allocation of resources in D-Clouds in order to ensure the adequate functioning of the D-Cloud while meeting developers’ requirements. Nubilum and its underlying technologies and building blocks are described and their allocation algorithms are also evaluated to verify their efficacy and efficiency. Keywords: cloud computing, resource management mechamisms, network virtualization.
  • 8. vi Resumo Atualmente, a infraestrutura dos provedores de computação em Nuvem é composta por recursos de rede e de computação, que são armazenados em datacenters de centenas de milhares de equipamentos. Neste cenário, encontram-se diversos desafios quanto à gerência de energia e controle de temperatura, além de, devido à distância geográfica entre os recursos e os usuários, ser fonte de atraso no acesso aos serviços. Uma alternativa a tais desafios é o uso de Nuvens Distribuídas (Distributed Clouds – D-Clouds) com recursos distribuídos geograficamente ao longo de uma infraestrutura de rede com cobertura abrangente. Prover recursos em tal cenário distribuído não é uma tarefa trivial, pois, além de recursos computacionais e de armazenamento, deve-se considerar recursos de rede os quais são oferecidos aos usuários da nuvem como um serviço de conectividade para transporte de dados (também chamado Network as a Service – NaaS). Desse modo, o processo de alocação deve considerar a virtualização de ambos, servidores e elementos de rede. Além disso, a gerência de recursos deve considerar desde a descoberta dos recursos adequados para atender as demandas dos usuários até a manutenção da qualidade de serviço na sua entrega final. Considerando estes desafios em gerência de recursos em D-Clouds, este trabalho propõe Nubilum: um sistema para gerência de recursos em D-Cloud que considera aspectos de geo-localidade e NaaS. Por meio de seus processos e algoritmos, Nubilum oferece soluções para descoberta, monitoramento, controle e alocação de recursos em D-Clouds de forma a garantir o bom funcionamento da D-Cloud, além de atender os requisitos dos desenvolvedores. As diversas partes e tecnologias de Nubilum são descritos em detalhes e suas funções delineadas. Ao final, os algoritmos de alocação do sistema são também avaliadas de modo a verificar sua eficácia e eficiência. Palavras-chave: computação em nuvem, mecanismos de alocação de recursos, virtualização de redes.
  • 9. vii Contents Abstract v Resumo vi Abbreviations and Acronyms xii 1 Introduction 1 1.1 Motivation............................................................................................................................................. 2 1.2 Objectives ............................................................................................................................................. 4 1.3 Organization of the Thesis................................................................................................................. 4 2 Cloud Computing 6 2.1 What is Cloud Computing?................................................................................................................ 6 2.2 Agents involved in Cloud Computing.............................................................................................. 7 2.3 Classification of Cloud Providers...................................................................................................... 8 2.3.1 Classification according to the intended audience..................................................................................8 2.3.2 Classification according to the service type.............................................................................................8 2.3.3 Classification according to programmability.........................................................................................10 2.4 Mediation System............................................................................................................................... 11 2.5 Groundwork Technologies.............................................................................................................. 12 2.5.1 Service-Oriented Computing...................................................................................................................12 2.5.2 Server Virtualization..................................................................................................................................12 2.5.3 MapReduce Framework............................................................................................................................13 2.5.4 Datacenters.................................................................................................................................................14 3 Distributed Cloud Computing 15 3.1 Definitions.......................................................................................................................................... 15 3.2 Research Challenges inherent to Resource Management ............................................................ 18 3.2.1 Resource Modeling....................................................................................................................................18 3.2.2 Resource Offering and Treatment..........................................................................................................20 3.2.3 Resource Discovery and Monitoring......................................................................................................22 3.2.4 Resource Selection and Optimization....................................................................................................23 3.2.5 Summary......................................................................................................................................................27 4 The Nubilum System 28 4.1 Design Rationale................................................................................................................................ 28 4.1.1 Programmability.........................................................................................................................................28 4.1.2 Self-optimization........................................................................................................................................29 4.1.3 Existing standards adoption.....................................................................................................................29 4.2 Nubilum’s conceptual view.............................................................................................................. 29 4.2.1 Decision plane............................................................................................................................................30 4.2.2 Management plane.....................................................................................................................................31 4.2.3 Infrastructure plane...................................................................................................................................32 4.3 Nubilum’s functional components.................................................................................................. 32 4.3.1 Allocator......................................................................................................................................................33 4.3.2 Manager.......................................................................................................................................................34
  • 10. viii 4.3.3 Worker.........................................................................................................................................................35 4.3.4 Network Devices.......................................................................................................................................36 4.3.5 Storage System ...........................................................................................................................................37 4.4 Processes............................................................................................................................................. 37 4.4.1 Initialization processes..............................................................................................................................37 4.4.2 Discovery and monitoring processes......................................................................................................38 4.4.3 Resource allocation processes..................................................................................................................39 4.5 Related projects.................................................................................................................................. 40 5 Control Plane 43 5.1 The Cloud Modeling Language ....................................................................................................... 43 5.1.1 CloudML Schemas.....................................................................................................................................45 5.1.2 A CloudML usage example......................................................................................................................52 5.1.3 Comparison and discussion .....................................................................................................................56 5.2 Communication interfaces and protocols...................................................................................... 57 5.2.1 REST Interfaces.........................................................................................................................................57 5.2.2 Network Virtualization with Openflow.................................................................................................63 5.3 Control Plane Evaluation ................................................................................................................. 65 6 Resource Allocation Strategies 68 6.1 Manager Positioning Problem ......................................................................................................... 68 6.2 Virtual Network Allocation.............................................................................................................. 70 6.2.1 Problem definition and modeling ...........................................................................................................72 6.2.2 Allocating virtual nodes............................................................................................................................74 6.2.3 Allocating virtual links...............................................................................................................................75 6.2.4 Evaluation...................................................................................................................................................76 6.3 Virtual Network Creation................................................................................................................. 81 6.3.1 Minimum length Steiner tree algorithms ...............................................................................................82 6.3.2 Evaluation...................................................................................................................................................86 6.4 Discussion........................................................................................................................................... 89 7 Conclusion 91 7.1 Contributions ..................................................................................................................................... 92 7.2 Publications ........................................................................................................................................ 93 7.3 Future Work ....................................................................................................................................... 94 References 96
  • 11. ix List of Figures Figure 1 Agents in a typical Cloud Computing scenario (from [24]) ..................................................7 Figure 2 Classification of Cloud types (from [71]).................................................................................9 Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24]) ................11 Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud............................................16 Figure 5 ISP-based D-Cloud example ...................................................................................................17 Figure 6 Nubilum’s planes and modules...............................................................................................30 Figure 7 Functional components of Nubilum......................................................................................33 Figure 8 Schematic diagram of Allocator’s modules and relationships with other components..33 Figure 9 Schematic diagram of Manager’s modules and relationships with other components...34 Figure 10 Schematic diagram of Worker modules and relationships with the server system........35 Figure 11 Link discovery process using LLDP and Openflow ..........................................................38 Figure 12 Sequence diagram of the Resource Request process for a developer..............................39 Figure 13 Integration of different descriptions using CloudML........................................................44 Figure 14 Basic status type used in the composition of other types..................................................45 Figure 15 Type for reporting status of the virtual nodes ....................................................................46 Figure 16 XML Schema used to report the status of the physical node...........................................46 Figure 17 Type for reporting complete description of the physical nodes.......................................46 Figure 18 Type for reporting the specific parameters of any node ...................................................47 Figure 19 Type for reporting information about the physical interface ...........................................48 Figure 20 Type for reporting information about a virtual machine..................................................48 Figure 21 Type for reporting information about the whole infrastructure ......................................49 Figure 22 Type for reporting information about the physical infrastructure...................................49 Figure 23 Type for reporting information about a physical link .......................................................50 Figure 24 Type for reporting information about the virtual infrastructure .....................................50 Figure 25 Type describing the service offered by the provider .........................................................51 Figure 26 Type describing the requirements that can be requested by a developer .......................52 Figure 27 Example of a typical Service description XML ..................................................................53 Figure 28 Example of a Request XML..................................................................................................53 Figure 29 Physical infrastructure description........................................................................................54 Figure 30 Virtual infrastructure description..........................................................................................55 Figure 31 Communication protocols employed in Nubilum..............................................................57 Figure 32 REST operation for the retrieval of service information..................................................59 Figure 33 REST operation for updating information of a service ....................................................59 Figure 34 REST operation for requesting resources for a new application.....................................59 Figure 35 REST operation for changing resources of a previous request .......................................60 Figure 36 REST operation for releasing resources of an application ...............................................60 Figure 37 REST operation for registering a new Worker...................................................................60 Figure 38 REST operation to unregister a Worker..............................................................................61 Figure 39 REST operation for update information of a Worker ......................................................61 Figure 40 REST operation for retrieving a description of the D-Cloud infrastructure .................61 Figure 41 REST operation for updating the description of a D-Cloud infrastructure...................61 Figure 42 REST operation for the creation of a virtual node............................................................62 Figure 43 REST operation for updating a virtual node ......................................................................62 Figure 44 REST operation for removal of a virtual node...................................................................62 Figure 45 REST operation for requesting the discovered physical topology ..................................63 Figure 46 REST operation for the creation of a virtual link ..............................................................63 Figure 47 REST operation for updating a virtual link.........................................................................64 Figure 48 REST operation for removal of a virtual link.....................................................................64
  • 12. x Figure 49 Example of a typical rule for ARP forwarding...................................................................65 Figure 50 Example of the typical rules created for virtual links: (a) direct, (b) reverse..................65 Figure 51 Example of a D-Cloud with ten workers and one Manager.............................................69 Figure 52 Algorithm for allocation of virtual nodes............................................................................74 Figure 53 Example illustrating the minimax path................................................................................75 Figure 54 Algorithm for allocation of virtual links..............................................................................76 Figure 55 The (a) old and (b) current network topologies of RNP used in simulations................77 Figure 56 Results for the maximum node stress in the (a) old and (b) current RNP topology....78 Figure 57 Results for the maximum link stress in the (a) old and (b) current RNP topology ......79 Figure 58 Results for the mean link stress in the (a) old and (b) current RNP topology...............80 Figure 59 Mean path length (a) old and (b) current RNP topology..................................................80 Figure 60 Example creating a virtual network: (a) before the creation; (b) after the creation ......81 Figure 61 Search procedure used by the GHS algorithm....................................................................83 Figure 62 Placement procedure used by the GHS algorithm.............................................................84 Figure 63 Example of the placement procedure: (a) before and (b) after placement.....................85 Figure 64 Percentage of optimal samples for GHS and STA in the old RNP topology................87 Figure 65 Percentage of samples reaching relative error ≤ 5% in the old RNP topology.............88 Figure 66 Percentage of optimal samples for GHS and STA in the current RNP topology ........88 Figure 67 Percentage of samples reaching relative error ≤ 5% in the current RNP topology......89
  • 13. xi List of Tables Table I Summary of the main aspects discussed..................................................................................27 Table II MIMEtypes used in the overall communications.................................................................58 Table III Models for the length of messages exchanged in the system in bytes.............................67 Table IV Characteristics present in Nubilum’s resource model ........................................................71 Table V Reduced set of characteristics considered by the proposed allocation algorithms ..........72 Table VI Factors and levels used in the MPA’s evaluation ................................................................78 Table VII Factors and levels used in the GHS’s evaluation...............................................................86 Table VIII Scientific papers produced ..................................................................................................94
  • 14. xii Abbreviations and Acronyms CDN Content Delivery Network CloudML Cloud Modeling Language D-Cloud Distribute Cloud DHCP Dynamic Host Configuration Protocol GHS Greedy Hub Selection HTTP Hypertext Transfer Protocol IaaS Infrastructure as a Service ISP Internet Service Provider LLDP Link Layer Discovery Protocol MPA Minimax Path Algorithm MPP Manager Positioning Problem NaaS Network as a Service NV Network Virtualization OA Optimal Algorithm OCCI Open Cloud Computing Interface PoP Point of Presence REST Representational state transfer RP Replica Placement RPA Replica Placement Algorithm STA Steiner Tree Approximation VM Virtual Machine VN Virtual Network XML Extensible Markup Language ZAA Zhu and Ammar Algorithm
  • 15. 1 1 Introduction “A inea incipere.” Erasmus Nowadays, it is common to access content across the Internet with little reference to the underlying datacenter hosting infrastructure maintained by content providers. The entire technology used to provide such level of locality transparency offers also a new model for the provisioning of computing services, known as Cloud Computing. This model is attractive as it allows resources to be provisioned according to users’ requirements leading to overall cost reduction. Cloud users can rent resources as they become necessary, in a much more scalable and elastic way. Moreover, such users can transfer operational risks to cloud providers. In the viewpoint of those providers, the model offers a way for a better utilization of their own infrastructure. Ambrust et al. [1] point out that this model benefits from a form of statistical multiplexing, since it allocates resources for several users concurrently on a demand basis. This statistical multiplexing of datacenters is subsequent to several decades of research in many areas such as distributed computing, Grid computing, web technologies, service computing, and virtualization. Current Cloud Computing providers mainly use large and consolidated datacenters in order to offer their services. However, the ever increasing need for over-provisioning to attend peak demands and providing redundancy against failures allied to expensive cooling needs are important factors increasing the energetic costs of centralized datacenters [62]. In current datacenters, the cooling technologies used for heat dissipation control accounts for as much as 50% of the total power consumption [38]. In addition to these aspects, it must be observed that the network between users and the Cloud is often an unreliable best-effort IP service, which can harm delay-constrained services and interactive applications. To deal with these problems, there have been some indicatives whereby small cooperative datacenters can be more attractive since they offer cheaper and low-power consumption alternative reducing the infrastructure costs of centralized Clouds [12]. These small datacenters can be built at different geographical regions and connected by dedicated or public (provided by Internet Service Providers) networks, configuring a new type of Cloud, referred to as a Distributed Cloud. Such
  • 16. 2 Distributed Clouds [20], or just D-Clouds, can exploit the possibility of (virtual) links creation and the potential of sharing resources across geographic boundaries to provide latency-based allocation of resources to fully utilize this emerging distributed computing power. D-Clouds can reduce communication costs by simply provisioning storage, servers, and network resources close to end- users. The D-Clouds can be considered as an additional step in the ongoing deployments of Cloud Computing: one that supports different requirements and leverages new opportunities for service providers. Users in a Distributed Cloud will be free to choose where to allocate their resources in order to attend specific market niches, constraints on jurisdiction of software and data, or quality of service aspects of their clients. 1.1 Motivation Similarly to Cloud Computing, one of the most important design aspects of D-Clouds is the availability of “infinite” computing resources which may be used on demand. Cloud users see this “infinite” resource pool because the Cloud offers the continuous monitoring and management of its resources and the allocation of resources in an elastic way. Nevertheless, providing on-demand computing instances and network resources in a distributed scenario is not a trivial task. Dynamic allocation of resources and their possible reallocation are essential characteristics for accommodating unpredictable demands and, ultimately, contributing to investment return. In the context of Clouds, the essential feature of any resource management system is to guarantee that both user and provider requirements are met satisfactorily. Particularly in D-Clouds, users may have network requirements, such as bandwidth and delay constraints, in addition to the common computational requirements, such as CPU, memory, and storage. Furthermore, other user requirements are relevant including node locality, topology of nodes, jurisdiction, and application interaction. The development of solutions to cope with resource management problems remains a very important topic in the field of Cloud Computing. With regard to this technology, there are solutions focused on grid computing ([49], [70]) and on datacenters in current Cloud Computing scenarios ([4]). However, such strategies do not fit well the D-Clouds as they are heavily based on assumptions that do not hold in Distributed Cloud scenarios. For example, such solutions are designed for over- provisioned networks and commonly do not take into consideration the cost of resources’ communication, which is an important aspect for D-Clouds that must be cautiously monitored and/or reserved in order to meet users’ requirements.
  • 17. 3 The design of a resource management system involves challenges other than the specific design of optimization algorithms for resource management. Since D-Clouds are composed of computational and network devices with different architectures, software, and hardware capabilities, the first challenge is the development of a suitable resource model covering all this heterogeneity [20],. In addition, the next challenge is to describe how resources are offered, which is important since the requirements supported by the D-Cloud provider are defined in this step. The other challenges are related with the overall operation of the resource management system. When requests arrive, the system should be aware of the current status of resources, in order to determine if there are sufficient available resources in the D-Cloud that could satisfy the present request. In this way, the right mechanisms for resource discovery and monitoring should also be designed, allowing the system to be aware of the updated status of all its resources. Then, based on the current status and the requirements of the request, the system may select and allocate resources to serve the present request. Please note that the solution to those challenges involves the fine-grained coordination of several distributed components and the orchestrated execution of the several subsystems composing the resource management system. At a first glance, these subsystems can be organized into three parts: one responsible for the direct negotiation of requirements with users; another responsible for deciding what resources to allocate for given applications; and one last part responsible for the effective enforcement of these decisions on the resources. Designing such system is a very interesting and challenging task, and it raises the following research questions that will be investigated in this thesis: 1. How Cloud users describe their requirements? In order to enable the automatic negotiation between users and the D-Cloud, the Cloud must recognize a language or formalism for requirements description. Thus, the investigation of this topic must determine the proper characteristics of such a language. In addition, it must verify the existent approaches around this topic in the many relative computing areas. 2. How to represent the resources available in the Cloud? Correlated to the first question, the resource management system must also maintain an information model to represent all the resources in the Cloud, including their relationships (topology) and their current status. 3. How the users’ applications are mapped onto Cloud resources? This question is about the very aspect of resource allocation, i.e., the algorithms, heuristics, and strategies that are used to decide the set of resources meeting the applications’ requirements and optimizing a utility function.
  • 18. 4 4. How to enforce the decisions made? The effective enforcement of the decisions involves the extension of communication protocols or the development of new ones in order to setup the state of the overall resources in the D-Cloud. 1.2 Objectives The main objective of this Thesis is to propose an integrated solution to problems related to the management of resources in D-Clouds. Such solution is presented as Nubilum, a resource management system that offers a self-managed system for challenges on discovery, control, monitoring, and allocation of resources in D-Clouds. Nimbulus provides fine-grained orchestration of their components in order to allocate applications on a D-Cloud. The specific goals of this Thesis are strictly related to the research questions presented in Section 1.1, they are: • Elaborate an information model to describe D-Cloud resources and application requirements as computational restrictions, topology, geographic location and other correlated aspects that can be employed to request resources directly to the D-Cloud; • Explore and extend communication protocols for the provisioning and allocation of computational and communication resources; • Develop algorithms, heuristics, and strategies to find suitable D-Cloud resources based on several different application requirements; • Integrate the information model, the algorithms, and the communication protocols, into a single solution. 1.3 Organization of the Thesis This Thesis identifies the challenges involved in the resource management on Distributed Cloud Computing and presents solutions for some of these challenges. The remainder of this document is organized as follows. The general concepts that make up the basis for all the other chapters are introduced in the second chapter. Its main objective is to discuss Cloud Computing while trying to explore such definition and to classify the main approaches in this area. The Distributed Cloud Computing concept and several important aspects of resource management on those scenarios are introduced in the third chapter. Moreover, this chapter will make a comparative analysis of related research areas and problems.
  • 19. 5 The fourth chapter introduces the first contribution of this Thesis: the Nubilum resource management system, which aggregates the several solutions proposed on this Thesis. Moreover, the chapter highlights the rationale behind Nubilum as well as their main modules and components. The fifth chapter examines and evaluates the control plane of Nubilum. It describes the proposed Cloud Modeling Language and details the communication interfaces and protocols used for communicating between Nubilum components. The sixth chapter gives an overview of the resource allocation problems in Distributed Clouds, and makes a thorough examination of the specific problems related to Nubilum. Some particular problems are analyzed and a set of algorithms is presented and evaluated. The seventh chapter of this Thesis reviews the obtained evaluation results, summarizes the contributions and sets the path to future works and open issues on D-Cloud.
  • 20. 6 2 Cloud Computing “Definitio est declaratio essentiae rei.” Legal Proverb In this chapter the main concepts of Cloud Computing will be presented. It begins with a discussion on the definition of Cloud Computing (Section 2.1) and the main agents involved in Cloud Computing (Section 2.2). Next, classifications of Cloud initiatives are offered in Section 2.3. An exemplary and simple architecture of a Cloud Mediation System is presented in Section 2.4 followed by a presentation in Section 2.5 of the main technologies acting behind the scenes of Cloud Computing initiatives. 2.1 What is Cloud Computing? A definition of Cloud Computing is given by the National Institute of Standards and Technology (NIST) of the United States: “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [45]. The definition says that on-demand dynamic reconfiguration (elasticity) is a key characteristic. Additionally, the definition highlights another Cloud Computing characteristic: it assumes that minimal management efforts are required to reconfigure resources. In other words, the Cloud must offer self-service solutions that must attend to requests on-demand, excluding from the scope of Cloud Computing those initiatives that operate through the rental of computing resources in a weekly or monthly basis. Hence, it restricts Cloud Computing to systems that provide automatic mechanisms for resource rental in real-time with minimal human intervention. The NIST definition gives a satisfactory concept of Cloud Computing as a computing model. But, NIST does not cover the main object of Cloud Computing: the Cloud. Thus, in this Thesis, Cloud Computing is defined as the computing model that operates based on Clouds. In turn, the Cloud is defined as a conceptual layer that operates above an infrastructure to provide elastic services in a timely manner.
  • 21. 7 This definition encompasses three main characteristics of Clouds. Firstly, it notes that a Cloud is primarily a concept, i.e., a Cloud is an abstraction over an infrastructure. Thus, it is independent of the employed technologies and therefore one can accept different setups, like Amazon EC2 or Google App Engine, to be named Clouds. Moreover, the infrastructure is defined in a broad sense once it can be composed by software, physical devices, and/or other Clouds. Secondly, all Clouds have the same purpose: to provide services. This means that a Cloud hides the complexity of the underlying infrastructure while exploring the potential of overlying services and acting as a middleware. In addition, providing a service involves, implicitly, the use of some type of agreement that should be guaranteed by the Cloud. Such agreements can vary from pre-defined contracts to malleable agreements defining functional and non-functional requirements. Note that these services are qualified as elastic ones, which has the same meaning of dynamic reconfiguration that appeared in the NIST definition. Last but not least, the Cloud must provide services as quickly as possible such that the infrastructure resources are allocated and reallocated to attend the users’ needs. 2.2 Agents involved in Cloud Computing Despite previous approaches ([64], [8], [72], and [68]), this Thesis focuses only on three distinct agents in Cloud Computing as shown in Figure 1: clients, developers, and the provider. The first notable point is that the provider deals with two types of users that are called developers and clients. Thus, clients are the customers of a service produced by a developer. Clients use services from developers, but such use generates demand to the provider that actually hosts the service, and therefore the client can also be considered a user of the Cloud. It is important to highlight that in some scenarios (like scientific computing or batch processing) a developer may behave as a client to the Cloud because it is the end-user of the applications. The text will use “users” when referring to both classes without distinctions. Figure 1 Agents in a typical Cloud Computing scenario (from [24]) Developers can be service providers, independent programmers, scientific institutions, and so on, i.e., all who build applications into the Cloud. They create and run their applications while Developer Developer Client Client Client Client
  • 22. 8 keeping decisions related to maintenance and management of the infrastructure to the provider. Please note that, a priori, developers do not need to know about the technologies that makeup the Cloud infrastructure, neither about the specific location of each item in the infrastructure. Lastly, the term application is used to mean all types of services that can be developed on the Cloud. In addition, it is important to note that the type of applications supported by a Cloud depends exclusively on the goals of the Cloud as determined by the provider. Such a wide range of possible targets generates many different types of Cloud Providers that are discussed in the next section. 2.3 Classification of Cloud Providers Currently, there are several operational initiatives of Cloud Computing; however despite all being called Clouds, they provide different types of services. For that reason, the academic community ([64], [8], [45], [72], and [71]) classified these solutions accurately in order to understand their relationships. The three complementary proposals for classification are as follows. 2.3.1 Classification according to the intended audience This first simple taxonomy is suggested by NIST [45] that organizes providers according to the audience to which the Cloud is aimed. There are four classes in this classification: Private Clouds, Community Clouds, Public Clouds, and Hybrid Clouds. The first three classes accommodate providers in a gradual opening of the intended audience coverage. The Private Cloud class encompasses such types of Clouds destined to be used solely by an organization operating over their own datacenter or one leased from a third party for exclusive use. When the Cloud infrastructure is shared by a group of organizations with similar interests it is classified as a Community Cloud. Furthermore, the Public Cloud class encompasses all initiatives intended to be used by the general public. Finally, Hybrid Clouds are simply the composition of two or more Clouds pertaining to different classes (Private, Community, or Public). 2.3.2 Classification according to the service type In [71], authors offer a classification as represented in Figure 2. Such taxonomy divides Clouds in five categories: Cloud Application, Cloud Software Environment, Cloud Software Infrastructure, Software Kernel, and Firmware/Hardware. The authors arranged the different types of Clouds in a stack, showing that Clouds from higher levels are created using services in the lower levels. This idea pertains to the definitions of Cloud Computing discussed previously in Sections 2.1 and 2.2. Essentially, the Cloud provider does not need to be the owner of the infrastructure.
  • 23. 9 Figure 2 Classification of Cloud types (from [71]) The class in the top of the stack, also called Software-as-a-Service (SaaS), involves applications accessed through the Internet, including social networks, Webmail, and Office tools. Such services provide software to be used by the general public, whose main interest is to avoid tasks related to software management like installation and updating. From the point of view of the Cloud provider, SaaS can decrease costs with software implementation when compared with traditional processes. Similarly, the Cloud Software Environment, also called Platform-as-a-Service (PaaS), encloses Clouds that offer programming environments for developers. Through well-defined APIs, developers can use software modules for access control, authentication, distributed processing, and so on, in order to produce their own applications in the Cloud. Moreover, developers can contract services for automatic scalability of their software, databases, and storage services. In the middle of the stack there is the Cloud Software Infrastructure class of initiatives. This class encompasses solutions that provide virtual versions of infrastructure devices found in datacenters like servers, databases, and links. Clouds in this class can be divided into three subclasses according to the type of resource that is offered by them. Computational resources are grouped in the Infrastructure-as-a-service (IaaS) subclass that provides generic virtual machines that can be used in many different ways by the contracting developer. Services for massive data storage are grouped in the Data-as-a-Service (DaaS) class, whose main mission is to store remotely users’ data on remote, which allows those users to access their data from anywhere and at anytime. Finally, the third subclass, called Communications-as-a-Service (CaaS), is composed of solutions that offer virtual private links and routers through telecommunication infrastructures. The last two classes do not offer Cloud services specifically, but they are included in the classification to show that providers offering Clouds in higher layers can have their own software and hardware infrastructure. The Software Kernel class includes all of the software necessary to provide services to the other categories like operating systems, hypervisors, cloud management
  • 24. 10 middleware, programming APIs, and libraries. Finally, the class of Firmware/Hardware covers all sale and rental services of physical servers and communication hardware. 2.3.3 Classification according to programmability The five-class scheme presented above can classify and organize the current spectrum of Cloud Computing solutions, but such a model is limited because the number of classes and their relationships will need to be rearranged as new Cloud services emerge. Therefore, in this Thesis, a different classification model will be used based on the programmability concept, which was previously introduced by Endo et al. [19]. Borrowed from the realm of network virtualization [11], programmability is a concept related to the programming features a network element offers to developers, measuring how much freedom the developer has to manipulate resources and/or devices. This concept can be easily applied to the comparison of Cloud Computing solutions. More programmable Clouds offer environments where developers are free to choose programming paradigms, languages, and platforms. Less programmable Clouds restrict developers in some way: perhaps by forcing a set of programming languages or by providing support for only one application paradigm. On the other hand, programmability directly affects the way developers manage their leased resources. From this point- of-view, providers of less programmable Clouds are responsible to manage their infrastructure while being transparent to developers. In turn, a more programmable Cloud leaves more of these tasks to developers, thus introducing management difficulties due to the more heterogeneous programming environment. Thus, Cloud Programmability can be defined as the level of sovereignty under which developers have to manipulate services leased from a provider. Programmability is a relative concept, i.e., it was adopted to compare one Cloud with others. Also, programmability is directly proportional to heterogeneity in the infrastructure of the provider and inversely proportional to the amount of effort that developers must spend to manage leased resources. To illustrate how this concept can be used, one can classify two current Clouds: Amazon EC2 and Google App Engine. Clearly the Amazon EC2 is more programmable, since in this Cloud developers can choose between different virtual machine classes, operating systems, and so on. After they lease one of these virtual machines, developers can configure it to work as they see fit: as a web server, as a content server, as a unit for batch processing, and so on. On the other hand, Google App Engine can be classified as a less programmable solution, because it allows developers to create Web applications that will be hosted by Google. This restricts developers to the Web paradigm and to some programming languages.
  • 25. 11 2.4 Mediation System Figure 3 introduces an Archetypal Cloud Mediation System. This is a conceptual model that will be used as a reference to the discussion on Resource Management in this Thesis. The Archetypal Cloud Mediation System focuses on one principle: resource management as the main service of any Cloud Computing provider. Thus, other important services like authentication, accounting, and security are out of the scope of this conceptual system and, therefore these services are separated from the Mediation System in this archetypal Cloud mediation system. Clients also do not factor into this view of the system, since resource management is mainly related to the allocation of developers’ applications and meeting their requirements. Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24]) The mediation system is responsible for the entire process of resource management in the Cloud. Such a process covers tasks that range from the automatic negotiation of developers requirements to the execution of their applications. It has three main layers: negotiation, resource management, and resource control. The negotiation layer deals with the interface between the Cloud and developers. In the case of Clouds selling infrastructure services, the interface can be a set of operations based on Web Services for control of the leased virtual machines. Alternately, in the case of PaaS services, this interface can be an API for software development in the Cloud. Moreover, the negotiation layer handles the process of contract establishment between the enterprises and the Cloud. Currently, this process is simple and the contracts tend to be restrictive. One can expect that in the future, Clouds will offer more sophisticated avenues for user interaction through high level abstractions and service level policies. Mediation System Resources Resource Management Negotiation Resource Control Developers Auxiliary Services Account Authentication Security
  • 26. 12 The resource management layer is responsible for the optimal allocation of applications for obtaining the maximum usage of resources. This function requires advanced strategies and heuristics to allocate resources that meet the contractual requirements as established with the application developer. These may include service quality restrictions, jurisdiction restrictions, elastic adaptation, among others. Metaphorically, one can say that while the resource management layer acts as the “brain” of the Cloud, the resource control layer plays the role of its “limbs”. The resource control encompasses all functions needed to enforce decisions generated by the upper layer. Beyond the tools used to configure the Cloud resources effectively, all communication protocols used by the Cloud are included in this layer. 2.5 Groundwork Technologies Some of the main technologies that used by the current Cloud mediation systems (namely Service- oriented Computing, Virtualization, MapReduce, and Datacenters) will be discussed. 2.5.1 Service-Oriented Computing Service-Oriented Computing defines a set of principles, architectural models, and technologies for the design and development of distributed applications. The recent development of software while focusing on services gave rise to SOA (Service-Oriented Architecture), which can be defined as an architectural model “that supports the discovery, message exchange, and integration between loosely coupled services using industry standards” [37]. The common technology for the implementation of SOA principles is the Web Service that defines a set of standards to implement services over the World Wide Web. In Cloud Computing, SOA is the main paradigm for the development of functions on the several layers of the Cloud. Cloud providers publish APIs for their services on the web, allowing developers to use the Cloud and to automate several tasks related to the management of their applications. Such APIs can assume the form of WSDL documents or REST-based interfaces. Furthermore, providers can make available Software Development Kits (SDKs) and other toolkits for the manipulation of applications running on the Cloud. 2.5.2 Server Virtualization Server virtualization is a technique that allows a computer system to be partitioned onto multiple isolated execution environments offering a similar service as a single physical computer, which are called Virtual Machines (VM). Each VM can be configured in an independent way while having its own operating system, applications, and network parameters. Commonly, such VMs are hosted on a
  • 27. 13 physical server running a hypervisor, the software that effectively virtualizes the server and manages the VMs [54]. There are several hypervisor options that can be used for server virtualization. From the open- source community, one can cite Citrix’s Xen1 and the Kernel-based Virtual Machine (KVM)2 . From the realm of proprietary solutions, some examples are VMWare ESX3 and Microsoft’s HyperV4 . The main factor that boosted up the adoption of server virtualization within Cloud Computing is that such technology offers good flexibility regarding the dynamic reallocation of workloads across servers. Such flexibility allows, for example, providers to execute maintenance on servers without stopping developers’ applications (that are running on VMs) or to implement strategies for better resource usage through the migration of VMs. Furthermore, server virtualization is adapted for the fast provisioning of new VMs through the use of templates, which enables providers to offer elastic services for applications developers [43]. 2.5.3 MapReduce Framework MapReduce [15] is a programming framework developed by Google for distributed processing of large data sets across computing infrastructures. Inspired on the map and reduce primitives present in functional languages, its authors developed an entire framework for the automatic distribution of computations. In this framework, developers are responsible for writing map and reduce operations and for using them according to their needs, which is similar to the functional paradigm. These map and reduce operations will be executed by the MapReduce system that transparently distributes computations across the computing infrastructure and treats all issues related to node communication, load balancing, and fault tolerance. For the distribution and synchronization of the data required by the application, the MapReduce system also requires the use of a specially tailored distributed file system called Google File System (GFS) [23]. Despite being introduced by Google, there are some open source implementations of the MapReduce system, like Hadoop [6] and TPlatform [55]. The former is a popular open-source software used for running applications on large clusters built of commodity hardware. This software is used by large companies like Amazon, AOL, and IBM, as well as in different Web applications such as Facebook, Twitter, Last.fm, among others. Basically, Hadoop is composed of two modules: a MapReduce environment for distributed computing, and a distributed file system called the Hadoop Distributed File System (HDFS). The latter is an academic initiative that provides a 1 http://www.xen.org/products/cloudxen.html 2 http://www.linux-kvm.org/page/Main_Page 3 http://www.vmware.com/ 4 http://www.microsoft.com/hyper-v-server/en/us/default.aspx
  • 28. 14 development platform for Web mining applications. Similarly to Hadoop and Google’s MapReduce, the TPlatform has a MapReduce module and a distributed file system known as the Tianwang File System (TFS) [55]. The use of MapReduce solutions is common groundwork technology in PaaS Clouds because it offers a versatile sandbox for developers. Differently from IaaS Clouds, PaaS developers using a general-purpose language with MapReduce support do not need to be concerned with software configuration, software updating and, network configurations. All these tasks are the responsibility of the Cloud provider, which, in turn, benefits from the fact that such configurations will be standardized across the overall infrastructure. 2.5.4 Datacenters Developers who are hosting their applications on a Cloud wish to scale their leased resources, effectively increasing and decreasing their virtual infrastructure according to the demand of their clients. This is also the case for developers making use of their own private Clouds. Thus, independently of the class of Cloud under consideration, a robust and safe infrastructure is needed. Whereas virtualization and MapReduce respond for the software solution required to attend this demand, the physical infrastructure of Clouds is based on datacenters, which are infrastructures composed of TI components providing processing capacity, storage, and network services for one or more organizations [66]. Currently, the size of a datacenter (in number of components) can vary from tens of components to tens of thousands of components depending on the datacenter’s mission. In addition, there are several different TI components for datacenters including switches and routers, load balancers, storage devices, dedicated storage networks, and, the main component of any datacenter, in other words, servers [27]. Cloud Computing datacenters provide the required power to attend developers’ demands in terms of processing, storage, and networking capacities. A large datacenter, running a virtualization solution, allows for better granularity division of the hardware’s power through the statistical multiplexing of developers’ applications.
  • 29. 15 3 Distributed Cloud Computing “Quae non prosunt singula, multa iuvant.” Ovid This chapter discusses the main concepts of Distributed Cloud (D-Cloud) Computing. It begins with a discussion of their definition (Section 3.1) in an attempt to distinguish the D-Cloud from the current Clouds and highlight their main characteristics. Next, the main research challenges regarding resource management on D-Clouds will be described in Section 3.2. 3.1 Definitions Current Cloud Computing setups involve a huge amount of investments as part of the datacenter, which is the common underlying infrastructure of Clouds as previously detailed in Section 2.5.4. This centralized infrastructure brings many well-known challenges such as the need for resource over-provisioning and the high cost for heat dissipation and temperature control. In addition to concerns with infrastructure costs, one must observe that those datacenters are not necessarily close to their clients, i.e., the network between end-users and the Cloud is often a long best-effort IP connection, which means longer round-trip delays. Considering such limitations, industry and academy researchers have presented indicatives that small datacenters can be sometimes more attractive since they offer a cheaper and low-power consumption alternative while also reducing the infrastructure costs of centralized Clouds [12]. Moreover, Distributed Clouds, or just D-Clouds, as pointed out by Endo et al. in [20], can exploit the possibility of links creation and the potential of sharing resources across geographic boundaries to provide latency-based allocation of resources to ultimately fully utilize this distributed computing power. Thus, D-Clouds can reduce communication costs by simply provisioning data, servers, and links close to end-users. Figure 4 illustrates how D-Clouds can reduce the cost of communication through the spread of computational power and the usage of a latency-based allocation of applications. In Figure 4(a) the client uses an application (App) running on the Cloud through the Internet, which is subject to the latency imposed by the best-effort network. In Figure 4(b), the client is accessing the same App,
  • 30. 16 but in this case, the latency imposed by the network will be reduced due to the allocation of the App in a server that is in a small datacenter closest to the client than the previous scenario. (a) (b) Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud Please note that the Figure 4(b) intentionally does not specify the network connecting the infrastructure of the D-Cloud Provider. This network can be rented from different local ISPs (using the Internet for interconnection) or from an ISP with wide area coverage. In addition, such ISP could be the own D-Cloud Provider itself. This may be the case as the D-Cloud paradigm introduces an organic change in the current Internet where ISPs can start to play as D-Cloud providers. Thus, ISPs could offer their communication and computational resources for developers interested in deploying their applications at the specific markets covered by those ISPs. This idea is illustrated by Figure 5 that shows a D-Cloud offered by a hypothetical Brazilian ISP. In this example, a developer deployed its application (App) on two servers in order to attend requests from northern and southern clients. If the number of northeastern clients increases, the developer can deploy its App (represented by the dotted box) on one server close to the northeast region in order to improve its service quality. It is important to pay attention to the fact that the contribution of this Thesis falls in this last scenario, i.e., a scenario where the network and computational resources are all controlled by the same provider. CloudProvider Client Internet App Client App DistributedCloudProvider
  • 31. 17 Figure 5 ISP-based D-Cloud example D-Clouds share similar characteristics with current Cloud Computing, including essential offerings such as scalability, on demand usage, and pay-as-you-go business plans. Furthermore, the agents already stated for current Clouds (please see Figure 1) are exactly the same in the context of D-Clouds. Finally, the many different classifications discussed in Section 2.3 can be applied also. Despite the similarity, one may highlight two peculiarities of D-Clouds: support to geo-locality and Network as a Service (NaaS) provisioning ([2], [63], [17]). The geographical diversity of resources potentially improves cost and performance and gives an advantage to several different applications, particularly, those that do not require massive internal communication among large server pools. In this category, as pointed out by [12], one can emphasize, firstly, applications being currently deployed in a distributed manner, like VOIP (Voice over IP) and online games; secondly, one can indicate the applications that are good candidates for distributed implementation, like traffic filtering and e-mail distribution. In addition, there are other different types of applications that use software or data with specific legal restrictions on jurisdiction, and specific applications whose public is restricted to one or more geographical areas, like the tracking of buses or subway routes, information about entertainment events, local news, etc. Support for geo-locality can be considered to be a step further in the deployment of Cloud Computing that leverages new opportunities for service providers. Thus, they will be free to choose where to allocate their resources in order to attend to specific niches, constraints on jurisdiction of software and data, or quality of service aspects of end-users. The NaaS (or Communication as a Service – CaaS as cited in section 2.3.2) allows service providers to manage network resources, instead of just computational ones. Authors in [2] call NaaS as a service offering transport network connectivity with a level of virtualization suitable to be App App App
  • 32. 18 invoked by service providers. In this way, D-Clouds are able to manage their network resources according to their convenience, offering better response time for hosted applications. The NaaS is close to the Network Virtualization (NV) research area [31], where the main problem consists in choosing how to allocate a virtual network over a physical one, meeting requirements and minimizing usage of the physical resources. Although NV and D-Clouds are subject to similar problems and scenarios, there is an essential difference between these two. While NV commonly models its resources at the infrastructure level (requests are always virtual networks mapped on graphs), a D-Cloud can be engineered to work with applications in a different abstraction level, exactly as it occurs with actual Cloud service types like the ones described at Section 2.3.2. This way, one may see Network Virtualization simply as a particular instance of the D-Cloud. Other insights about NV are given in Section 3.3.2. Finally, it must be highlighted that the D-Cloud does not compete with the current Cloud Computing paradigm, since the D-Cloud merely fits a certain type of applications that have hard restrictions on geographical location, while the existent Clouds continue to be attracting for applications demanding massive computational resources or simple applications with minor or no restrictions on geographical location. Thus, the current Cloud Computing providers are the first potential candidates to take advantage of the D-Cloud paradigm, since the current Clouds could hire D-Cloud resources on-demand and move the applications to certain geographical locations in order to meet specific developers’ requirements. In addition to the current Clouds, the D-Clouds can also serve the developers directly. 3.2 Research Challenges inherent to Resource Management D-Clouds face challenges similar to the ones presented in the context of current Cloud Computing. However, as stated in Chapter 1, the object of the present study is the resource management in D- Clouds. Thus, this Section gives special emphasis to the challenges for resource management in D- Clouds, while focusing on four categories as presented in [20]: a) resource modeling; b) resource offering and treatment; c) resource discovery and monitoring; and d) resource selection. 3.2.1 Resource Modeling The first challenge is the development of a suitable resource model that is essential to all operations in the D-Cloud, including management and control. Optimization algorithms are also strongly dependent of the resource modeling scheme used. In a D-Cloud environment, it is very important that resource modeling takes into account physical resources as well as virtual ones. On one hand, the amount of details in each resource should be treated carefully, since if resources are described with great details, there is a risk that the
  • 33. 19 resource optimization becomes hard and complex, since the computational optimization problem considering the several modeled aspects can create NP-hard problems. On the other hand, more details give more flexibility and leverage the usage of resources. There are some alternatives for resource modeling in Clouds that could be applied to D- Clouds. One can cite, for example, the OpenStack software project [53], which is focused on producing an open standard Cloud operating system. It defines a Restful HTTP service that supports JSON and XML data formats and it is used to request or to exchange information about Cloud resources and action commands. OpenStack also offers ways to describe how to scale server down or up (using pre-configured thresholds); it is extensible, allowing the seamless addition of new features; and it returns additional error messages in faults case. Other resource modeling alternative is the Virtual Resources and Interconnection Networks Description Language (VXDL) [39], whose main goal is to describe resources that compose a virtual infrastructure while focusing on virtual grid applications. The VXDL is able to describe the components of an infrastructure, their topology, and an execution chronogram. These three aspects compose the main parts of a VXDL document. The computational resource specification part describes resource parameters. Furthermore, some peculiarities of virtual Grids are also present, such as the allocation of virtual machines in the same hardware and location dependence. The specification of the virtual infrastructure can consider specific developers’ requirements such as network topology and delay, bandwidth, and the direction of links. The execution chronogram specifies the period of resource utilization, allowing efficient scheduling, which is a clear concern for Grids rather than Cloud computing. Another interesting point of VXDL is the possibility of describing resources individually or in groups, according to application needs. VXDL lacks support for distinct services descriptions, since it is focused on grid applications only. The proposal presented in [32], called VRD hereafter, describes resources in a network virtualization scenario where infrastructure providers describe their virtual resources and services prior to offering them. It takes into consideration the integration between the properties of virtual resources and their relationships. An interesting point in the proposal is its use of functional and non-functional attributes. Functional attributes are related to characteristics, properties, and functions of components. Non-functional attributes specify criteria and constraints, such as performance, capacity, and QoS. Among the functional properties that must be highlighted is the set of component types: PhysicalNode, VirtualNode, Link, and Interface. Such properties suggest a flexibility that can be used to represent routers or servers, in the case of nodes, and wired or wireless links, in the case of communication links and interfaces.
  • 34. 20 Another proposal known as the Manifest language was developed by Chapman et al. [9]. They proposed new meta-models to represent service requirements, constraints, and elasticity rules for software deployment in a Cloud. The building block of such framework is the OVF (Open Virtualization Format) standard, which was extended by Chapman et al. to perform the vision of D- Clouds considering locality constraints. These two points are very interesting to our scenario. With regard to elasticity, it assumes a rule-based specification formed by three fields: a monitored condition related to the state of the service (such as workload), an operator (relational and logical ones are accepted), and an associated action to follow when the condition is met. The location constraints identify sites that should be favored or avoided when selecting a location for a service. Nevertheless, the Manifest language is focused on the software architecture. Hence, the language is not concerned with other aspects such as resources’ status or network resources. Cloud# is a language for modeling Clouds proposed by [16] to be used as a basis for Cloud providers and clients to establish trust. The model is used by developers to understand the behavior of Cloud services. The main goal of Cloud# is to describe how services are delivered, while taking into consideration the interaction among physical and virtual resources. The main syntactic construct within Cloud# is the computation unit CUnit, which can model Cloud systems, virtual machines, or operating systems. A CUnit is represented as a tuple of six components modeling characteristics and behaviors. This language gives developers a better understanding of the Cloud organization and how their applications are dealt with. 3.2.2 Resource Offering and Treatment Once the D-Cloud resources are modeled, the next challenge is to describe how resources are offered to developers, which is important since the requirements supported by the provider are defined in this step. Such challenge will also define the interfaces of the D-Cloud. This challenge differs from resource modeling since the modeling is independent of the way that resources are offered to developers. For example, the provider could model each resource individually, like independent items in a fine-grained scale such as GHz of CPU or GB of memory, but could offer them like a coupled collection of those items or a bundle, such as VM templates as cited at Section 2.5.2. Recall that, in addition to computational requirements (CPU and memory) and traditional network requirements, such as bandwidth and delay, new requirements are present under D-Cloud scenarios. The topology of the nodes is a first interesting requirement to be described. Developers should be able to set inter-nodes relationships and communication restrictions (e.g., downlink and uplink rates). This is illustrated in the scenario where servers – configured and managed by
  • 35. 21 developers – are distributed at different geographical localities while it is necessary for them to communicate with each other in a specific way. Jurisdiction is related to where (geographically) applications and their data must be stored and handled. Due to restrictions such as copyright laws, D-Cloud users may want to limit the location where their information will be stored (such as countries or continents). Other geographical constraint can be imposed by a maximum (or minimum) physical distance (or delay value) between nodes. Here, though developers do not know about the actual topology of the nodes, they may merely establish some delay threshold value for example. Developers should also be able to describe scalability rules, which would specify how and when the application would grow and consume more resources from the D-Cloud. Authors in [21] and [9] define a way of doing this, allowing the Cloud user to specify actions that should be taken, like deploying new VMs, based on thresholds of metrics monitored by the D-Cloud itself. Additionally, resource offering is associated to interoperability. Current Cloud providers offer proprietary interfaces to access their services, which can hinder users within their infrastructure as the migration of applications cannot be easily made between providers [8]. It is hoped that Cloud providers identify this problem and work together to offer a standardized API. According to [61], Cloud interoperability faces two types of heterogeneities: vertical heterogeneity and horizontal heterogeneity. The first type is concerned with interoperability within a single Cloud and may be addressed by a common middleware throughout the entire infrastructure. The second challenge, the horizontal heterogeneity, is related to Clouds from different providers. Therefore, the key challenge is dealing with these differences. In this case, a high level of granularity in the modeling may help to address the problem. An important effort in the search for horizontal standardization comes from the Open Cloud Manifesto5 , which is an initiative supported by hundreds of companies that aims to discuss a way to produce open standards for Cloud Computing. Their major doctrines are collaboration and coordination of efforts on the standardization, adoption of open standards wherever appropriate, and the development of standards based on customer requirements. Participants of the Open Cloud Manifesto, through the Cloud Computing Use Case group, produced an interesting white paper [51] highlighting the requirements that need to be standardized in a cloud environment to ensure interoperability in the most typical scenarios of interaction in Cloud Computing. 5 http://www.opencloudmanifesto.org/
  • 36. 22 Another group involved with Cloud standards is the Open Grid Forum6 , which is intended to develop the specification of the Open Cloud Computing Interface (OCCI)7 . The goal of OCCI is to provide an easily extendable RESTful interface Cloud management. Originally, the OCCI was designed for IaaS setups, but their current specification [46] was extended to offer a generic scheme for the management of different Cloud services. 3.2.3 Resource Discovery and Monitoring When requests reach a D-Cloud, the system should be aware of the current status of resources, in order to determine if there are available resources in the D-Cloud that could satisfy the requests. In this way, the right mechanisms for resource discovery and monitoring should also be designed, allowing the system to be aware of the updated status of all its resources. Then, based on the current status and request’ requirements, the system may select and allocate resources to serve these new request. Resource monitoring should be continuous and help taking allocation and reallocation decisions as part of the overall resource usage optimization. A careful analysis should be done to find a good and acceptable trade-off between the amount of control overhead and the frequency of resource information updating. The monitoring may be passive or active. It is considered passive when there are one or more entities collecting information. The entity may continuously send polling messages to nodes asking for information or may do this on-demand when necessary. On the other hand, the monitoring is active when nodes are autonomous and may decide when to send asynchronously state information to some central entity. Naturally, D-Clouds may use both alternatives simultaneously to improve the monitoring solution. In this case, it is necessary to synchronize updates in repositories to maintain consistency and validity of state information. The discovery and monitoring in a D-Cloud can be accompanied by the development of specific communication protocols. Such protocols act as a standard plane for control in the Cloud, allowing interoperability between devices. It is expected that such type of protocols can control the different elements including servers, switches, routers, load balancers, and storage components present in the D-Cloud. One possible method of coping with this challenge is to use smart communication nodes with an open programming interface to create new services within the node. One example of this type of open nodes can be seen in the emerging Openflow-enabled switches [44]. 6 http://www.gridforum.org/ 7 http://occi-wg.org/about/specification/
  • 37. 23 3.2.4 Resource Selection and Optimization With information regarding Cloud resource availability at hand, a set of appropriate candidates may then be highlighted. Next, the resource selection process finds the configuration that fulfills all requirements and optimizes the usage of the infrastructure. Selecting solutions from a set of available ones is not a trivial task due to the dynamicity, high algorithm complexity, and all different requirements that must be contemplated by the provider. The problem of resource allocation is recurrent on computer science, and several computing areas have faced such type of problem since early operating systems. Particularly in the Cloud Computing field, due to the heterogeneous and time-variant environment in Clouds, the resource allocation becomes a complex task, forcing the mediation system to respond with minimal turnaround time in order to maintain the developer’s quality requirements. Also, balancing resources’ load and projecting energy-efficient Clouds are major challenges in Cloud Computing. This last aspect is especially relevant as a result of the high demand for electricity to power and to cool the servers hosted on datacenters [7]. In a Cloud, energy savings may be achieved through many different strategies. Server consolidation, for example, is a useful strategy for minimizing energy consumption while maintaining high usage of servers’ resources. This strategy saves the energy migrating VMs onto some servers and putting idle servers into a standby state. Developing automated solutions for server consolidation can be a very complex task since these solutions can be mapped to bin-packing problems known to be NP-hard [72]. VM migration and cloning provides a technology to balance load over servers within a Cloud, provide fault tolerance to unpredictable errors, or reallocate applications before a programmed service interruption. But, although this technology is present in major industry hypervisors (like VMWare or Xen), there remains some open problems to be investigated. These include cloning a VM into multiple replicas on different hosts [40] and developing VM migration across wide-area networks [14]. Also, the VM migration introduces a network problem, since, after migration, VMs require adaptation of the link layer forwarding. Some of the strategies for new datacenter architectures explained in [67] offer solutions to this problem. Remodeling of datacenter architectures is other research field that tries to overcome limitations on scalability, stiffness of address spaces, and node congestion in Clouds. Authors in [67] surveyed this theme, highlighted the problems on network topologies of state-of-the-art datacenters, and discussed literature solutions for these problems. One of these solutions is the D-Cloud, as
  • 38. 24 pointed also by [72], which offers an energy efficient alternative for constructing a cloud and an adapted solution for time-critical services and interactive applications. Considering specifically the challenges on resource allocation in D-Clouds, one can highlight correlated studies based on the Placement of Replicas and Network Virtualization. The former is applied into Content Distribution Networks (CDNs) and it tries to decide where and when content servers should be positioned in order to improve system’s performance. Such problem is associated with the placement of applications in D-Clouds. The latter research field can be applied to D-Clouds considering that a virtual network is an application composed by servers, databases, and the network between them. Both research fields will be described in following sections. Replica Placement Replica Placement (RP) consists of a very broad class of problems. The main objective of this type of problems is to decide where, when, and by whom servers or their content should be positioned in order to improve CDN performance. The correspondent existing solutions to these problems are generally known as Replica Placement Algorithms (RPA) [35]. The general RP problem is modeled as a physical topology (represented by a graph), a set of clients requesting services, and some servers to place on the graph (costs per server can be considered instead). Generally, there is a pre-established cost function to be optimized that reflects service-related aspects, such as the load of user’s requests, the distance from the server, etc. As pointed out by [35], an RPA groups these aspects into two different components: the problem definition, which consists of a cost function to be minimized under some constraints, and a heuristic, which is used to search for near-optimal solutions in a feasible time frame, since the defined problems are usually NP-complete. Several different variants of this general problem were already studied. But, according to [57], they fall into two classes: facility location and minimum K-median. In the facility location problem, the main goal is to minimize the total cost of the graph through the placement of a number of servers, which have an associated cost. The minimum K-median problem, in turn, is similar but assumes the existence of a pre-defined number K of servers. More details on the modeling and comparison between different variants of the RP problem are provided by [35]. Different versions of this problem can be mapped onto resource allocation problems in D- Clouds. A very simple mapping can be defined considering an IaaS service where virtual machines can be allocated in a geo-distributed infrastructure. In such mapping, the topology corresponds to the physical infrastructure elements of the D-Cloud, the VMs requested by developers can be treated as servers, and the number of clients accessing each server would be their load.
  • 39. 25 Qiu et al. [57] proposed three different algorithms to solve the K-median problem in a CDN scenario: Tree-based algorithm, Greedy algorithm, and Hot Spot algorithm. The Tree-based solution assumes that the underlying graph is a tree that is divided into several small trees, placing each server in each small tree. The Greedy algorithm places servers one at a time in order to obtain a better solution in each step until all servers are allocated. Finally, the Hot Spot solution attempts to place servers in the vicinity of clients with the greatest demand. The results showed that the Greedy Algorithm for replica placement could provide CDNs with performance that is close to optimal. These solutions can be mapped onto D-Clouds considering the simple scenario of VM allocation on a geo-distributed infrastructure with the restriction that each developer has a fixed number of servers to attend their clients. In such case, this problem can be straightforwardly reduced to the K-median problem and the three solutions proposed could be applied. Basically, one could treat each developer as a different CDN and optimize each one independently still considering a limited capacity of the physical resources caused by the allocation of other developers. Presti et al. [56], treat a RP variant considering a trade-off between the load of requests per content and the number of replica additions and removals. Their solution considers that each server in the physical topology decides autonomously, based on thresholds, when to clone overloaded contents or to remove the underutilized ones. Such decisions also encompass the minimization of the distance between clients and the respective accessed replica. A similar problem is investigated in [50], but considering constraints on the QoS perceived by the client. The authors propose a mathematical offline formulation and an online version that uses a greedy heuristic. The results show that the heuristic presents good results with minor computational time. The main focus of these solutions is to provide scalability to the CDN according to the load caused by client requests. Thus, despite working only with the placement of content replicas, such solutions can be also applied to D-Clouds with some simple modifications. Considering replicas as allocated VMs, one can apply the threshold-based solution proposed in [56] to the simple scenario of VM scalability on a geo-distributed infrastructure. Network Virtualization The main problem of NV is the allocation of virtual networks over a physical network [10] and [3]. Analogously, D-Clouds’ main goal is to allocate application requests on physical resources according to some constraints while attempting to obtain a clever mapping between the virtual and physical resources. Therefore, problems on D-Clouds can be formulated as NV problems, especially in scenarios considering IaaS-level services.
  • 40. 26 Several instances of the NV based resource allocation problem can be reduced to a NP-hard problem [48]. Even the versions where one knows beforehand all the virtual network requests that will arrive in the system is NP-hard. The basic solution strategy thus is to restrict the problem space making it easier to deal with and also consider the use of simple heuristic-based algorithms to achieve fast results. Given a model based on graphs to represent both physical and virtual servers, switches, and links [10], an algorithm that allocates virtual networks should consider the constraints of the problem (CPU, memory, location or bandwidth limits) and an objective function based on the algorithm objectives. In [31], the authors describe some possible objective functions to be optimized, like the ones related to maximize the revenue of the service provider, minimizing link and nodes stress, etc. They also survey heuristic techniques used when allocating the virtual networks dividing them in two types: static and dynamic. The dynamic type permits reallocating along the time by adding more resources to already allocated virtual networks in order to obtain a better performance. The static one means once a virtual network is allocated it will hardly ever change its setup. To exemplify the type of problems studied on NV, one can be driven to discuss the one studied by Chowdhury et al. [10]. Its authors propose an objective function related to the cost and revenue of the provider and constrained by capacity and geo-location restrictions. They reduce the problem to a mixed integer programming problem and then relax the integer constraints through the deriving of two different algorithms for the solution’s approximation. Furthermore, the paper also describes a Load Balancing algorithm, in which the original objective function is customized in order to avoid using nodes and links with low residual capacity. This approach implies in allocation on less loaded components and an increase of the revenue and acceptance ratio of the substrate network. Such type of problem and solutions can be applied to D-Clouds. One example could be the allocation of interactive servers with jurisdiction restrictions. In this scenario, the provider must allocate applications (which can be mapped on virtual networks) whose nodes are linked and that must be close to a certain geographical place according to a maximum tolerated delay. Thus, a provider could apply the proposed algorithms with minor simple adjustments. In the paper of Razzaq and Rathore [58], the virtual network embedding algorithm is divided in two steps: node mapping and link mapping. In the node mapping step, nodes with highest resource demand are allocated first. The link mapping step is based on an edge disjoint k-shortest path algorithm, by selecting the shortest path which can fulfill the virtual link bandwidth
  • 41. 27 requirement. In [42], a backtracking algorithm for the allocation of virtual networks onto substrate networks based on the graph isomorphism problem is proposed. The modeling considers multiple capacity constraints. Zhu and Ammar [74] proposed a set of four algorithms with the goal of balancing the load on the physical links and nodes, but their algorithms do not consider capacity aspects. Their algorithms perform the initial allocation and make adaptive optimizations to obtain better allocations. The key idea of the algorithms is to allocate virtual nodes considering the load of the node and the load of the neighbor links of that node. Thus one can say that they perform the allocation in a coordinated way. For virtual link allocation, the algorithm tries to select paths with few stressed links in the network. For more details about the algorithm see [74]. Considering the objectives of NV and RP problems, one may note that NV problems are a general form of the RP problem: RP problems try to allocate virtual servers whereas NV considers allocation of virtual servers and virtual links. Both categories of problems can be applied to D- Clouds. Particularly, RP and NV problems may be respectively mapped on two different classes of D-Clouds: less controllable D-Clouds and more controllable ones, respectively. The RP problems are suitable for scenarios where allocation of servers is more critical than links. In turn, the NV problems are especially adapted to situations where the provider is an ISP that has full control over the whole infrastructure, including the communication infrastructure. 3.2.5 Summary The D-Clouds’ domain brings several engineering and research challenges that were discussed in this section and whose main aspects are summarized in Table I. Such challenges are only starting to receive attention from the research community. Particularly, the system, models, languages, and algorithms presented in the next chapters will cope with some of these challenges. Table I Summary of the main aspects discussed Categories Aspects Resource Modeling Heterogeneity of resources Physical and virtual resources must be considered Complexity vs. Flexibility Resource Offering and Treatment Describe the resources offered to developers Describe the supported requirements New requirements: topology, jurisdiction, scalability Resource Discovery and Monitoring Monitoring must be continuous Control overhead vs. Updated information Resource Selection and Optimization Find resources to fulfill developer’s requirements Optimize usage of the D-Cloud infrastructure Complex problems solved by approximation algorithms
  • 42. 28 4 The Nubilum System “Expulsa nube, serenus fit saepe dies.” Popular Proverb Section 2.4 introduced an Archetypal Cloud Mediation system focusing specifically on the resource management process that ranges from the automatic negotiation of developers requirements to the execution of their applications. Further, this system was divided into three layers: negotiation, resource management, and resource control. Keeping in mind this simple archetypal mediation system, this chapter presents Nubilum a resource management system that offers a self-managed solution for challenges resulting from the discovery, monitoring, control, and allocation of resources in D-Clouds. This system appears previously in [25] under the name of D-CRAS (Distributed Cloud Resource Allocation System). Section 4.1 presents some decisions taken to guide the overall design and implementation of Nubilum. Section 4.2 presents a conceptual view of the Nubilum’s architecture highlighting their main modules. The functional components of Nubilum are detailed in Section 4.3. Section 4.4 presents the main processes performed by Nubilum. Section 4.5 closes this chapter by summarizing the contributions of the system and comparing them with correlated resource management systems. 4.1 Design Rationale As stated previously in Section 1.2, the objective of this Thesis is to develop a self-manageable system for resource management on D-Clouds. Before the development of the system and their correspondent architecture, some design decisions that will guide the development of the system must be delineated and justified. 4.1.1 Programmability The first aspect to be defined is the abstraction level in which Nubilum will act. Given that D- Clouds concerns can be mapped on previous approaches on Replica Placement (see Section 0) and Network Virtualization (see Section 0) research areas, a straightforward approach would be to consider a D-Cloud working at the same abstraction level. Therefore, knowing that proposals in both areas commonly seem to work at the IaaS level, i.e., providing virtualized infrastructures, Nubilum would naturally also operate at the IaaS level.
  • 43. 29 Nubilum offers a Network Virtualization service. Applications can be treated as virtual networks and the provider’s infrastructure is the physical network. In this way, the allocation problem is a virtual network assignment problem and previous solutions for the NV area can be applied. Note that such approach does not exclude previous Replica Placement solutions because such area can be viewed as a particular case of Network Virtualization. 4.1.2 Self-optimization As defined in Section 2.1, the Cloud must provide services in a timely manner, i.e., resources required by users must be configured as quickly as possible. In other words, to meet such restriction, Nubilum must operate as much as possible without human intervention, which is the very definition of self-management from Autonomic Computing [69]. The operation involves maintenance and adjustment of the D-Cloud resources in the face of changing application demands and innocent or malicious failures. Thus, Nubilum must provide solutions to cope with the four aspects leveraged by Autonomic Computing: self-configuration, self- healing, self-optimization, and self-protection. Particularly, this Thesis focuses on investigating self- optimization – and, at some levels possibly, self-configuration – on D-Clouds. The other two aspects are considered out of scope of this proposal. According to [69], self-optimization of a system involves letting its elements “continually seek ways to improve their operation, identifying and seizing opportunities to make themselves more efficient in performance or cost”. Such definition fits very well the aim of Nubilum, which must ensure an automatic monitoring and control of resources to guarantee the optimal functioning of the Cloud while meeting developers’ requirements. 4.1.3 Existing standards adoption The Open Cloud Manifesto, an industry initiative that aims to discuss a way to produce open standards for Cloud Computing, states that Cloud providers “must use and adopt existing standards wherever appropriate” [51]. The Manifesto argues that several efforts and investments have been made by the IT industry in standardization, so it seems more productive and economic to use such standards when appropriate. Following this same line, Nubilum will adopt some industry standards when possible. Such adoption is also extended to open processes and software tools. 4.2 Nubilum’s conceptual view As shown in Figure 6, the conceptual view of Nubilum’s architecture is composed of three planes: a Decision plane, a Management plane, and an Infrastructure plane. Starting from the bottom, the lower plane nestles all modules responsible for the appropriate virtualization of each resource in the
  • 44. 30 D-Cloud: servers, dedicated storage, and network, which includes links, routers, and switches. The Management plane is responsible for the monitoring and control of the D-Cloud, as well as the enforcement of allocation decisions taken by the upper layer. Finally, at the top of the architecture, the Decision plane is where the advanced strategies and heuristics to allocate resources are implemented. In the following sections the modules will be detailed. Figure 6 Nubilum’s planes and modules 4.2.1 Decision plane The Decision plane is composed of four modules: Negotiator, Mapper, Application Monitors, and the D-Cloud Monitor. The Negotiator module is the front-end of Nubilum and is similar to the Negotiation Layer presented in the Archetypal Mediation System at Section 2.4. Thus, the Negotiator offers an API that can be used by developers to contract and control their virtual resources. In this system, these operations will be implemented as Web Services using a descriptive language whose definitions are illustrated in Chapter 5. The Mapper is responsible for deciding the mapping of the required virtual resources on the corresponding physical resources. It is important to note that, in Nubilum, the mapping algorithms work with a model of the entire infrastructure, as the effective allocation is the responsibility of lower planes. Thus, the Mapper inputs are a representation of a developer’s request and a representation of the current status of the D-Cloud and the output is a new representation of the D- Cloud status to be enforced over the real system. It is important to stress that the Mapper module contains the main intelligence of the system and it is responsible for finding a configuration that fulfills all computational and network resources’ requirements and optimizes the usage of the D- Cloud infrastructure. The allocation algorithms used as part of Nubilum will be discussed at Chapter 6. Infrastructure Plane ManagementPlane DecisionPlane Mapper Network Monitor Server Monitor Storage Monitor Network Controller Server Controller Storage Controller Resource Maestro Server Virtualization Network Virtualization Storage Virtualization Application Managers Application Managers Application Monitor Negotiator Resource Discoverer D-Cloud Monitor
  • 45. 31 The Decision plane also has a set of modules managing the applications of each developer: the Application Monitors. The Application Monitor is responsible by the monitoring part of the self- management loop of the system, since it periodically checks an application to guarantee the fulfilling of their requirements. Each application submitted to Nubilum has an Application Monitor associated with it. This module continuously checks the current status of the application against its requirements and selects the appropriate actions when attempting to improve the application’s performance, if that is necessary. Please note that the Application Monitor does not make allocation decisions; it is merely responsible for detecting performance degradations and requesting new resources from the Allocator. The provider requirements - with respect to the usage of physical resources - are constantly monitored and verified through the D-Cloud Monitor module. Similar to the Application Monitor modules, the D-Cloud Monitor is other module acting in the self-management loop through monitoring of the physical resources. It checks the current state of the physical resources against the provider’s requirements. Upon a detection of surpassing a threshold (e.g: an increase of server load beyond an established threshold), the D-Cloud Monitor communicates the Mapper soliciting the re- optimization of the infrastructure. 4.2.2 Management plane Nubilum divides D-Cloud resources in three categories: server, network, and storage. In the server category there are all the devices that can host virtual machines. The network resources represent links, network devices (routers and switches), and protocols that compose the underlying topology of the D-Cloud. Finally, the storage resources are the nodes dedicated to storing virtual machine images, files, or databases. Considering such division, the Management plane has different modules for controlling and monitoring each category of resources. Thus, the Server Controller and Server Collector are respectively responsible for the control of the servers and hosted VMs, and for the acquisition of updated status of the server and VMs. The Network Collector is responsible for obtaining updated information about the state of the network and the load of links, whereas the Network Controller is responsible for communicating decisions about the creation, modification, and removal of virtual links to network devices. Further, this controller manages the assignments of IP and MAC addresses to virtual resources. Similarly, the Storage Controller and the Storage Collector are responsible for controlling and collecting storage resources. All such information about resource monitoring is sent to the D-Cloud Monitor and Application Monitors.
  • 46. 32 Associated with the individual controllers and collectors of each resource is the Resource Discoverer, which is responsible by the detection of resources in the D-Cloud and for maintaining information about their respective status, which will be collected by the respective monitors of computing, storage, and network resources. Finally, the Resource Maestro orchestrates the creation and removal of the virtual resources in the respective physical resources according to the decisions made by the modules at the Decision Plane. When a new request arrives to Nubilum and the Mapper module decides where the new virtual resources will be allocated, the Maestro enforces this decision by communicating it to the components involved. One important task of this module is that it must translate high level orders taken by the Mapper into low level requests that can be handled by the other modules. Such translations must be made carefully, considering the application’s requirements, once the order of the enforcement of any changes may alter the application’s execution. For example, consider a virtual network of two virtual nodes A and B and one direct virtual link between them. Positioning a new virtual node C in the middle (with one virtual link from A to C and another one between A and B) involves the coordinated creation of these two virtual links in order to minimize interruptions in the intercommunications of the virtual nodes. 4.2.3 Infrastructure plane The infrastructure plane offers tools used for the appropriate virtualization of servers, network elements and storage devices in the D-Cloud. As a result, this plane requires three modules to accomplish its tasks. The Server Virtualization is the module responsible for the effective management of resources in the physical servers. It also manages the creation and maintenance of virtual machines, i.e., it corresponds to the hypervisor installed in the physical servers. Thus, it is important to note that, differently from the Server Controller and Server Monitor modules already described, the Server Virtualization module is intended for the local control of a server and its resources. The Network Virtualization module corresponds to the platform used for the effective virtualization of the network and the associated protocols to accomplish this task. Similarly, the Storage Virtualization comprehends all the technologies used for the effective control of storage components in the D-Cloud. 4.3 Nubilum’s functional components This section describes how the modules presented in the conceptual view in Section 4.2 are rearranged to create the functional components of Nubilum. Moreover, this section provides
  • 47. 33 information about the technological solutions taken to overcome some D-Cloud-related design issues. The functional components in Nubilum are: the Allocator (located at the Decision plane in the conceptual view), the Manager (that has modules from the Decision and Management plane), the Storage System (situated at the Infrastructure plane), the Network Devices (situated at the Infrastructure plane), and the Workers (situated in a mid-term between the Management plane and the Infrastructure plane). Figure 7 illustrates these five components and their respective modules. At the top are the Allocator and the Manager responsible, respectively, for taking decisions in the allocation of incoming requests and for the overall control of the D-Cloud resources. The other three components are responsible for operational tasks in each type of physical resource. Figure 7 Functional components of Nubilum 4.3.1 Allocator The Allocator (showed at Figure 8) is responsible for treating the resource requests made by developers and for mapping the requested virtual resources on the physical resources of the D- Cloud. Figure 8 Schematic diagram of Allocator’s modules and relationships with other components Manager Worker Worker Workers Allocator Network Devices StorageSystem Mapper Network Monitor Resource Maestro Network Virtualization Storage Virtualization Network Controller Storage Controller Storage Monitor Network Controller Server Monitor Server Virtualization Server Controller Resource Discoverer Application Managers Application Managers Application Managers Negotiator D-Cloud Manager Allocator Mapper Negotiator Manager Cloud Provider Cloud Developer
  • 48. 34 This Allocator needs to be as simple as possible in order to leverage system performance, since the optimization problems for resource allocation tend to be computationally intensive. Therefore, it has just the Mapper module (part of the Decision plane), which communicates with the Manager to obtain information about the D-Cloud state and sends commands to enforce new decisions, through a REST-based API. The Negotiator module of the Decision plane is the external interface of Nubilum, which offers a REST API for communication with developers and the provider. 4.3.2 Manager The Manager is the central component in Nubilum that coordinates the overall control of the D- Cloud and maintains updated information about the status of all its resources. It has four modules from the Management plane (Network Controller, Network Collector, Resource Discoverer, and Resource Maestro) and two modules from the Decision plane (Application Monitor, and D-Cloud Monitor), which are represented in Figure 9. This figure also shows the communication with other components [44]. Figure 9 Schematic diagram of Manager’s modules and relationships with other components The Resource Discoverer module uses a simple process for discovering servers and network devices based on the prior registration of each individual device at the Manager. Thus, the known address of the Manager must be manually configured on the Worker and the Network Devices. The acquiring of status information of the Network Devices is made through periodical queries to the Openflow Collector module, while the status of each Worker is sent by them when a configured threshold is crossed. All information is passed and maintained by the Resource Discoverer module. The Network Controller and Network Collector modules implement the functions of control and status collection of network devices individually. This module assumes the usage of Openflow as a platform for network virtualization. This platform was chosen because of its versatility. It allows configuring Network Devices in order to setup a path in the physical network corresponding to the virtual link of an application. This module can be implemented using current APIs which support the Openflow protocol or, alternatively, through invoking Openflow controllers such as NOX [29]. Manager WorkersWorkers Network Monitor Resource Maestro Resource Discoverer Application Managers Application Managers Application Managers D-Cloud Manager I n t e r f a c e Allocator Workers WorkersWorkers Network Devices Network Controller
  • 49. 35 4.3.3 Worker The basic component of Nubilum is the Worker. This component is installed on each physical server and its main function is to manage the entire life-cycle of the virtual machines hosted on the hypervisor. A schematic view of the relationship between the modules of this component is presented in Figure 10. Figure 10 Schematic diagram of Worker modules and relationships with the server system From the Infrastructure plane, the Workers execute the Server Virtualization module. The Workers are required to support third-party hypervisors as Xen, VMWare, or KVM, for example. To avoid the implementation of the different drivers supporting each one of these hypervisors into the Worker, the Libvirt [5] open-source API is used. Libvirt provides a common generic and stable layer for VM management and offers an abstraction to Cloud providers since it supports the execution on different hypervisors like: Xen, KVM, VirtualBox, and VMWare ESX. Through Libvirt, the Workers can effectively control and monitor several aspects of virtual machines, which can be simply consumed by the other modules. The advantage obtained when using Libvirt is notorious, since this API offers a hypervisor-agnostic alternative for virtual machine control. In addition, Libvirt leverages an abstraction of other aspects associated with the virtual machine as storage and network manipulation. Workers have several modules. For the control and monitoring of the physical server and virtual machines hosted on it, there are the Server Controller and the Server Collector modules. The Server Controller is responsible for the overall tasks involved in the creation, maintenance, and destruction of virtual machines, while the Server Collector maintains records of the status of the Worker Storage Controller Storage Monitor Server Monitor Server Virtualization Server Controller Network Controller LibvirtLibvirt I n t e r f a c e VM1 VM2 VMn... Manager
  • 50. 36 server and the virtual machines hosted on it. To accomplish these tasks, these modules coordinate the execution of the other modules hosted at the Worker, call the hypervisor through the Libvirt API and invoke the operating system through some host monitoring API available in the programming language (in case of Java, one could use the JavaSysMon system call library [33]). The Storage Controller and Storage Collector modules manage the devices used for storing virtual machine images. They maintain the storage pool while creating and deleting virtual machine images. For this, the modules use Libvirt, which offers a common interface to manage many different types of storage spaces ranging from iSCSI devices to a simple directory in the same machine. Thus, these modules can access any type of Storage, making Nubilum independent of the actual underlying storage scheme used on the D-Cloud, while the only assumption made is that all files are potentially accessible from every machine in the D-Cloud. More about the storage in the D- Cloud can be found in the corresponding section. The Workers are responsible for the effective assignment of IPs and MACs to the virtual machines, this task is performed by the Network Controller module, which is composed by a built- in DHCP server and a LLDP (Link Layer Discovery Protocol) agent. The DHCP module is dynamically configured with a direct mapping between the MAC and IP numbers, which are used to give reserved IPs to each virtual machine hosted on the server. Please note that the presence of several DHCP servers (one per Worker) does not produce interferences between them, since each built-in DCHP server will only respond to requests of the virtual machines hosted on the Worker, whereas requests with unknown MAC addresses will be dropped. The LLDP is employed for the advertisement of LLDP messages used in the discovery of their links. The use of LLDP for discovery is detailed in Section 4.4.2. In addition to the modules used for the effective control and monitoring of the server and virtual machines, the Worker has an Interface for communicating with Manager component. Such interface is a REST-based Web Service. 4.3.4 Network Devices Nubilum considers using a network virtualization platform for the setup and maintenance of the virtual links across the system. On the current version of Nubilum, it is required that all network devices in the system implement Openflow for network virtualization purposes. The advantage of adopting such a solution is that not only is there a native support for the creation of virtual links, but also no adaptations must be made to the network devices. This component has one only module from the Conceptual View: the Network Virtualization module.
  • 51. 37 4.3.5 Storage System The Storage System covers the Storage Virtualization module responsible by virtualize all the storage resources spread along the D-Cloud. As earlier stated, Nubilum does not enforce the usage of any actual storage technology type, but entrusts this to Libvirt’s coverage of storage technologies. The only requirement is that all the virtual machine images are available to all servers in the D-Cloud. This occurs because Nubilum depends on the current virtual machine migration techniques that have this particular requirement. Please note that the D-Cloud should not employ centralized storage solutions, but it can employ distributed storage solutions, such as the Sector project [28], a file system designed for distributed storage over WANs. 4.4 Processes Several processes in Nubilum are performed as a result of the coordination between the components. The present section will discuss these processes, which can be organized into three classes: initialization processes, discovery processes, and resource request processes. 4.4.1 Initialization processes The initialization process of the components that make up Nubilum is simple, since each component is configured before its initialization. The first component to be initiated is the Storage system as it is a third-party system. Similarly, the network devices must be configured with routes for accessing the overall physical resources in the system and with the indication of the Openflow controller address (i.e., the known IP address of the Manager component). The connection follows the standardized process specified in the Openflow documentation [52]. The last infrastructural component, the Worker, is installed in each server of the D-Cloud and its configuration includes the Manager’s address for registration. This configuration also includes information about storage pools, geographical location, and base images to be used for virtual machine creation. When started, the Worker connects with the local Libvirt API and obtains information about the server and current virtual machines hosted on the server. Using such information, the Worker prepares a description of the local server and virtual machines it hosts and then initiates a simple registration process in the Manager, through the REST interface. If the Manager is not initialized yet, the Worker will try again after a random sleep period. The Manager can be initialized either on a reserved server or on a virtual machine hosted by a server. Its initialization process opens an HTTP server for listening to REST messages and waits for connections from Workers and Allocator components. Also, an Openflow controller is started to
  • 52. 38 listen for incoming connections of Network Devices. As with the Manager, the Allocator can be initialized on any server of the D-Cloud or on a virtual machine, alternatively. The address of the Manager is configured before the Allocator’s initialization. 4.4.2 Discovery and monitoring processes The resource discovery comprehends the processes for finding new resources and acquiring updated status on these. The initialization processes of Workers and Network Devices – which involves the registration of both into the Manager – is the first part of the discovery processes that find new resources on the D-Cloud. However, such processes are not sufficient to discover the connections between physical resources in the D-Cloud. Therefore, Nubilum employs a discovery strategy supported by NOX, which makes use of LLDP messages generated by the Manager and sent by each Network Device to their neighbors [24]. The link discovery strategy is illustrated at Figure 11. Firstly, the Manager sends an Openflow message (number 1 in the figure) to a switch in the network requesting forwarding of a LLDP packet for each of its neighbor switches. When such neighbors receive the LLDP packet (arrows marked as 2) they generate a new Openflow message (marked as 3) to the Manager informing the receiving of this specific LLDP packet. The same process is executed for all switches concurrently and the LLDP packet of each switch is specific for that switch in order to identify each link. Such a strategy guarantees the discovery of all links between Network Devices in the D-Cloud, since it is assumed that all Network Devices are Openflow-enabled. This process was extended to capture LLDP messages sent by Workers in order to discover links between servers and Network Devices. Figure 11 Link discovery process using LLDP and Openflow Collecting information about the status of each resource involves retrieving information about the Storage System, the Network Devices, and the Workers. The performance status of the Network Manager 1 2 2 2 3 33 Network Device Network Device Network Device Network Device
  • 53. 39 Devices is monitored by the Manager in a passive fashion by polling the resources according to a configured period. Such polling is made through well-defined Openflow messages, which can inform several Nubilum counters as, for example, the number of received packets or bytes per virtual link in a network device (flows in Openflow terminology). Differently from the Network Devices, the Workers actively send status information through the Storage and Server Collector modules. The information offered by the Worker is closely related to the support provided by Libvirt. It is possible to obtain information about CPU and Memory usage and the disk used space in the server and virtual machines. 4.4.3 Resource allocation processes The resource allocation processes are started up when developers, the Application Monitors, or the D-Cloud Monitor, contact the Allocator requiring, respectively, resources for a new application, more resources for an existent application, or the reallocation of resources to meet provider’s requirements. Figure 12 shows all messages exchanged by the resource allocation process when a developer requests resources for a new application. The developer sends to the Allocator a POST message containing a CloudML document (see Chapter 5 for details about this language) describing the requirements of its application, i.e., a virtual network with virtual machines, its geographical position, and virtual links. After that, the Allocator sends a GET message to the Manager to retrieve the current status of the entire D-Cloud, which will be used as input to the resource allocation algorithms. Figure 12 Sequence diagram of the Resource Request process for a developer PUT /dcloud (CloudML) Cloud Developer Allocator Workers Network Devices POST App(CloudML) GET DCloud CloudML Resource Allocation POST/virnode (CloudML) POST/virnode (CloudML) POST/virnode (CloudML) ReplyPOST (CloudML) ReplyPOST (CloudML) ReplyPOST (CloudML) Create flow (Openflow) Create flow (Openflow) Create flow (Openflow) ReplyPUT (CloudML) ReplyPOST (CloudML) Manager Nubilum
  • 54. 40 After the resource allocation algorithm execution, the Allocator sends a PUT message to the Manager indicating what resources (and their respective configuration) must be dedicated to this given developer. Then, the Manager sends POST messages to each Worker in the D-Cloud that will host the requested virtual resources, receives respective replies, and sends Openflow messages to each Network Device informing them of the flows for the setup of new virtual links. Then, the Manager sends a reply PUT to the Allocator to confirm the allocation, and finally, the Allocator returns a reply POST to the developer indicating if it was possible to allocate their application and its respective IP address. The processes triggered by the D-Cloud Manager or the Application Managers are similar to the one presented in Figure 12, except that these modules use the PUT instead of the POST method in the initial message. 4.5 Related projects This chapter presented the implementation guidelines of Nubilum – a resource management system that offers a solution for challenges related to allocation, discovery, control, and monitoring of resources for ISP-based Distributed Clouds (please see Section 3.1 for more details). The system manages a D-Cloud as an IaaS Cloud offering developers an environment that manages the entire life-cycle of their applications ranging in scope from the request of resources, which are allocated into a virtualized infrastructure of scattered geo-distributed servers connected by network devices, to their removal from the D-Cloud. In order to obtain optimal or quasi-optimal solutions while maintaining scalability, Nubilum has a centralized decision ingredient with two high-level components: the Allocator and the Manager; and a decentralized control ingredient with three infrastructural components: Workers, Network Devices, and the Storage System. It also introduces a clear separation between the enforcement actions and the intelligence roles played by the Manager and the Allocator, respectively. The Manager offers an abstraction of the overall D-Cloud to the Allocator, which, in turn, is described in a high-level perspective, where only the functionalities and communication processes were defined. Thus, the proposed system intends to remain open for different model-driven resource allocation strategies. Some architectural ideas present in Nubilum are similar to the ones presented by open source resource management systems for Cloud Computing like, Eucalyptus, OpenNebula, and Nimbus, which are compared in [24]. These systems propose solutions to some problems that can arise in a D-Cloud scenario, thus providing good starting solutions for the design of resource management systems for D-Clouds. Beyond the centralized resource management, such systems also are based in
  • 55. 41 open interfaces and open tools like, respectively, Web Service-based interfaces (REST in case of Nubilum) and Libvirt as a hypervisor-agnostic solution. Particularly from OpenNebula, Nubilum leverages the idea of a conceptual stacked view separated from a functional view, as well as the separation between the decision and the control tasks. Despite these similarities, those systems have a lack of direct support to D-Clouds, mainly with regard to the virtualization of the network, which is a main aspect of D-Clouds discussed in this Thesis. Differently from the above-mentioned systems, the RESERVOIR project proposes a model, architecture, and functionalities for Open Federated Cloud Computing, which is a D-Cloud scenario where a Cloud provider may dynamically partner with others to provide a seemingly infinite resource pool [59]. To achieve this goal, RESERVOIR leverages virtualization technologies and embeds autonomous management into the infrastructure. To cope with networking aspects, the architecture designs a scheme of Virtual Ethernet [30], which offers isolation while sharing network resources between the federated Clouds composing the system. In contrast, following the design decision to use existing standards, Nubilum employs Openflow as a solution for network virtualization. An initiative that works with an ISP-based D-Cloud scenario is the research project called GEYSERS [22]. This project extends standard GMPLS-related traffic engineering protocols to virtual infrastructure reservation, i.e., for the integrated allocation of computational and networking resources. Nubilum adopts a different approach, working with two different standards for communication: REST-based interfaces combined with Libvirt for the allocation of server and storage resources and Openflow for allocating network links. The integration of these standards is made by the internal algorithms of the Manager. Such strategy is interesting since the system can be deployed without support of additional protocols. Also working with an ISP-based D-Cloud, one of the SAIL project objectives is to provide resource virtualization through the CloNe (Cloud Networking) architecture [60]. CloNe addresses the networking aspect in Cloud Computing and introduces the concept of a FNS (Flash Network Slice), a network resource that can be provisioned on a time scale comparable to existing compute and storage resources. The implementation of this concept already is an object of study in the SAIL project, but some solutions were presented ranging from VPNs to networks based on Openflow devices. Thus, the network virtualization solution presented in Nubilum, which is based on Openflow, can be viewed as a first implementation of the FNS proposed by CloNe. Another distinction between Nubilum and other existing resource management systems/architectures is that Nubilum is focused on the resource management properly whereas
  • 56. 42 other practical aspects as security and fault tolerance were not considered, although Nubilum could be extended to consider those important aspects. Hitherto, one specific critic that can be made to Nubilum is that some aspects of the system were generically described. One example is their openness to the usage of different algorithms for resource allocation, which makes it difficult to see how the system can guarantee designs decision related to self-optimization. Such aspects will be discussed in the next chapters what will specialize the architectural specifications further. Chapter 5 details the overall communication protocols and their respective messages used in the control plane of the system. Basically, the next chapter will describe the HTTP and Openflow messages used by the components as well as it will describe the modeling language CloudML. Furthermore, algorithms for resource allocation derived for specific cases will be presented at Chapter 6.
  • 57. 43 5 Control Plane “Si quam rem accures sobrie aut frugaliter, solet illa recte sub manus succedere.” Plautus Persa This chapter details and evaluates the control plane of Nubilum. It consists of two main elements: the HTTP and Openflow messages for communication between the components in Nubilum, and the Cloud Modeling Language (CloudML) that is used to describe services, resources and requirements. Such elements are intrinsically correlated since the language represents data that is exchanged by the HTTP messages and that defines the number and format of the Openflow messages. These also specify how developers and the provider interact with the D-Cloud and how the effective resource allocation is made on the D-Cloud. Following the above division of the control plane, this chapter is split in three main sections: it starts by presenting CloudML and discussing its characteristics with respect to similar description languages in Section 5.1; next, in Section 5.2, the communication interfaces and protocols are explained; finally, Section 5.3 evaluates the overall control plane solution. 5.1 The Cloud Modeling Language As explained in Chapter 3, the representation of user requirements and cloud resources is the first challenge to face when considering an automatic resource management system. This chapter introduces the Cloud Modeling Language (CloudML), an XML-based language intended to cope with the aforementioned required representations. CloudML is proposed to model service profiles and developer’s requirements, while at the same time, to represent physical and virtual resource status in D-Clouds. This language was previously introduced in [26]. Considering previous languages ([9], [39], [32], [16], [73], [47]) for the representation of resources and their respective limitations (more about this topic is discussed in Section 5.1.3), it was decided to design the Cloud Modeling Language (CloudML) to ensure three clear objectives: a) the language must represent all physical and virtual resources in a D-Cloud, including their current state; b) the proposed language must be able to model the service supported by the provider; and c) the language must represent a developer’s requirements while also relating them to the provider’s services.
  • 58. 44 In order to grasp how CloudML offers the integration of such objectives, please consider Figure 13. This figure depicts a scenario with three actors: the application developer, the Cloud provider, and Nubilum. Further, the figure shows the interactions between these actors through the use of a respective description in CloudML. Figure 13 Integration of different descriptions using CloudML First, the Cloud provider should describe all services offered by the D-Cloud, generating a service description document (step number “1” in the figure). Next, a Cloud developer may use these descriptions (step number “2”) to verify if its requests are attended by the concerned D-Cloud. Note that the CloudML allows different D-Cloud providers to comply with their respective service descriptions. In this way, a developer may choose between different providers according to its own criteria and convenience. Once a Cloud provider is selected, the developer composes a document describing its requirements and may then submit its requests to the D-Cloud (step number “3”), more specifically represented by the Cloud System Management, which will ultimately allocate resources according to the requested resources and the current status of the D-Cloud. At any time, the provider may asynchronously request status or the description of resources (step number “4”) of the D-Cloud. This same information is the input of the resource management process performed by Nubilum. As defined in Section 4.1.1, Nubilum works as an IaaS Cloud. Thus, in CloudML the developers’ applications are treated as virtual networks and the provider’s infrastructure is the physical network. Another important design aspect is the use XML technologies as the underlying structure to compose CloudML documents. By adopting this well-established technology, in Service Description Cloud Operator describes its services and makes them available 1 Cloud Developer uses Services’ description to make its requests 2 Cloud Operator may request status/description of resources 4 Resource Description Request Description Cloud Developer sends its requests to the Cloud 3 Nubilum Cloud Provider Application Developer
  • 59. 45 contrast to new ones such as JSON [13], it is possible to use solutions and APIs present in the XML ecosystem to guarantee syntax correction, query of documents, and other facilities. The rest of this section is organized as in the following: Section 5.1.1 presents and details the CloudML language and its XML Schemas; Section 5.1.2 is dedicated to illustrate the use of CloudML in a simple scenario; finally, a qualitative comparison and discussion between CloudML and existent languages are given in Section 5.1.3. 5.1.1 CloudML Schemas The XML Schemas of CloudML were divided into three groups: schemas for resource description, schemas for service description, and schemas for requirements description. For didactical purposes, it was opted for presenting those schemas through intuitive diagrams generated by the Web Tools Platform (an Eclipse plug-in [18]) instead of the tangled XML Schema code. Resource Description This first group has two subgroups: one for short reports on the status of resources, and the other group provides a complete description of resources. The basic XML element for reporting resources’ status is the NodeStatusType (Figure 14) which represents the status of both physical servers and virtual machines (called just nodes in our language). This type is composed by two required attributes (CPU and RAM) and a sequence of Storage elements. These attributes are presented in percentage values while the Storage has a type defining the absolute value of the used space (Size), the Unit relative to this space (KB, MB, GB, etc), and a logical ID since a node can have many drives for storage. Figure 14 Basic status type used in the composition of other types The next type is the VirNodeStatusType that is used to report the status of a specific virtual machine (Figure 15). Such type has three attributes: the ID which is a unique value used for identification purposes and defined when the VM is created; the Owner is the identification of the developer that owns such VM; and the VMState that indicates the current state of the VM. CloudML defines three states for the VM: stopped, running, and suspended, which are self- descriptive. The associated type still has a Status element whose type is the NodeStatusType
  • 60. 46 already described. The PhyNodeStatusType is similar to the one for virtual nodes, except for the omission of the VMState and Owner attributes. Figure 15 Type for reporting status of the virtual nodes The NodesStatusType gives information about the status of the whole resources managed by the Worker. The NodesStatusType has only one root element called Nodes, as showed in Figure 16. Figure 16 XML Schema used to report the status of the physical node One basic element for complete descriptions of physical nodes is the PhysicalNodeType (Figure 17). This type has the ID attribute and four inner elements: NodeParams, PhyIface, VirNodeID, and VirEnvironment. Figure 17 Type for reporting complete description of the physical nodes The NodeParametersType (Figure 18) describes relevant characteristics including: node parameters (memory, processor, and storage), its geographical location (Location element), its
  • 61. 47 Functionality on the network (switch, server, etc…), its current status (which is an element of the type NodeStatusType) and the OtherParams for general use and extension. Figure 18 Type for reporting the specific parameters of any node Here, there are two aspects that should be highlighted. First, the Location is the element that enables a provider to know where resources are geo-located in the infrastructure. Second, the OtherParams element can be used by providers or equipment vendors to extend CloudML including other parameters not covered by this current version. In this way, CloudML presents itself as an extensible language. The PhysicalInterfaceType (Figure 19) is an extension of InterfaceType and is used to describe physical links associated to the interface (PhysicalLinksID element) and virtual interfaces (VirtualInterfacesID element) also related to the physical node. Such interfaces can be, for example, interfaces from virtual nodes. The general InterfaceType has an ID, MAC, IPv4, and IPv6 as attributes that are inherited by the PhysicalInterfaceType.
  • 62. 48 Figure 19 Type for reporting information about the physical interface As part of the PhysicalNodeType, the VirNodeID is a simple list of the IDs of the virtual machines hosted on the node, and the VirEnvironment is a list containing information about the virtualization environment. Each item in the list informs its CPU architecture (32 or 64 bits), the virtualization method (full or paravirtualized), and the hypervisor. Thus, an item indicates a type of virtual machine supported. The VirtualNodeType (Figure 20) gives a complete description of a virtual machine and is similar to the physical node. The VirtualInterfaceType also inherits from the InterfaceType, and the VirEnvironment contains only two attributes: one indicating the hypervisor and the other indicating the virtualization mode of the VM. Figure 20 Type for reporting information about a virtual machine
  • 63. 49 The VirNodesDescription and the NodesDescription are lists similar to the ones defined into the Status XML Schemas. The InfraStructureType (Figure 21) is composed by a PhyInfra element and zero or more VirInfra elements. The element PhyInfra is a PhysicalInfraStructureType and corresponds to the collection of physical nodes and links. The VirtualInfraStructureType indicates virtual infrastructures currently hosted by the physical infrastructure. Figure 21 Type for reporting information about the whole infrastructure The PhysicalInfraStructureType (Figure 22) has an ID attribute and is composed by two elements: PhyNode and PhyLink; which clearly represent the nodes (computers, switches, etc…) and their connections (cable links, radio links, etc…), respectively. The PhyNode element is of the type PhysicalNodeType which was already described whereas the PhyLink is of the type PhyisicalLinkType (Figure 23). Figure 22 Type for reporting information about the physical infrastructure
  • 64. 50 Figure 23 Type for reporting information about a physical link The PhysicalLinkType describes physical links between physical nodes. It has an ID attribute, a LinkParams element, and zero or more VirLinkID elements. The LinkParametersType, just as the NodeParametersType, supports all relevant characteristics of the link, which include: link technology (Ethernet, Wi-Fi, etc…), capacity, the current status (current delay, current allocated rate and current bit error rate), and also an extensible element (OtherParams) serving for future extension purposes. The VirLinkID element identifies the virtual links currently allocated on this physical link. Similarly to the physical infrastructure there is a type dedicated towards the collection of virtual nodes and virtual links called VirtualInfraStructureType (Figure 24). It has an ID, an Owner attribute (identifying the developer who owns these virtual resources) and can be composed of one or more VirNode elements (of the described VirtualNodeType) and several VirLink elements of the VirtualLinkType, which is very similar to the type for physical links. Figure 24 Type for reporting information about the virtual infrastructure Service Description CloudML provides a profile-based method for describing a provider’s services, which are described by an XML Scheme whose root element is Services from the ServiceType (Figure 25). This type has a Version attribute and a sequence of Node and Link profiles elements. The Node element
  • 65. 51 consists of the nodes profiles that are described by the NodeProfileType whereas the Link element uses the LinkProfileType. A Coverage element, from the CoverageType, is also described. The NodeProfileType uses the MemoryType for the RAM and Storage elements and the CPUProfileType for CPU. The first has two attributes indicating the amounts of memory and the second has three attributes indicating the following aspects: CPU frequency, number of cores, and CPU architecture. The LinkProfileType has only three intuitive attributes: ID for identification of the profile, the Rate for reserved rate, and the Delay for the maximum delay. The CoverageType is intended to inform the geographical areas that the provider covers. Thus, this type is just a sequence of Location identified by three attributes: Country, State, and City. It is important to notice that there is a Location element in NodeParametersType (already explained) used to geo-locate nodes in the infrastructure. With these two elements (Location and Coverage) a provider is able to identify the geographical location of its resources, which allows the provider to offer location-aware services. Figure 25 Type describing the service offered by the provider Request Description Developers describe their application requirements through a request document, whose root element is the RequestType (Figure 26). Such type is composed by three attributes: an ID, an
  • 66. 52 Owner, and a Tolerance (delay value in milliseconds), which expresses how far the virtual nodes can be placed from their required location specified on the Node element. The NodeSpecType and LinkSpecType have an attribute to indicate their ID and an attribute to indicate the ID of the corresponding of the respective profile (described in the ServiceType) chosen by the developer. The NodeSpecType has also a Location element indicating where the node must be positioned, which is defined using the LocationType. The LinkSpecType has several Node elements indicating the ID of the requested nodes that the link will be connecting. Figure 26 Type describing the requirements that can be requested by a developer 5.1.2 A CloudML usage example This section considers an example of the CloudML usage. Through CloudML, the provider is able to list the different profiles using the service description model, informing nodes and links configurations. A developer should inform its requirements using the requirements description document. Moreover, Nubilum uses the description and status documents to describe physical and virtual resources with regard to the D-Cloud. These can be used for internal communication between equipments and components of the system and for external communication with the provider. Next, these XML documents are described in more details. Please notice that some irrelevant parts of the XML documents were omitted for a better visualization. Services XML The XML document shown in Figure 27 represents the service defined by the D-Cloud provider. There are two node profiles (nodeprofile01 and nodeprofile02), two link profiles (linkprofile01 and linkprofile02), and three Coverage items with Country, State, and City specifications. The nodeprofile01 is a node running the Linux operating system with 2 GB of RAM, 80 GB of storage and acts as a server. The linkprofile01 represents a link with maximum delay and
  • 67. 53 capacity equal to 0.150 ms and 1.5 Mbps, respectively. The linkprofile02 is similar to that of linkprofile01. The Coverage is defined as a set of Country, State, and City elements. In this case, the Cloud provider informs that developers can request resources located at three different localities: “Brazil, São Paulo, Campinas”, “Brazil, Pernambuco, Recife”, or “Brazil, Ceará, Fortaleza”. Figure 27 Example of a typical Service description XML The nodeprofile02 is from a different type of the nodeprofile01. This node profile is a router as is indicated in the Functionality tag. In this Thesis, such type of node indicates an abstract entity used to create virtual links for a specific location without use of a more robust virtual node. Thus, a developer can request the use of this node to ask for a link with guarantees to attend a specific geographical region. Request XML This example considers a developer making a simple request for two nodes and one link spanning between them. To compose this document, the developer should read the Service description offered by the provider, select the correspondent profile, and then make its request. The XML document in Figure 28 represents such simple request. The node01 is from nodeprofile01 and is located at the state of “São Paulo”. The node02 is at “Recife” and it is a router. The link is linkprofile01. Figure 28 Example of a Request XML
  • 68. 54 Description XML The description document represents the infrastructure of the D-Cloud, including all physical and virtual nodes. Depending on the size of the D-Cloud, this document can be very long. Thus, for a better visualization, the next examples will illustrate only some parts of CloudML: the physical infrastructure, the virtual infrastructure, and the virtual links. The XML document at Figure 29 presents a complete description and status of all physical nodes and physical links, with the first <PhyNode> tag informing resource characteristics (like CPU, RAM…) for node 100. The node 101 description was omitted, since it is similar. Figure 29 Physical infrastructure description
  • 69. 55 The <VirNodeID> tag informs the IDs of the virtual nodes that are running at the specific physical node. In this case, according to our example, only the virtual node node01 is running at physical node 100. There are also two physical links (<PhyLink> tags). The physical link phylink01 has virtual link virlink01 associated to it. Further information about this link was omitted here and will be described in the next Figure. Figure 30 shows the description and the status of all the virtual nodes and virtual links of a specific owner in the D-Cloud. Particularly, this example shows how the virtual network allocated resources after receiving the request in Figure 28. Please note that this description is very similar to the physical infrastructure. The virtual node node01 has many characteristics, such as RAM, CPU, storage, network interface, and virtual environment. In this case, as the two virtual nodes that resulted from the same type, the virtual node node02, omitted in the document, is a simple virtual node with the Router functionality. Figure 30 Virtual infrastructure description
  • 70. 56 Furthermore, this example is also about the description and the status of all virtual links established in the D-Cloud. The virtual link virlink01 has information, such as technology and rate, described in the <LinkParams> tag. Note that virlink01 was referenced previously in the physical infrastructure description as a link associated to a physical one. 5.1.3 Comparison and discussion Some alternatives for resource and requirements description in D-Clouds were presented at Section 3.2.1. In this section, CloudML characteristics will be contrasted, and a comparison will be done in order to discuss these languages, highlighting their advantages and weaknesses. CloudML presented in this work was developed to be a vendor-neutral language for resource/request description on D-Clouds. Such neutrality is obtained both internally and externally to the D-Cloud. First, in the internal viewpoint, CloudML brings a common language that can be implemented by different Cloud equipment vendors in order to integrate their different solutions. Certainly, such integration cannot be made without some common protocol implemented by the vendors, but CloudML offers a common terrain for data representation that is a crucial step towards interoperability. Moreover, CloudML supports vendors’ innovation offering flexibility through the use of the OtherParams element in the description of virtual and physical nodes and links. Such optional field can be used by different vendors to convey private information in order to tune equipments of the same vendor in the infrastructure. This characteristic is similar to OpenStack. In the second and external viewpoint, the supported neutrality allows developers to request services from different D-Cloud providers in order to compare characteristics from each one and choose the appropriated services for their tasks. Here, it is important to notice that these providers should use some standardized interface, such as OCCI, to handle this information model. All the languages covered in Section 3.2.1 describe, in some way, computational and network resources in the Cloud. Service description is also a common place for description languages. However, these services are described in different manners. For example, the CloudML uses profiles to represent distinct types of nodes and links that compose services; the VXDL is itself a representational way to describe Grid applications; the OpenStack uses flavors idea, but it is restricted to computational resources. Request description is not treated by the VRD. One interesting aspect of CloudML is that of geo-location. With this information, the Cloud may offer services with location-awareness. This point is also covered by the VXDL, VRD, and Manifest languages, but this aspect is described without details in the respective works.
  • 71. 57 In addition to these points, the main CloudML characteristic is the description integration. With CloudML, different Cloud providers may easily describe their resources and services and make them available to developers. Thus, developers may search for the more suitable Cloud to submit their requests to. 5.2 Communication interfaces and protocols Figure 31 highlights the communication protocols used by Nubilum’s components, which were cited at Chapter 4. Note that the figure omits the Storage System since it is controlled through the same interface available at Workers. Basically, two different protocols are used for communication: the HTTP protocol employed by REST interfaces available in the Allocator, Manager, and Worker components, and the Openflow protocol for communication with network devices. Together those protocols cope with the integrated control and monitoring of all physical and virtual resources in the D-Cloud. The HTTP protocol is also used by the D-Cloud provider to describe the supported services and by the developers to submit their requests to the system. Figure 31 Communication protocols employed in Nubilum The next sections will detail the communication protocols employed in Nubilum. First, Section 5.2.1, will show the REST interfaces of each component: Allocator, Manager, and Worker. After that, Section 5.2.2 will discuss how Openflow is explored by the Nubilum to setup virtual links in the physical network. 5.2.1 REST Interfaces As showed in Section 5.1, the XML Schemas defining CLoudML were divided into three groups: schemas for resource description, schemas for service description, and schemas for requirements description. Here, these schemas are mapped in seven MIMEtypes (Table II) that describe the specific data types (xml documents) to be used by the REST interfaces of Nubilum. NUBILUM Allocator Manager Worker Network Elements D-Cloud Provider Application Developer HTTP HTTP HTTP HTTP Openflow
  • 72. 58 Table II MIMEtypes used in the overall communications Mimetype Description cloudml/nodesstatus+xml Status of physical and virtual nodes cloudml/nodesdescription+xml Description of physical and virtual nodes cloudml/virnodedescription+xml Description of a single virtual node cloudml/infradescription+xml Descriptin of the entire D-Cloud cloudml/virinfradescription+xml Description of a particular virtual infrastructure cloudml/servicesdescription+xml Description of the service defined by the Provider cloudml/appdescription+xml Description of a developer’s request for their application The cloudml/nodesstatus+xml is used to exchange volatile status information between entities, and it refers to a short description of the status of all nodes managed by a worker, i.e., the server (referred to as a physical node) and the virtual machines (or virtual nodes). The other MIMEtypes offer complete descriptions of the resources. The cloudml/nodesdescription+xml (corresponding to the NodesDescription XML Schema) gives a long description of the physical node and virtual nodes and the cloudml/virnodedescription+xml (corresponding to the VirNodesDescription XML Schema) refers to the complete description of a specific virtual node hosted on the server. The type cloudml/infradescription+xml refers to the entire D-cloud and it shows all the servers, physical links, virtual machines, and virtual links hosted on the D-Cloud, whereas the cloudml/virinfradescription+xml refers to the complete description of the resources in a virtual infrastructure. The cloudml/servicedescription+xml and the cloudml/appdescription+xml describe the service offered by the provider and the developer’s requests, respectively. The REST interfaces were defined according to five URL resources: the virnode resource is used to describe operations available at virtual machines; the worker resource is used for registering and unregistering Workers at the Manager; the dcloud resource is the representation of the entire D-Cloud infrastructure; the services resource is used to configure the profiles of virtual machines and virtual links that can be requested by developers; and the app resource is used to represent operations for requesting, updating, and releasing resources of a particular application. The next sections offer more details about those resources. Allocator The interface available at the Allocator component can be accessed by developers and the provider. The developers make use of operations to submit their requests to Nubilum whereas the provider can configure the services offered by Nubilum. As stated earlier, through the services resource the provider can configure the profiles of virtual machines and virtual links that can be requested by
  • 73. 59 developers. The operations are used to retrieve and update the list of services offered by the D- Cloud provider, as showed in Figure 32 and Figure 33. The operation for updating the offered services uses a PUT method on the URL “/services” and it carries into the body the new parameters of the service in an XML document following the cloudml/servicesdescription+xml MIMEtype. Developers use the GET operation to request the description of the current services supported by the provider. URL http://allocator_ip/services/ Method GET Returns 200 OK & XML (cloudml/servicesdescription+xml) 401 Unauthorized 404 Not Found Figure 32 REST operation for the retrieval of service information URL http://allocator_ip/services Method PUT Request Body XML (cloudml/servicesdescription+xml) Returns 201 Created & Location 400 Bad Request 401 Unauthorized 422 Unprocessable Entity Figure 33 REST operation for updating information of a service Using the app resource, a developer can submit, delete, or update an application (Figure 34, Figure 35, and Figure 36, respectively). Such operations use the cloudml/appdescription+xml MIMEtype, which describes the virtual network, i.e., virtual nodes and virtual links requested by the developer. Moreover, some error messages were defined also to deal with exception cases: 400 for syntax errors in the XML document sent; 422 for any errors on the information contained in the XML, e.g., invalid identification numbers or server characteristics; and 401 to report authentication errors. URL http://allocator_ip/app Method POST Request Body XML (cloudml/appdescription+xml) Returns 201 Created & XML (cloudml/virinfradescription+xml) 400 Bad Request 401 Unauthorized 409 Conflict 422 Unprocessable Entity Figure 34 REST operation for requesting resources for a new application
  • 74. 60 URL http://allocator_ip/app/[id] Method PUT Request Body XML (cloudml/appdescription+xml) Returns 201 Created & XML (cloudml/virinfradescription+xml) 400 Bad Request 401 Unauthorized 409 Conflict 422 Unprocessable Entity Figure 35 REST operation for changing resources of a previous request URL http://allocator_ip/app/[id] Method DELETE Returns 204 No Content 401 Unauthorized 404 Not Found Figure 36 REST operation for releasing resources of an application These operations define the external interface of Nubilum, but other standardized interfaces may be used instead for the same purpose of external communication, such as OCNI (Open Cloud Networking Interface), an extension of OCCI proposed by [60] to cover network requirements. Manager The Manager’s interface has five operations that manipulate two resources. The worker resource is used by Workers for registering and unregistering at the Manager (Figure 37 and Figure 38, respectively). The operation for registration uses a HTTP POST method whose body section contains a XML document corresponding to an entire description of the server and the virtual machines. The second operation over the worker resource is used by Workers to unregister from the Manager. In this operation, the Worker must inform the URL previously sent by the Manager in the POST operation, which includes an ID for this Worker. The third operation (Figure 39) is employed by Workers to update the Manager with information about the server and their virtual machines. URL http://manager_ip/worker Method POST Request Body XML (cloudml/nodesdescription+xml) Returns 201 Created & Location 400 Bad Request 401 Unauthorized 422 Unprocessable Entity Figure 37 REST operation for registering a new Worker
  • 75. 61 URL http://manager_ip/worker/[id] Method DELETE Returns 204 No Content 401 Unauthorized 404 Not Found Figure 38 REST operation to unregister a Worker URL http://manager_ip/worker/[id] Method PUT Request Body XML (cloudml/nodesstatus+xml) Returns 201 Created & Location 400 Bad Request 401 Unauthorized 404 Not Found 422 Unprocessable Entity Figure 39 REST operation for update information of a Worker The other two operations available at the Manager’s interface operate over the overall dcloud resource, which is a representation of the entire D-Cloud infrastructure (Figure 40 and Figure 41). A first operation is intended to retrieve the complete description of the D-Cloud whereas the other one is used to submit a new description of the D-Cloud, which will then be enforced onto the physical resources by the Manager. Both operations are used by the Allocator. Please note that these operations use the cloudml/infradescription+xml MIMEtype. URL http://manager_ip/dcloud Method GET Returns 200 OK & XML (cloudml/infradescription+xml) 401 Unauthorized 404 Not Found Figure 40 REST operation for retrieving a description of the D-Cloud infrastructure URL http://manager_ip/dcloud Method PUT Request Body XML (cloudml/infradescription+xml) Returns 200 OK 400 Bad Request 401 Unauthorized 422 Unprocessable Entity Figure 41 REST operation for updating the description of a D-Cloud infrastructure
  • 76. 62 Worker The Worker interface is focused on operations on virtual machines. Through the virnode resources, the interface of the Workers’ components offers operations for the creation, updating, and removal of virtual machines, which are respectively showed at Figure 42, Figure 43, and Figure 44. URL http://worker_ip/virnode Method POST Request Body XML (cloudml/virnodedescription+xml) Returns 201 Created & XML (cloudml/virnodedescription+xml) 400 Bad Request 401 Unauthorized 422 Unprocessable Entity Figure 42 REST operation for the creation of a virtual node The operation for the creation of a virtual node carries the node parameters in an XML document following the cloudml/virnodedescription+xml MIMEtype. If executed with success, the operation returns a 201 message and a XML document containing the parameters of the allocated virtual machine. This document is very similar to the one that was passed in the request body but with some additional information like the current state of the virtual machine. The other two operations are similar to the POST operation, except that they access the new virtual node through a different URI. URL http://worker_ip/virnode/[id] Method PUT Request Body XML (cloudml/virnodedescription+xml) Returns 201 Created & Location 400 Bad Request 404 Not Found 401 Unauthorized 422 Unprocessable Entity Figure 43 REST operation for updating a virtual node URL http://worker_ip/virnode/[id] Method DELETE Returns 204 No Content 401 Unauthorized 404 Not Found Figure 44 REST operation for removal of a virtual node
  • 77. 63 5.2.2 Network Virtualization with Openflow Nubilum uses the Openflow protocol for controlling and collecting information of the physical and virtual links. As discussed in Section 4.3.2, the Manager’s modules coping with these functions are implemented in a NOX Openflow controller through a specific application, which has also a set of REST-based interfaces for communication with other Manager’s modules. The next sections will describe in details the introduced REST interfaces and will specify the process to setup the correct flows in each network device when creating virtual links. NOX REST Interfaces The REST interface in NOX has only two resources: topo, to inform the topology of the physical network; and vlink that allows the manipulation of virtual links. The topo resource is subject to a single GET operation (see Figure 45) used by the Resource Discoverer module to request the topology of the physical network, which is a result of the LLDP- based discovery processes detailed in Section 4.4.2. The operation is detailed in Figure 45. Note that when the operation proceeds correctly, a XML document from the cloudml/infradescription+xml MIMEtype is sent to the requester, but this description contains only reduced information about the physical resources and the links between them. URL http://nox_ip/topo Method GET Returns 200 OK & XML (cloudml/infradescription+xml) 404 Not Found Figure 45 REST operation for requesting the discovered physical topology Over the vlink resource, three operations were defined, namely, POST (Figure 46), PUT (Figure 47), and DELETE (Figure 48), for the creation, updating, and deleting of virtual links, respectively. A new MIMEtype is also introduced that contains information describing the virtual link: the <MAC,IP> for source and destination nodes of the virtual link, and the path of physical nodes/links that will host this virtual link. This path is informed since the calculus of the path is not a NOX’s task but is rather a task performed by the Allocator component. URL http://nox_ip/vlink Method POST Request Body XML (cloudml/virlinkdescription+xml) Returns 201 Created & XML (cloudml/virlinkdescription+xml) 400 Bad Request 422 Unprocessable Entity Figure 46 REST operation for the creation of a virtual link
  • 78. 64 URL http://nox_ip/vlink/[id] Method PUT Request Body XML (cloudml/virlinkdescription+xml) Returns 201 Created & Location 400 Bad Request 404 Not Found 422 Unprocessable Entity Figure 47 REST operation for updating a virtual link URL http://nox_ip/vlink/[id] Method DELETE Returns 204 No Content 404 Not Found Figure 48 REST operation for removal of a virtual link Virtual links setup The Openflow protocol leverages the usage of flow-based network devices that are configured by NOX. Thus, when the Manager calls NOX through a POST operation in the vlink resource, the Openflow Controller and Collector module will send appropriated Openflow messages to each network device. In our implementation, the entire D-Cloud network is an Ethernet network and each network device is a switch. Thus, the first aspect to be considered is the forwarding of ARP packets which must be flooded in the network without causing the retransmission of multiple copies of each packet. This problem can be solved by creating a spanning tree in the graph in order to allow the communication of all servers in the D-Cloud. The spanning tree can be created using the Spanning Tree Protocol (STP) which is an optional protocol that Openflow switches can implement. The use of STP is documented in the Openflow specification document [52]. In order to support switches that do not perform STP, the next procedure is used in Nubilum. Considering the physical network formed by servers, switches, and their respective links, a breadth-first search was implemented in order to find the edges of the graph pertaining to the spanning tree. Based on this, each network device is configured with a number of flows equal to the number of ports on the switch. For each port, the module checks if the link associated with the switch port is part of the spanning tree, if it does then all incoming ARP packets will be flooded to the other ports except the port from where the packet arrived. If the link associated with the port is out of the spanning tree, the ARP packets arriving from that port will be dropped. Figure 49 shows an example of the typical Openflow tuple of these flows. In the example, all ARP packets incoming from switch port 1 will be flooded to the other ports except port 1.
  • 79. 65 Figure 49 Example of a typical rule for ARP forwarding In Nubilum, virtual links are implemented by a set of flows configured in the network devices that form a path between the two virtual nodes in the borders of this virtual link. This path is calculated based on the developer’s requirements and the specific goals of the allocation algorithms implemented in the Mapper module at the Allocator component and no optimization is done in NOX. Thus, the creation of virtual links is based on the XML document received. As explained in the previous section, each request indicates the 2-uple <MAC,IP> for the source and destination nodes of the virtual link as well as the path (a list of network devices). Each network device in the path will be configured with two specific flows for bidirectional communication. The flows have the characteristics presented in Figure 50. Note that these flows include information about the MAC and IP addresses of the virtual machines connected by the virtual link in order to restrict the virtual link to these machines. (a) (b) Figure 50 Example of the typical rules created for virtual links: (a) direct, (b) reverse In Section 5.1.2, the CloudML example introduced the use of virtual routers for the creation of virtual links towards certain geographical locations. A link with a virtual router receives a different treatment in Nubilum since it is intended to route data from a virtual node to clients and vice versa. Thus, NOX creates two flows similar to the ones presented in Figure 50, but containing only the MAC and IP addresses of the virtual node. Thus, traffic incoming from clients in the geographical region can be routed to the virtual node. 5.3 Control Plane Evaluation As discussed in this Chapter, the Nubilum’s processes generate several control messages for monitoring and allocating resources, which naturally introduce load in the network. Thus, two Switch Port MAC src MAC dst Eth type VLAN ID IP src IP dst TCP sport TCP dport Action 1 * * ARP * * * * * FLOOD Switch Port MAC src MAC dst Eth type VLAN ID IP src IP dst TCP sport TCP dport Action 1 1:1:1:1:1:1 2:2:2:2:2:2 IP * 1.1.1.1 2.2.2.2 * * Outputto port2 Switch Port MAC src MAC dst Eth type VLAN ID IP src IP dst TCP sport TCP dport Action 2 2:2:2:2:2:2 1:1:1:1:1:1 IP * 2.2.2.2 1.1.1.1 * * Outputto port1
  • 80. 66 questions can be posed: “What is the impact of these control messages onto the system?”, and “how does this scale up with the number of Workers and network elements?”. Considering both questions, this Section shows the evaluation of the control load (total message size in bytes) generated by the system through some simple and useful models derived from measurements made on a prototype of the system. The prototype of Nubilum was implemented in a testbed. This prototype comprehends the overall components of the system: Allocator, Manager, Worker, Network Device, and Storage System. All the communication interfaces that make the Nubilum’s control plane were also implemented as described in the previous sections. After implementation, some measurements were taken in this prototype in order to measure the number and size of HTTP and Openflow messages generated by three distinct events: the resource allocation for a new request to the system; the status update of the physical resources (Workers and Network Devices); and the release of resources of a developer’s application. Please note that this section does not evaluate the algorithms for resource allocation – which will be evaluated through simulations in Chapter 6 – it only measures the control load introduced by Nubilum in order to evaluate the impact that the system cause in the network due to the communication interfaces. The messages exchanged between the components in Nubilum were measured in an actual prototype of the system. One can divide these control messages into two major groups: HTTP messages – used for communication with the developer, allocator, manager, and worker – and Openflow messages – used for communication with the network devices. Furthermore, these messages can be divided into three sub-groups according to their associated events: messages for application allocation, messages for application release, and messages for status update. Table III shows the number of bytes generated by the control messages between each component of the system for each type of event. Each two lines in the table represent one interface of the system: one between the Developer and the Allocator, another between the Allocator and the Manager, others between the Manager and each Worker, and a last one between the Manager and each Network Device. The size of the messages depends on the specific parameters: VN = number of virtual nodes, PN = number of physical nodes, VL = number of virtual links, PL = number of physical links, IF = infrastructure description (in bytes, given by IF = 300+734*PN+857*VN+389*PL+314*VL), P = number of ports in the network device. Thus, the length of an allocation message sent from Developer to Allocator (first line in the table), for example, is proportional to the number of virtual nodes (VN) and to the number of virtual links (VL) required by the Developer. Furthermore, it is
  • 81. 67 important to note that for each message there is a fixed number (in bytes) that represents the HTTP header length, a fixed XML part, and the parameters’ description length. For example, with regard to the message between the Developer and the Allocator, one can note that 505 bytes is the length of the HTTP header and the fixed size XML part, 84 is the length of one virtual node description, and 74 is the length of one virtual link description. If there are 10 VN and 5 VL, the total length of an allocation message submitted by a Developer to an Allocator is 1715 bytes. Table III Models for the length of messages exchanged in the system in bytes Interface Allocation event Release event Update event Type Developer Allocator 505+84*VN+74*VL (GET) 161 (DELETE) N/A HTTP Allocator Developer 537+857*VN+314*VL (Reply GET) 46 (Reply DELETE) N/A HTTP Allocator Manager 120 (GET) 221+IF (PUT) 221+IF (PUT) N/A HTTP Manager Allocator 237+IF (Reply GET) 242+IF (Reply PUT) 242+IF (Reply PUT) N/A HTTP Manager Worker 978 (POST) 169 (DELETE) 639+180*VN (PUT) HTTP Worker Manager 1024 (Reply POST) 46 (Reply DELETE) 130 (Reply PUT) HTTP Manager Network Device 320 288 20 Openflow Network Device Manager N/A 352 12+104*P Openflow Examining Table III, one can note that the HTTP messages are bigger than the messages generated by the Openflow protocol. This is expected since the HTTP messages carry CloudML documents whose text format introduces flexibility but tends to be bigger than a byte-oriented protocol. Another interesting aspect that can be highlighted is that the messages between Allocator and Manager are bigger than other HTTP messages. This is due to the large amount of data that needs to be exchanged between these two components because of the way the system was designed. This needs to be done in order to allow the separation between the component that allocates new requests and the one that gathers information about the whole system. Another reason for this design choice is to allow the allocator to be stateless in respect to the status of the resources. Thus each time a new request arrives the Allocator must acquire the current status of the D-Cloud. Notice also that part of this infrastructural information that needs to be exchanged provides valuable feedback about the developers’ applications, as the allocator acts as an interface between developers and the whole system. The table also reveals that the control load increases linearly with the size of the physical infrastructure (number of Workers and Network Devices) as well as the size of the requests. This is an important validation, as it shows that there is no unusual increase of the control traffic in the system according to the number of Workers, which is an aspect that affects system’s scalability.
  • 82. 68 6 Resource Allocation Strategies “Fugiunt amici a male gerente res suas.” Schottus, Adagia Section 3.2 detailed four research challenges related to resource management on D-Clouds: Resource Modeling, Resource Offering and Treatment, Resource Discovery and Monitoring, and Resource Selection and Optimization. The first three research challenges were discussed in the context of Chapter 4 and Chapter 5. In a complementary form, this chapter will discuss the fourth challenge about resource allocation. The problem of finding the best mapping of distributed applications onto a physical network is not a trivial task due to the dynamicity, high algorithm complexity, and all the different requirements (from developers and the provider) that must be contemplated by Nubilum. Moreover, this problem is not restricted to the allocation of an incoming application, but it also involves the continuous monitoring and maintenance of the application in order to guarantee developer’s requirements. This chapter focuses on problems for resource allocation on D-Clouds. The first problem investigated (in Section 6.1) is relative to the allocation of resources for the management entities of the Cloud. Specifically, the Manager Positioning Problem will be addressed and solved. After that, Section 6.2 discusses the way Nubilum deals with the problem of allocating virtual networks and evaluates an algorithm for load balance. Section 6.3 presents algorithms for the creation of a virtual network when the request is composed only by the virtual nodes. Finally, Section 6.4 discusses and summarizes the main results obtained in this chapter. 6.1 Manager Positioning Problem Recall, from Chapter 4, that Nubilum is composed by five components: Allocator, Manager, Workers, Network Devices, and the Storage System. As described earlier, the Manager is at the center of the entire system and maintains continuous communication with all the other components. This implicates that the relative position of the Manager in the network can influence the performance of the entire system, since choosing a bad one will certainly increase communication costs with part of the network and it may also cause delayed responses to urgent system’s
  • 83. 69 operational queries hence leading to performance degradation events. The problem of finding an optimal position for the Manager entity is referred here as the Manager Positioning Problem (MPP). Let us initiate by supposing without loss of generality that the communication between the Manager and the Allocator is of minor importance in comparison to the communication with the many Workers and Network Devices. This can be achieved, for example, if the two components are executed in the same physical node. Assume, finally, that the D-Cloud is composed by nodes able to work as Network Devices and Workers at the same time. This assumption only turns the MPP more general, since each node in the D-Cloud is capable of hosting the Manager. Figure 51 shows an example of a D-Cloud containing ten Worker/Network Device nodes (represented by W nodes) and the Manager (the overlaid M node) already allocated in one of these nodes. Furthermore, each link has an associated delay that is showed on the figure as annotated values over the links. Thus, ‫ܦ‬௜,௝ is defined as the shortest path between nodes i and j considering the delays as links’ weights. Note that this value is defined to be zero when ݅ ൌ ݆ and is positive otherwise. Figure 51 Example of a D-Cloud with ten workers and one Manager The MPP can be defined to determine the position of the Manager in the D-Cloud in order to minimize the cost function ∑ ‫ܦ‬௜,ெ௜‫א‬ே , where N is the set of nodes of the graph and M is the node where Manager is hosted. Please note that this problem is equivalent to solve the 1-median problem, which is seen as a very simple form of the replica problems introduced in Section 3.2.4. This problem can be solved calculating the all-pairs shortest path in a weighted graph which can be obtained with the Floyd-Warshall algorithm whose complexity is ܱሺܸଷሻ [41], where ܸ is the number of nodes. Using the distance from each node to each other, the sum of the distances can be calculated for each node in the graph that is a candidate for positioning the Manager. The node that has the minimum sum of distances is the solution to the MPP. The next example will detail as this simple algorithm can be used in a particular instance of the MPP. W W W W W W W W WW M0.1 0.5 0.2 0.01 0.1 0.2 0.3 0.05 0.2
  • 84. 70 Considering the example in Figure 51 and using one of the algorithms for all-pairs shortest path computation, one can obtain the distance matrix below (nodes are numbered from left to right and from top to bottom): ‫ۏ‬ ‫ێ‬ ‫ێ‬ ‫ێ‬ ‫ێ‬ ‫ێ‬ ‫ێ‬ ‫ێ‬ ‫ێ‬ ‫ۍ‬ 0 0.3 0.1 0.6 0.11 0.21 0.31 0.36 0.61 0.81 0.3 0 0.2 0.7 0.21 0.31 0.41 0.46 0.71 0.91 0.1 0.2 0 0.5 0.01 0.11 0.21 0.26 0.51 0.71 0.6 0.7 0.5 0 0.51 0.61 0.71 0.76 1.01 1.21 0.11 0.21 0.01 0.51 0 0.1 0.2 0.25 0.5 0.7 0.21 0.31 0.11 0.61 0.1 0 0.3 0.35 0.6 0.8 0.31 0.41 0.21 0.71 0.2 0.3 0 0.05 0.3 0.5 0.36 0.46 0.26 0.76 0.25 0.35 0.05 0 0.35 0.55 0.61 0.71 0.51 1.01 0.5 0.6 0.3 0.35 0 0.2 0.81 0.91 0.71 1.21 0.7 0.8 0.5 0.55 0.2 0 ‫ے‬ ‫ۑ‬ ‫ۑ‬ ‫ۑ‬ ‫ۑ‬ ‫ۑ‬ ‫ۑ‬ ‫ۑ‬ ‫ۑ‬ ‫ې‬ . Summing up the values in the columns, each element of this cost vector will give the cost of the function considering that M is hosted in the respective node and the minimal value indicates the best node to allocate the Manager. For the matrix above, the cost vector is ሾ3.41 4.21 2.61 6.61 2.59 3.39 2.99 3.39 4.79 6.39ሿ and the minimal cost is 2.59 corresponding to the fifth node where the Manager is already allocated as showed at Figure 51. Please note that, the problem can be simplified considering a restricted number of nodes as Workers. In this case, the same solution can be applied by simply restricting the path computations to the Worker nodes and using links and intermediary nodes as the inputs for the calculus. Also, the same idea can be applied to reduce the number of path computations considering only Workers that have at least one virtual machine already allocated, since the status of these nodes must be maintained more accurately than that of servers that have no running VMs. One may note that if using static delays values, the path matrix can be computed only once, saving execution time. However, a more practical approach of the problem would be to measure the delay along the time according to the traffic in the D-Cloud. In this case, the same solution could be used repeatedly while considering a delay sample in time (or any function over this value, including here functions over historic values) as the input for the new path matrix computation. 6.2 Virtual Network Allocation In Nubilum, developers submit their requirements to the system as a virtual network, composed of virtual nodes and virtual links. A priori, Nubilum’s resource model (condensed in the CloudML) supports a wide range of virtual network algorithms while considering diverse characteristics as geo- locality, network delay, Worker capacity, network capacity, and so on. Such characteristics are used
  • 85. 71 as inputs of the resource allocation algorithms and they can figure as constraints of the problem or as part of the objective function. Note that Nubilum neither restricts the objective functions that can be used by the allocation algorithms nor constrains the type of the algorithm (please see Section 0 for discussion on the topic of Network Virtualization). However, Nubilum confines the algorithms to work with the characteristics of their resource model that are summarized in Table IV. Table IV Characteristics present in Nubilum’s resource model Components Characteristics Physical Network Physical Nodes Memory, Storage, CPU Geo-location Supported virtualization environment Functionality Physical network interfaces Already allocated virtual machines Physical Links Capacity and allocated capacity Current delay Current bit error rate Technology Already allocated virtual links Requested Virtual Network Virtual Nodes Memory, Storage, CPU Geo-location and tolerance delay OS type Functionality Virtual Links Required rate Maximum delay Allocated Virtual Network Virtual Nodes Memory, Storage, CPU Geo-location Functionality VM state Owner Virtual Links Capacity and current rate Current delay Current maximum bit error rate Owner The characteristics in the table allow that several resource allocation algorithms for virtual networks can be implemented in the Mapper module (see Section 4.3.1) with just minor modifications. For example, the algorithms D-ViNE and R-ViNE proposed in [10] consider aspects as node capacity, node location, maximum tolerance distance, and link capacity for the physical and the requested virtual networks. The algorithms consider those aspects in a abstract way, but associating the abstract node capacity with a concrete characteristic as Memory, Storage, and/or CPU and considering the distance measured in delay, these algorithms could be implemented without any additional modifications. Similarly other proposals cited in [31] and [3] could be adapted for use in Nubilum.
  • 86. 72 This section presents a resource allocation algorithm that works with subset of the characteristics supported by Nubilum. This algorithm has the objective of guarantying the load balancing of the physical resources (servers and links). The problem is to allocate an incoming virtual network in the physical network in order to balance the load of virtual resources allocated in physical resources, subject to some constraints, i.e., this problem can be better defined as one that allocates the virtual network in order to minimize the sum of two components: the usage of resources at physical nodes and the maximum number of virtual links in a given physical link. The problem considers restrictions associated to virtual nodes and virtual links. In case of virtual nodes memory, storage, CPU, geo-location, and functionality constraints are considered, whereas in the case of virtual links the problem considers only a required maximum delay of the physical path. Table V shows these set of characteristics. Note that, differently from other approaches, the present problem does not consider link capacity since it seems to provide better insights and more useful results with unrestricted capacity in a pay-as-you-go business plan. Table V Reduced set of characteristics considered by the proposed allocation algorithms Components Characteristics Physical Network Physical Nodes Memory, Storage, CPU Geo-location Functionality Already allocated virtual machines Physical Links Current delay Already allocated virtual links Virtual Network Request Virtual Nodes Memory, Storage, CPU Geo-location Functionality Virtual Links Maximum delay Following, Section 6.2.1 defines the problem stated above in a formal way. In order to solve the problem, the solution is divided into two steps: virtual node allocation, which is covered in Section 6.2.2, and the allocation of virtual links, discussed in Section 6.2.3. The proposed algorithms are evaluated in Section 6.2.4. 6.2.1 Problem definition and modeling The physical or substrate network is represented by an undirected graph GS ൌ ሺNS , ESሻ. Each substrate node vS ‫א‬ NS has three distinct capacities: cሺvSሻ, mሺvSሻand sሺvSሻ, representing CPU, memory, and storage remaining capacities on each substrate node. Each substrate link has a delay associated represented by the function dS: ES ՜ R. The current stress sሺeSሻ of a substrate link eS ‫א‬ ES is the number of virtual links that already pass through this substrate link. Each virtual network will be given as a graph GV ൌ ሺNV , EVሻ. Each virtual node vV ‫א‬ NV has three distinct capacity requirements: cሺvVሻ, mሺvVሻand sሺvVሻ, representing CPU, memory and
  • 87. 73 storage consumption. For each virtual link e୴ ‫א‬ EV , a maximum delay given by a function dV: EV ՜ R is associated. Each virtual node will be assigned to a substrate node. This assignment can be seen as a function MN: NV ՜ NS . For each virtual node v ‫א‬ NV , the set NSሺvሻ ‫ؿ‬ NS indicates the physical nodes where this virtual node can be allocated. Each virtual link will be assigned to a substrate simple path between the corresponding substrate nodes that host the virtual nodes at both ends of that virtual link. Being PS the set of simple paths of GS and PSሺu, wሻ the set of simple paths of GS between physical nodes u and w, this assignment can be seen as a function ME: EV ՜ PS such that if e୴ ൌ ሺu, wሻ ‫א‬ EV , then MEሺe୴ሻ ‫א‬ PSሺu, wሻ. If p ‫א‬ PS is any simple path on GS , then its delay dSሺpሻ is the sum of delays of each physical link of the path. For eୱ ‫א‬ ES, the stress Sሺeୱሻ generated by the allocation of GV is defined as the number of virtual links e୴ ‫א‬ EV that pass through the substrate link eୱ, i.e., the number of virtual links e୴ ‫א‬ EV such that eୱ is a substrate link of the path given by MEሺe୴ሻ. Given a substrate network GS ൌ ሺNS , ESሻ and a virtual network GV ൌ ሺNV , EVሻ, the problem is to find mappings MN: NV ՜ NS and ME: EV ՜ PS in order to minimize: max ୣS‫א‬ES ሾsሺeSሻ ൅ SሺeSሻሿ ൅ max ୴‫א‬NS ቎ ෍ cሺuሻ െ cሺvሻ MNሺ୳ሻୀ୴ ቏ ൅ max ୴‫א‬NS ቎ ෍ mሺuሻ െ mሺvሻ MNሺ୳ሻୀ୴ ቏ ൅ max ୴‫א‬NS ቎ ෍ sሺuሻ െ sሺvሻ MNሺ୳ሻୀ୴ ቏ Subject to the constraints: ‫׊‬vԖNV , MNሺvሻ ‫א‬ NSሺvሻ 1) ‫׊‬e ൌ ሺu, wሻԖEV , MEሺeሻ ‫א‬ PS ൫MNሺuሻ, MNሺwሻ൯ 2) ‫׊‬eԖE୴ , dVሺeሻ ൑ dS൫MEሺeሻ൯ 3) ‫׊‬vԖNS , cሺvሻ ൑ ෍ c MNሺ୳ሻୀ୴ ሺuሻ 4) ‫׊‬vԖNS , mሺvሻ ൑ ෍ m MNሺ୳ሻୀ୴ ሺuሻ 5) ‫׊‬vԖNS , sሺvሻ ൑ ෍ s MNሺ୳ሻୀ୴ ሺuሻ 6) The objective function aims to minimize the sum of maximum link stress and remaining CPU, memory and storage, thus balancing the load. The first constraint (1) implies that each virtual node
  • 88. 74 will be allocated in a required physical node, according to geo-location restriction given. The second constraint (2) enforces the virtual link between virtual nodes u and w to be allocated in the path between the physical nodes in which these virtual nodes are allocated. The third constraint (3) implies that the delay of the virtual link will be respected in the path in which it is allocated. The last three constrains (4), (5), and (6) enforce that the capacities restrictions for virtual nodes are fulfilled. 6.2.2 Allocating virtual nodes The problem of allocating virtual nodes on multiple physical nodes considering multiple constraints can be reduced to that of the multidimensional multiple knapsacks, which is NP-complete [36]. Therefore, the algorithm for virtual node allocation proposed in this Thesis uses a greedy approach as showed at Figure 52. This algorithm consists in selecting, for each virtual machine of a new request, the appropriate Workers according to the storage, memory and CPU capacities associated with the virtual nodes and also the geo-location constraints. The algorithm initiates sorting the virtual nodes (a vm) and Workers that will be processed. This sort is made in the following order: highest free memory, highest free CPU, and lastly highest free storage space (line 4 for virtual machines and line 6 for Workers). For each virtual machine, the algorithm tries to greedily allocate it to the Worker with most remaining capacities that satisfies its location restriction (modularized in a simple check in line 8). In other words, this check will get a given virtual node and verify if this virtual node can be allocated in the current Worker being tested. If it is possible, the virtual node is allocated in this Worker (line 9) and the algorithm will try to allocate other virtual nodes; if it is not possible, the next Worker in the list of Workers will be tested. The algorithm allocates the virtual machines that need more resources first. Note that the algorithm performs admission control, if a virtual machine of the list cannot be mapped to a Worker, then the entire request is rejected (line 13). Algorithm 1: Allocation of virtual nodes 1 Inputs: vmList, workerList; 2 Outputs: vmMapping; 3 begin 4 sortByDecreasingValueOfResources (vmList); 5 for each vm in vmList do 6 sortByRemainingResources (workerList); 7 for each worker in workerList do 8 if( tryToAllocate (vm, worker) ) then 9 vmMapping.Add(vm, worker); 10 update( worker ); 11 stop; 12 end 13 if vm not in vmMapping then stop; 14 end 15 end Figure 52 Algorithm for allocation of virtual nodes
  • 89. 75 6.2.3 Allocating virtual links Regarding link allocation, the algorithm focuses on minimizing the maximum number of virtual paths allocated over a physical link while obeying the restriction on the maximum delay. In order to achieve link load balancing when allocating on physical links, a minimax path with delay restriction algorithm was used to determine the paths that will be allocated. A minimax path between two Workers is the path that minimizes the maximum weight of any of its links. The link weight is given by the number of virtual paths that is already allocated in a physical link. So, the minimax path has the minimum maximum link weight. Figure 53 illustrates the minimax path concept considering the paths between two nodes S and T. The numbers annotated in the links and separated by the semicolon are, in the left, the link weight for the minimax path calculus and, in the right, the delay for checking the restriction. In the example, the path costs of P1 and P2 are 4 and 2, respectively, which are the maximum weights of the links in each path. The minimax path is P2, but P1 is the chosen path when considering the delay restriction. Figure 53 Example illustrating the minimax path To determine the minimax path with delay restriction between two nodes one can execute a binary search on a vector ranging from the minimum link weight to the maximum link weight of the graph. In each iteration of the binary search, a weight k is selected and all links with weight greater than k are removed from the graph. The Djikstra algorithm considering the delay as the cost is then executed to verify if there is a path between the two nodes in this new graph that is constrained by the maximum delay. If there is such a path in this graph the upper bound is adjusted to the current k. Adversely, if there is not a path in this graph, the lower bound is adjusted to k. The binary search is repeated until the vector is reduced to two elements. Thus, k is chosen as the element in the upper bound, and Djikstra is used again to verify the delay constraint. If no path is found for that link, it cannot be allocated in this network. This simple algorithm is showed only for illustrating a solution for the problem, but more efficient algorithms can be used, like the ones proposed for the bottleneck shortest path problem, which is a closely related problem [34]. Similarly to the problem of virtual node allocation, the allocation of several virtual links focusing on minimizing the number of virtual links on each physical link is a NP-hard optimization problem (please see [74] to verify this aspect); this is the reason why the solution presented here is a greedy approach based on some heuristics. S T 2;0.1 2;0.2 1;0.3 4;0.2 P1 P2 CP1=max(2,4)=4 CP2=max(2,1)=2 CST=min(4,2)=2 P2 is the minmaxpath P1 is the minmax path with delay restriction
  • 90. 76 In the virtual link allocation algorithm (Figure 54), each iteration computes a minimax path with delay restrictions for each virtual link not yet allocated. If one of those virtual links cannot be allocated, then the entire process is stopped (lines 7 and 8). The virtual link with the largest minimax cost is allocated in the current iteration. The key idea behind this heuristic is that this is the most restrictive virtual link. The next iterations continue with this behavior, but the calculus of the minimax path considers this new allocated virtual link added to the graph. Thus, this greedy algorithm performs link allocation considering the minimax path in order to achieve load balancing without transgressing the delay restriction. Algorithm 2: Allocation of virtual links 1 Inputs: vmMapping, vLinkList, physicalLinksState; 2 Outputs: linkMapping; 3 begin 4 while (vLinkList) 5 for each vlink in vLinkList do 6 status = minimaxPathWithDelay (vlink, vmMapping, physicalLinksState); 7 if status = reject then 8 stop; 9 end 10 selectedVLink = largestMinimaxCost; 11 linkMapping.Add( allocate(selectedVLink)); 12 vLinkList.remove( selectedVLink ); 13 update( physicalLinksState ); 14 end 15 end Figure 54 Algorithm for allocation of virtual links 6.2.4 Evaluation An event-driven simulator for the allocation virtual networks was implemented to compare the performance results of the algorithms proposed by this Thesis (Minimax Path Algorithm – MPA) and the algorithm of Zhu and Ammar (ZAA), already described in Section 3.2.4. This specific evaluation is interesting since the performance of ZAA is a reference for virtual network allocation problems with load balance. This fact can be seen in the results shown by Chowdhury et al. [10]. Amongst the contribution of Chowdhury et al., they proposed a virtual network allocation algorithm with load balance that proved to achieve worst results than ZAA. Moreover, a Hybrid solution that applies the ZAA heuristic for node allocation and the MPA strategy for link allocation will be tested. Thus, the Hybrid strategy allows comparing just the link allocation strategy of MPA against the link allocation of ZAA, showing the impact of the minimax strategy. The simulator performs an event-driven simulation, whose main events are arrival and departure of virtual networks from a D-Cloud. The D-Cloud is represented as a topology composed of Workers connected by some links. The simulation model considers a simplified version of the model introduced in the previous section, i.e., the several constraints as link delay, CPU, Memory, Storage, and geo-location were removed from this simulation model in order to evaluate the MPA in
  • 91. 77 the same scenario for which ZAA was developed. Thus, the simulation model will consider only the stress, which is defined for Workers and physical links. In case of a Worker, the stress is the number of virtual nodes that are currently allocated on that Worker, for a physical link, the stress is the number of virtual links that are currently allocated on that physical link. In this way, the node allocation strategy of MPA was modified to consider only the stress as the metric for optimization, while disregarding specific aspects that were considered in the CPU, memory, storage, and geo-location restrictions. Thus, the algorithm is similar to the least-load strategy compared in Zhu and Ammar paper [74], with the difference that MPA’s link allocation algorithm uses the minimax path while the least-load studied in their paper uses the shortest path. Observe also that the link allocation algorithm was changed to consider the delay of each link as one unit. The adopted physical network is the backbone of the RNP (Rede Nacional de Ensino e Pesquisa), a Brazilian academic ISP, which is currently composed of 28 nodes and 33 links as showed at Figure 55(b)8 . In this evaluation, it was also considered an old topology of the RNP, which was composed of 27 nodes and 29 links (showed at Figure 55(a)). As one can see, the current topology has 5 nodes of degree one, and the old has 17. The impact of this aspect, as will be showed in the results, is that the number of disjoint paths tends to be greater in the current RNP topology affecting the performance of MPA. (a) (b) Figure 55 The (a) old and (b) current network topologies of RNP used in simulations The several factors varied in the simulation are showed at Table VI. Excepting the network topology, the other factors were obtained from the paper of Zhu and Ammar, since the idea is comparing the algorithms in the same scenario proposed for the ZAA. The virtual network requests arrive in a Poisson process whose average rate was set to 1, 5, 9, and 13 requests per unit time in each scenario. The lifetime of each virtual network follows an exponential distribution of mean equal to 275 time units. The evaluation includes only networks in a star topology, since the authors of ZAA argued that their algorithm is better for such type of networks. Observe that Zhu and 8 http://www.rnp.br/backbone/index.php
  • 92. 78 Ammar propose also an algorithm for graph division that subdivides a general network into several star networks, but, since only star networks are used in this simulation, such algorithm will not be evaluated. The size of each virtual network is uniformly distributed from 5 to 15 nodes including the center node. Thus, the evaluation occurs in a better case for ZAA. On the other hand, their algorithm for adaptive reallocation is not used in this comparison, since the objective here is to evaluate the allocation only. Table VI Factors and levels used in the MPA’s evaluation Factors Levels Physical network topology Old RNP topology and Current RNP topology (one Worker per PoP) Virtual network request rate Exponentially distributed with rate equal 1, 5, 9, and 13 virtual networks per unit time Virtual network lifetime Exponentially distributed with mean equal to 275 unit times Virtual network topology Star networks Size of the virtual networks Uniformly distributed between 5 and 15 nodes As occurs in the Zhu and Ammar’s paper, the simulation is finished after 80,000 requests have been serviced, and to reduce the effects of the warm-up period the metrics are collected after 40,000 requests have been serviced. For each simulation time, the maximum node stress, the maximum link stress, the mean link stress, and the mean path length are observed, which are averaged across the simulation time. The results were calculated with 95% confidence intervals, which will be showed if necessary. Regarding the maximum node stress (showed at Figure 56), the results show that the MPA strategy reaches the same performance of ZAA and Hybrid solutions. Actually, such an achievement can be observed in the ZAA’s paper, which shows that their algorithm for allocating star networks obtains a maximum node stress equivalent to the least-load algorithm. These results are similar in both evaluated topologies. (a) (b) Figure 56 Results for the maximum node stress in the (a) old and (b) current RNP topology 0 10 20 30 40 50 60 70 80 0 2 4 6 8 10 12 14 MaximumNodeStress ArrivalRate (requestsperunit time) ZAA MPA Hybrid 0 10 20 30 40 50 60 70 80 0 2 4 6 8 10 12 14 MaximumNodeStress Arrival Rate (requestsperunit time) ZAA MPA Hybrid
  • 93. 79 About the maximum link stress, one can observe that Hybrid solution and ZAA outperform MPA in the old topology (Figure 57(a)). The MPA is about 43% greater than the ZAA in the worst case at Arrival Rate equal to one, in the better case (at 13 requests per simulation time) the MPA is 23% greater than the ZAA maximum node stress. This occurs because the node allocation of ZAA takes in consideration the stress of neighbor links and the path cost has the property of adapting between shortest path and a least-loaded path according to the current load. Thus, initially, ZAA tends to allocate nodes in lesser stressed regions of the graph connected through shorter links. On the other hand, MPA tends to allocate the virtual nodes in physical nodes that not are necessarily close that can cause the use of longer paths, even when the network is lightly stressed. In this same scenario, the Hybrid strategy shows as the minimax path allocation strategy can outperform the strategy adopted in ZAA. This strategy eliminates bad the effect of the node allocation in MPA using the node allocation heuristic of ZAA. The gain in the maximum link stress of the Hybrid strategy is about 35% and it is practically constant over the variation of the arrival rate. (a) (b) Figure 57 Results for the maximum link stress in the (a) old and (b) current RNP topology MPA obtain better results than ZAA in a topology with more links available for allocation, as shows Figure 57(b). In this way, it was perceived that the addition of more links in the network mitigates the node positioning algorithm and turns the link positioning strategies more purposeful. When the arrival rate is one request per unit time the gain of MPA over ZAA is about 4% whereas as the arrival rate increases the MPA gain goes to about 40% over the ZAA. For the Hybrid approach the gain in the maximum link stress is about 13% and it is practically constant over the variation of the arrival rate. The MPA’s better performance is due to the fact that the minimax path algorithm tries to directly minimize the maximum number of virtual links in each physical link, which brings better results than searching for the path whose stress is minimal, as it is on the case with the algorithm of Zhu and Ammar. 0 100 200 300 400 500 600 700 0 2 4 6 8 10 12 14 MaximumLinkStress ArrivalRate (requestsperunit time) ZAA MPA Hybrid 0 100 200 300 400 500 600 700 0 2 4 6 8 10 12 14 MaximumLinkStress Arrival Rate (requestsperunit time) ZAA MPA Hybrid
  • 94. 80 The mean link stress is showed at Figure 58. These results show that the MPA link allocation strategy obtains a mean link stress higher than the ZAA, which is an expected result since ZAA optimizes the mean link stress, while MPA optimizes directly the maximum link stress. (a) (b) Figure 58 Results for the mean link stress in the (a) old and (b) current RNP topology Figure 59 shows the mean path length considering the mean of the length of each virtual link in the physical network, i.e., the number of physical links on which the virtual link is allocated. As one can see, MPA and Hybrid solutions obtain, in average, one link more than ZAA. This result occurs because the minimax path allocation algorithm seeks alternative paths in the network leading to an increase of this metric. (a) (b) Figure 59 Mean path length (a) old and (b) current RNP topology Another point that must be highlighted is that the original problem solved by MPA is to allocate the nodes considering several restrictions including the geographical ones. In this way, if a modified version of the ZAA node allocation algorithm is used in a D-Cloud scenario it could obtain a poor performance since the virtual nodes should be allocated in their respective geographical regions preventing ZAA of allocating the virtual nodes in few stressed regions of the network connected by a shortest distance. Differently, the MPA’s node allocation algorithm is designed for the D-Cloud as it tries to maintain a lower stress level inside each geographical region of the D-Cloud. 0 50 100 150 200 250 0 2 4 6 8 10 12 14 MeanLinkStress Arrival Rate (requestsperunit time) ZAA MPA Hybrid 0 50 100 150 200 250 0 2 4 6 8 10 12 14 MeanLinkStress ArrivalRate (requestsperunit time) ZAA MPA Hybrid 0 1 2 3 4 5 6 7 8 0 2 4 6 8 10 12 14 MeanPathLenght ArrivalRate (requestsperunit time) ZAA MPA Hybrid 0 1 2 3 4 5 6 7 8 0 2 4 6 8 10 12 14 MeanPathLenght ArrivalRate (requestsperunit time) ZAA MPA Hybrid
  • 95. 81 6.3 Virtual Network Creation In Nubilum, the Request Description XML Scheme allows a request to have virtual nodes only (see Section 5.1.1). In this case, the submitted request is not a virtual network and finding a specific route with a maximum delay is not necessary. Thus, the simple routing through shortest paths would be sufficient. But, the D-Cloud provider could use some solution to create a virtual network for intercommunication between the virtual machines for this new request in order to obtain a better usage of its network resources. Working with this scenario, this section investigates the problem of creating a virtual network when a request is submitted without any specified links. The problem consists in creating a network connecting a subset of the physical nodes (the nodes where the virtual nodes were previously allocated by the Algorithm 1) with the objective of balancing the load across virtual links and, secondly, minimizing the energy consumption. Figure 60(a) shows an example of a given physical topology where the virtual machines are allocated in nodes A, B, and E and the number in each physical link indicates the number of current virtual links crossing this physical link. Figure 60(b) shows the virtual network (the grey lines) created to interconnect those nodes. The created virtual network reaches the two objectives of balancing the load in the network while reducing the number of used links, which leads to the reduction of the energy consumption. (a) (b) Figure 60 Example creating a virtual network: (a) before the creation; (b) after the creation This problem can be better defined as to find a minimum length (number of links) interconnect for a subset of nodes such that the maximum weight of its links is minimal. The minimum Steiner tree problem can be reduced to this one considering minimizing the length in number of links and with all link weights with the same arbitrary value. On the other hand, if the value of the minimal maximum weight is given, the problem then reduces to finding a minimum length Steiner tree. As can be seen, the proposed problem is NP-hard. Therefore, approximate solutions will be proposed. The reduction of the problem to a Steiner tree gives an interesting idea for an approximate algorithm. Such algorithm consists in executing a binary search in a vector ranging from the current minimum link weight to the maximum link weight of the graph. During each iteration, a weight k is selected and all links with weight greater than k are removed from the graph. An approximate minimum length Steiner tree algorithm [65] is executed in this subgraph to verify if D E F C 0 A B 1 0 0 41 2 D E F C 0 A B 1 0 1 42 2
  • 96. 82 there is a tree between the nodes. If there is a Steiner tree in this graph the upper bound is adjusted to the current k. Adversely, if there is no path in this graph, the lower bound is adjusted to k. The binary search is repeated until the vector is reduced to two elements. Thus, k is chosen as the element in the upper bound, and the Steiner tree approximation gives the virtual network for the problem. This algorithm is similar to the minimax path with delay restriction algorithm described previously in Section 6.2.3, since it is defined by an outer function doing binary search and an inner algorithm to solve a specific problem (minimum delay path, in the previous algorithm, and minimum length Steiner tree, in this one). The main idea behind this combined approach is that the binary search provides the load balance searching by minimal maximum link weight, while the Steiner tree approximation gives the minimum length virtual network that minimizes energy. Next, the algorithms used to solve the minimum length Steiner tree problem (Section 6.3.1) and an evaluation of these algorithms are shown (Section 6.3.2). 6.3.1 Minimum length Steiner tree algorithms This section shows three algorithms to solve the minimum length Steiner tree problem in a graph. The first algorithm is a well-known approximation based on the minimum spanning tree in the distance graph [65]. The second and third algorithms are proposed by this Thesis: one uses a greedy heuristic that searches for the better hubs in the network in order to minimize the path lengths, and the other is an exponential algorithm that finds the minimum Steiner tree through successive tries of link removal. Steiner Tree Approximation (STA) Basically, the STA algorithm consists in transforming a general Steiner tree problem in a metric Steiner tree problem, which is a variant of the general problem where the graph is complete and the links satisfy the triangle inequality (ܿ‫ݐݏ݋‬ሺ‫,ݑ‬ ‫ݒ‬ሻ ൑ ܿ‫ݐݏ݋‬ሺ‫,ݑ‬ ‫ݓ‬ሻ ൅ ܿ‫ݐݏ݋‬ሺ‫,ݒ‬ ‫ݓ‬ሻ). The transformation is made by calculating the all-pairs shortest path in the original graph ‫ܩ‬ and generating a new connected graph ‫ܩ‬ᇱ whose nodes are the nodes of the original graph and the links are the cost of the shortest path between the nodes in the original graph. In ‫’ܩ‬ one can find a minimum spanning tree ܶ’ that is a 2-approximation of the minimum Steiner tree in this distance graph. Given this tree, one can replace their links by the original paths to obtain a subgraph in ‫.ܩ‬ If this subgraph contains cycles, removing links will generate a 2- approximation of the minimum Steiner tree in the general graph. More details and proofs on this process can be found in [65]. The complexity of the STA algorithm is ܱሺܰଷ ሻ, where ܰ is the number of physical nodes.
  • 97. 83 Greedy Hub Selection (GHS) The solution of the Steiner tree contains a subset of nodes of the graph that act like hubs interconnecting two or more nodes of the Steiner tree (the nodes C and F in Figure 60(b)). Thus, given a graph and a subset of physical nodes containing the requested virtual nodes, the objective of the GHS algorithm is to find the hubs of the minimum length Steiner tree interconnecting the virtual nodes. This problem is similar to the replica placement problems (Section 0) but with the difference that the replicas must form a virtual tree topology, which makes the problem more difficult. The GHS algorithm initiates with a tree formed by the physical nodes where a virtual node was allocated. One of these nodes is chosen as the root of the tree and as the first hub. Following an iterative approach, a new hub node is placed at the best location in the network – defined as the one which achieves the minimal number of used links (the cost) among all the possible positions. This location is then fixed. The new hub is then connected to other nodes in the tree through the shortest path, but a heuristic is used to maintain the tree. The positioning of a new hub and the immediate link rewiring reduces the cost and these processes follow while the positioning of a new hub reduces the cost. Algorithm 3: Search Procedure 1 Inputs: nodeList; 2 Outputs: selectedNode, best; 3 best = infinite; 4 for each node in nodeList(available) do 5 cost = placementProcedure(node); 6 if(cost < best) 7 selectedNode = node; 8 best = cost; 9 end 10 undoInsertion(node); 11 end Figure 61 Search procedure used by the GHS algorithm The selection characterizes GHS as greedy: it selects the best configuration possible for the current iteration. Thus, GHS searches for a solution with a relevant degree of optimality, calculating the new value of the cost for each selected candidate physical node, and selecting the one that achieves the best cost. The pseudo-code of this search procedure is presented in Figure 61. Such procedure simply selects any available physical node as a candidate. The variable nodeList can be indexed with available, hub, root, or requested in order to return respectively: the available nodes; the hub nodes other than the root; the root node; or the other requested nodes. Thus, nodeList(available) returns the nodes of the physical network that are not already a hub, the root, or a requested node. After that, the algorithm calls the placementProcedure (Figure 62) adding a new hub in this candidate and rewiring the virtual links (line 5). The cost achieved by
  • 98. 84 placing this new hub is calculated and if better than others it is selected as the best candidate node (lines 7 and 8). The line 10 returns the network to the state before the hub insertion in order to prepare the network for the next candidate hub. The placement strategy (Figure 62) is a heuristic approach that links a candidate hub to a parent and children maintaining the tree structure and guaranteeing optimality. The parent is selected firstly as the nearest hub considering the shortest path length (line 3), but if the virtual link from the current parent of this nearest hub crosses the candidate (line 4) the parent of the candidate will be the parent of the nearest hub, and the candidate will be the new parent of this hub. After that, the children of the new hub are selected amongst the other hub and requested nodes (line 10). A node is chosen as a child if the new hub is a better parent (nearer) than the current parent and if the node is not an ancestor of the new hub. After placing the new hub, the new number of used links is returned. Algorithm 4: Placement Procedure 1 Inputs: candidate 2 Outputs: newCost 3 nearest = nearestHub(candidate); 4 if(path(nearest,parent(nearest)).contains(candidate)) 5 parent(candidate) = parent(nearest); 6 parent(nearest) = candidate; 7 else 8 parent(candidate) = nearest; 9 end 10 for each node in nodeList(hub or requested) do 11 if(distance(node, parent(node)) > distance(node, candidate) and not isAncestor(node, candidate)) 12 parent(node) = candidate; 13 end 14 end 15 newCost = calculateNewCost(); Figure 62 Placement procedure used by the GHS algorithm In order to clarify the proposed heuristic, let’s consider the current virtual tree in Figure 63(a), which is formed by the grey nodes, with the R node representing the selected root node, the H node as a hub selected in a previous iteration, and the grey lines indicating the path of the current virtual links. The white nodes are available for adding new hubs. Considering the current candidate as the node A, one must select the nearest one as node H, since it is the nearest hub from A. However, as the path that goes from H to its parent R passes through A, the parent of A is set as R and H as a child of A. Finally, one must set all the other children of A, as any node that is not an ancestor of the new hub (as the parent of A was already set, its ancestor is well defined as the node R) which has a distance to A of less than its distance to its own parent. Notice that the nodes that can be its children are 1, 2, and 3. For such nodes, only the node 1 is nearer to node A than to its own parent, the others are nearer to node H. So, the new children of A are 1 and H. Figure 63(b) depicts the new configuration. The cost function should be calculated now and compared to other candidate results.
  • 99. 85 (a) (b) Figure 63 Example of the placement procedure: (a) before and (b) after placement These two procedures are the core of GHS, but there are two additional procedures. The procedure for selecting the root node tries to select the virtual node that minimizes the summing of the distances between all the allocated virtual nodes, which is equivalent to the MPP explained in Section 6.1, where the link weights are all equal to one and the summation is made only in a subset of nodes. This problem can be solved in the same way as the MPP. After choosing the root node, the virtual links in initial virtual tree are created trough the shortest path between the root node and each other requested node. If the virtual link between a requested node A and the root node passes through a requested node B the node B will be the parent of A. The other procedure is an external loop that calls the search procedure to greedily add new hubs to the network until the cost achieved by the placement of a new hub is not reduced. The complexity of GHS is ܱሺܰଷ ሻ. Optimal Algorithm (OA) Because a Steiner tree never contains a cycle, then there exists a subgraph of the original graph which is a tree and contains a minimal Steiner tree for any given subset of nodes. Observe that, in this subgraph there is only one path between any two nodes. Thus, if the graph is already a tree, the minimal Steiner tree between the given subset of nodes can be found on linear time through a depth-first search. Considering such property, an optimal algorithm to find the minimum length Steiner tree can be proposed: for the connected component of the graph in which the subset of nodes is – which can be found by a breadth-first search after removing the links with weight greater than k –, count the minimal number of links that are needed to be removed in order to turn this graph into a tree. If ܰ is the number of nodes and ‫ܮ‬ the number of links of the considered component, this number is given by ݉ ൌ ‫ܮ‬ െ ܰ ൅ 1 [41]. Then, in this component, for each subset of ݉ links, remove them and find the optimal Steiner tree in this subgraph, considering that this graph is a tree, through the depth-first search, as observed previously. As the algorithm tries every possible way of removing A B C R 1 2 3 H H A B C R 1 2 3 H
  • 100. 86 these links, one of them will find the tree which leads to an optimal Steiner tree. The complexity of this algorithm is ܱሺቀ ‫ܮ‬ ݉ ቁ ሺܰ ൅ ‫ܮ‬ሻሻ. 6.3.2 Evaluation This section evaluates the minimum length Steiner tree algorithms discussed in the previous section. The section does not evaluate the overall solution for the creation of the virtual network, which has the binary search as the outer procedure, but it evaluates only the inner Steiner tree procedure, i.e., this evaluation only covers the energy reduction objective. The evaluation was made through Monte Carlo simulations considering a fixed physical topology with the random positioning of the virtual nodes. In each simulation sample, the stress of each physical link is drawn from a uniform distribution and the virtual nodes to be connected are positioned in a uniform way. Note that, each sample is independent, and the physical network is returned to its initial values before a new set of requested nodes is attempted. The two RNP topologies used in Section 6.2.4 and showed at Figure 55 are also used in this experiment. At each run of the simulation each algorithm (STA, GHS, and OA) is submitted to the same scenario and the number of used physical links (the cost) in the tree is measured for each algorithm independently. The factors varied in the experiment are showed at Table VII. For the old RNP topology, the number of requested virtual nodes in the network is varied though the simulations from 3 to 27 nodes, which is equivalent to 11% and 100% of the physical nodes, respectively. For the current RNP topology, this number ranges from 3 to 28. For each run, from a total of 1000, the requested nodes were positioned in a different physical node in a random way, with every subset of the physical nodes equally likely to be selected without repetitions. For each sample, the relative error between the costs of the GHS and STA algorithms was calculated against the optimal cost. All the results showed use a confidence level of 95%, which are showed in the graphs. Table VII Factors and levels used in the GHS’s evaluation Factors Levels Physical network topology Old RNP topology and Current RNP topology (one Worker per PoP) Number of requested virtual nodes 3 to 27 (old RNP) 3 to 28 (current TNP) Figure 64, Figure 65, Figure 66, and Figure 67 show the main results for the algorithms. The graphs in Figure 64 and Figure 66 present the percentage of samples that reached the optimum (relative error equal zero), and the graphs in Figure 65 and Figure 67 present the percentage of samples that reached a relative error below than five percent. These percentages are showed
  • 101. 87 according to the number of requested nodes. Results for requests with 27 nodes for the scenario of the old RNP topology are not showed because this scenario is simple and all the samples reach the optimum. The same occurs for 28 nodes in the current RNP topology scenario. As shown by Figure 64, the GHS algorithm achieves the optimum cost in 100% of the samples for virtual networks with 3 nodes and 95% to 99% of the samples for virtual networks of 4, 5, 6, and 7 nodes. However, the performance of GHS tends to decrease when the number of requested nodes increases. In the worst case, about 75% of the samples of GHS reach the optimum in the cases from 14 to 22 requested nodes. Moreover, for small virtual networks until 7 nodes, GHS statistically outperforms STA, with the best cases occurring with 5 and 6 nodes where the difference is about 6%. The performance of STA is high for few requested nodes, decreases in the middle of the range of virtual network size, and reaches the optimum when 26 nodes are requested. Figure 64 Percentage of optimal samples for GHS and STA in the old RNP topology As the number of nodes in the virtual network tends to the total number of nodes (27 in the old RNP topology), the performance of STA is improved and outperforms GHS since the problem tends to be to compute the minimum spanning tree in the physical network. In the other hand, GHS is better for small networks because the placement strategy is designed to find the hubs in the physical network whereas the STA strategy is to find the common links in the shortest paths between the requested nodes, if there are no common links in these shortest paths STA cannot find the hubs minimizing the cost. Looking only for the samples that reached the optimum, one can conclude that the GHS algorithm is not adequate for bigger virtual networks. But, Figure 65 shows the performance of each algorithm considering the samples that reached a relative cost error less than 5% in relation to the optimum. In this case, the GHS performance has significantly improved for virtual networks greater or equal than 19 nodes. For example, in the scenario with 26 nodes – the worst scenario for GHS 60,00% 70,00% 80,00% 90,00% 100,00% 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Percentageofsamplesthatreachedthe optimum Number of nodes of the Virtual Networkto be created Performance of GHS and STA against the Optimal Algorithm GHS STA
  • 102. 88 considering only optimum samples – all the samples of the GHS algorithm obtained a relative error lesser than 5%. Furthermore, the STA algorithm has improved for virtual networks with 19 nodes or more. This shows that, considering all virtual networks’ sizes, even the GHS algorithm reaching optimality only in few cases, most samples reached less than 5% of the optimum in the old RNP topology and the worst cases was for 69.1% of the samples with 16 nodes. For STA, the worst case is 65.6% with 14 virtual nodes. Figure 65 Percentage of samples reaching relative error ≤ 5% in the old RNP topology The results for the current RNP topology are presented at Figure 66 and Figure 67. In this scenario, the behavior of the results is similar to the behavior in the previous scenario. Again, the GHS outperforms STA for smaller virtual networks from 3 to 11 nodes and the inverse occurs for virtual networks from 16 to 27 nodes. Thus, it must be observed that increasing the quantity of physical links can improve the results of the GHS algorithm, since in the old RNP topology GHS outperforms STA until 7 nodes only. Moreover, performance of GHS is substantially improved when considering samples that reached 5% of the optimum as occurred in the previous scenario. Figure 66 Percentage of optimal samples for GHS and STA in the current RNP topology 60,00% 70,00% 80,00% 90,00% 100,00% 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Percentageofsamplesthatreached relativeerrorlessthan5% Number of nodes of the Virtual Networkto be created Performance of GHSand STA against the Optimal Algorithm(w/5% error) GHS STA 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00% 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Percentageofsamplesthatreachedthe optimum Number of nodes of the Virtual Networkto be created Percentageof samples that reachedthe optimum GHS STA
  • 103. 89 Figure 67 Percentage of samples reaching relative error ≤ 5% in the current RNP topology 6.4 Discussion This chapter presented how Nubilum performs the allocation of resources in a D-Cloud. It was showed as the comprehensive resource model employed by Nubilum leverages the usage of previous algorithms for virtual network allocation present in the literature. In addition, this chapter presented the evaluation of the specific algorithms for self-optimization of resources in a D-Cloud considering several aspects such as load balancing, energy consumption, network restrictions, and server restrictions. Considering particularly the problems involving virtual networks – both the allocation and creation of virtual networks – it was showed that, due to the complexity introduced when coping with several different requirements and objectives (from developers and the provider), the problems are in NP. Thus, the proposed solutions for such problems are greedy algorithms which employ some sort of heuristics. In the problem where the virtual network is given, the proposed solution tries to minimize the number of virtual links allocated in each physical link considering restrictions associated to virtual nodes and virtual links. Note that the virtual link restriction does not consider link capacity, since considering unconstrained capacities seems better suited to a D-Cloud with a pay-as-you-go business plan, thus, the problem differs from several previous ones on the field of virtual networks allocation. Other aspect that can be highlighted is that the proposed solution is not restricted to the allocation of an incoming virtual network. It could be used for continuous maintenance of the application in order to guarantee developer’s requirements. Particularly, the algorithm for allocation of virtual links (Figure 52) could be called by the Application Monitor to reallocate a virtual link whose current delay is greater than the maximum required. Also, the same algorithm could be called by the D- 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00% 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Percentageofsamplesthatreached relativeerrorlessthan5% Number of nodes of the Virtual Networkto be created Percentageof samples that reached relativeerror less than 5% GHS STA
  • 104. 90 Cloud Monitor to rearrange the overall virtual links on the D-Cloud in order to obtain a better load balance in the physical network. Other studied problem was to build a virtual network for a given set of virtual nodes in order to minimize the energy consumption whereas balancing the load of physical links. Such a problem considers that the developer requested only virtual nodes with no virtual links, thus, the problem tries to capture provider’s requirements on load balancing and energy consumption. For this problem, two algorithms were proposed, one an optimum algorithm and other one based on heuristics. The heuristic approach was compared with a known approximation algorithm for the Steiner tree problem, and the results showed that the proposed heuristic is better suited for small size virtual networks whereas the traditional approximation algorithm is better for bigger virtual networks. One possible implementation of this algorithm in a production scenario could consider alternating between these algorithms according to the size of the virtual network in order to obtain the best solution in each case. The optimum algorithm was used in the experiments as a baseline that references the performance of the others algorithms. Observe that the optimum algorithm is adapted for the physical network used in the test scenario that is close to a tree (27 nodes and 29 links). In this case the optimum algorithm could be used for virtual network creation since the number of possible combinations is low, lowering the computing time required.
  • 105. 91 7 Conclusion “Non datur denarius incipientibus, sed finientibus, et corona non currentibus, sed pervenientibus.” Saint Augustine Small cooperative datacenters can be attractive since they offer cheaper and low-power consumption alternative that reduces the costs of centralized Clouds. These small datacenters can be built at different geographical regions and connected to form a Distributed Cloud, which reduces costs by simply provisioning storage, servers, and network resources close to end-users. Users in a D-Cloud are free to choose where to allocate their resources in order to attend specific market niches, constraints on jurisdiction of software and data, or quality of service aspects of their clients. One of the most important design aspects of D-Clouds is the availability of “infinite” computing resources. But, providing on-demand computing instances and network resources in a distributed scenario is not a trivial task, and the resource management system must be carefully designed in order to guarantee that both user and provider requirements are met satisfactorily. Such design covers several aspects such as the optimization algorithms for resource management, the development of suitable resource and offering models, and the right mechanisms for resource discovery and monitoring should also be designed. This Thesis introduced Nubilum, a resource management system that offers a self-managed solution for challenges on allocation, discovery, control, and monitoring of resources in Distributed Clouds. This system can be seen as an integrated solution meeting the following requirements: • An suitable information model – called as CloudML – for describing the range of D-Cloud resources and application’s computational, topologyc, and geographic restrictions; • An integrated control plane for network and computational resources, which combines HTTP-based interfaces and the Openflow protocol to enforce decisions in the D-Cloud; • A set of algorithms for allocation of resources to developer’s applications based on several different requirements and objectives. Next, the main contributions of Nubilum are described in Section 7.1 whereas the related publications obtained during the PhD are presented at Sectio 7.2. Finally, some further work are listed in Section 7.3.
  • 106. 92 7.1 Contributions Nubilum introduces a clear separation between the enforcement and the intelligence roles played by the Manager and the Allocator, respectively. The Manager offers an abstraction of the available resources on the D-Cloud to the Allocator, which, in turn, uses algorithms to allocate resources for incoming developer’s requests. Note that the system defines a resource model but it intends to remain open for different model-driven resource allocation strategies. The resource model is condensed in the CloudML, a description language that expresses resources, services, and requests in an integrated way. CloudML presents interesting characteristics for both providers and developers. CloudML is used in Nubilum for representing virtual and physical infrastructure, services, and requests. In terms of the solutions employed for controlling the D-Cloud, Nubilum uses open protocols for communication between their components and open strategies to define their processes. The overall communication is made through HTTP messages (using REST) and Openflow messages. These messages are exchanged between the components for the coordinated execution of each process in the D-Cloud. One of these processes is the discovery process that combines Openflow registration messages, REST operations, and LLDP messages in an integrated and versatile solution for discovering of servers, switches, and the network topology. In terms of the allocation process, Nubilum performs the allocation of virtual machines and virtual links on the physical servers and links. The use of Libvirt for management of the hypervisors allows Nubilum to work with very heterogeneous infrastructures, whereas the use of Openflow allows the versatile reorganization of the network when necessary. The IP/MAC management is centralized into the Manager, but the actual attribution of those addresses to virtual machines is made by each Worker through a built-in DHCP server. The overall Nubilum’s control plane solution was evaluated through measurements in a prototype implementation of Nubilum. From those measurements some models were derived to estimate the load introduced by Nubilum’s control messages when performing their main operations. Such models show the linear growth of the control load in respect to the growth of the number of resources in the D-Cloud. Finally, some optimization algorithms were developed for two different problems. The first problem involves the on-line allocation of virtual networks in a physical network with the objective of balancing the load over the physical infrastructure. Such problem also considers restrictions on geo-location, memory, processing, and network delay. Due to the complexity of this problem, its solution is a procedure developed in two stages: one for node allocation and another for links.
  • 107. 93 Compared with a previous approach designed for load balancing, our procedure showed to be adequate in scenarios with bigger virtual networks. The second studied case addresses the problem of allocating a virtual tree for connecting nodes, when the request does not contain any links. Again load balancing is the main objective, with the energy reduction as a secondary objective. A two-step algorithm is designed for this problem, with an outer loop responsible for load balancing and an inner loop for energy reduction. The inner loop problem can be reduced to a minimum length Steiner tree problem and two algorithms are proposed for this step. One uses a greedy strategy for hub placement and the other uses a combinatorial approach to find the optimum. The heuristic-based algorithm is compared against a Steiner tree approximation algorithm with the optimum algorithm as the baseline. The results showed that the hub placement heuristic is better suited for small virtual networks, while the Steiner approximation has better results in scenarios with greater virtual networks. 7.2 Publications Some results presented in this Thesis were developed in cooperation with the Location and Allocation Mechanisms for Distributed Cloud Servers, a research project carried by the GPRT (Grupo de Pesquisa em Redes e Telecomunicações) and funded by Ericsson Telecomunicações S.A., Brazil. Parts of this Thesis were published in the form of scientific papers at conferences and journals. Table VIII shows all the papers produced, including the papers already accepted and the submitted papers that are under revision. The results and analysis developed in papers #1 and #2 are not part of this Thesis, but they make part of a previous evaluation of several open-source systems for management of Clouds. These earlier works were an important milestone in the development of this Thesis since they provided sufficient knowledge about the technologies being used to leverage current Cloud Computing setups. Paper #1, particularly, contributed with the concept of programmability applied to Cloud Computing. Papers #3 and #4 are in the group of the conceptual contributions of this Thesis which discuss the research challenges related to resource management on Clouds and D-Clouds, respectively. The book chapter is partially reproduced in Chapter 2, whereas Chapter 3 is composed by some parts of paper #4. The papers #5, #6, #7, and #8 compose the group of effective contributions of this Thesis. Part of the paper #5 introducing CloudML appears in Section 5.1 of this Thesis. The short paper #6 corresponds to Chapter 4 and it presents the main components and modules of Nubilum.
  • 108. 94 Table VIII Scientific papers produced # Reference Type Status 1 P. Endo, G. Gonçalves, J. Kelner, D. Sadok. A Survey on Open-source Cloud Computing Solutions. VIII Workshop in Clouds, Grids e Applications, Brazil, June, 2010. Conference Full Paper Published 2 T. Cordeiro, D. Damálio, N. Pereira, P. Endo, A. Palhares, G. Gonçalves, D. Sadok, J. Kelner, B. Melander, V. Souza, J. Mångs. Open Source Cloud Computing Platforms. 9th International Conference on Grid and Cloud Computing, China, November 2010. Conference Full Paper Published 3 G. Gonçalves, P. Endo, T. Cordeiro, A. Palhares, D., J. Kelner, B. Melander, J. Mångs. Resource Allocation in Clouds: Concepts, Tools and Research Challenges. 29º Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, 2011. Book Chapter Published 4 P. Endo, A. Palhares, N. Pereira, G. Gonçalves, D. Sadok, J. Kelner, B. Melander, J. Mångs. Resource Allocation for Distributed Cloud – Concepts and Research Challenges. IEEE Network Magazine, July/August 2011. Journal Full Paper Published 5 G. Gonçalves, P. Endo, M. Santos, D. Sadok, J. Kelner, B. Melander, J. Mångs. CloudML: An Integrated Language for Resource, Service and Request Description for D- Clouds. IEEE Conference on Cloud Computing Technology and Science (CloudCom), December 2011. Conference Full Paper Published 6 G. Gonçalves, M. Santos, G. Charamba, P. Endo, D. Sadok, J. Kelner, B. Melander, J. Mångs. D-CRAS: Distributed Cloud Resource Allocation System, IEEE/IFIP Network Operations and Management Symposium (NOMS), 2012 Conference Short Paper Published 7.3 Future Work As can be noticed throughout this Thesis, the Cloud Computing paradigm will certainly be present in our lives during the years to come. Also, D-Clouds should gain its niche slowly. But, before seeing new developments, new research for automating the formation of such distributed environments is still needed. These include strategies for advertising and scavenging for resources, loading and freeing these automatically. Similar challenges include usage of different protocols for controlling dedicated network devices as radio stations, modems, and other devices that do not support the Openflow protocol. Future works on this Thesis include testing the current implemented version of Nubilum in an ISP-scale environment. Such case study would employ all the components of the system and some of the proposed algorithms to instantiate applications on a D-cloud environment. The main idea of this case study is to obtain performance data of the system in a real environment, allowing identifying the bottlenecks and, eventually, engineering challenges that could be not perceived in the tests made in laboratory.
  • 109. 95 Another future possibility would be adding support for elasticity on Nubilum. This aspect involves reviewing several aspects of the system. One first aspect to be considered is the change on the CloudML information model, which should be modified to support elasticity rules that would be used by Nubilum to determine when and how to scale up and down the applications. These very specific rules differs from the common practice used in current Cloud Computing setups since creating a new virtual node can require the creation of one or several links, which would be indicated by the elasticity rules informed by the developer. Moreover, specific algorithms should be developed to determine the suitable physical resources for the elastic growth of the virtual networks.
  • 110. 96 References [1] ARMBRUST, M., FOX, A., GRIFFITH, R., JOSEPH, A.D., KATZ, R.H., KONWINSKI, A., LEE, G., PATTERSON, D.A., RABKIN, A., STOICA, I., and ZAHARIA, M. Above the Clouds: A Berkeley View of Cloud Computing, Tech. Rep. UCB/EECS-2009-28, EECS Department, University of California, Berkeley, 2009. [2] BARONCELLI, F., MARTINI, B., and CASTOLDI, P. Network virtualization for cloud computing, Annals of Telecommunications, v. 65, pp. 713-721, Springer-Verlag, 2010. [3] BELBEKKOUCHE, A., HASAN, M., and KARMOUCH, A. Resource Discovery and Allocation in Network Virtualization, IEEE Communications Surveys & Tutorials, n. 99, pp. 1-15, 2012. [4] BELOGLAZOV, A., BUYYA, R., LEE, Y. C., and ZOMAYA, A. A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems, Advances in Computers, v. 82, Elsevier, pp. 47-111, 2011. [5] BOLTE, M., SIEVERS, M., BIRKENHEUER, G., NIEHÖRSTER, O., and BRINKMANN, A. Non-intrusive Virtualization Management using libvirt. Conference on Design, Automation and Test in Europe (DATE), Germany, March 2010. [6] BORTHAKUR, D. The Hadoop Distributed File System: Architecture and Design. Available at: http://hadoop.apache.org/core/docs/current/hdfs_design.pdf. Last access: February, 2012. [7] BUYYA, R., BELOGLAZOV, A., and ABAWAJY, J. Energy-Efficient Management of Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and Open Challenges. International Conference on Parallel and Distributed Processing Techniques and Applications, pp. 6-17, 2010. [8] BUYYA, R., YEO, C.S., VENUGOPAL, S., BROBERG, J., and BRANDIC, I. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems, Elsevier B. V, 2009 [9] CHAPMAN, C., EMMERICH, W., MARQUEZ, F. G., CLAYMAN, S., and GALIS, A. Software architecture definition for on-demand cloud provisioning. Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 61- 72, 2010. [10] CHOWDHURY, N. M. M. K., RAHMAN, M. R., and BOUTABA, R. Virtual Network Embedding with Coordinated Node and Link Mapping, IEEE INFOCOM, 2009. [11] CHOWDHURY, N.M. M. K. and BOUTABA, R. A survey of network virtualization. Computer Networks, Vol. 54, issue 5, pp. 862-876, Elsevier, 2010. [12] CHURCH, K., GREENBREG, A., and HAMILTON, J. On Delivering Embarrassingly Distributed Cloud Services, Workshop on Hot Topics in Networks (HotNets), 2008. [13] CROCKFORD, D. JSON: The fat-free alternative to XML. Presented at XML 2006, Boston, USA, December 2006. Available at: http://www.json.org/fatfree.html. Last access: February, 2012. [14] CULLY, B., LEFEBVRE, G., MEYER, D., FEELEY, M., HUTCHINSON, N. and WARFIELD, A. Remus: High availability via asyncronous virtual machine replication. 5th USENIX Symposium on Networked Systems Design and Implementation, April 2008.
  • 111. 97 [15] DEAN, J. and GHEMAWAT, S. Mapreduce: simplified data processing on large clusters. Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, Berkeley, CA, USA, 2004. [16] DONGXI, L. and JOHN, Z. Cloud#: A Specification Language for Modeling Cloud, IEEE International Conference on Cloud Computing, pp. 533-540, 2011. [17] DUDKOWSKI, D., TAUHID, B., NUNZI, G., and BRUNNER, M. A Prototype for In- Network Management in NaaS-enabled Networks, 12th IFIP/IEEE International Symposium on Integrated Network Management, pp. 81-88, 2011. [18] ECLIPSE FOUNDATION. Web Tools Platform, 2011. Available at: http://www.eclipse.org/webtools/. Last access: February, 2012. [19] ENDO, P. T., GONÇALVES, G. E., KELNER, J., and SADOK, D. A Survey on Open- source Cloud Computing Solutions. Workshop em Clouds, Grids e Aplicações, Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, 2010. [20] ENDO, P. T., PALHARES, A. V. A., PEREIRA, N. N., GONÇALVES, G. E., SADOK, D., KELNER, J., MELANDER, B., and MANGS, J. E. Resource allocation for distributed cloud: concepts and research challenges, IEEE Network Magazine, vol. 25, pp. 42-46, July 2011. [21] GALAN, F., Sampaio, A., Rodero-Merino, L., Loy, I., Gil, V., and Vaquero, L. M. Service specification in cloud environments based on extensions to open standards. Proceedings of the Fourth International ICST Conference on Communication System software and middleware, 2009. [22] GEYSERS Project, Initial GEYSERS Architecture & Interfaces Specification, Deliverable D2.1 of the Generalised Architecture for Dynamic Infrastructure Services (GEYSERS) FP7 Project, January 2010. [23] GHEMAWAT, S., GOBIOFF, H., and LEUNG, S. The Google file system. 19th Symposium on Operating Systems Principles, pages 29–43, Lake George, New York, 2003. [24] GONÇALVES, G., ENDO, P., CORDEIRO, T., PALHARES, A., SADOK, D., KELNER, J., MELANDER, B., and MÅNGS, J. Resource Allocation in Clouds: Concepts, Tools and Research Challenges. Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, June 2011. [25] GONÇALVES, G., ENDO, P., SANTOS, M., SADOK, D., KELNER, J., MELANDER, B. MÅNGS, J. CloudML: An Integrated Language for Resource, Service and Request Description for D-Clouds. IEEE Conference on Cloud Computing Technology and Science (CloudCom), December 2011. [26] GONÇALVES, G., ENDO, P., SANTOS, M., SADOK, D., KELNER, J., MELANDER, B., MÅNGS, J. CloudML: An Integrated Language for Resource, Service and Request Description for D-Clouds. IEEE Conference on Cloud Computing Technology and Science (CloudCom), December 2011. [27] GREENBERG, A., HAMILTON, J., MALTZ, D. A., and PATEL, P. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev. 39, n. 1, pp. 68-73, 2008. [28] GU, Y., and GROSSMAN, R. Sector and Sphere: The Design and Implementation of a High Performance Data Cloud, Philosophical Transactions: Series A, Mathematical, physical, and engineering sciences, v. 367, pp. 2429-2445, June 2009.
  • 112. 98 [29] GUDE, N., KOPONEN, T., PETTIT, J., PFAFF, B., CASADO, M., MCKEOWN, N., and SHENKER. S. NOX: towards an operating system for networks. ACM SIGCOMM Computer Communication Review, v. 38, no. 3, July 2008. [30] HADAS, D., GUENENDER, S., and ROCHWERGER, B. Virtual Network Services for Federated Cloud Computing, Technical report H-0269, IBM, 2009. [31] HAIDER, A., POTTER, R., and NAKAO, A. Challenges in Resource Allocation in Network Virtualization. 20th ITC Specialist Seminar, pp. 18-20, Hoi An, Vietnam, May 2009. [32] HOUIDI, I., LOUATI, W., ZEGHLACHE, D., and BAUCKE, S. Virtual Resource Description and Clustering for Virtual Network Discovery, Proceedings of IEEE ICC Workshop on the Network of the Future, 2009. [33] HUMBLE, J. JavaSysMon, version 0.3.0, January 2010. Available at: https://github.com/jezhumble/javasysmon. Last access: February, 2012. [34] JUNGNICKEL, D. Graphs, Networks and Algorithms. Algorithms and Computation in Mathematics, v.5, 3rd Edition, Springer-Verlag, 2007. [35] KARLSSON, M., KARAMANOLIS, C., and MAHALINGAM, M. A Framework for Evaluating Replica Placement Algorithms. Technical Report HPL-2002, HP Laboratories, July 2002. [36] KELLERER, H., PFERSCHY, U., and PISINGER, D. Knapsack Problems, 1st Edition, Springer-Verlag, 2004. [37] KHOSHAFIAN, S. Service Oriented Enterprises. Auerbach Publications, 2007. [38] KOOMEY, J. Growth in Data center electricity use 2005 to 2010. Analytics Press, Oakland, August 2011. Available at: http://www.analyticspress.com/datacenters.html. Last access: February, 2012. [39] KOSLOVSKI, G.P., PRIMET, P.V.B., and CHARAO, A.S. VXDL: Virtual resources and interconnection networks description language, Networks for Grid Applications, pp. 138-154, Springer, 2009. [40] LAGAR-CAVILLA, H. A., WHITNEY, J. A., SCANNELL, A. M., PATCHIN, P., RUMBLE, S. M., LARA, E., BRUDNO, M., and SATYANARAYANAN, M. SnowFlock: rapid virtual machine cloning for cloud computing. Fourth ACM European Conference on Computer Systems, 2009. [41] LEISERSON, C. E., STEIN, C., RIVEST, R. L., and CORMEN, T. H. Algoritmos: Teoria e Prática. Campus, ed. 1, 2002. [42] LISCHKA, J. and KARL, H. A virtual network mapping algorithm based on subgraph isomorphism detection, ACM SIGCOMM Workshop on Virtualized Infastructure Systems and Architectures, 2009. [43] MARSHALL, P., KEAHEY, K., and FREEMAN, T. Elastic Site: Using Clouds to Elastically Extend Site Resources, 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 43-52, Australia, June, 2010. [44] MCKEOWN, N., ANDERSON, T., BALAKRISHNAN, H., PARULKAR, G., PETERSON, L., REXFORD, J., SHENKER, S., and TURNER, J. OpenFlow: enabling innovation in campus networks, ACM SIGCOMM Computer Communication Review, 2008. [45] MELL, P. and GRANCE, T. The NIST Definition of Cloud Computing. National Institute of Standards and Technology, Information Technology Laboratory, 2009.
  • 113. 99 [46] METSCH, T., EDMONDS, A., and NYRÉN, R. Open Cloud Computing Interface – Core, Open Grid Forum, OCCI-WG, Specification Document. Available at: http://forge.gridforum.org/sf/go/doc16161?nav=1, 2010. Last access: February, 2012. [47] MORÁN, D., VAQUERO, L. M., and GALÁN, F. Elastically Ruling the Cloud: Specifying Application’s Behavior in Federated Clouds, IEEE International Conference on Cloud Computing, pp. 89-96, 2011. [48] MOSHARAF, N., CHOWDHURY, K., and BOUTABA, R. A survey of network virtualization. Computer Networks, v. 54, i. 5, pp. 862–876, April 2010. [49] MURPHY, M, and GOASGUEN, S. Virtual Organization Clusters: Self-provisioned clouds on the grid. In Future Generation Computer Systems, 2010. [50] NEVES, T. A., DRUMMOND, L. M. A., OCHI, L. S., ALBUQUERQUE, C., and UCHOA, E. Solving Replica Placement and Request Distribution in Content Distribution Networks, Electronic Notes in Discrete Mathematics, Volume 36, pp. 89-96, ISSN 1571-0653, 2010. [51] OPENCLOUD. The Open Could Manifesto - Dedicated to the belief that the cloud should be open, 2009. Available at: http://www.opencloudmanifesto.org/. Last access: February, 2012. [52] OPENFLOW PROJECT. OpenFlow Switch Specification, Version 1.1.0 Implemented, February 28, 2011. [53] OPENSTACK – Developer Guide – API v1.1. September, 2011. Available at: http://docs.openstack.org/api/openstack-compute/1.1/os-compute-devguide-1.1.pdf. Last access: February, 2012. [54] PADALA, P. Automated management of virtualized data centers. Ph.D. Thesis, Univ. Michigan, USA, 2010. [55] PENG, B., CUI, B. and LI, X. Implementation Issues of a Cloud Computing Platform. IEEE Data Engineering Bulletin, volume 32, issue 1, 2009. [56] PRESTI, F. L., PETRIOLI, C., and VICARI, C. Distributed dynamic replica placement and request redirection in content delivery networks, MASCOTS, pp. 366–373, 2007. [57] QIU, L., PADMANABHAN, V., and VOELKER, G. On the Placement of Web Server Replicas. Proceedings of IEEE INFOCOM, pages 1587–1596, April 2001. [58] RAZZAQ, A. and RATHORE, M. S. An approach towards resource efficient virtual network embedding, IEEE Conference on High Performance Switching and Routing, 2010. [59] ROCHWERGER, B., BREITGAND, D., EPSTEIN, A., HADAS, D., LOY, I., NAGIN, K., TORDSSON, J., RAGUSA, C., VILLARI, M., CLAYMAN, S., LEVY, E., MARASCHINI, A., MASSONET, P., MUÑOZ, H., and TOFETTI, G. Reservoir - When One Cloud Is Not Enough. IEEE Computer, v. 44, i. 3, pp. 44-51, 2011. [60] SAIL Project, Cloud Network Architecture Description, Deliverable D-D.1 of the Scalable and Adaptable Internet Solutions (SAIL) FP7 Project, July 2011. [61] SHETH, A and RANABAHU, A. Semantic Modeling for Cloud Computing, Part I. IEEE Computer Society - Semantics & Services, 2010. [62] VALANCIUS, V., LAOUTARIS, N., MASSOULIE, L., DIOT, C., and Rodriguez, P. Greening the Internet with Nano Data Centers. Proceedings of the 5th international conference on Emerging networking experiments and technologies, pp. 37-48. 2009.
  • 114. 100 [63] VAQUERO, L. M., MERINO, L. R., and BUYYA, R. Dynamically Scaling Applications in the Cloud, ACM SIGCOMM Computer Communication Review, v. 41, n. 1, pp. 45- 52, January 2011. [64] VAQUERO, L., MERINO, L., CACERES, J., and LINDNER, M. A Break in the Clouds: Towards a Cloud Definition, vol. 39, pp. 50–55, January 2008. [65] VAZIRANI, V. V. Approximation Algorithms, 2nd Edition, Springer-Verlag, 2003. [66] VERAS, M. Datacenter: Componente Central da Infraestrutura de TI. Brasport Livros e Multimídia, Rio de Janeiro, 2009. [67] VERDI, F. L., ROTHENBERG, C. E., PASQUINI, R., and MAGALHÃES, M. Novas arquiteturas de data center para cloud computing. SBRC 2010 – Minicursos, 2010. [68] VOUK, M.A. Cloud Computing – Issues, Research and Implementations. Journal of Computing and Information Technology, pages 235–246. University of Zagreb, Croatia, 2008. [69] WHITE, S.R., HANSON, J.E., WHALLEY, I., CHESS, D.M., KEPHART, J.O. An architectural approach to autonomic computing, Proceedings. International Conference on Autonomic Computing, vol., no., pp. 2- 9, 17-18, May 2004. [70] XHAFA, F. and ABRAHAM, A. Computational models and heuristics methods for Grid scheduling problems. In Future Generation Computer Systems, 2010. [71] YOUSEFF, L., BUTRICO, M., and SILVA, D. Toward a unified ontology of cloud computing. Grid Computing Environments Workshop, 2008. [72] ZHANG, Q., CHENG, L., and BOUTABA, R. Cloud computing: state-of-the-art and research challenges. Journal of Internet Service Applications, Springer, pp. 7-18, 2010. [73] ZHOU, D., ZHONG, L., WO, T., and KAN, J. CloudView: Describe and Maintain Resource View in Cloud, IEEE Cloud Computing Technology and Science (CloudCom), pp. 151-158, 2010. [74] ZHU, Y. and AMMAR, M. Algorithms for assigning substrate network resources to virtual network components, IEEE INFOCOM, pp. 1-12, 2006.