Load balancing and handoff in lte

See discussions, stats, and author profiles for this publication at:
http://www.researchgate.net/publication/269705521
Load balancing and handover joint
optimization in LTE networks using Fuzzy
Logic and Reinforcement Learning
ARTICLE in COMPUTER NETWORKS · NOVEMBER 2014
Impact Factor: 1.26 · DOI: 10.1016/j.comnet.2014.10.027
READS
173
3 AUTHORS:
P. Muñoz
University of Malaga
20 PUBLICATIONS 79 CITATIONS
SEE PROFILE
Raquel Barco
SEE PROFILE
Isabel de la Bandera Cascales
SEE PROFILE
All in-text references underlined in blue are linked to publications on ResearchGate,
letting you access and read them immediately.
Available from: P. Muñoz
Retrieved on: 30 October 2015

Load balancing and handover joint optimization in LTE
networks using Fuzzy Logic and Reinforcement Learning
P. Muñoz ⇑
, R. Barco, I. de la Bandera
University of Málaga, Communications Engineering Dept., Campus de Teatinos, 29071 Málaga, Spain
a r t i c l e i n f o
Article history:
Received 10 June 2014
Received in revised form 29 October 2014
Accepted 31 October 2014
Available online 13 November 2014
Keywords:
Load balancing
Handover
Self-organizing networks
Long-term evolution
Fuzzy logic
Reinforcement learning
a b s t r a c t
With the growing deployment of cellular networks, operators have to devote significant
manual effort to network management. As a result, Self-Organizing Networks (SONs) have
become increasingly important in order to raise the level of automated operation in cellular
technologies. In this context, Load Balancing (LB) and Handover Optimization (HOO) have
been identified by industry as key self-organizing mechanisms for the Radio Access Net-
works (RANs). However, most efforts have been focused on developing a stand-alone entity
for each self-organizing mechanism, which will run in parallel with other entities, as well
as designing coordination mechanisms in charge of stabilizing the network as a whole. Due
to the importance of LB and HOO, in this paper, a unified self-management mechanism
based on Fuzzy Logic and Reinforcement Learning is proposed. In particular, the proposed
algorithm modifies handover parameters to optimize the main Key Performance Indicators
related to LB and HOO. Results show that the proposed scheme effectively provides better
performance than independent entities running simultaneously in the network.
Ó 2014 Elsevier B.V. All rights reserved.
1. Introduction
In the last years, cellular networks have experienced a
large increase in size and complexity. As a result, mobile
operators have focused attention on reducing capital
expenditures (CAPEX) and operational expenditures
(OPEX) of their networks [1]. This fact has stimulated
strong research activity in the field of Self-Organizing
Networks (SON), which is a set of principles and concepts
defined by the 3rd Generation Partnership Project (3GPP)
for automating network management while improving
network quality [2]. In the context of SON, certain
functions have been identified as key enablers by the
3GPP, among which are Load Balancing (LB) and Handover
Optimization (HOO). The former is an automated function
where cells suffering occasional congestion can transfer
load to neighbor cells, which have spare resources, by e.g.
adjusting mobility parameters. The latter is a solution for
automatic detection and correction of errors and subopti-
mal settings in the mobility configuration, which may lead
to a degradation of user performance. Many efforts in the
research community have been devoted to the so-called
Mobility LB (MLB) and Mobility Robustness Optimization
(MRO), for which the 3rd Generation Partnership Project
(3GPP) has specified particular features [3]. Typically, these
functionalities are implemented at a low-level in the net-
work architecture, meaning that they operate quickly (i.e.
at time scales of the order of seconds or less) and they
are located in each base station on the access network. In
this sense, less or no attention has been paid to LB and
HOO at higher levels, e.g. at the level of the Operations,
Administration, and Maintenance (OAM) system, which
typically operates slower (i.e. at time scales of the order
of minutes or even hours) and they are not necessarily
located in the base stations (e.g. they can be located in a
http://dx.doi.org/10.1016/j.comnet.2014.10.027
1389-1286/Ó 2014 Elsevier B.V. All rights reserved.
⇑ Corresponding author. Tel.: +34 952 134 164; fax: +34 952 132 027.
E-mail addresses: pabloml@ic.uma.es (P. Muñoz), rbm@ic.uma.es
(R. Barco), ibanderac@ic.uma.es (I. de la Bandera).
Computer Networks 76 (2015) 112–125
Contents lists available at ScienceDirect
Computer Networks
journal homepage: www.elsevier.com/locate/comnet

server on the core network). Thus, network management at
this level copes with slower changes in the network, whose
impact on performance can be even more important, since
the underlying variations to be tracked are typically rather
slow as well [4]. In addition, the data available in the OAM
system is much more abundant than in the base stations,
thereby allowing more efficient and powerful network
management. As a result, the implementation of this kind
of algorithms will provide great benefits and cost-savings
to operators.
As the deployment of stand-alone SON functions is
growing, the number of conflicts and dependencies
between them increases. A conflict can happen if, for
example, two individual SON functions optimize the same
parameter with different goals at a network element [5]. As
expected, conflicts may have a negative impact on network
performance. The common solution in SON research has
been to create an additional entity, usually called coordina-
tor, which manages the conflict. Typically, an entity
causing conflict is switched off or limited in the control
strategy, e.g. by decreasing the allowed range, the
maximum allowed step sizes or the periodicity at which
parameter control takes place. The study of SON coordina-
tion is a topic recently addressed in the bibliography. On
the one hand, there are several studies with the aim of
developing a functional framework for SON coordination
[6–10]. On the other hand, further efforts have been
devoted to specific solutions for coordination of two or
more SON functionalities [11]. Special attention has been
devoted to the coordination of MLB and MRO, addressed
by the SOCRATES project [12]. In particular, the study
assumes the control parameters of the MLB and MRO
algorithms to be independent of each other, i.e. the two
algorithms do not tune the same parameters. While the
MLB function adjusts the HO margin (HOM), the MRO
function adjusts the Time-To-Trigger (TTT) and hysteresis
parameters. The interactions exist because these two func-
tions influence the same Key Performance Indicators (KPIs)
that are used as input for the optimization algorithms. In
[13], a constraint for the connection quality more restric-
tive than the one assumed in the SOCRATES project is con-
sidered. In this sense, the MLB function is restrained in
favor of the HO performance optimization. In [14], to avoid
the conflict between MLB and MRO, the HOM range of MLB
is dynamically adjusted according to the TTT and the hys-
teresis parameters, which are first adjusted considering the
effect of the user speed.
Although the coordinator-based schemes have been
well accepted by the research community, there are some
related issues. Specifically, the definition of operator
policies becomes a complex task, since there exists a
trade-off between proper controllability and ease of use
[10]. In addition, when some limitations are applied to
the control strategy (e.g. by restricting the step size), the
optimal configuration may lie outside the space of possible
solutions. Another problem is related to the prioritization
of SON functions in a centralized coordination scheme,
which is the typical implementation due to the required
integration with a (centralized) legacy OAM system [15].
Under this situation, the coordination entity has to process
many parameter configuration requests, so that the risk of
monopolization by high priority functions is high. Due to
this, the joint optimization of SON functions has also been
addressed. In [16], the problem of coordinating capacity
and coverage optimization and MLB is addressed. Instead
of implementing an additional entity that coordinates the
outcomes of each independent function, these functions
are combined into one algorithm and then the cellular net-
work is optimized towards a joint target. Similarly, in [17],
instead of controlling the conflict between independent
MRO and MLB functions, a joint optimization algorithm is
proposed. Such an algorithm adjusts the same HO param-
eters for individual users (i.e. each user has individual val-
ues of the same HO parameters). This solution reduces
unnecessary HOs for some users that should not be handed
over to the neighbor cell. However, it is noted that, at the
level of the OAM system, this feature is hard to be imple-
mented since statistics are rarely given per user-level, in
addition to the high signaling cost that this kind of optimi-
zation would involve. In [18], the proposed MRO and MLB
algorithm prioritizes the MRO part, since KPIs related to
the connection quality (e.g. the radio link failure) are con-
sidered first. However, other important KPIs from the MRO
viewpoint, such as those associated with unnecessary HOs,
are not taken into account in the study, which makes more
difficult to achieve optimal performance.
For all those reasons, in this paper, a novel unified algo-
rithm for both LB and HOO in Long-Term Evolution (LTE)
networks is proposed. This algorithm is based on a Fuzzy
System (FS) that tunes the handover (HO) parameters at
the cell adjacency level to improve network performance.
The FS is optimized by the Q-Learning algorithm, which
drives it to select the most appropriate action either due
to LB and/or HOO reasons. The decision of which action
the FS should take depends on past actions which were
taken by the FS and whose impact on network perfor-
mance was measured through the KPIs. With the proposed
solution, the complexity of the SON coordination entity
would be reduced, as it is freed from the coordination of
two important SON functions. In addition, the proposed
algorithm is expected to achieve better performance, as
its space of all candidate solutions is not as restricted as
if a coordinator-based scheme or some type of prioritiza-
tion algorithm would be used.
The rest of the paper is organized as follows. Section 2
formulates the problem and introduces the mobility
algorithm in LTE networks and the system performance
metrics. In Section 3, the design of the proposed FS as well
as its optimization process is described. Section 4 presents
the simulation setup and discusses the simulation results.
Finally, Section 5 presents the main conclusions of the
study.
2. System model
The HO is the procedure that preserves the connection
when the user moves around the network. As LTE is being
deployed with a frequency reuse of one (i.e. the same fre-
quency is shared by all cells), the intra-frequency HO is
very common in these networks. More specifically, the
most widely extended algorithm for the HO-triggering
decision is the 3GPP A3 event [19]. Roughly, this algorithm
P. Muñoz et al. / Computer Networks 76 (2015) 112–125 113

triggers the execution of an HO if the neighbor cell
becomes offset better than the serving cell during a specific
time period determined by the TTT parameter. Formally, it
is expressed as:
RSRPj > RSRPi þ HOMi!j; ð1Þ
where RSRPi and RSRPj are the averaged values of the Ref-
erence Signal Received Power (RSRP) measured for serving
cell i and target cell j respectively, and HOMi!j is the HOM
from cell i to cell j. Note that the symmetric HOMj!i is also
defined in the opposite direction of the adjacency (i.e. a
pair of cells that are neighbors).
In contrast with HO-triggering decisions based on
absolute comparisons (e.g. the serving/neighbor cell
below/above a threshold), the A3 event consists of a rela-
tive comparison that simplifies the configuration of its
parameters since they are independent of the absolute
received power levels, which may depend on diverse con-
text factors. However, the HOM in Eq. (1) is broken down
into several terms by the 3GPP, so that:
HOMi!j ¼ Hys þ Ofi À Ofj þ Oci À Ocj þ Off; ð2Þ
where Hys and Off are the hysteresis and offset parameters,
respectively, for this event, Ofi and Oci are the frequency
and cell specific offsets, respectively, for serving cell i,
and Ofj and Ocj are the frequency and cell specific offsets,
respectively, for neighbor cell j. While only one value of
Hys and Off corresponding to the A3 event can be used
for all the cells and deployed frequencies in the network,
Oci and Ocj can be defined per cell and Ofi and Ofj can be
defined per frequency layer. In addition to this, the defini-
tion of Hys implies the existence of another inequality in
which this term has opposite sign:
RSRPj < RSRPi À Hys þ Ofi À Ofj þ Oci À Ocj þ Off: ð3Þ
This inequality is called the leaving condition for this
event. Assuming that the entering condition given by Eq.
(1) was previously satisfied, the leaving condition must
be satisfied to reset the TTT parameter. By optimizing
Hys, the impact of signal fluctuations on the handover pro-
cess can be effectively reduced.
In general, the HOO function is directly related to the
parameter Hys, so that its optimization could only be
performed at the event level. Conversely, LB function is
more related to HO parameters that are defined at the cell
level (e.g. Oci and Ocj). In practice, different parameters are
optimum for different cell pairs (e.g. due to the shadowing
variations). Hence, the model adopted in this paper is
based on a joint optimization of LB and HOO at the cell
level. Specifically, this model requires only one formula
(e.g. the entering condition given by Eq. (2)) and one
parameter, denoted as HOM and defined per adjacency.
The only condition is that both HOM values in the adja-
cency (i.e. HOMi!j and HOMj!i) are simultaneously tuned
to perform the joint optimization of LB and HOO. Note that
this model complies with the 3GPP specifications if the
parameter Hys is set to zero and the rest of the parameters
are grouped into a single parameter defined per adjacency.
To facilitate the joint optimization, the parameters need
to be expressed in a more understandable way, according
to the following relationship in an adjacency x:
HðxÞ
þ OðxÞ
¼ HOM
ðxÞ
i!j;
HðxÞ
À OðxÞ
¼ HOM
ðxÞ
j!i;
8
<
:
ð4Þ
where HðxÞ
is the parameter related to HOO that represents
a hysteresis and OðxÞ
is the parameter related to LB that rep-
resents a certain offset. Intuitively, the HOM determines
the area where the users connected to a cell would perform
an HO toward a neighbor cell. On the one hand, in the con-
text of HOO and assuming that OðxÞ
¼ 0, the parameter
HOM
ðxÞ
i!j and the symmetric HOM
ðxÞ
j!i can be set to the same
positive value (given by HðxÞ
) so that certain symmetric
region between the two cells is ensured in order to avoid
unnecessary HOs. In Fig. 1(a), the RSRP of the neighbor
and the serving cells are represented. For instance, if a user
connected to cell i moves to cell j, it connects to cell j when
the RSRP from cell j is equal to the RSRP from cell i plus
HOM
ðxÞ
i!j. As in this case the HOMs are assumed to be sym-
metric, the same value is applied to the opposite situation
(i.e. when the user moves from cell j to i; HOM
ðxÞ
j!i is used).
Decreasing such a symmetric region (i.e. both HOMs)
favors UEs to perform an HO to a neighbor cell. This is
because the UE would need a lower received power from
the target cell to trigger an HO. Conversely, increasing
the HOMs makes more difficult to perform an HO, meaning
that the user spends more time attached to the serving cell,
while the connection quality is getting worse, which could
cell iRSRP
HOMji HOMij
cell j
cell i cell j
cell iRSRP
HOMji HOMij
cell j
cell i cell j
(a) (b)
Fig. 1. Adjustment of HOM for (a) MRO and (b) MLB purposes.
114 P. Muñoz et al. / Computer Networks 76 (2015) 112–125

lead to call dropping. However, under this configuration,
possible unnecessary HOs due to signal fluctuations can
be avoided. In this sense, the HOO function aims to reduce
the inefficient usage of network resources due to unneces-
sary HOs provided that the level of call dropping is low
enough. An important KPI closely related to the problem
of unnecessary HOs is the HO Ratio (HOR), defined in this
work as the number of HOs divided by the total number
of carried calls.
On the other hand, maintaining the symmetric region,
both HOMs can also be jointly tuned by means of OðxÞ
(e.g.
HOM
ðxÞ
i!j is increased while HOM
ðxÞ
j!i is decreased) so that
the service area of these cells is modified for LB purposes.
In this case, both HOMs are modified with the same
magnitude to preserve the hysteresis region. However,
those variations in HOM should have opposite sign to mod-
ify the service area of the two cells. In Fig. 1(b), it is observed
that HOM
ðxÞ
i!j has been increased, while HOM
ðxÞ
j!i has been
decreased. As a result, the service area of the cell i is larger.
Considering, for example, that cell j is overloaded, its service
area is reduced while the service area of the adjacent cells
with spare resources (e.g. cell i) is increased to take users
from the congested cell edge. Thus, cells suffering occa-
sional congestion can transfer load to neighbor cells, which
have free resources, by adjusting mobility parameters. It is
noted that, since HOMs are defined on an adjacency basis,
cell service areas cannot be only re-sized but also re-shaped.
The most important benefit of LB is that call blocking in the
network is reduced, especially in those cells highly loaded.
The function responsible for accepting or blocking a call is
the Call Admission Control (CAC). Such a function checks
the availability of free resources in the candidate cell before
taking a decision. In this paper, a ‘worst-case’ criterion has
been adopted to accept calls, i.e. the user is finally accepted
if the highest number of radio resources needed to maintain
a connection (worst-case requirement) is less or equal than
the number of radio resources available in the candidate
cell. If the condition is not satisfied by any candidate cell,
then the call is blocked. To quantify the call blocking,
network operators typically use the Call Blocking Ratio
(CBR), defined as the number of blocked calls divided by
the number of call attempts.
Finally, the actions performed by LB and HOO may also
involve a decrease in the connection quality. To explain
this, on the one hand, let’s consider the HOM to be
decreased for LB reasons. The target cell will increase the
probability to be preferred to the serving cell, even if the
connection quality is worse due to negative HOM values.
If this is the case, some users, usually located in the cell
edge, will be handed over to the target cell experiencing
worse radio conditions as a result of the performed HO.
Thus, negative values of the HOM will increase the risk of
dropping. On the other hand, when the HOM is increased
for either LB or HOO reasons, the user will spend more time
attached to the serving cell, delaying an HO towards cells
with better radio conditions. In this case, the probability
of call dropping will also increase. For these reasons, a KPI
widely used by network operators is the Call Dropping
Ratio (CDR), defined as the number of dropped calls divided
by the number of finished calls. Typically, call dropping
may occur because the connection quality is bad, but also
because there are no available resources due to an overload
situation. In this paper, the calculation of the CDR includes
those dropped calls due to bad radio conditions. In particu-
lar, a call is dropped when the Signal-to Interference plus
Noise Ratio (SINR) is below a certain threshold during a
specific time interval. The call dropping due to an overload
situation is assumed to be negligible due to the correct
operation of the CAC in the network, since enough
resources are guaranteed by the CAC for the accepted calls.
3. Joint optimization algorithm
This section explains the proposed algorithm for LB and
HOO. The first part comprises the design of the FS, describ-
ing the inputs, the outputs and the behavior of the system.
After this, the second part of the section is devoted to the
optimization technique that is used to lead the FS in the
action selection.
3.1. The fuzzy system
As an alternative to Classical Logic, Fuzzy Logic is a
mathematical discipline that introduces a degree of
vagueness when an assertion is made [20]. The design of
a FS for control problems is one of the most important
application areas of Fuzzy Logic [21]. Its main benefit is
that controlling a system can be performed by using lin-
guistic terms such as high or low instead of providing a
numerical value when defining the reference values of
the controller. Experience has shown that Fuzzy Logic Con-
trollers (FLCs) provide results superior to those obtained by
conventional control algorithms. In particular, the method-
ology of the FLC becomes very useful when the processes
are too complex for analysis by conventional quantitative
techniques or when the available sources of information
are interpreted qualitatively, inexactly, or uncertainly [22].
The proposed FS is designed on the basis of FLCs, but
there are some differences. From the operational perspec-
tive, it combines the functionalities of LB and HOO, i.e. to
decrease the call blocking and the HO signaling load,
respectively, while at the same time the connection quality
is preserved. To achieve this, the LB part is inspired in the
LB algorithm proposed in [23] and the HOO part is inspired
in the HOO algorithm proposed in [24]. In both cases, the
developed algorithms only implement a unique SON
function which iteratively adjusts the HOM to optimize
the respective KPIs. However, in the case of LB, the HOM
variations in both directions of any adjacency have the
same magnitude but opposite sign. In this paper, this is
equivalent to modify OðxÞ
, as shown in Eq. (4). In the case
of HOO, only the magnitude of the HOM variations is chan-
ged (i.e. the sign remains unchanged), which is equivalent
to adjusting HðxÞ
in Eq. (4).
The design of an FS that integrates both functionalities
poses new challenges. Typically, in the context of FLCs, if
the system is composed of two outputs, the design is
broken down into two new FLCs with one output each. In
this paper, the proposed joint optimization algorithm
affects two different components of the HOM parameter,
OðxÞ
and HðxÞ
, whose combination results in the values of

HOM
ðxÞ
i!j and HOM
ðxÞ
j!i, both set in the adjacency x. In princi-
ple, this design may require two separate FLCs as two out-
puts are involved. However, such a solution is directly
related to the coordinator-based schemes mentioned in
Section 1, in which the FLCs would be coordinated by an
upper-level entity. To avoid the problems linked to this
kind of solutions, in this paper, the proposed FS integrates
both functionalities into one entity, whose structure is
depicted in Fig. 2. It is assumed that each adjacency in
the network has this entity implemented. As observed,
the inputs of the FS in the adjacency x are the CBR
ðxÞ
ji , the
HOR(x)
and the HOM
ðxÞ
ij . The first input is a derived KPI, cal-
culated as:
CBR
ðxÞ
ji ¼ CBR
ðxÞ
j À CBR
ðxÞ
i ; ð5Þ
where CBR
ðxÞ
i and CBR
ðxÞ
j are the CBR measured in cell i and
cell j, respectively. This input allows to balance the traffic
between adjacent cells. In principle, CBR
ðxÞ
ji could take neg-
ative values if the cell i has higher CBR than cell j. However,
to simplify the behavior of the FS, the FS is always applied
in the direction of the adjacency in which cell j has equal or
higher CBR than cell i. The second input, HOR(x)
, is used to
reduce the HO signaling load when possible. This KPI is
calculated considering those HOs and calls carried in both
cells of the adjacency. The third input, HOM
ðxÞ
ij , is the
current value of the HOM, whose aim is to determine when
the HOM is reaching high values, in which case the connec-
tion quality would be significantly impacted. More pre-
cisely, the current value of HOM is the one taken in the
opposite direction of the CBR difference, i.e. from cell i to
cell j. Finally, the output of the FS, called YðxÞ
, is a variable
whose possible values refer to simultaneous variations in
OðxÞ
and HðxÞ
. More specifically, one output value is given
by the concatenation of two fields, the former correspond-
ing to the variation in OðxÞ
and the latter to the variation in
HðxÞ
, i.e.:
YðxÞ
¼ ðDOðxÞ
; DHðxÞ
Þ: ð6Þ
This solution allows to overcome the problem of FLCs in
which the output is one variable that can only takes
discrete or continuous values in a certain range. Let also
assume that the variation for the component OðxÞ
in a cer-
tain step of the algorithm can only be Àd, 0 or þd, while
the variation for HðxÞ
can only be Às, 0 or þs. For example,
if the output is YðxÞ
¼ ðþd; 0Þ, then the assignment can be
formally expressed as:
OðxÞ
ðt þ 1Þ OðxÞ
ðtÞ þ d; ð7Þ
HðxÞ
ðt þ 1Þ HðxÞ
ðtÞ þ 0: ð8Þ
Once the inputs and the output have been determined,
the next step is their characterization from the fuzzy logic
perspective. Starting with the inputs, it is necessary to
define the fuzzy sets and membership functions associated
with them, as shown in Fig. 3. Each fuzzy set should be
identified with a linguistic term (e.g. ‘low’ or ‘very low’).
The need to work with fuzzy sets comes from the existence
of concepts with no clear boundaries in their definition. In
this context, when working with KPIs, it is often difficult to
determine from which value a KPI is considered to be jeop-
ardized. For this reason, two fuzzy sets, ‘low’ and ‘high’,
have been defined for each input. In the case of HOM, the
objective of defining two fuzzy sets is to identify when
the HOM is close to saturation, since the CDR may be neg-
atively affected. In addition, for each fuzzy set of the
inputs, a membership function, denoted by lV ðuÞ, quanti-
fies the degree of membership of a given input value u to
a certain fuzzy set V, with a value between 0 and 1. Thus,
CBR ji
(x)
HOR
(x)
HOMij
(x)
Inference
...
...
Fuzzifier
·
·
·
·
ΔO
(x)
ΔH
(x)
HOMij
(x)
ji
(x)
HOM
Network
Conversion
O
(x)
H
(x)
Y = ( , H )ΔO Δ
(x) (x)
Fig. 2. Scheme of the proposed FS.

unlike in classical sets, the transition between both values
is gradual. For simplicity, as shown in Fig. 3, the selected
membership functions follow a triangle-shaped or trape-
zoid-shaped functions.
The core of the FS is given by a rule base that represents
the dynamic behavior of the FS through a set of linguistic
rules derived from the expert knowledge. Such a rule base
comprises a collection of fuzzy rules following a syntax of
the type IF-THEN to set the control strategy, e.g.:
IF ðCBR
ðxÞ
ji is highÞ & ðHORðxÞ
is lowÞ & ðHOM
ðxÞ
ij is lowÞ
THEN YðxÞ
¼ ðþd; 0Þ: ð9Þ
To define these rules, the knowledge and experience of
human experts is normally required. Each antecedent of
the rules represents an input state and the number of rules
is derived from the combination of all fuzzy sets among the
different inputs. In this sense, the definition of the rule
base must be complete (i.e. all fuzzy rules defined), so that
the FS can generate an appropriate action for every input
state in the system. The rule base for the proposed FS is
shown in Table 1. The definition of each rule is as follows.
First, rule 1 is activated when the CBR is balanced, and the
HOR and HOM remain at low values, meaning that no
change in HOM is needed (i.e. YðxÞ
¼ ½0; 0Š). Rules 2–4 have
in common that they are triggered when the HOM is low.
As stated before, a low HOM means that the connection
quality is not jeopardized due to changes in HOM, so that
both LB and HOO actions can be applied without any
restriction. For this reason, rule 2, which is triggered when
only the CBR presents undesired behavior, implements an
action of LB (i.e. YðxÞ
¼ ½þd; 0Š). Specifically, this rule shrinks
the service area of the congested cell, in order to send
traffic to the adjacent cell. Regarding rule 3, it is activated
due to HOO reasons, i.e. when the HOR reaches high values.
In this case, the proposed solution is to increase the sym-
metric region between adjacent cells (i.e. YðxÞ
¼ ½0; þsŠ),
so that the number of unnecessary HOs can be reduced.
To activate rule 4, the CBR and HOR must have high values
simultaneously. In principle, it is unclear whether the
performed action should be related to LB or to HOO. This
decision may depend on the particular scenario, so that
trial-and-error strategies (e.g. reinforcement learning) are
appropriated in this case. The next section explains how
to select the optimal consequent for this rule.
The remaining rules (i.e. rule 5, 6, 7 and 8) are all linked
to high values of the HOM, which may indicate that the con-
nection quality is very poor, especially for cell-edge users.
In rule 5, the CBR and HOR must exhibit low values to acti-
vate this rule, meaning that the problems related to LB/HOO
were mitigated by increasing HOM. The objective of rule 5
will be to decrease the HOM, since the high HOM may
already be unnecessary. If not, the concerned KPI will be
affected and, therefore, the FS will perform the appropriate
action. The problem is that such a decrease in HOM can be
due to LB or HOO reasons. As in rule 4, the application of
trial-and-error strategies will help to make this decision
in those situations. Rules 6 and 7 are related to more
extreme situations in which the existing problem has lead
the FS to achieve high values of HOM, but it has not been
mitigated. In addition, the fact that the HOM achieves large
values due to successive LB and HOO actions may nega-
tively affect the network performance. In this sense, there
are essentially two different situations related with this
issue. One situation would be given by severe congestion
situations, in which LB actions would greatly modify
service areas. As a result, any action of HOO to reduce
unnecessary HOs (e.g. due to high mobility) would involve
larger HOM values that may degrade cell-edge user’s per-
formance. Given that the HOM is saturated due to LB, the
action of rule 6 should be to lead the HOM towards lower
values or leave it unchanged, so that subsequent HOO
actions could be effectively applied. The other situation is
given by the presence of very high mobility in the network,
in which the HOM will reach high values as a result of HOO
actions. Under this assumption, any congestion arising in
the network would lead the LB part to work with large
HOM values, which is not desirable. Thus, rule 7 is intended
to leave unchanged, or even reduce, the symmetric region,
provided that the HOM is saturated due to HOO reasons.
low high
1
0
μ
(x)
ji
CBR
(x)
ji
0.03
low high
1
0
μ
(x)
HOR
(x)1
low high
1
0
μ
(x)
ij
HOM
(x)
ij
6 8
(a) (b) (c)
[dB]
Fig. 3. Membership functions of the fuzzy sets for each input: (a) CBR
ðxÞ
ji , (b) HOR(x)
and (c) HOM
ðxÞ
ij .
Table 1
Proposed sets of fuzzy rules.
Rule
no.
Input 1
CBR
ðxÞ
ji
Input 2
HOR(x)
Input 3
HOM
ðxÞ
ij
Candidate action(s)
[DOðxÞ
; DH(x)
]
1 L L L [0,0]
2 H L L [þd,0]
3 L H L [0,þs]
4 H H L [þd,0], [0,þs]
5 L L H [Àd,0], [0,Às]
6 H L H [0,0], [Àd,0]
7 L H H [0,0], [0,Às]
8 H H H [Àd,0], [0,Às]

Finally, rule 8 refers to a situation in which the CBR and
HOR are simultaneously high, which may occur for example
when there are both a severe congestion and high mobility
users in the network. In this case, the solution could be to
enlarge the service area of the congested cell or to reduce
the symmetric region in the adjacency. Since the objective
would be to favor HOO or LB, respectively, the option that
simultaneously enlarges the service area of the congested
cell and reduces the symmetric region in the adjacency is
discarded. Other alternatives would cause the connection
quality to be significantly worsened. The specific action
for this rule will also be determined by the optimization
algorithm explained in next section.
Once the FS has been defined, its operation is as follows.
As shown in Fig. 2, the first step is given by the fuzzifier,
the process by which the assignment of membership val-
ues (one for each fuzzy value of the linguistic variable) to
a numerical input value is made by using the membership
functions. The next step of the FS is given by the inference,
which calculates the degree of truth of each activated rule
as follows:
aðxÞ
k ¼ lK1
ðCBR
ðxÞ
ji Þ Ã lK2
ðHORðxÞ
Þ Ã lK3
ðHOM
ðxÞ
ij Þ; ð10Þ
where aðxÞ
k is the degree of truth for the rule k in the
adjacency x, and lK1
; lK2
and lK3
are the membership
functions corresponding to the fuzzy sets K1; K2 and K3,
respectively, involved in the rule k. The intersection of
the fuzzy sets, denoted here by ‘Ã’, is implemented by using
the min-operator, which takes the minimum value of the
arguments. Finally, unlike the structure of a typical FLC,
the proposed FS does not implement the module known
as defuzzifier, where the activated fuzzy rules are all aggre-
gated to produce a non-fuzzy value. The reason for this is
that the output of the proposed scheme is given by a
two-dimensional variable whose elements are not corre-
lated between them (e.g. OðxÞ
can be either increased or
decreased while HðxÞ
does not change). Thus, the fuzziness
between different consequents would not be applicable in
this work. Conversely, the output of the proposed FS is
given by the consequent of the rule whose degree of truth
is the highest. It can be formally expressed as:
outputðxÞ
¼ YðxÞ
arg max
k
aðxÞ
k

: ð11Þ
Finally, to download these changes to the network
configuration, two more steps will be necessary. As repre-
sented in Fig. 2, the former is the update of OðxÞ
and HðxÞ
considering the parameter variations given by the FS and
the latter is the conversion from OðxÞ
and HðxÞ
to HOM
ðxÞ
i!j
and HOM
ðxÞ
j!i given their relationship expressed in Eq. (4).
3.2. Optimization of the fuzzy system
In the proposed FS, there are certain rules (4–8 in Table 1)
with more than one consequent defined. This means that, a
priori, it is unclear which is the most appropriate action for
these rules, since it may depend on many context factors
(e.g. the environment, the traffic distribution patterns, the
user mobility, etc.) at the moment of the execution of the
rules or simply because of the interactions between the
objectives of LO and HOO. Different strategies have been
investigated to create, adapt or refine rules [25–29]. In this
sense, mobile operators usually do not have the complete
knowledge to take proper actions in every network state.
Thus, due to the complex nature of network management,
Reinforcement Learning (RL) is of particular interest in this
context, as the system is able to learn from its own experi-
ence. In addition, unlike other mathematical approaches
(e.g. supervised learning in Neural Networks), in RL, a train-
ing data set is not required. For this reason, in this work, the
popular RL algorithm known as Q-Learning has been
adopted, so that the best consequent for each fuzzy rule
can be found through learning from interaction.
The combination between fuzzy logic and RL has been
addressed in some previous works [29–33]. However, the
proposed optimization algorithm differs from the common
implementation of the fuzzy Q-Learning algorithm [34].
This is because, in the case of a typical FLC, the q-function
(i.e. a characteristic function of fuzzy Q-Learning optimiza-
tion) is updated according to the degree of activation of each
triggered rule of the FLC. As a consequence, the q-function
can be updated for more than one input state, i.e. there
exists a certain degree of fuzziness. Conversely, in the case
of the proposed fuzzy system, the update of the q-function
is only made for one input state, as the number of rules that
can be activated at each optimization step is only one.
In RL, an agent is driven to take actions in an environ-
ment in order to maximize a cumulative reward. The
optimization scheme showing the combination of the FS
and the learning entity is depicted in Fig. 4. The basic ele-
ments in RL are the agent, the environment, the states,
the actions, the policy, the reward and the value function.
In this case, the agent that takes the actions is the proposed
FS, while the environment corresponds to the cellular net-
work. The states are given by the combination of the fuzzy
sets of the FS. Note that, for each state, there is one fuzzy
rule defined. The actions are given by the candidate conse-
quents of the rules and they represent a specific variation in
the HOM. The policy defines how the agent has to act at a
given time. The reward is a numerical value that expresses
the intrinsic desirability of being in a certain state. While
reward
state
agent
environment
CBR
HOR
(Network)
HOM
Fuzzy
System
action
HOM
CDR
Value
function
Policy
Q-Learning
Fig. 4. Optimization scheme.

the reward indicates what is good in an immediate sense,
the value function specifies what is good in the long run.
In particular, the value function is a mapping between each
state and the total amount of reward that an agent can
expect to accumulate over the future, starting from that
state. In this sense, the objective of the agent is not to obtain
the maximum immediate reward, but to maximize the total
reward that the agent receives in the long run.
RL methods are characterized by two important fea-
tures: the trial-and-error search and the fact that actions
may affect not only the immediate reward but also the
subsequent rewards. As previously stated, the agent has
to maximize the received reward in the long-term (or
expected cumulative reward), which is the sum of the
rewards that will be obtained from the input states visited
in the future:
Rt ¼ rtþ1 þ crtþ2 þ c2
rtþ3 þ Á Á Á ¼
X1
k¼0
ck
rtþkþ1; ð12Þ
where r is the numerical reward obtained at each optimi-
zation step after performing an action and c is the discount
rate determining the relative importance of future
rewards. In this paper, the action performed by an acti-
vated fuzzy rule will be rewarded positively if the connec-
tion quality is not significantly degraded. The immediate
reward, r can be formally expressed by defining a specific
threshold for the CDR, which is the KPI that estimates the
connection quality, as stated in Section 2. Then, those
actions leading to a CDR equal or less than the threshold
should be rewarded with a positive value, while those
actions producing a CDR higher than the threshold should
be punished with a negative value. Considering this, the
formula for the reward is expressed as:
rðxÞ
¼
c if CDR
ðxÞ
measured 6 CDRth;
Àc otherwise;
(
ð13Þ
where rðxÞ
is the reward for the adjacency x and c is a
constant that can be expressed as a common factor in the
definition of the reward in Eq. (12), so that the effect is a
scaling transformation that can be used to avoid under-
flow/overflow issues in storing the q-function. In this
paper, c = 10 is assumed. In addition, CDRth is the threshold
defined at the network level to determine bad quality and
CDR
ðxÞ
measured
is the maximum CDR between both cells in the
adjacency, i.e.:
CDR
ðxÞ
measured ¼ max CDR
ðxÞ
i ; CDR
ðxÞ
j
n o
: ð14Þ
In the proposed FS, to quantify the benefits of executing
a certain rule consequent (i.e. the action) provided that a
rule has been activated (i.e. the state), the value q of a
state-action pair ½s; aŠ is defined. It is a discrete function,
denoted by q½s; aŠ, that expresses the expected cumulative
reward that can be received when taking action a from
state s. In this work, a discrete version of the Q-Learning
algorithm is considered, where the learned q-function
directly approximates the optimal one independently of
the policy followed by the agent [35].
The pseudo-code of the optimization algorithm is
shown in Fig. 5. After initializing the q-function, the
selection of the consequent for each rule (step 1) is made
by using a certain exploration/exploitation policy. Explora-
tion is needed since trying actions that have not yet been
selected is the only way to discover new actions that could
provide much more reward than other actions already
tested. Exploitation is also needed since the current knowl-
edge must be exploited to obtain reward. A widely-used
policy is the so-called -greedy policy, defined as:
ai ¼ arg max
k
q½i; kŠ with probability 1 À ; ð15Þ
ai ¼ randomfak; k ¼ 1; 2; . . . ; Jg with probability ; ð16Þ
where ai is the selected consequent for rule i and deter-
mines the trade-off between exploration and exploitation
during the optimization process (e.g. ¼ 0 means no
exploration, so that the best action is always selected).
Each time an action (i.e. a variation in HOM) is per-
formed, the network should evolve to a new state, s0
, in
which the KPIs are collected again. At this time, the reward
of the action is computed by using Eq. (13), as stated in
step 2 (Fig. 5). Then, the so-called value of the new state,
denoted by v½s0
Š, is calculated as:
v½s0
Š ¼ max
k
q½s0
; akŠ: ð17Þ
While the q-function quantifies the value of taking an
action when starting from a given state, the v-function
estimates the value of being in that state regardless of
the action to be taken. Note also that the new state s0
is
specified by the new activated fuzzy rule in the FS. From
vðs0
Þ, an error signal is calculated as follows:
Dq ¼ r þ c Á v½s0
Š À q½s; aiŠ; ð18Þ
where c is a discount factor. As observed, the first part of
the formula is the q-function calculated as the sum of the
immediate reward r for state s and the expected value of
the next state, v½s0
Š. This is equivalent to Eq. (12), where
the immediate reward and future rewards (i.e. the
expected value of the next state) are accumulated. The last
part in Eq. (18) is taken from the stored q½s; aŠ. As a result,
the q½s; aŠ will be updated in the direction of the optimal q-
function independently of the policy followed by the agent
Fig. 5. Pseudo-code of the optimization algorithm.

(step 3 in Fig. 5). Such an update is made by utilizing an
ordinary gradient descent, i.e.:
q½s; aiŠ q½s; aiŠ þ g Á Dq; ð19Þ
where g is a learning rate. The above-described process is
repeated for the new current state (steps 4 and 5 in
Fig. 5) starting with the action selection (step 1).
4. Performance analysis
4.1. Analysis setup
To assess the performance of the proposed joint optimi-
zation algorithm, a dynamic system-level simulator for LTE
macrocells has been used [36]. This simulator executes a
selectable number of optimization loops to emulate the
tuning process. Each loop comprises 7000 simulation steps,
equivalent to 12 min of actual network time. Each simula-
tion step includes updating user positions, propagation
computation, generation of new calls, and radio resource
management algorithms. At the end of each loop, measure-
ments and reliable statistics are obtained to be used in the
following optimization loop. Thus, in a certain loop, the
steps 1–5 of the algorithm described in Fig. 5 are executed
once.
The simulated scenario includes a macro-cellular envi-
ronment with a layout consisting of 19 tri-sectorized sites
evenly distributed in the scenario, as shown in Fig. 6. The
main simulation parameters are summarized in Table 2.
For simplicity, only the downlink is considered in the sim-
ulation. The service provided to users is the voice call as it is
the main service affected by the tuning process. The traffic
distribution is unevenly distributed in space, where some
cells in the center of the scenario have higher traffic density
than the surrounding cells. In addition, to thoroughly assess
the proposed algorithm, three different configurations have
been considered. Firstly, the simulated high load scenario is
Scenario
Congested
ar ea
Parameters
Indicators
ing Ratio (CBR)
ping Ratio (CDR)
Load
Balancing
Handover
Optimization
Baseline
Algorithm
Fuzzy
System
Uncoordinated
Fig. 6. Block diagram of the simulation process.
Table 2
Simulation parameters.
Parameter Configuration
Cellular layout Hexagonal grid, 57 cells (3 Â 19 sites),
cell radius 0.5 km
Transmission direction Downlink
Carrier frequency 2.0 GHz
System bandwidth 1.4 MHz
Frequency reuse 1
Propagation model Okumura–Hata with wrap-around
Log-normal slow fading, rsf = 8 dB and
correlation distance = 50 m
Channel model Multipath fading, EPA model
Mobility model Random direction
Low speed = 3 km/h
High speed = 50 km/h
Service model Constant bit rate (voice call), poisson
traffic arrival, mean call duration
120 s, 16 kbps
Base station model Tri-sectorized antenna, SISO,
EIRPmax = 43 dBm
Scheduler Time domain: Round-Robin
Frequency domain: Best Channel
Power control Equal transmit power per PRB
Link adaptation Fast, CQI based, perfect estimation
Handover Time-To-Trigger = 100 ms
HOM: ½À24; 24Š dB
Call dropping SINR À6.9 dB
Traffic distribution Unevenly distributed in space
Time resolution 100 TTI (100 ms)
Loop time 12 min
Simulation duration 3200 min
Optimization algorithm d = 1 dB, s = 0.5 dB

given by the presence of a greater number of users moving
at low speed (3 km/h) around the scenario, where the CBR
is expected to be high. Secondly, the simulated high mobil-
ity scenario is given by the presence of high-speed users
(50 km/h), which in principle would lead to a high HOR.
In this case, the number of users is not high, but the
unevenly distributed traffic in the scenario can lead to con-
gestion situations, especially in the central area. Finally, the
third scenario is a combination of the two previous scenar-
ios, so that high-load and high-speed users are simulated.
To compare the proposed method with reference cases,
as shown in Fig. 6, the independent SON functions of LB
and HOO, taken from [23,24] respectively, have been
implemented and simulated in two different ways. In one
of them, only a functionality is active in the network, while
in the other configuration both LB and HOO functions are
simultaneously executed in an uncoordinated way. In
addition, a baseline optimization scheme following the
main principles addressed in [18] has been implemented.
This scheme prioritizes the HOO part depending on
whether the connection quality is jeopardized or not. More
specifically, if the CDR is above a certain threshold, only the
HOO function is executed. The performance of these
approaches will be assessed by looking at the main related
KPIs, in particular, the overall HOR, CBR and CDR. A Figure-
of-Merit (FoM), U, that combines the previous KPIs into a
scalar value has also been considered. This FoM character-
izes, qualitatively, the overall performance of the evaluated
approaches. Formally, U is defined as [37]:
U ¼ k Á ðCBR½%Š þ ð1 À CBR½%Š=100Þ Á CDR½%ŠÞ þ HOR;
ð20Þ
where k is a constant weight determining the relative
importance of the CBR and CDR (both related to user dis-
satisfaction) compared with the HO signaling cost given
by HOR. In this study, k equal to 1 is assumed.
4.2. Simulation results
First, a sensitivity analysis for determining the optimal
values of d and s (i.e. the variation of O and H components,
respectively) has been carried out. Fig. 7 shows the mean of
the related KPIs and the proposed FoM, U, for the three dif-
ferent situations: high-load, high-mobility and both
together. As observed, U is a combination of the KPIs
related to user dissatisfaction (i.e. the CBR and CDR) and
the KPI related to the HO signaling cost (i.e. the HOR).
In the high-load scenario (Fig. 7(a)), the variations of d
and s have low impact on HOR since users have low mobil-
ity, meaning that the impact of HOR on U will also be
minor. Due to this, the variations in U are mainly given
by the user dissatisfaction. In this sense, there is a clear
trade-off between CBR and CDR, i.e. while CBR is reduced
(by increasing d), CDR is greater. However, for high values
of d, the variations in CBR are greater than in CDR. As a con-
sequence, the best values of U (i.e. the lowest) correspond
to high values of d. This is in contrast to the situations with
high-mobility, as explained below.
Fig. 7. Sensitivity analysis for d and s in different scenarios: (a) high-load, (b) high-mobility and (c) high-load and high-mobility.

The second scenario given by high-mobility (Fig. 7(b))
shows that HOR increases for larger values of d, especially
when s is 1 dB. The reason for this is that resizing the cell
service areas for load balancing purposes leads cell-edge
users to be under worse radio conditions after performing
an HO, so that the probability to perform a new HO to
other neighbor cells is increased. Conversely, provided that
d is low (avoiding the effect of load balancing on HOR), for
larger values of s, HOR decreases. This is in line with the
optimization of the H component, i.e. increasing H makes
more difficult to perform an HO and it reduces the HO fre-
quency. The main drawback for this case is that the CDR is
negatively affected. The high-mobility scenario also pro-
duces lower values of CBR and CDR because the traffic is
geographically dispersed due to the high speed of the
users. The configuration (d = 1, s = 0.5) dB provides the
lowest value of U, as a result of a better trade-off between
HO signaling and user dissatisfaction.
The above analysis can also be extended to the scenario
that combines both high-load and high-mobility (Fig. 7(c)).
Since an important objective of the proposed algorithm is to
optimize mobility and load balancing without jeopardizing
the connection quality, the high values of CDR measured in
this scenario establishes the possible range of variation of d
and s. In particular, it is observed that values of s above
0.5 dB involve a CDR greater than 5%, which would cause
serious inconvenience to operators. Leaving s fixed to
0.5 dB, the increase of d can also lead to high values of
CDR. In particular, values above 3 dB would significantly
jeopardize the CDR. For this reason, the range of d and s
analyzed in this work does not exceed the limits shown in
Fig. 7. As in the previous high-mobility scenario, the opti-
mal configuration is (d = 1, s = 0.5) dB, meaning that this
setting can be reasonably used to evaluate the performance
of the proposed algorithm against other approaches.
The comparison of the proposed fuzzy system with
other approaches is represented in Fig. 8, where the evolu-
tion throughout the time of the KPIs for each strategy is
depicted. For the sake of clarity, the represented values
have been averaged with the six subsequent samples.
The initial situation is given by a low traffic and low mobil-
ity. After about 200 min, the central cells of the scenario
become crowded, so that many users are blocked, increas-
ing the CBR. Looking more closely at this indicator, the
evaluated approaches reach values of CBR $5% when the
traffic change occurs. The HOO configuration is not able
to solve this problem, keeping the CBR at such high values,
while the LB configuration achieves a reduction of 2% in a
few optimization steps. Conversely, the gain in CBR
obtained by the uncoordinated alternative, the baseline
scheme and the proposed fuzzy system is more moderate.
A higher number of users also means more interference in
the network, so that the connection quality of the users is
worse, increasing the CDR. This increase is more pro-
nounced in the case of the uncoordinated approach. To
explain this, note that the LB and HOO functions are simul-
taneously changing the HOM from the first optimization
steps. As the CDR is not significantly affected by these
changes (due to the low interference conditions), the
HOM reaches large values. As a result, when the congestion
situation occurs, the HOM values are so large that the CDR
becomes high. After this, the SON functions attempt to
reduce this KPI. In the case of the baseline approach, this
effect on CDR is attenuated because the LB function is
switched off when the CDR becomes high. Since HOMs
are not adjusted by the LB function, the level of CDR is
not as high as with the uncoordinated scheme. The rest
of configurations, i.e. MLB, HOO and the fuzzy system, keep
the CDR constant at around 2%. Due to the presence of only
low-speed users, the HOR is about 1.
The offered traffic experiences a small reduction at
around min. 1000, but it is not until around min. 1200
when the users move at high-speed. The scenario of
high-mobility starts at this moment and the HOR is
abruptly increased to values above 10, except in the case
of the proposed fuzzy system, whose values over the time
are below 7. Thus, the performance of the proposed tech-
nique in terms of HOR is clearly better than the rest of
the strategies. It is also noted that the trajectory of HOR
followed by the uncoordinated and baseline approaches
is very similar since the HOO function is active during
the entire simulation. Regarding the CBR and CDR, the LB
and the proposed approaches lead to values below 1%,
while the rest of strategies produce undesirably higher val-
ues. Note also that, for all the cases, the CBR decreases sig-
nificantly from the previous situation (i.e. before min.
1200) because the traffic load is geographically dispersed
due to the presence of fast users.
The situation after $2100 min is given by a new increase
of the offered traffic, so that the last part of the simulation
includes both high-load and high-mobility. Looking at the
HOR, the proposed method remains at low values, being
the best approach from this perspective at any time. Simi-
larly, the LB approach keeps a relatively constant but higher
level of HOR values, since no actions to reduce HO signaling
take place in this case. The HOO, the uncoordinated and the
baseline approaches lead to a gradual increase in this KPI.
The reason for this is that these strategies implement the
same HOO function, which attempt to decrease the high
peak in the CDR at the expense of increasing the number
of HOs. However, the impact of these three methods on
the CDR is not the same. In particular, the baseline approach
provides lower CDR values than those obtained by the
uncoordinated approach because the LB function is
switched off when the CDR is jeopardized after the
variation in traffic load. The HOO approach gives even
lower values of CDR since the LB function is not executed
during the entire simulation. From the CBR perspective,
the LB approach provides the lowest values while the CDR
is also quite low, similar to the fuzzy system. The proposed
method gives better CBR than other approaches and, as pre-
viously stated, the HOR is the lowest as well.
The evolution of U throughout the time (Fig. 8(d))
shows the suitability of the evaluated methods in each sce-
nario. It is noted that the strategy with the lowest value of
U will establish a good trade-off between HO signaling and
user dissatisfaction. In the first scenario, given by high-
load conditions, the best method is the execution of LB
alone, which significantly reduces the CBR but at the
expense of an increase in the CDR that is higher than in
the case of the proposed scheme. This is because the
scenario has low mobility and does not require any

optimization from the HOO perspective. In the second sce-
nario, determined by the presence of high-speed users, the
proposed fuzzy system provides the lowest value of U,
since it considerably reduces the HOR. At the beginning
of the third scenario (a combination of the two previous),
the proposed scheme also achieves lower U values than
those obtained by the LB approach since this latter method
needs more iterations to reduce the CBR. Thus, it can be
highlighted that the proposed joint optimization method
is the only solution that, in the presence of mobility and
congestion problems (i.e. scenarios two and three), reduces
both the HOR and the CBR, which are the objectives of the
HOO and LB, respectively. In this sense, note that the LB
approach does not reduce the HOR in the second scenario,
which is mainly determined by high-mobility.
5. Conclusion
In this paper, a novel joint optimization algorithm for
LB and HOO functions has been proposed. First, the
optimized parameter HOM is broken down into two com-
ponents, O(x)
and H(x)
, which are directly related to LB and
HOO, respectively. Then, an FS that adjusts the HOM com-
ponents at the cell adjacency level for the joint optimiza-
tion of both functions is proposed. Finally, the FS is
teamed with the Q-Learning algorithm, which leads the
0 500 1000 1500 2000 2500 3000
5
10
15
20
HOR
(a)
LB
HOO
Uncoordinated
Baseline
Fuzzy System
0 500 1000 1500 2000 2500 3000
0
2
4
6
8
(b)
0 500 1000 1500 2000 2500 3000
0
2
4
6
(c)
0 500 1000 1500 2000 2500 3000
0
20
40
60
80
U
(d)
Fig. 8. Temporary evolution of (a) HOR, (b) CBR, (c) CDR and (d) U for different approaches.

FS to select suitable actions from the LB/HOO perspective,
without jeopardizing the connection quality of the active
users in the network. The proposed technique has been
compared with a baseline scheme based on the existing
bibliography and the reference cases in which LB and
HOO operate separately or even simultaneously in an
uncoordinated way. In addition, these techniques have
been assessed in extreme scenarios in which the HOM
achieves large values, such as those with high traffic load
and/or high mobility.
Results show that the proposed scheme effectively
improves network performance over the reference cases.
In particular, the HOR in the presence of high-mobility
users can be reduced down to the half, while the user
dissatisfaction in terms of the CBR and CDR keeps values
similar to the baseline schemes. In addition, it is the only
solution that is able to partially alleviate a congestion
situation and to reduce the number of HOs, which are
the main objectives of the LB and HOO, respectively. Unlike
other reference methods, the proposed technique does not
produce high peaks in the KPIs when the situation changes
abruptly, e.g. some cells become congested. In the context
of SON, it is highlighted that the complexity of the SON
entity that coordinates SON specific functions would be
reduced, as it is freed from the coordination of the two
important SON functions, LB and HOO. Finally, the advan-
tages of using fuzzy logic is that the proposed design is
easy to implement.
Acknowledgment
This work has partially been supported by the Junta de
Andalucía (Excellence Research Program, Projects P08-TIC-
4052 and P12-TIC-2905).
References
[1] L.C. Schmelz et al., Self-configuration, -optimisation and -healing in
wireless networks, in: Wireless World Research Forum Meeting, vol.
20, 2008.
[2] 3GPP, Evolved Universal Terrestrial Radio Access (E-UTRA) and
Evolved Universal Terrestrial Radio Access Network (E-UTRAN);
Overall description; Stage 2, version 11.4.0 (2012-12), TS 36.300.
[3] 3GPP, Self-Organizing Networks (SON) Policy Network Resource
Model (NRM) Integration Reference Point (IRP); Requirements,
version 11.1.0 (2012-12), TS 32.521.
[4] I. Viering, M. Döttling, A. Lobinger, A mathematical perspective of
self-optimizing wireless networks, in: Proc. of International
Conference on Communications (ICC ’09), 2009.
[5] 3GPP, Self-Organizing Networks (SON) Policy Network Resource
Model (NRM) Integration Reference Point (IRP); Information Service
(IS), version 11.4.0 (2012-12), TS 32.522.
[6] K. Tsagkaris, N. Koutsouris, P. Demestichas, R. Combes, SON
coordination in a unified management framework, in: Proc. of IEEE
77th Vehicular Technology Conference (VTC), Spring, 2013.
[7] X. Gelabert, B. Sayrac, S. Ben Jemaa, A heuristic coordination
framework for self-optimizing mechanisms in LTE HetNets, IEEE
Trans. Veh. Technol. 63 (3) (2013) 1320–1334.
[8] R. Combes, Z. Altman, E. Altman, Coordination of autonomic
functionalities in communications networks, in: CoRR abs/
1209.1236, 2012.
[9] H. Lateef, A. Imran, A. Abu-Dayya, A framework for classification of
self-organising network conflicts and coordination algorithms, in:
Proc. of IEEE 24th International Symposium on Personal Indoor and
Mobile Radio Communications (PIMRC), 2013.
[10] L. Schmelz, M. Amirijoo, A. Eisenblaetter, R. Litjens, M. Neuland, J.
Turk, A coordination framework for self-organisation in LTE
networks, in: Proc. of IEEE International Symposium on Integrated
Network Management (IM), 2011 IFIP, 2011, pp. 193–200.
[11] P. Vlacheas, E. Thomatos, K. Tsagkaris, P. Demestichas, Operator-
governed SON coordination in downlink LTE networks, in: Proc. of
Future Network Mobile Summit (FutureNetw), 2012.
[12] INFSO-ICT-216284 SOCRATES, Framework for the Development of
Self-organisation Methods, Tech. Rep. Deliverable D2.4, Version
1.0.3, September, 2008.
[13] W. Li, X. Duan, S. Jia, L. Zhang, Y. Liu, J. Lin, A dynamic hysteresis-
adjusting algorithm in LTE self-organization networks, in: Proc. of
IEEE 75th Vehicular Technology Conference (VTC), Spring, 2012.
[14] Y. Li, M. Li, B. Cao, Y. Wang, W. Liu, Dynamic optimization of
handover parameters adjustment for conflict avoidance in long term
evolution, China Commun. 10 (1) (2013) 56–71.
[15] R. Romeikat, H. Sanneck, T. Bandh, Efficient, dynamic coordination of
request batches in C-SON systems, in: Proc. of IEEE 77th Vehicular
Technology Conference (VTC), Spring, 2013.
[16] H. Klessig, A. Fehske, G. Fettweis, J. Voigt, Improving coverage and
load conditions through joint adaptation of antenna tilts and cell
selection rules in mobile networks, in: Proc. of International
Symposium on Wireless Communication Systems (ISWCS), 2012.
[17] J. Chen, H. Zhuang, B. Andrian, Y. Li, Difference-based joint
parameter configuration for MRO and MLB, in: Proc. of IEEE 75th
Vehicular Technology Conference (VTC), Spring, 2012.
[18] W.-Y. Li, X. Zhang, S.-C. Jia, X.-Y. Gu, L. Zhang, X.-Y. Duan, J.-R. Lin, A
novel dynamic adjusting algorithm for load balancing and handover
co-optimization in LTE SON, J. Comput. Sci. Technol. 28 (3) (2013)
437–444.
[19] 3GPP, Evolved Universal Terrestrial Radio Access (E-UTRA); Radio
Resource Control (RRC); Protocol specification, version 11.2.0 (2012-
12), TS 36.331.
[20] T. Ross, Fuzzy Logic with Engineering Applications, Wiley, 2010.
[21] A. Engelbrecht, Computational Intelligence: An Introduction, John
Wiley Sons, 2007.
[22] C. Lee, Fuzzy logic in control systems: fuzzy logic controller. I, IEEE
Trans. Syst., Man Cybernet. 20 (2) (1990) 404–418.
[23] P. Muñoz, R. Barco, I. de la Bandera, Optimization of load balancing
using fuzzy Q-Learning for next generation wireless networks,
Expert Syst. Appl. 40 (4) (2013) 984–994.
[24] P. Muñoz, R. Barco, I. de la Bandera, On the potential of handover
parameter optimization for self-organizing networks, IEEE Trans.
Veh. Technol. 62 (5) (2013) 1895–1905.
[25] K.C. Foong, C.T. Chee, L.S. Wei, Adaptive network fuzzy inference
system (ANFIS) handoff algorithm, in: Proc. of the International
Conference on Future Computer and Communication (ICFCC), 2009.
[26] A. Çalhan, C. Çeken, An optimum vertical handoff decision algorithm
based on adaptive fuzzy logic and genetic algorithm, Wireless Pers.
Commun. (2010) 1–18.
[27] L. Giupponi, R. Agustí, J. Pérez-Romero, O. Sallent, A framework for
JRRM with resource reservation and multiservice provisioning in
heterogeneous networks, Mobile Networks Appl. 11 (2006) 825–
846.
[28] M. Dirani, Z. Altman, Self-organizing networks in next generation
radio access networks: application to fractional power control,
Comput. Networks 55 (2) (2011) 431–438.
[29] R. Nasri, A. Samhat, Z. Altman, A new approach of UMTS-WLAN load
balancing; algorithm and its dynamic optimization, in: Proc. of IEEE
International Symposium on a World of Wireless, Mobile and
Multimedia Networks, 2007.
[30] A. Galindo-Serrano, L. Giupponi, Downlink femto-to-macro
interference management based on fuzzy Q-learning, in: Proc. of
International Symposium on Modeling and Optimization in Mobile,
Ad Hoc and Wireless Networks (WiOpt), 2011.
[31] M. Haddad, Z. Altman, S. Elayoubi, E. Altman, A Nash–Stackelberg
fuzzy Q-learning decision approach in heterogeneous cognitive
networks, in: Proc. of IEEE Global Telecommunications Conference
(GLOBECOM), 2010.
[32] R. Razavi, S. Klein, H. Claussen, A fuzzy reinforcement learning
approach for self-optimization of coverage in LTE networks, Bell
Labs Tech. J. 15 (3) (2010) 153–175.
[33] Y.H. Chen, C.J. Chang, C.Y. Huang, Fuzzy Q-learning admission
control for WCDMA/WLAN heterogeneous networks with
multimedia traffic, IEEE Trans. Mobile Comput. 8 (11) (2009)
1469–1479.
[34] P.Y. Glorennec, Fuzzy Q-learning and dynamical fuzzy Q-learning,
in: Proc. of the Third IEEE Conference on Fuzzy Systems, vol. 1, 1994,
pp. 474–479.
[35] C. Watkins, P. Dayan, Technical note: Q-learning, Mach. Learn. 8 (3)
(1992) 279–292.

[36] P. Muñoz, I. de la Bandera, F. Ruiz, S. Luna-Ramírez, R. Barco, M. Toril,
P. Lázaro, J. Rodríguez, Computationally-efficient design of a
dynamic system-level LTE simulator, Int. J. Electron. Telecommun.
57 (3) (2011) 347–358.
[37] J. Ruiz-Avilés, S. Luna-Ramírez, M. Toril, F. Ruiz, Traffic steering by
self-tuning controllers in enterprise LTE femtocells, EURASIP J.
Wireless Commun. Network. 2012 (337) (2012).
Pablo Muñoz received his M.Sc. and Ph.D.
degrees in Telecommunication Engineering
from the University of Málaga (Spain) in 2008
and 2013, respectively. He is currently work-
ing with the Communications Engineering
Department at the same university. Since
September 2009, he has been a Ph.D. Fellow,
where he has been working in self optimiza-
tion of mobile radio access networks and
radio resource management.
Raquel Barco received the M.Sc. degree in
Telecommunication Engineering in 1997 and
the Ph.D. degree in 2007 from the University
of Málaga, Spain. From 1998 to 2000, she
worked at the European Space Agency,
Darmstadt, Germany. From 2000 to 2003, she
worked part-time for Nokia Networks. Cur-
rently, she is Associate Professor at the Com-
munication Engineering Department,
University of Málaga. She has published more
than 50 papers in international journals and
conferences and she has been involved in
several projects with companies. Her research interests are in the field of
mobile communication systems, especially Self-Organizing Networks.
Isabel de la Bandera received her M.Sc.
degree in Telecommunication Engineering
from the University of Málaga (Spain) in 2009.
In 2008, she was with the Communications
Engineering Department at the same univer-
sity in RFID projects. Since February 2010, she
has been with the same department working
in projects about radio resource management
in next generation mobile networks and she is
working toward the Ph.D. degree in Tele-
communications Engineering.

Load balancing and handoff in lte

More Related Content

What's hot (20)

Similar to Load balancing and handoff in lte (20)

Recently uploaded (20)

Load balancing and handoff in lte