SlideShare a Scribd company logo
CPS 230
DESIGN AND ANALYSIS
OF ALGORITHMS
Fall 2008
Instructor: Herbert Edelsbrunner
Teaching Assistant: Zhiqiang Gu
CPS 230 Fall Semester of 2008
Table of Contents
1 Introduction 3
I DESIGN TECHNIQUES 4
2 Divide-and-Conquer 5
3 Prune-and-Search 8
4 Dynamic Programming 11
5 Greedy Algorithms 14
First Homework Assignment 17
II SEARCHING 18
6 Binary Search Trees 19
7 Red-Black Trees 22
8 Amortized Analysis 26
9 Splay Trees 29
Second Homework Assignment 33
III PRIORITIZING 34
10 Heaps and Heapsort 35
11 Fibonacci Heaps 38
12 Solving Recurrence Relations 41
Third Homework Assignment 44
IV GRAPH ALGORITHMS 45
13 Graph Search 46
14 Shortest Paths 50
15 Minimum Spanning Trees 53
16 Union-Find 56
Fourth Homework Assignment 60
V TOPOLOGICAL ALGORITHMS 61
17 Geometric Graphs 62
18 Surfaces 65
19 Homology 68
Fifth Homework Assignment 72
VI GEOMETRIC ALGORITHMS 73
20 Plane-Sweep 74
21 Delaunay Triangulations 77
22 Alpha Shapes 81
Sixth Homework Assignment 84
VII NP-COMPLETENESS 85
23 Easy and Hard Problems 86
24 NP-Complete Problems 89
25 Approximation Algorithms 92
Seventh Homework Assignment 95
2
1 Introduction
Meetings. We meet twice a week, on Tuesdays and
Thursdays, from 1:15 to 2:30pm, in room D106 LSRC.
Communication. The course material will be delivered
in the two weekly lectures. A written record of the lec-
tures will be available on the web, usually a day after the
lecture. The web also contains other information, such as
homework assignments, solutions, useful links, etc. The
main supporting text is
TARJAN. Data Structures and Network Algorithms. SIAM,
1983.
The book focuses on fundamental data structures and
graph algorithms, and additional topics covered in the
course can be found in the lecture notes or other texts in
algorithms such as
KLEINBERG AND TARDOS. Algorithm Design. Pearson Ed-
ucation, 2006.
Examinations. There will be a final exam (covering the
material of the entire semester) and a midterm (at the be-
ginning of October), You may want to freshen up your
math skills before going into this course. The weighting
of exams and homework used to determine your grades is
homework 35%,
midterm 25%,
final 40%.
Homework. We have seven homeworks scheduled
throughout this semester, one per main topic covered in
the course. The solutions to each homework are due one
and a half weeks after the assignment. More precisely,
they are due at the beginning of the third lecture after the
assignment. The seventh homework may help you prepare
for the final exam and solutions will not be collected.
Rule 1. The solution to any one homework question must
fit on a single page (together with the statement of the
problem).
Rule 2. The discussion of questions and solutions before
the due date is not discouraged, but you must formu-
late your own solution.
Rule 3. The deadline for turning in solutions is 10 min-
utes after the beginning of the lecture on the due date.
Overview. The main topics to be covered in this course
are
I Design Techniques;
II Searching;
III Prioritizing;
IV Graph Algorithms;
V Topological Algorithms;
VI Geometric Algorithms;
VII NP-completeness.
The emphasis will be on algorithm design and on algo-
rithm analysis. For the analysis, we frequently need ba-
sic mathematical tools. Think of analysis as the measure-
ment of the quality of your design. Just like you use your
sense of taste to check your cooking, you should get into
the habit of using algorithm analysis to justify design de-
cisions when you write an algorithm or a computer pro-
gram. This is a necessary step to reach the next level in
mastering the art of programming. I encourage you to im-
plement new algorithms and to compare the experimental
performance of your program with the theoretical predic-
tion gained through analysis.
3
I DESIGN TECHNIQUES
2 Divide-and-Conquer
3 Prune-and-Search
4 Dynamic Programming
5 Greedy Algorithms
First Homework Assignment
4
2 Divide-and-Conquer
We use quicksort as an example for an algorithm that fol-
lows the divide-and-conquer paradigm. It has the repu-
tation of being the fasted comparison-based sorting algo-
rithm. Indeed it is very fast on the average but can be slow
for some input, unless precautions are taken.
The algorithm. Quicksort follows the general paradigm
of divide-and-conquer, which means it divides the un-
sorted array into two, it recurses on the two pieces, and it
finally combines the two sorted pieces to obtain the sorted
array. An interesting feature of quicksort is that the divide
step separates small from large items. As a consequence,
combining the sorted pieces happens automatically with-
out doing anything extra.
void QUICKSORT(int ℓ, r)
if ℓ < r then m = SPLIT(ℓ, r);
QUICKSORT(ℓ, m − 1);
QUICKSORT(m + 1, r)
endif.
We assume the items are stored in A[0..n − 1]. The array
is sorted by calling QUICKSORT(0, n − 1).
Splitting. The performance of quicksort depends heav-
ily on the performance of the split operation. The effect of
splitting from ℓ to r is:
• x = A[ℓ] is moved to its correct location at A[m];
• no item in A[ℓ..m − 1] is larger than x;
• no item in A[m + 1..r] is smaller than x.
Figure 1 illustrates the process with an example. The nine
items are split by moving a pointer i from left to right
and another pointer j from right to left. The process stops
when i and j cross. To get splitting right is a bit delicate,
in particular in special cases. Make sure the algorithm is
correct for (i) x is smallest item, (ii) x is largest item, (iii)
all items are the same.
int SPLIT(int ℓ, r)
x = A[ℓ]; i = ℓ; j = r + 1;
repeat repeat i++ until x ≤ A[i];
repeat j-- until x ≥ A[j];
if i < j then SWAP(i, j) endif
until i ≥ j;
SWAP(ℓ, j); return j.
i j
m
j
i
1 9
3 5 4 2 4
1 9
2 7
5
7
4 2 9 4 2 1
7 3
3 5 4 2 4 2
Figure 1: First, i and j stop at items 9 and 1, which are then
swapped. Second, i and j cross and the pivot, 7, is swapped with
item 2.
Special cases (i) and (iii) are ok but case (ii) requires a
stopper at A[r + 1]. This stopper must be an item at least
as large as x. If r < n − 1 this stopper is automatically
given. For r = n − 1, we create such a stopper by setting
A[n] = +∞.
Running time. The actions taken by quicksort can be
expressed using a binary tree: each (internal) node repre-
sents a call and displays the length of the subarray; see
Figure 2. The worst case occurs when A is already sorted.
1
2
1 1
2
5
7
9
1
Figure 2: The total amount of time is proportional to the sum
of lengths, which are the numbers of nodes in the corresponding
subtrees. In the displayed case this sum is 29.
In this case the tree degenerates to a list without branch-
ing. The sum of lengths can be described by the following
recurrence relation:
T (n) = n + T (n − 1) =
n
X
i=1
i =

n + 1
2

.
The running time in the worst case is therefore in O(n2
).
In the best case the tree is completely balanced and the
sum of lengths is described by the recurrence relation
T (n) = n + 2 · T

n − 1
2

.
5
If we assume n = 2k
− 1 we can rewrite the relation as
U(k) = (2k
− 1) + 2 · U(k − 1)
= (2k
− 1) + 2(2k−1
− 1) + . . . + 2k−1
(2 − 1)
= k · 2k
−
k−1
X
i=0
2i
= 2k
· k − (2k
− 1)
= (n + 1) · log2(n + 1) − n.
The running time in the best case is therefore in
O(n log n).
Randomization. One of the drawbacks of quicksort, as
described until now, is that it is slow on rather common
almost sorted sequences. The reason are pivots that tend
to create unbalanced splittings. Such pivots tend to oc-
cur in practice more often than one might expect. Hu-
man and often also machine generated data is frequently
biased towards certain distributions (in this case, permuta-
tions), and it has been said that 80% of the time or more,
sorting is done on either already sorted or almost sorted
files. Such situations can often be helped by transferring
the algorithm’s dependence on the input data to internally
made random choices. In this particular case, we use ran-
domization to make the choice of the pivot independent of
the input data. Assume RANDOM(ℓ, r) returns an integer
p ∈ [ℓ, r] with uniform probability:
Prob[RANDOM(ℓ, r) = p] =
1
r − ℓ + 1
for each ℓ ≤ p ≤ r. In other words, each p ∈ [ℓ, r] is
equally likely. The following algorithm splits the array
with a random pivot:
int RSPLIT(int ℓ, r)
p = RANDOM(ℓ, r); SWAP(ℓ, p);
return SPLIT(ℓ, r).
We get a randomized implementation by substituting
RSPLIT for SPLIT. The behavior of this version of quick-
sort depends on p, which is produced by a random number
generator.
Average analysis. We assume that the items in A[0..n−
1] are pairwise different. The pivot splits A into
A[0..m − 1], A[m], A[m + 1..n − 1].
By assumption on function RSPLIT, the probability for
each m ∈ [0, n − 1] is 1
n . Therefore the average sum
of array lengths split by QUICKSORT is
T (n) = n +
1
n
·
n−1
X
m=0
(T (m) + T (n − m − 1)) .
To simplify, we multiply with n and obtain a second rela-
tion by substituting n − 1 for n:
n · T (n) = n2
+ 2 ·
n−1
X
i=0
T (i), (1)
(n − 1) · T (n − 1) = (n − 1)2
+ 2 ·
n−2
X
i=0
T (i). (2)
Next we subtract (2) from (1), we divide by n(n + 1), we
use repeated substitution to express T (n) as a sum, and
finally split the sum in two:
T (n)
n + 1
=
T (n − 1)
n
+
2n − 1
n(n + 1)
=
T (n − 2)
n − 1
+
2n − 3
(n − 1)n
+
2n − 1
n(n + 1)
=
n
X
i=1
2i − 1
i(i + 1)
= 2 ·
n
X
i=1
1
i + 1
−
n
X
i=1
1
i(i + 1)
.
Bounding the sums. The second sum is solved directly
by transformation to a telescoping series:
n
X
i=1
1
i(i + 1)
=
n
X
i=1

1
i
−
1
i + 1

= 1 −
1
n + 1
.
The first sum is bounded from above by the integral of 1
x
for x ranging from 1 to n + 1; see Figure 3. The sum
of 1
i+1 is the sum of areas of the shaded rectangles, and
because all rectangles lie below the graph of 1
x we get a
bound for the total rectangle area:
n
X
i=1
1
i + 1

Z n+1
1
dx
x
= ln(n + 1).
6
x
x
1/
4
3
2
1
Figure 3: The areas of the rectangles are the terms in the sum,
and the total rectangle area is bounded by the integral from 1
through n + 1.
We plug this bound back into the expression for the aver-
age running time:
T (n)  (n + 1) ·
n
X
i=1
2
i + 1
 2 · (n + 1) · ln(n + 1)
=
2
log2 e
· (n + 1) · log2(n + 1).
In words, the running time of quicksort in the average case
is only a factor of about 2/ log2 e = 1.386 . . . slower than
in the best case. This also implies that the worst case can-
not happen very often, for else the average performance
would be slower.
Stack size. Another drawback of quicksort is the recur-
sion stack, which can reach a size of Ω(n) entries. This
can be improved by always first sorting the smaller side
and simultaneously removing the tail-recursion:
void QUICKSORT(int ℓ, r)
i = ℓ; j = r;
while i  j do
m = RSPLIT(i, j);
if m − i  j − m
then QUICKSORT(i, m − 1); i = m + 1
else QUICKSORT(m + 1, j); j = m − 1
endif
endwhile.
In each recursive call to QUICKSORT, the length of the ar-
ray is at most half the length of the array in the preceding
call. This implies that at any moment of time the stack
contains no more than 1 + log2 n entries. Note that with-
out removal of the tail-recursion, the stack can reach Ω(n)
entries even if the smaller side is sorted first.
Summary. Quicksort incorporates two design tech-
niques to efficiently sort n numbers: divide-and-conquer
for reducing large to small problems and randomization
for avoiding the sensitivity to worst-case inputs. The av-
erage running time of quicksort is in O(n log n) and the
extra amount of memory it requires is in O(log n). For
the deterministic version, the average is over all n! per-
mutations of the input items. For the randomized version
the average is the expected running time for every input
sequence.
7
3 Prune-and-Search
We use two algorithms for selection as examples for the
prune-and-search paradigm. The problem is to find the
i-smallest item in an unsorted collection of n items. We
could first sort the list and then return the item in the i-th
position, but just finding the i-th item can be done faster
than sorting the entire list. As a warm-up exercise consider
selecting the 1-st or smallest item in the unsorted array
A[1..n].
min = 1;
for j = 2 to n do
if A[j]  A[min] then min = j endif
endfor.
The index of the smallest item is found in n − 1 com-
parisons, which is optimal. Indeed, there is an adversary
argument, that is, with fewer than n − 1 comparisons we
can change the minimum without changing the outcomes
of the comparisons.
Randomized selection. We return to finding the i-
smallest item for a fixed but arbitrary integer 1 ≤ i ≤ n,
which we call the rank of that item. We can use the split-
ting function of quicksort also for selection. As in quick-
sort, we choose a random pivot and split the array, but we
recurse only for one of the two sides. We invoke the func-
tion with the range of indices of the current subarray and
the rank of the desired item, i. Initially, the range consists
of all indices between ℓ = 1 and r = n, limits included.
int RSELECT(int ℓ, r, i)
q = RSPLIT(ℓ, r); m = q − ℓ + 1;
if i  m then return RSELECT(ℓ, q − 1, i)
elseif i = m then return q
else return RSELECT(q + 1, r, i − m)
endif.
For small sets, the algorithm is relatively ineffective and
its running time can be improved by switching over to
sorting when the size drops below some constant thresh-
old. On the other hand, each recursive step makes some
progress so that termination is guaranteed even without
special treatment of small sets.
Expected running time. For each 1 ≤ m ≤ n, the
probability that the array is split into subarrays of sizes
m − 1 and n − m is 1
n . For convenience we assume that n
is even. The expected running time increases with increas-
ing number of items, T (k) ≤ T (m) if k ≤ m. Hence,
T (n) ≤ n +
1
n
n
X
m=1
max{T (m − 1), T (n − m)}
≤ n +
2
n
n
X
m= n
2 +1
T (m − 1).
Assume inductively that T (m) ≤ cm for m  n and
a sufficiently large positive constant c. Such a constant
c can certainly be found for m = 1, since for that case
the running time of the algorithm is only a constant. This
establishes the basis of the induction. The case of n items
reduces to cases of m  n items for which we can use the
induction hypothesis. We thus get
T (n) ≤ n +
2c
n
n
X
m= n
2 +1
m − 1
= n + c · (n − 1) −
c
2
·
n
2
+ 1

= n +
3c
4
· n −
3c
2
.
Assuming c ≥ 4 we thus have T (n) ≤ cn as required.
Note that we just proved that the expected running time of
RSELECT is only a small constant times that of RSPLIT.
More precisely, that constant factor is no larger than four.
Deterministic selection. The randomized selection al-
gorithm takes time proportional to n2
in the worst case,
for example if each split is as unbalanced as possible. It is
however possible to select in O(n) time even in the worst
case. The median of the set plays a special role in this al-
gorithm. It is defined as the i-smallest item where i = n+1
2
if n is odd and i = n
2 or n+2
2 if n is even. The determinis-
tic algorithm takes five steps to select:
Step 1. Partition the n items into
n
5

groups of size
at most 5 each.
Step 2. Find the median in each group.
Step 3. Find the median of the medians recursively.
Step 4. Split the array using the median of the medians
as the pivot.
Step 5. Recurse on one side of the pivot.
It is convenient to define k =
n
5

and to partition such
that each group consists of items that are multiples of k
positions apart. This is what is shown in Figure 4 provided
we arrange the items row by row in the array.
8
Figure 4: The 43 items are partitioned into seven groups of 5 and
two groups of 4, all drawn vertically. The shaded items are the
medians and the dark shaded item is the median of medians.
Implementation with insertion sort. We use insertion
sort on each group to determine the medians. Specifically,
we sort the items in positions ℓ, ℓ+k, ℓ+2k, ℓ+3k, ℓ+4k
of array A, for each ℓ.
void ISORT(int ℓ, k, n)
j = ℓ + k;
while j ≤ n do i = j;
while i  ℓ and A[i]  A[i − k] do
SWAP(i, i − k); i = i − k
endwhile;
j = j + k
endwhile.
Although insertion sort takes quadratic time in the worst
case, it is very fast for small arrays, as in this applica-
tion. We can now combine the various pieces and write
the selection algorithm in pseudo-code. Starting with the
code for the randomized algorithm, we first remove the
randomization and second add code for Steps 1, 2, and 3.
Recall that i is the rank of the desired item in A[ℓ..r]. Af-
ter sorting the groups, we have their medians arranged in
the middle fifth of the array, A[ℓ+2k..ℓ+3k−1], and we
compute the median of the medians by recursive applica-
tion of the function.
int SELECT(int ℓ, r, i)
k = ⌈(r − ℓ + 1)/5⌉;
for j = 0 to k − 1 do ISORT(ℓ + j, k, r) endfor;
m′
= SELECT(ℓ + 2k, ℓ + 3k − 1, ⌊(k + 1)/2⌋);
SWAP(ℓ, m′
); q = SPLIT(ℓ, r); m = q − ℓ + 1;
if i  m then return SELECT(ℓ, q − 1, i)
elseif i = m then return q
else return SELECT(q + 1, r, i − m)
endif.
Observe that the algorithm makes progress as long as there
are at least three items in the set, but we need special treat-
ment of the cases of one or of two items. The role of the
median of medians is to prevent an unbalanced split of
the array so we can safely use the deterministic version of
splitting.
Worst-case running time. To simplify the analysis, we
assume that n is a multiple of 5 and ignore ceiling and
floor functions. We begin by arguing that the number of
items less than or equal to the median of medians is at least
3n
10 . These are the first three items in the sets with medians
less than or equal to the median of medians. In Figure 4,
these items are highlighted by the box to the left and above
but containing the median of medians. Symmetrically, the
number of items greater than or equal to the median of
medians is at least 3n
10 . The first recursion works on a set
of n
5 medians, and the second recursion works on a set of
at most 7n
10 items. We have
T (n) ≤ n + T
n
5

+ T

7n
10

.
We prove T (n) = O(n) by induction assuming T (m) ≤
c · m for m  n and c a large enough constant.
T (n) ≤ n +
c
5
· n +
7c
10
· n
=

1 +
9c
10

· n.
Assuming c ≥ 10 we have T (n) ≤ cn, as required. Again
the running time is at most some constant times that of
splitting the array. The constant is about two and a half
times the one for the randomized selection algorithm.
A somewhat subtle issue is the presence of equal items
in the input collection. Such occurrences make the func-
tion SPLIT unpredictable since they could occur on either
side of the pivot. An easy way out of the dilemma is to
make sure that the items that are equal to the pivot are
treated as if they were smaller than the pivot if they occur
in the first half of the array and they are treated as if they
were larger than the pivot if they occur in the second half
of the array.
Summary. The idea of prune-and-search is very similar
to divide-and-conquer, which is perhaps the reason why
some textbooks make no distinction between the two. The
characteristic feature of prune-and-search is that the recur-
sion covers only a constant fraction of the input set. As we
have seen in the analysis, this difference implies a better
running time.
It is interesting to compare the randomized with the de-
terministic version of selection:
9
• the use of randomization leads to a simpler algorithm
but it requires a source of randomness;
• upon repeating the algorithm for the same data, the
deterministic version goes through the exact same
steps while the randomized version does not;
• we analyze the worst-case running time of the deter-
ministic version and the expected running time (for
the worst-case input) of the randomized version.
All three differences are fairly universal and apply to other
algorithms for which we have the choice between a deter-
ministic and a randomized implementation.
10
4 Dynamic Programming
Sometimes, divide-and-conquer leads to overlapping sub-
problems and thus to redundant computations. It is not
uncommon that the redundancies accumulate and cause
an exponential amount of wasted time. We can avoid
the waste using a simple idea: solve each subproblem
only once. To be able to do that, we have to add a cer-
tain amount of book-keeping to remember subproblems
we have already solved. The technical name for this de-
sign paradigm is dynamic programming.
Edit distance. We illustrate dynamic programming us-
ing the edit distance problem, which is motivated by ques-
tions in genetics. We assume a finite set of characters
or letters, Σ, which we refer to as the alphabet, and we
consider strings or words formed by concatenating finitely
many characters from the alphabet. The edit distance be-
tween two words is the minimum number of letter inser-
tions, letter deletions, and letter substitutions required to
transform one word to the other. For example, the edit
distance between FOOD and MONEY is at most four:
FOOD → MOOD → MOND → MONED → MONEY
A better way to display the editing process is the gap rep-
resentation that places the words one above the other, with
a gap in the first word for every insertion and a gap in the
second word for every deletion:
F O O D
M O N E Y
Columns with two different characters correspond to sub-
stitutions. The number of editing steps is therefore the
number of columns that do not contain the same character
twice.
Prefix property. It is not difficult to see that you cannot
get from FOOD to MONEY in less than four steps. However,
for longer examples it seems considerably more difficult
to find the minimum number of steps or to recognize an
optimal edit sequence. Consider for example
A L G O R I T H M
A L T R U I S T I C
Is this optimal or, equivalently, is the edit distance between
ALGORITHM and ALTRUISTIC six? Instead of answering
this specific question, we develop a dynamic program-
ming algorithm that computes the edit distance between
an m-character string A[1..m] and an n-character string
B[1..n]. Let E(i, j) be the edit distance between the pre-
fixes of length i and j, that is, between A[1..i] and B[1..j].
The edit distance between the complete strings is therefore
E(m, n). A crucial step towards the development of this
algorithm is the following observation about the gap rep-
resentation of an optimal edit sequence.
PREFIX PROPERTY. If we remove the last column of an
optimal edit sequence then the remaining columns
represent an optimal edit sequence for the remaining
substrings.
We can easily prove this claim by contradiction: if the
substrings had a shorter edit sequence, we could just glue
the last column back on and get a shorter edit sequence for
the original strings.
Recursive formulation. We use the Prefix Property to
develop a recurrence relation for E. The dynamic pro-
gramming algorithm will be a straightforward implemen-
tation of that relation. There are a couple of obvious base
cases:
• Erasing: we need i deletions to erase an i-character
string, E(i, 0) = i.
• Creating: we need j insertions to create a j-
character string, E(0, j) = j.
In general, there are four possibilities for the last column
in an optimal edit sequence.
• Insertion: the last entry in the top row is empty,
E(i, j) = E(i, j − 1) + 1.
• Deletion: the last entry in the bottom row is empty,
E(i, j) = E(i − 1, j) + 1.
• Substitution: both rows have characters in the last
column that are different, E(i, j) = E(i − 1, j −
1) + 1.
• No action: both rows end in the same character,
E(i, j) = E(i − 1, j − 1).
Let P be the logical proposition A[i] 6= B[j] and denote
by |P| its indicator variable: |P| = 1 if P is true and |P| =
0 if P is false. We can now summarize and for i, j  0
get the edit distance as the smallest of the possibilities:
E(i, j) = min



E(i, j − 1) + 1
E(i − 1, j) + 1
E(i − 1, j − 1) + |P|



.
11
The algorithm. If we turned this recurrence relation di-
rectly into a divide-and-conqueralgorithm, we would have
the following recurrence for the running time:
T (m, n) = T (m, n − 1) + T (m − 1, n)
+ T (m − 1, n − 1) + 1.
The solution to this recurrence is exponential in m and n,
which is clearly not the way to go. Instead, let us build
an m + 1 times n + 1 table of possible values of E(i, j).
We can start by filling in the base cases, the entries in the
0-th row and column. To fill in any other entry, we need
to know the values directly to the left, directly above, and
both to the left and above. If we fill the table from top to
bottom and from left to right then whenever we reach an
entry, the entries it depends on are already available.
int EDITDISTANCE(int m, n)
for i = 0 to m do E[i, 0] = i endfor;
for j = 1 to n do E[0, j] = j endfor;
for i = 1 to m do
for j = 1 to n do
E[i, j] = min{E[i, j − 1] + 1, E[i − 1, j] + 1,
E[i − 1, j − 1] + |A[i] 6= B[j]|}
endfor
endfor;
return E[m, n].
Since there are (m+1)(n+1) entries in the table and each
takes a constant time to compute, the total running time is
in O(mn).
An example. The table constructed for the conversion of
ALGORITHM to ALTRUISTIC is shown in Figure 5. Boxed
numbers indicate places where the two strings have equal
characters. The arrows indicate the predecessors that de-
fine the entries. Each direction of arrow corresponds to a
different edit operation: horizontal for insertion, vertical
for deletion, and diagonal for substitution. Dotted diago-
nal arrows indicate free substitutions of a letter for itself.
Recovering the edit sequence. By construction, there
is at least one path from the upper left to the lower right
corner, but often there will be several. Each such path
describes an optimal edit sequence. For the example at
hand, we have three optimal edit sequences:
A L G O R I T H M
A L T R U I S T I C
A
L
I
G
O
R
T
H
M
A L T R U I S T I C
0 4
1 2 3 5 6 7 8
3
9
4
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6 7 8 9
10
1 0 1 2 3 4 5 6 7 8
2 1 1 2 3 4 5 6 7 8
3 2 2 2 3 4 5 6 7 8
4 3 2 3 4 5 6 7 8
5 4 4 3 3 3 4 5 6 7
6 5 4 4 4 4 5 6
4
7 6 5 5 5 5 5 5 5 6
8 7 6 6 6 6 6 6 6 6
Figure 5: The table of edit distances between all prefixes of
ALGORITHM and of ALTRUISTIC. The shaded area highlights the
optimal edit sequences, which are paths from the upper left to
the lower right corner.
A L G O R I T H M
A L T R U I S T I C
A L G O R I T H M
A L T R U I S T I C
They are easily recovered by tracing the paths backward,
from the end to the beginning. The following algorithm
recovers an optimal solution that also minimizes the num-
ber of insertions and deletions. We call it with the lengths
of the strings as arguments, R(m, n).
void R(int i, j)
if i  0 or j  0 then
switch incoming arrow:
case ց: R(i − 1, j − 1); print(A[i], B[j])
case ↓: R(i − 1, j); print(A[i], )
case →: R(i, j − 1); print( , B[j]).
endswitch
endif.
Summary. The structure of dynamic programming is
again similar to divide-and-conquer, except that the sub-
problems to be solved overlap. As a consequence, we get
different recursive paths to the same subproblems. To de-
velop a dynamic programming algorithm that avoids re-
dundant solutions, we generally proceed in two steps:
12
1. We formulate the problem recursively. In other
words, we write down the answer to the whole prob-
lem as a combination of the answers to smaller sub-
problems.
2. We build solutions from bottom up. Starting with the
base cases, we work our way up to the final solution
and (usually) store intermediate solutions in a table.
For dynamic programming to be effective, we need a
structure that leads to at most some polynomial number
of different subproblems. Most commonly, we deal with
sequences, which have linearly many prefixes and suffixes
and quadratically many contiguous substrings.
13
5 Greedy Algorithms
The philosophy of being greedy is shortsightedness. Al-
ways go for the seemingly best next thing, always op-
timize the presence, without any regard for the future,
and never change your mind about the past. The greedy
paradigm is typically applied to optimization problems. In
this section, we first consider a scheduling problem and
second the construction of optimal codes.
A scheduling problem. Consider a set of activities
1, 2, . . . , n. Activity i starts at time si and finishes
at time fi  si. Two activities i and j overlap if
[si, fi] ∩ [sj, fj] 6= ∅. The objective is to select a maxi-
mum number of pairwise non-overlapping activities. An
example is shown in Figure 6. The largest number of ac-
c
d
b h
g
a
e
time
f
[
[
[
[
[
[
[
[
]
]
]
]
]
]
]
]
Figure 6: A best schedule is c, e, f, but there are also others of
the same size.
tivities can be scheduled by choosing activities with early
finish times first. We first sort and reindex such that i  j
implies fi ≤ fj.
S = {1}; last = 1;
for i = 2 to n do
if flast  si then
S = S ∪ {i}; last = i
endif
endfor.
The running time is O(n log n) for sorting plus O(n) for
the greedy collection of activities.
It is often difficult to determine how close to the opti-
mum the solutions found by a greedy algorithm really are.
However, for the above scheduling problem the greedy
algorithm always finds an optimum. For the proof let
1 = i1  i2  . . .  ik be the greedy schedule con-
structed by the algorithm. Let j1  j2  . . .  jℓ be any
other feasible schedule. Since i1 = 1 has the earliest finish
time of any activity, we have fi1 ≤ fj1 . We can therefore
add i1 to the feasible schedule and remove at most one ac-
tivity, namely j1. Among the activities that do not overlap
i1, i2 has the earliest finish time, hence fi2 ≤ fj2 . We can
again add i2 to the feasible schedule and remove at most
one activity, namely j2 (or possibly j1 if it was not re-
moved before). Eventually, we replace the entire feasible
schedule by the greedy schedule without decreasing the
number of activities. Since we could have started with a
maximum feasible schedule, we conclude that the greedy
schedule is also maximum.
Binary codes. Next we consider the problem of encod-
ing a text using a string of 0s and 1s. A binary code maps
each letter in the alphabet of the text to a unique string
of 0s and 1s. Suppose for example that the letter ‘t’ is
encoded as ‘001’, ‘h’ is encoded as ‘101’, and ‘e’ is en-
coded as ‘01’. Then the word ‘the’ would be encoded as
the concatenation of codewords: ‘00110101’. This partic-
ular encoding is unambiguous because the code is prefix-
free: no codeword is prefix of another codeword. There is
1
1
0
h
1
1
0
0 1 h
t e
0
e
t
0 1
Figure 7: Letters correspond to leaves and codewords correspond
to maximal paths. A left edge is read as ‘0’ and a right edge as
‘1’. The tree to the right is full and improves the code.
a one-to-one correspondence between prefix-free binary
codes and binary trees where each leaf is a letter and the
corresponding codeword is the path from the root to that
leaf. Figure 7 illustrates the correspondence for the above
3-letter code. Being prefix-free corresponds to leaves not
having children. The tree in Figure 7 is not full because
three of its internal nodes have only one child. This is an
indication of waste. The code can be improved by replac-
ing each node with one child by its child. This changes
the above code to ‘00’ for ‘t’, ‘1’ for ‘h’, and ‘01’ for ‘e’.
Huffman trees. Let wi be the frequency of the letter ci
in the given text. It will be convenient to refer to wi as
the weight of ci or of its external node. To get an effi-
cient code, we choose short codewords for common let-
ters. Suppose δi is the length of the codeword for ci. Then
the number of bits for encoding the entire text is
P =
X
i
wi · δi.
Since δi is the depth of the leaf ci, P is also known as the
weighted external path length of the corresponding tree.
14
The Huffman tree for the ci minimizes the weighted ex-
ternal path length. To construct this tree, we start with n
nodes, one for each letter. At each stage of the algorithm,
we greedily pick the two nodes with smallest weights and
make them the children of a new node with weight equal
to the sum of two weights. We repeat until only one node
remains. The resulting tree for a collection of nine letters
with displayed weights is shown in Figure 8. Ties that
38
17
61
23
13
7
10
4
21 8
6
3
9
5
3
1
5
Figure 8: The numbers in the external nodes (squares) are the
weights of the corresponding letters, and the ones in the internal
nodes (circles) are the weights of these nodes. The Huffman tree
is full by construction.
001
000
11
101
100
01110
01111
0110
010
5
61
23 38
10 13
3
1
4
3
5
17
6
21
9
8
7
Figure 9: The weighted external path length is 15 + 15 + 18 +
12 + 5 + 15 + 24 + 27 + 42 = 173.
arise during the algorithm are broken arbitrarily. We re-
draw the tree and order the children of a node as left and
right child arbitrarily, as shown in Figure 9.
The algorithm works with a collection N of nodes
which are the roots of the trees constructed so far. Ini-
tially, each leaf is a tree by itself. We denote the weight
of a node by w(µ) and use a function EXTRACTMIN that
returns the node with the smallest weight and, at the same
time, removes this node from the collection.
Tree HUFFMAN
loop µ = EXTRACTMIN(N);
if N = ∅ then return µ endif;
ν = EXTRACTMIN(N);
create node κ with children µ and ν
and weight w(κ) = w(µ) + w(ν);
add κ to N
forever.
Straightforward implementations use an array or a linked
list and take time O(n) for each operation involving N.
There are fewer than 2n extractions of the minimum and
fewer than n additions, which implies that the total run-
ning time is O(n2
). We will see later that there are better
ways to implement N leading to running time O(n log n).
An inequality. We prepare the proof that the Huffman
tree indeed minimizes the weighted external path length.
Let T be a full binary tree with weighted external path
length P(T ). Let Λ(T ) be the set of leaves and let µ and
ν be any two leaves with smallest weights. Then we can
construct a new tree T ′
with
(1) set of leaves Λ(T ′
) = (Λ(T ) − {µ, ν}) ˙
∪ {κ} ,
(2) w(κ) = w(µ) + w(ν),
(3) P(T ′
) ≤ P(T ) − w(µ) − w(ν), with equality if µ
and ν are siblings.
We now argue that T ′
really exists. If µ and ν are siblings
then we construct T ′
from T by removing µ and ν and
declaring their parent, κ, as the new leaf. Then
µ ν
µ σ
ν σ
Figure 10: The increase in the depth of ν is compensated by the
decrease in depth of the leaves in the subtree of σ.
P(T ′
) = P(T ) − w(µ)δ − w(ν)δ + w(κ)(δ − 1)
= P(T ) − w(µ) − w(ν),
where δ = δ(µ) = δ(ν) = δ(κ) + 1 is the common depth
of µ and ν. Otherwise, assume δ(µ) ≥ δ(ν) and let σ be
15
the sibling of µ, which may or may not be a leaf. Exchange
ν and σ. Since the length of the path from the root to σ
is at least as long as the path to µ, the weighted external
path length can only decrease; see Figure 10. Then do the
same as in the other case.
Proof of optimality. The optimality of the Huffman tree
can now be proved by induction.
HUFFMAN TREE THEOREM. Let T be the Huffman tree
and X another tree with the same set of leaves and
weights. Then P(T ) ≤ P(X).
PROOF. If there are only two leaves then the claim is obvi-
ous. Otherwise, let µ and ν be the two leaves selected by
the algorithm. Construct trees T ′
and X′
with
P(T ′
) = P(T ) − w(µ) − w(ν),
P(X′
) ≤ P(X) − w(µ) − w(ν).
T ′
is the Huffman tree for n − 1 leaves so we can use the
inductive assumption and get P(T ′
) ≤ P(X′
). It follows
that
P(T ) = P(T ′
) + w(µ) + w(ν)
≤ P(X′
) + w(µ) + w(ν)
≤ P(X).
Huffman codes are binary codes that correspond to
Huffman trees as described. They are commonly used to
compress text and other information. Although Huffman
codes are optimal in the sense defined above, there are
other codes that are also sensitive to the frequency of se-
quences of letters and this way outperform Huffman codes
for general text.
Summary. The greedy algorithm for constructing Huff-
man trees works bottom-up by stepwise merging, rather
than top-down by stepwise partitioning. If we run the
greedy algorithm backwards, it becomes very similar to
dynamic programming, except that it pursues only one of
many possible partitions. Often this implies that it leads
to suboptimal solutions. Nevertheless, there are problems
that exhibit enough structure that the greedy algorithm
succeeds in finding an optimum, and the scheduling and
coding problems described above are two such examples.
16
First Homework Assignment
Write the solution to each problem on a single page. The
deadline for handing in solutions is September 18.
Problem 1. (20 points). Consider two sums, X = x1 +
x2 + . . . + xn and Y = y1 + y2 + . . . + ym. Give an
algorithm that finds indices i and j such that swap-
ping xi with yj makes the two sums equal, that is,
X − xi + yj = Y − yj + xi, if they exist. Analyze
your algorithm. (You can use sorting as a subroutine.
The amount of credit depends on the correctness of
the analysis and the running time of your algorithm.)
Problem 2. (20 = 10 + 10 points). Consider dis-
tinct items x1, x2, . . . , xn with positive weights
w1, w2, . . . , wn such that
Pn
i=1 wi = 1.0. The
weighted median is the item xk that satisfies
X
xixk
wi  0.5 and
X
xj xk
wj ≤ 0.5.
(a) Show how to compute the weighted median
of n items in worst-case time O(n log n) using
sorting.
(b) Show how to compute the weighted median in
worst-case time O(n) using a linear-time me-
dian algorithm.
Problem 3. (20 = 6 + 14 points). A game-board has n
columns, each consisting of a top number, the cost of
visiting the column, and a bottom number, the maxi-
mum number of columns you are allowed to jump to
the right. The top number can be any positive integer,
while the bottom number is either 1, 2, or 3. The ob-
jective is to travel from the first column off the board,
to the right of the nth column. The cost of a game is
the sum of the costs of the visited columns.
Assuming the board is represented in a two-
dimensional array, B[2, n], the following recursive
procedure computes the cost of the cheapest game:
int CHEAPEST(int i)
if i  n then return 0 endif;
x = B[1, i] + CHEAPEST(i + 1);
y = B[1, i] + CHEAPEST(i + 2);
z = B[1, i] + CHEAPEST(i + 3);
case B[2, i] = 1: return x;
B[2, i] = 2: return min{x, y};
B[2, i] = 3: return min{x, y, z}
endcase.
(a) Analyze the asymptotic running time of the pro-
cedure.
(b) Describe and analyze a more efficient algorithm
for finding the cheapest game.
Problem 4. (20 = 10 + 10 points). Consider a set of n
intervals [ai, bi] that cover the unit interval, that is,
[0, 1] is contained in the union of the intervals.
(a) Describe an algorithm that computes a mini-
mum subset of the intervals that also covers
[0, 1].
(b) Analyze the running time of your algorithm.
(For question (b) you get credit for the correctness of
your analysis but also for the running time of your
algorithm. In other words, a fast algorithm earns you
more points than a slow algorithm.)
Problem 5. (20 = 7 + 7 + 6 points). Let A[1..m] and
B[1..n] be two strings.
(a) Modify the dynamic programming algorithm
for computing the edit distance between A and
B for the case in which there are only two al-
lowed operations, insertions and deletions of in-
dividual letters.
(b) A (not necessarily contiguous) subsequence of
A is defined by the increasing sequence of its
indices, 1 ≤ i1  i2  . . .  ik ≤ m. Use
dynamic programming to find the longest com-
mon subsequence of A and B and analyze its
running time.
(c) What is the relationship between the edit dis-
tance defined in (a) and the longest common
subsequence computed in (b)?
17
II SEARCHING
6 Binary Search Trees
7 Red-black Trees
8 Amortized Analysis
9 Splay Trees
Second Homework Assignment
18
6 Binary Search Trees
One of the purposes of sorting is to facilitate fast search-
ing. However, while a sorted sequence stored in a lin-
ear array is good for searching, it is expensive to add and
delete items. Binary search trees give you the best of both
worlds: fast search and fast update.
Definitions and terminology. We begin with a recursive
definition of the most common type of tree used in algo-
rithms. A (rooted) binary tree is either empty or a node
(the root) with a binary tree as left subtree and binary tree
as right subtree. We store items in the nodes of the tree.
It is often convenient to say the items are the nodes. A
binary tree is sorted if each item is between the smaller or
equal items in the left subtree and the larger or equal items
in the right subtree. For example, the tree illustrated in
Figure 11 is sorted assuming the usual ordering of English
characters. Terms for relations between family members
such as child, parent, sibling are also used for nodes in a
tree. Every node has one parent, except the root which has
no parent. A leaf or external node is one without children;
all other nodes are internal. A node ν is a descendent of µ
if ν = µ or ν is a descendent of a child of µ. Symmetri-
cally, µ is an ancestor of ν if ν is a descendent of µ. The
subtree of µ consists of all descendents of µ. An edge is a
parent-child pair.
m
k
l
z
v
i
j
d
b
r
g y
c
Figure 11: The parent, sibling and two children of the dark node
are shaded. The internal nodes are drawn as circles while the
leaves are drawn as squares.
The size of the tree is the number of nodes. A binary
tree is full if every internal node has two children. Every
full binary tree has one more leaf than internal node. To
count its edges, we can either count 2 for each internal
node or 1 for every node other than the root. Either way,
the total number of edges is one less than the size of the
tree. A path is a sequence of contiguous edges without
repetitions. Usually we only consider paths that descend
or paths that ascend. The length of a path is the number
of edges. For every node µ, there is a unique path from
the root to µ. The length of that path is the depth of µ.
The height of the tree is the maximum depth of any node.
The path length is the sum of depths over all nodes, and
the external path length is the same sum restricted to the
leaves in the tree.
Searching. A binary search tree is a sorted binary tree.
We assume each node is a record storing an item and point-
ers to two children:
struct Node{item info; Node ∗ ℓ, ∗ r};
typedef Node ∗ Tree.
Sometimes it is convenient to also store a pointer to the
parent, but for now we will do without. We can search in
a binary search tree by tracing a path starting at the root.
Node ∗ SEARCH(Tree ̺, item x)
case ̺ = NULL: return NULL;
x  ̺ → info: return SEARCH(̺ → ℓ, x);
x = ̺ → info: return ̺;
x  ̺ → info: return SEARCH(̺ → r, x)
endcase.
The running time depends on the length of the path, which
is at most the height of the tree. Let n be the size. In the
worst case the tree is a linked list and searching takes time
O(n). In the best case the tree is perfectly balanced and
searching takes only time O(log n).
Insert. To add a new item is similarly straightforward:
follow a path from the root to a leaf and replace that leaf
by a new node storing the item. Figure 12 shows the tree
obtained after adding w to the tree in Figure 11. The run-
c j
y
g
r
b d i
v z
l
k m
w
Figure 12: The shaded nodes indicate the path from the root we
traverse when we insert w into the sorted tree.
ning time depends again on the length of the path. If the
insertions come in a random order then the tree is usually
19
close to being perfectly balanced. Indeed, the tree is the
same as the one that arises in the analysis of quicksort.
The expected number of comparisons for a (successful)
search is one n-th of the expected running time of quick-
sort, which is roughly 2 ln n.
Delete. The main idea for deleting an item is the same
as for inserting: follow the path from the root to the node
ν that stores the item.
Case 1. ν has no internal node as a child. Remove ν.
Case 2. ν has one internal child. Make that child the
child of the parent of ν.
Case 3. ν has two internal children. Find the rightmost
internal node in the left subtree, remove it, and sub-
stitute it for ν, as shown in Figure 13.
ν
ν K J
J
Figure 13: Store J in ν and delete the node that used to store J.
The analysis of the expected search time in a binary search
tree constructed by a random sequence of insertions and
deletions is considerably more challenging than if no dele-
tions are present. Even the definition of a random se-
quence is ambiguous in this case.
Optimal binary search trees. Instead of hoping the in-
cremental construction yields a shallow tree, we can con-
struct the tree that minimizes the search time. We con-
sider the common problem in which items have different
probabilities to be the target of a search. For example,
some words in the English dictionary are more commonly
searched than others and are therefore assigned a higher
probability. Let a1  a2  . . .  an be the items and
pi the corresponding probabilities. To simplify the discus-
sion, we only consider successful searches and thus as-
sume
Pn
i=1 pi = 1. The expected number of comparisons
for a successful search in a binary search tree T storing
the n items is
1 + C(T ) =
n
X
i=1
pi · (δi + 1)
= 1 +
n
X
i=1
pi · δi,
where δi is the depth of the node that stores ai. C(T )
is the weighted path length or the cost of T . We study
the problem of constructing a tree that minimizes the cost.
To develop an example, let n = 3 and p1 = 1
2 , p2 =
1
3 , p3 = 1
6 . Figure 14 shows the five binary trees with
three nodes and states their costs. It can be shown that the
a2
3
a
a
1
a2
a2
a
1
1 a1
a1
a a3
a2 a
3
2
a3
a3 a
Figure 14: There are five different binary trees of three nodes.
From left to right their costs are 2
3
, 5
6
, 2
3
, 7
6
, 4
3
. The first tree and
the third tree are both optimal.
number of different binary trees with n nodes is 1
n+1
2n
n

,
which is exponential in n. This is far too large to try all
possibilities, so we need to look for a more efficient way
to construct an optimum tree.
Dynamic programming. We write T j
i for the optimum
weighted binary search tree of ai, ai+1, . . . , aj, Cj
i for its
cost, and pj
i =
Pj
k=i pk for the total probability of the
items in T j
i . Suppose we know that the optimum tree
stores item ak in its root. Then the left subtree is T k−1
i
and the right subtree is T j
k+1. The cost of the optimum
tree is therefore Cj
i = Ck−1
i + Cj
k+1 + pj
i − pk. Since we
do not know which item is in the root, we try all possibili-
ties and find the minimum:
Cj
i = min
i≤k≤j
{Ck−1
i + Cj
k+1 + pj
i − pk}.
This formula can be translated directly into a dynamic pro-
gramming algorithm. We use three two-dimensional ar-
rays, one for the sums of probabilities, pj
i , one for the costs
of optimum trees, Cj
i , and one for the indices of the items
stored in their roots, Rj
i . We assume that the first array has
already been computed. We initialize the other two arrays
along the main diagonal and add one dummy diagonal for
the cost.
20
for k = 1 to n do
C[k, k − 1] = C[k, k] = 0; R[k, k] = k
endfor;
C[n + 1, n] = 0.
We fill the rest of the two arrays one diagonal at a time.
for ℓ = 2 to n do
for i = 1 to n − ℓ + 1 do
j = i + ℓ − 1; C[i, j] = ∞;
for k = i to j do
cost = C[i, k − 1] + C[k + 1, j]
+ p[i, j] − p[k, k];
if cost  C[i, j] then
C[i, j] = cost; R[i, j] = k
endif
endfor
endfor
endfor.
The main part of the algorithm consists of three nested
loops each iterating through at most n values. The running
time is therefore in O(n3
).
Example. Table 1 shows the partial sums of probabil-
ities for the data in the earlier example. Table 2 shows
6p 1 2 3
1 3 5 6
2 2 3
3 1
Table 1: Six times the partial sums of probabilities used by the
dynamic programming algorithm.
the costs and the indices of the roots of the optimum trees
computed for all contiguous subsequences. The optimum
6C 1 2 3
1 0 2 4
2 0 1
3 0
R 1 2 3
1 1 1 1
2 2 2
3 3
Table 2: Six times the costs and the roots of the optimum trees.
tree can be constructed from R as follows. The root stores
the item with index R[1, 3] = 1. The left subtree is there-
fore empty and the right subtree stores a2, a3. The root
of the optimum right subtree stores the item with index
R[2, 3] = 2. Again the left subtree is empty and the right
subtree consists of a single node storing a3.
Improved running time. Notice that the array R in Ta-
ble 2 is monotonic, both along rows and along columns.
Indeed it is possible to prove Rj−1
i ≤ Rj
i in every row and
Rj
i ≤ Rj
i+1 in every column. We omit the proof and show
how the two inequalities can be used to improve the dy-
namic programming algorithm. Instead of trying all roots
from i through j we restrict the innermost for-loop to
for k = R[i, j − 1] to R[i + 1, j] do
The monotonicity property implies that this change does
not alter the result of the algorithm. The running time of a
single iteration of the outer for-loop is now
Uℓ(n) =
n−ℓ+1
X
i=1
(Rj
i+1 − Rj−1
i + 1).
Recall that j = i + ℓ − 1 and note that most terms cancel,
giving
Uℓ(n) = Rn
n−ℓ+2 − Rℓ−1
1 + (n − ℓ + 1)
≤ 2n.
In words, each iteration of the outer for-loop takes only
time O(n), which implies that the entire algorithm takes
only time O(n2
).
21
7 Red-Black Trees
Binary search trees are an elegant implementation of the
dictionary data type, which requires support for
item SEARCH (item),
void INSERT (item),
void DELETE (item),
and possible additional operations. Their main disadvan-
tage is the worst case time Ω(n) for a single operation.
The reasons are insertions and deletions that tend to get
the tree unbalanced. It is possible to counteract this ten-
dency with occasional local restructuring operations and
to guarantee logarithmic time per operation.
2-3-4 trees. A special type of balanced tree is the 2-3-4
tree. Each internal node stores one, two, or three items
and has two, three, or four children. Each leaf has the
same depth. As shown in Figure 15, the items in the in-
ternal nodes separate the items stored in the subtrees and
thus facilitate fast searching. In the smallest 2-3-4 tree of
7 15
25
20
4 17
9
2
Figure 15: A 2-3-4 tree of height two. All items are stored in
internal nodes.
height h, every internal node has exactly two children, so
we have 2h
leaves and 2h
−1 internal nodes. In the largest
2-3-4 tree of height h, every internal node has four chil-
dren, so we have 4h
leaves and (4h
− 1)/3 internal nodes.
We can store a 2-3-4 tree in a binary tree by expanding a
node with i  1 items and i + 1 children into i nodes each
with one item, as shown in Figure 16.
Red-black trees. Suppose we color each edge of a bi-
nary search tree either red or black. The color is conve-
niently stored in the lower node of the edge. Such a edge-
colored tree is a red-black tree if
(1) there are no two consecutive red edges on any de-
scending path and every maximal such path ends with
a black edge;
(2) all maximal descending paths have the same number
of black edges.
b
a c
or
a b
b
a
c
a
a
b
b
Figure 16: Transforming a 2-3-4 tree into a binary tree. Bold
edges are called red and the others are called black.
The number of black edges on a maximal descending path
is the black height, denoted as bh(̺). When we transform
a 2-3-4 tree into a binary tree as in Figure 16, we get a red-
black tree. The result of transforming the tree in Figure 15
17
20
15
25
2
9
7
4
Figure 17: A red-black tree obtained from the 2-3-4 tree in Fig-
ure 15.
is shown in Figure 17.
HEIGHT LEMMA. A red-black tree with n internal nodes
has height at most 2 log2(n + 1).
PROOF. The number of leaves is n + 1. Contract each
red edge to get a 2-3-4 tree with n + 1 leaves. Its height
is h ≤ log2(n + 1). We have bh(̺) = h, and by Rule
(1) the height of the red-black tree is at most 2bh(̺) ≤
2 log2(n + 1).
Rotations. Restructuring a red-black tree can be done
with only one operation (and its symmetric version): a ro-
tation that moves a subtree from one side to another, as
shown in Figure 18. The ordered sequence of nodes in the
left tree of Figure 18 is
. . . , order(A), ν, order(B), µ, order(C), . . . ,
and this is also the ordered sequence of nodes in the right
tree. In other words, a rotation maintains the ordering.
Function ZIG below implements the right rotation:
22
C
B
A
B
A
C
right rotation
left rotation
Zig
Zag
ν
µ ν
µ
Figure 18: From left to right a right rotation and from right to
left a left rotation.
Node ∗ ZIG(Node ∗ µ)
assert µ 6= NULL and ν = µ → ℓ 6= NULL;
µ → ℓ = ν → r; ν → r = µ; return ν.
Function ZAG is symmetric and performs a left rotation.
Occasionally, it is necessary to perform two rotations in
sequence, and it is convenient to combine them into a sin-
gle operation referred to as a double rotation, as shown
in Figure 19. We use a function ZIGZAG to implement a
A
right rotation
ZigZag
double
ν
µ
κ
κ
ν µ
C
B
A
D
B C D
Figure 19: The double right rotation at µ is the concatenation of
a single left rotation at ν and a single right rotation at µ.
double right rotation and the symmetric function ZAGZIG
to implement a double left rotation.
Node ∗ ZIGZAG(Node ∗ µ)
µ → ℓ = ZAG(µ → ℓ); return ZIG(µ).
The double right rotation is the composition of two single
rotations: ZIGZAG(µ) = ZIG(µ) ◦ ZAG(ν). Remember
that the composition of functions is written from right to
left, so the single left rotation of ν precedes the single right
rotation of µ. Single rotations preserve the ordering of
nodes and so do double rotations.
Insertion. Before studying the details of the restructur-
ing algorithms for red-black trees, we look at the trees that
arise in a short insertion sequence, as shown in Figure 20.
After adding 10, 7, 13, 4, we have two red edges in se-
quence and repair this by promoting 10 (A). After adding
2, we repair the two red edges in sequence by a single ro-
tation of 7 (B). After adding 5, we promote 4 (C), and after
adding 6, we do a double rotation of 7 (D).
5
4 13
2
4
5
2 7
10
6
5
13
7
2 7
6
10
4
A 7
D
C
B
13
4
2
10
13
4
10
10
13
7
Figure 20: Sequence of red-black trees generated by inserting
the items 10, 7, 13, 4, 2, 5, 6 in this sequence.
An item x is added by substituting a new internal node
for a leaf at the appropriate position. To satisfy Rule (2)
of the red-black tree definition, color the incoming edge
of the new node red, as shown in Figure 21. Start the
ν
ν
x
Figure 21: The incoming edge of a newly added node is always
red.
adjustment of color and structure at the parent ν of the new
node. We state the properties maintained by the insertion
algorithm as invariants that apply to a node ν traced by the
algorithm.
INVARIANT I. The only possible violation of the red-
black tree properties is that of Rule (1) at the node
ν, and if ν has a red incoming edge then it has ex-
actly one red outgoing edge.
Observe that Invariant I holds right after adding x. We
continue with the analysis of all the cases that may arise.
The local adjustment operations depend on the neighbor-
hood of ν.
Case 1. The incoming edge of ν is black. Done.
23
Case 2. The incoming edge of ν is red. Let µ be the
parent of ν and assume ν is left child of µ.
Case 2.1. Both outgoing edges of µ are red, as
in Figure 22. Promote µ. Let ν be the parent of
µ and recurse.
ν
µ
µ
ν
Figure 22: Promotion of µ. (The colors of the outgoing edges of
ν may be the other way round).
Case 2.2. Only one outgoing edge of µ is red,
namely the one from µ to ν.
Case 2.2.1. The left outgoing edge of ν is
red, as in Figure 23 to the left. Right rotate
µ. Done.
σ
ν µ
µ
ν µ
ν
σ
ν
µ
Figure 23: Right rotation of µ to the left and double right rotation
of µ to the right.
Case 2.2.2. The right outgoing edge of ν
is red, as in Figure 23 to the right. Double
right rotate µ. Done.
Case 2 has a symmetric case where left and right are in-
terchanged. An insertion may cause logarithmically many
promotions but at most two rotations.
Deletion. First find the node π that is to be removed. If
necessary, we substitute the inorder successor for π so we
can assume that both children of π are leaves. If π is last
in inorder we substitute symmetrically. Replace π by a
leaf ν, as shown in Figure 24. If the incoming edge of π is
red then change it to black. Otherwise, remember the in-
coming edge of ν as ‘double-black’, which counts as two
black edges. Similar to insertions, it helps to understand
the deletion algorithm in terms of a property it maintains.
INVARIANT D. The only possible violation of the red-
black tree properties is a double-black incoming edge
of ν.
π ν
Figure 24: Deletion of node π. The dashed edge counts as two
black edges when we compute the black depth.
Note that Invariant D holds right after we remove π. We
now present the analysis of all the possible cases. The ad-
justment operation is chosen depending on the local neigh-
borhood of ν.
Case 1. The incoming edge of ν is black. Done.
Case 2. The incoming edge of ν is double-black. Let
µ be the parent and κ the sibling of ν. Assume ν is
left child of µ and note that κ is internal.
Case 2.1. The edge from µ to κ is black.
Case 2.1.1. Both outgoing edges of κ are
black, as in Figure 25. Demote µ. Recurse
for ν = µ.
κ
µ
κ
ν ν
µ
Figure 25: Demotion of µ.
Case 2.1.2. The right outgoing edge of κ
is red, as in Figure 26 to the left. Change
the color of that edge to black and left ro-
tate µ. Done.
µ
κ
κ
κ
σ
ν
ν
µ ν
ν
µ µ
κ
σ
Figure 26: Left rotation of µ to the left and double left rotation
of µ to the right.
Case 2.1.3. The right outgoing edge of
κ is black, as in Figure 26 to the right.
Change the color of the left outgoing edge
to black and double left rotate µ. Done.
Case 2.2. The edge from µ to κ is red, as in Fig-
ure 27. Left rotate µ. Recurse for ν.
24
µ
ν κ µ
κ
ν
Figure 27: Left rotation of µ.
Case 2 has a symmetric case in which ν is the right child of
µ. Case 2.2 seems problematic because it recurses without
moving ν any closer to the root. However, the configura-
tion excludes the possibility of Case 2.2 occurring again.
If we enter Cases 2.1.2 or 2.1.3 then the termination is im-
mediate. If we enter Case 2.1.1 then the termination fol-
lows because the incoming edge of µ is red. The deletion
may cause logarithmically many demotions but at most
three rotations.
Summary. The red-black tree is an implementation
of the dictionary data type and supports the operations
search, insert, delete in logarithmic time each. An inser-
tion or deletion requires the equivalent of at most three
single rotations. The red-black tree also supports finding
the minimum, maximum and the inorder successor, prede-
cessor of a given node in logarithmic time each.
25
8 Amortized Analysis
Amortization is an analysis technique that can influence
the design of algorithms in a profound way. Later in this
course, we will encounter data structures that owe their
very existence to the insight gained in performance due to
amortized analysis.
Binary counting. We illustrate the idea of amortization
by analyzing the cost of counting in binary. Think of an
integer as a linear array of bits, n =
P
i≥0 A[i] · 2i
. The
following loop keeps incrementing the integer stored in A.
loop i = 0;
while A[i] = 1 do A[i] = 0; i++ endwhile;
A[i] = 1.
forever.
We define the cost of counting as the total number of bit
changes that are needed to increment the number one by
one. What is the cost to count from 0 to n? Figure 28
shows that counting from 0 to 15 requires 26 bit changes.
Since n takes only 1 + ⌊log2 n⌋ bits or positions in A,
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1 1
1
1 1
1
1
1
1
1
1
1 1
1
1
0 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
5
4
3
2
1
0
Figure 28: The numbers are written vertically from top to bot-
tom. The boxed bits change when the number is incremented.
a single increment does at most 2 + log2 n steps. This
implies that the cost of counting from 0 to n is at most
n log2 n+2n. Even though the upper bound of 2 +log2 n
is almost tight for the worst single step, we can show that
the total cost is much less than n times that. We do this
with two slightly different amortization methods referred
to as aggregation and accounting.
Aggregation. The aggregation method takes a global
view of the problem. The pattern in Figure 28 suggests
we define bi equal to the number of 1s and ti equal to
the number of trailing 1s in the binary notation of i. Ev-
ery other number has no trailing 1, every other number
of the remaining ones has one trailing 1, etc. Assuming
n = 2k
− 1, we therefore have exactly j − 1 trailing 1s
for 2k−j
= (n + 1)/2j
integers between 0 and n − 1. The
total number of bit changes is therefore
T (n) =
n−1
X
i=0
(ti + 1) = (n + 1) ·
k
X
j=1
j
2j
.
We use index transformation to show that the sum on the
right is less than 2:
X
j≥1
j
2j
=
X
j≥1
j − 1
2j−1
= 2 ·
X
j≥1
j
2j
−
X
j≥1
1
2j−1
= 2.
Hence the cost is T (n)  2(n + 1). The amortized cost
per operation is T (n)
n , which is about 2.
Accounting. The idea of the accounting method is to
charge each operation what we think its amortized cost is.
If the amortized cost exceeds the actual cost, then the sur-
plus remains as a credit associated with the data structure.
If the amortized cost is less than the actual cost, the accu-
mulated credit is used to pay for the cost overflow. Define
the amortized cost of a bit change 0 → 1 as $2 and that
of 1 → 0 as $0. When we change 0 to 1 we pay $1 for
the actual expense and $1 stays with the bit, which is now
1. This $1 pays for the (later) cost of changing the 1 to 0.
Each increment has amortized cost $2, and together with
the money in the system, this is enough to pay for all the
bit changes. The cost is therefore at most 2n.
We see how a little trick, like making the 0 → 1 changes
pay for the 1 → 0 changes, leads to a very simple analysis
that is even more accurate than the one obtained by aggre-
gation.
Potential functions. We can further formalize the amor-
tized analysis by using a potential function. The idea is
similar to accounting, except there is no explicit credit
saved anywhere. The accumulated credit is an expres-
sion of the well-being or potential of the data structure.
Let ci be the actual cost of the i-th operation and Di the
data structure after the i-th operation. Let Φi = Φ(Di)
be the potential of Di, which is some numerical value
depending on the concrete application. Then we define
ai = ci + Φi − Φi−1 as the amortized cost of the i-th
26
operation. The sum of amortized costs of n operations is
n
X
i=1
ai =
n
X
i=1
(ci + Φi − Φi−1)
=
n
X
i=1
ci + Φn − Φ0.
We aim at choosing the potential such that Φ0 = 0 and
Φn ≥ 0 because then we get
P
ai ≥
P
ci. In words,
the sum of amortized costs covers the sum of actual costs.
To apply the method to binary counting we define the po-
tential equal to the number of 1s in the binary notation,
Φi = bi. It follows that
Φi − Φi−1 = bi − bi−1
= (bi−1 − ti−1 + 1) − bi−1
= 1 − ti−1.
The actual cost of the i-th operation is ci = 1 + ti−1,
and the amortized cost is ai = ci + Φi − Φi−1 = 2.
We have Φ0 = 0 and Φn ≥ 0 as desired, and therefore
P
ci ≤
P
ai = 2n, which is consistent with the analysis
of binary counting with the aggregation and the account-
ing methods.
2-3-4 trees. As a more complicated application of amor-
tization we consider 2-3-4 trees and the cost of restructur-
ing them under insertions and deletions. We have seen
2-3-4 trees earlier when we talked about red-black trees.
A set of keys is stored in sorted order in the internal nodes
of a 2-3-4 tree, which is characterized by the following
rules:
(1) each internal node has 2 ≤ d ≤ 4 children and stores
d − 1 keys;
(2) all leaves have the same depth.
As for binary trees, being sorted means that the left-to-
right order of the keys is sorted. The only meaningful def-
inition of this ordering is the ordered sequence of the first
subtree followed by the first key stored in the root followed
by the ordered sequence of the second subtree followed by
the second key, etc.
To insert a new key, we attach a new leaf and add the key
to the parent ν of that leaf. All is fine unless ν overflows
because it now has five children. If it does, we repair the
violation of Rule (1) by climbing the tree one node at a
time. We call an internal node non-saturated if it has fewer
than four children.
Case 1. ν has five children and a non-saturated sibling
to its left or right. Move one child from ν to that
sibling, as in Figure 29.
$1 $0
$6 $3
Figure 29: The overflowing node gives one child to a non-
saturated sibling.
Case 2. ν has five children and no non-saturated sib-
ling. Split ν into two nodes and recurse for the parent
of ν, as in Figure 30. If ν has no parent then create a
new root whose only children are the two nodes ob-
tained from ν.
$0
$6
$3 $6
$1
Figure 30: The overflowing node is split into two and the parent
is treated recursively.
Deleting a key is done is a similar fashion, although there
we have to battle with nodes ν that have too few children
rather than too many. Let ν have only one child. We repair
Rule (1) by adopting a child from a sibling or by merging
ν with a sibling. In the latter case the parent of ν looses a
child and needs to be visited recursively. The two opera-
tions are illustrated in Figures 31 and 32.
$4
$3 $1
$0
Figure 31: The underflowing node receives one child from a sib-
ling.
Amortized analysis. The worst case for inserting a new
key occurs when all internal nodes are saturated. The in-
sertion then triggers logarithmically many splits. Sym-
metrically, the worst case for a deletion occurs when all
27
$1 $4 $0
$1
$0
Figure 32: The underflowing node is merged with a sibling and
the parent is treated recursively.
internal nodes have only two children. The deletion then
triggers logarithmically many mergers. Nevertheless, we
can show that in the amortized sense there are at most a
constant number of split and merge operations per inser-
tion and deletion.
We use the accounting method and store money in the
internal nodes. The best internal nodes have three children
because then they are flexible in both directions. They
require no money, but all other nodes are given a posi-
tive amount to pay for future expenses caused by split and
merge operations. Specifically, we store $4, $1, $0, $3,
$6 in each internal node with 1, 2, 3, 4, 5 children. As il-
lustrated in Figures 29 and 31, an adoption moves money
only from ν to its sibling. The operation keeps the total
amount the same or decreases it, which is even better. As
shown in Figure 30, a split frees up $5 from ν and spends
at most $3 on the parent. The extra $2 pay for the split
operation. Similarly, a merger frees $5 from the two af-
fected nodes and spends at most $3 on the parent. This
is illustrated in Figure 32. An insertion makes an initial
investment of at most $3 to pay for creating a new leaf.
Similarly, a deletion makes an initial investment of at most
$3 for destroying a leaf. If we charge $2 for each split and
each merge operation, the money in the system suffices to
cover the expenses. This implies that for n insertions and
deletions we get a total of at most 3n
2 split and merge oper-
ations. In other words, the amortized number of split and
merge operations is at most 3
2 .
Recall that there is a one-to-one correspondence be-
tween 2-3-4 tree and red-black trees. We can thus trans-
late the above update procedure and get an algorithm for
red-black trees with an amortized constant restructuring
cost per insertion and deletion. We already proved that for
red-black trees the number of rotations per insertion and
deletion is at most a constant. The above argument im-
plies that also the number of promotions and demotions is
at most a constant, although in the amortized and not in
the worst-case sense as for the rotations.
28
9 Splay Trees
Splay trees are similar to red-black trees except that they
guarantee good shape (small height) only on the average.
They are simpler to code than red-black trees and have the
additional advantage of giving faster access to items that
are more frequently searched. The reason for both is that
splay trees are self-adjusting.
Self-adjusting binary search trees. Instead of explic-
itly maintaining the balance using additional information
(such as the color of edges in the red-black tree), splay
trees maintain balance implicitly through a self-adjusting
mechanism. Good shape is a side-effect of the operations
that are applied. These operations are applied while splay-
ing a node, which means moving it up to the root of the
tree, as illustrated in Figure 33. A detailed analysis will
2
1
3
4 4
3
1
2 2
3
1
4 1
4
3
2
Figure 33: The node storing 1 is splayed using three single rota-
tions.
reveal that single rotations do not imply good amortized
performance but combinations of single rotations in pairs
do. Aside from double rotations, we use roller-coaster
rotations that compose two single left or two single right
rotations, as shown in Figure 35. The sequence of the two
single rotations is important, namely first the higher then
the lower node. Recall that ZIG(κ) performs a single right
rotation and returns the new root of the rotated subtree.
The roller-coaster rotation to the right is then
Node ∗ ZIGZIG(Node ∗ κ)
return ZIG(ZIG(κ)).
Function ZAGZAG is symmetric, exchanging left and
right, and functions ZIGZAG and ZAGZIG are the two
double rotations already used for red-black trees.
Splay. A splay operation finds an item and uses rotations
to move the corresponding node up to the root position.
Whenever possible, a double rotation or a roller-coaster
rotation is used. We dispense with special cases and show
Function SPLAY for the case the search item x is less than
the item in the root.
if x  ̺ → info then µ = ̺ → ℓ;
if x  µ → info then
µ → ℓ = SPLAY(µ → ℓ, x);
return ZIGZIG(̺)
elseif x  µ → info then
µ → r = SPLAY(µ → r, x);
return ZIGZAG(̺)
else
return ZIG(̺)
endif.
If x is stored in one of the children of ̺ then it is moved
to the root by a single rotation. Otherwise, it is splayed
recursively to the third level and moved to the root either
by a double or a roller-coaster rotation. The number of
rotation depends on the length of the path from ̺ to x.
Specifically, if the path is i edges long then x is splayed in
⌊i/2⌋ double and roller-coaster rotations and zero or one
single rotation. In the worst case, a single splay operation
takes almost as many rotations as there are nodes in the
tree. We will see shortly that the amortized number of
rotations is at most logarithmic in the number of nodes.
Amortized cost. Recall that the amortized cost of an op-
eration is the actual cost minus the cost for work put into
improving the data structure. To analyze the cost, we use a
potential function that measures the well-being of the data
structure. We need definitions:
the size s(ν) is the number of descendents of node ν, in-
cluding ν,
the balance β(ν) is twice the floor of the binary logarithm
of the size, β(ν) = 2⌊log2 s(ν)⌋,
the potential Φ of a tree or a collection of trees is the sum
of balances over all nodes, Φ =
P
β(ν),
the actual cost ci of the i-th splay operation is 1 plus the
number of single rotations (counting a double or
roller-coaster rotation as two single rotations).
the amortized cost ai of the i-th splay operation is ai =
ci + Φi − Φi−1.
We have Φ0 = 0 for the empty tree and Φi ≥ 0 in general.
This implies that the total actual cost does not exceed the
total amortized cost,
P
ci =
P
ai − Φn + Φ0 ≤
P
ai.
To get a feeling for the potential, we compute Φ for
the two extreme cases. Note first that the integral of the
29
natural logarithm is
R
ln x = x ln x − x and therefore
R
log2 x = x log2 x − x/ ln 2. In the extreme unbal-
anced case, the balance of the i-th node from the bottom
is 2⌊log2 i⌋ and the potential is
Φ = 2
n
X
i=1
⌊log2 i⌋ = 2n log2 n − O(n).
In the balanced case, we bound Φ from above by 2U(n),
where U(n) = 2U(n
2 )+log2 n. We prove that U(n)  2n
for the case when n = 2k
. Consider the perfectly balanced
tree with n leaves. The height of the tree is k = log2 n.
We encode the term log2 n of the recurrence relation by
drawing the hook-like path from the root to the right child
and then following left edges until we reach the leaf level.
Each internal node encodes one of the recursively surfac-
ing log-terms by a hook-like path starting at that node. The
paths are pairwise edge-disjoint, which implies that their
total length is at most the number of edges in the tree,
which is 2n − 2.
Investment. The main part of the amortized time analy-
sis is a detailed study of the three types of rotations: sin-
gle, roller-coaster, and double. We write β(ν) for the bal-
ance of a node ν before the rotation and β′
(ν) for the bal-
ance after the rotation. Let ν be the lowest node involved
in the rotation. The goal is to prove that the amortized
cost of a roller-coaster and a double rotation is at most
3[β′
(ν) − β(ν)] each, and that of a single rotation is at
most 1 + 3[β′
(ν) − β(ν)]. Summing these terms over the
rotations of a splay operation gives a telescoping series in
which all terms cancel except the first and the last. To this
we add 1 for the at most one single rotation and another 1
for the constant cost in definition of actual cost.
INVESTMENT LEMMA. The amortized cost of splaying a
node ν in a tree ̺ is at most 2 + 3[β(̺) − β(ν)].
Before looking at the details of the three types of rota-
tions, we prove that if two siblings have the same balance
then their common parent has a larger balance. Because
balances are even integers this means that the balance of
the parent exceeds the balance of its children by at least 2.
BALANCE LEMMA. If µ has children ν, κ and β(ν) =
β(κ) = β then β(µ) ≥ β + 2.
PROOF. By definition β(ν) = 2⌊log2 s(ν)⌋ and therefore
s(ν) ≥ 2β/2
. We have s(µ) = 1 + s(ν) + s(κ) ≥ 21+β/2
,
and therefore β(µ) ≥ β + 2.
Single rotation. The amortized cost of a single rotation
shown in Figure 34 is 1 for performing the rotation plus
the change in the potential:
a = 1 + β′
(ν) + β′
(µ) − β(ν) − β(µ)
≤ 1 + 3[β′
(ν) − β(ν)]
because β′
(µ) ≤ β(µ) and β(ν) ≤ β′
(ν).
µ
ν
µ
ν
Figure 34: The size of µ decreases and that of ν increases from
before to after the rotation.
Roller-coaster rotation. The amortized cost of a roller-
coaster rotation shown in Figure 35 is
a = 2 + β′
(ν) + β′
(µ) + β′
(κ)
− β(ν) − β(µ) − β(κ)
≤ 2 + 2[β′
(ν) − β(ν)]
because β′
(κ) ≤ β(κ), β′
(µ) ≤ β′
(ν), and β(ν) ≤ β(µ).
We distinguish two cases to prove that a is bounded from
above by 3[β′
(ν) − β(ν)]. In both cases, the drop in the
µ
ν
κ
ν
µ
κ
µ
κ
ν
Figure 35: If in the middle tree the balance of ν is the same as
the balance of µ then by the Balance Lemma the balance of κ is
less than that common balance.
potential pays for the two single rotations.
Case β′
(ν)  β(ν). The difference between the balance
of ν before and after the roller-coaster rotation is at
least 2. Hence a ≤ 3[β′
(ν) − β(ν)].
Case β′
(ν) = β(ν) = β. Then the balances of nodes ν
and µ in the middle tree in Figure 35 are also equal
to β. The Balance Lemma thus implies that the bal-
ance of κ in that middle tree is at most β − 2. But
since the balance of κ after the roller-coaster rotation
is the same as in the middle tree, we have β′
(κ)  β.
Hence a ≤ 0 = 3[β′
(ν) − β(ν)].
30
Double rotation. The amortized cost of a double rota-
tion shown in Figure 36 is
a = 2 + β′
(ν) + β′
(µ) + β′
(κ)
− β(ν) − β(µ) − β(κ)
≤ 2 + [β′
(ν) − β(ν)]
because β′
(κ) ≤ β(κ) and β′
(µ) ≤ β(µ). We again dis-
tinguish two cases to prove that a is bounded from above
by 3[β′
(ν)−β(ν)]. In both cases, the drop in the potential
pays for the two single rotations.
Case β′
(ν)  β(ν). The difference is at least 2, which
implies a ≤ 3[β′
(ν) − β(ν)], as before.
Case β′
(ν) = β(ν) = β. Then β(µ) = β(κ) = β. We
have β′
(µ)  β′
(ν) or β′
(κ)  β′
(ν) by the Balance
Lemma. Hence a ≤ 0 = 3[β′
(ν) − β(ν)].
µ
κ
ν
µ
ν
κ
Figure 36: In a double rotation, the sizes of µ and κ decrease
from before to after the operation.
Dictionary operations. In summary, we showed that the
amortized cost of splaying a node ν in a binary search tree
with root ̺ is at most 1+3[β(̺)−β(ν)]. We now use this
result to show that splay trees have good amortized perfor-
mance for all standard dictionary operations and more.
To access an item we first splay it to the root and return
the root even if it does not contain x. The amortized cost
is O(β(̺)).
Given an item x, we can split a splay tree into two,
one containing all items smaller than or equal to x and the
other all items larger than x, as illustrated in Figure 37.
The amortized cost is the amortized cost for splaying plus
x x
Figure 37: After splaying x to the root, we split the tree by un-
linking the right subtree.
the increase in the potential, which we denote as Φ′
− Φ.
Recall that the potential of a collection of trees is the sum
of the balances of all nodes. Splitting the tree decreases
the number of descendents and therefore the balance of
the root, which implies that Φ′
− Φ  0. It follows that
the amortized cost of a split operation is less than that of a
splay operation and therefore in O(β(̺)).
Two splay trees can be joined into one if all items in
one tree are smaller than all items in the other tree, as il-
lustrated in Figure 38. The cost for splaying the maximum
max max
Figure 38: We first splay the maximum in the tree with the
smaller items and then link the two trees.
in the first tree is O(β(̺1)). The potential increase caused
by linking the two trees is
Φ′
− Φ ≤ 2⌊log2(s(̺1) + s(̺2))⌋
≤ 2 log2 s(̺1) + 2 log2 s(̺2).
The amortized cost of joining is thus O(β(̺1) + β(̺2)).
To insert a new item, x, we split the tree. If x is al-
ready in the tree, we undo the split operation by linking
the two trees. Otherwise, we make the two trees the left
and right subtrees of a new node storing x. The amortized
cost for splaying is O(β(̺)). The potential increase caused
by linking is
Φ′
− Φ ≤ 2⌊log2(s(̺1) + s(̺2) + 1)⌋
= β(̺).
The amortized cost of an insertion is thus O(β(̺)).
To delete an item, we splay it to the root, remove the
root, and join the two subtrees. Removing x decreases the
potential, and the amortized cost of joining the two sub-
trees is at most O(β(̺)). This implies that the amortized
cost of a deletion is at most O(β(̺)).
Weighted search. A nice property of splay trees not
shared by most other balanced trees is that they automat-
ically adapt to biased search probabilities. It is plausible
that this would be the case because items that are often
accessed tend to live at or near the root of the tree. The
analysis is somewhat involved and we only state the re-
sult. Each item or node has a positive weight, w(ν)  0,
31
and we define W =
P
ν w(ν). We have the following
generalization of the Investment Lemma, which we state
without proof.
WEIGHTED INVESTMENT LEMMA. The amortized cost
of splaying a node ν in a tree with total weight W
is at most 2 + 3 log2(W/w(ν)).
It can be shown that this result is asymptotically best pos-
sible. In other words, the amortized search time in a splay
tree is at most a constant times the optimum, which is
what we achieve with an optimum weighted binary search
tree. In contrast to splay trees, optimum trees are expen-
sive to construct and they require explicit knowledge of
the weights.
32
Second Homework Assignment
Write the solution to each problem on a single page. The
deadline for handing in solutions is October 02.
Problem 1. (20 = 12 + 8 points). Consider an array
A[1..n] for which we know that A[1] ≥ A[2] and
A[n − 1] ≤ A[n]. We say that i is a local minimum if
A[i − 1] ≥ A[i] ≤ A[i + 1]. Note that A has at least
one local minimum.
(a) We can obviously find a local minimum in time
O(n). Describe a more efficient algorithm that
does the same.
(b) Analyze your algorithm.
Problem 2. (20 points). A vertex cover for a tree is a sub-
set V of its vertices such that each edge has at least
one endpoint in V . It is minimum if there is no other
vertex cover with a smaller number of vertices. Given
a tree with n vertices, describe an O(n)-time algo-
rithm for finding a minimum vertex cover. (Hint: use
dynamic programming or the greedy method.)
Problem 3. (20 points). Consider a red-black tree formed
by the sequential insertion of n  1 items. Argue that
the resulting tree has at least one red edge.
[Notice that we are talking about a red-black tree
formed by insertions. Without this assumption, the
tree could of course consist of black edges only.]
Problem 4. (20 points). Prove that 2n rotations suffice to
transform any binary search tree into any other binary
search tree storing the same n items.
Problem 5. (20 = 5 + 5 + 5 + 5 points). Consider a
collection of items, each consisting of a key and a
cost. The keys come from a totally ordered universe
and the costs are real numbers. Show how to maintain
a collection of items under the following operations:
(a) ADD(k, c): assuming no item in the collection
has key k yet, add an item with key k and cost
c to the collection;
(b) REMOVE(k): remove the item with key k from
the collection;
(c) MAX(k1, k2): assuming k1 ≤ k2, report the
maximum cost among all items with keys k ∈
[k1, k2].
(d) COUNT(c1, c2): assuming c1 ≤ c2, report the
number of items with cost c ∈ [c1, c2];
Each operation should take at most O(log n) time in
the worst case, where n is the number of items in the
collection when the operation is performed.
33
III PRIORITIZING
10 Heaps and Heapsort
11 Fibonacci Heaps
12 Solving Recurrence Relations
Third Homework Assignment
34
10 Heaps and Heapsort
A heap is a data structure that stores a set and allows fast
access to the item with highest priority. It is the basis of
a fast implementation of selection sort. On the average,
this algorithm is a little slower than quicksort but it is not
sensitive to the input ordering or to random bits and runs
about as fast in the worst case as on the average.
Priority queues. A data structure implements the prior-
ity queue abstract data type if it supports at least the fol-
lowing operations:
void INSERT (item),
item FINDMIN (void),
void DELETEMIN (void).
The operations are applied to a set of items with priori-
ties. The priorities are totally ordered so any two can be
compared. To avoid any confusion, we will usually refer
to the priorities as ranks. We will always use integers as
priorities and follow the convention that smaller ranks rep-
resent higher priorities. In many applications, FINDMIN
and DELETEMIN are combined:
void EXTRACTMIN(void)
r = FINDMIN; DELETEMIN; return r.
Function EXTRACTMIN removes and returns the item
with smallest rank.
Heap. A heap is a particularly compact priority queue.
We can think of it as a binary tree with items stored in the
internal nodes, as in Figure 39. Each level is full, except
13
8
7
9
6
5
8
2
15
12
10
7
Figure 39: Ranks increase or, more precisely, do not decrease
from top to bottom.
possibly the last, which is filled from left to right until
we run out of items. The items are stored in heap-order:
every node µ has a rank larger than or equal to the rank of
its parent. Symmetrically, µ has a rank less than or equal
to the ranks of both its children. As a consequence, the
root contains the item with smallest rank.
We store the nodes of the tree in a linear array, level
by level from top to bottom and each level from left to
right, as shown in Figure 40. The embedding saves ex-
6 9 8 15 8 7
7
2 5 10
7 8 9 10 11 12
6
12 13
1 2 3 4 5
Figure 40: The binary tree is layed out in a linear array. The root
is placed in A[1], its children follow in A[2] and A[3], etc.
plicit pointers otherwise needed to establish parent-child
relations. Specifically, we can find the children and par-
ent of a node by index computation: the left child of A[i]
is A[2i], the right child is A[2i + 1], and the parent is
A[⌊i/2⌋]. The item with minimum rank is stored in the
first element:
item FINDMIN(int n)
assert n ≥ 1; return A[1].
Since the index along a path at least doubles each step,
paths can have length at most log2 n.
Deleting the minimum. We first study the problem of
repairing the heap-order if it is violated at the root, as
shown in Figure 41. Let n be the length of the array. We
8 10
6 9
2
7
8
7 12
15
13
5
Figure 41: The root is exchanged with the smaller of its two
children. The operation is repeated along a single path until the
heap-order is repaired.
repair the heap-order by a sequence of swaps along a sin-
gle path. Each swap is between an item and the smaller of
its children:
35
void SIFT-DN(int i, n)
if 2i ≤ n then
k = arg min{A[2i], A[2i + 1]}
if A[k]  A[i] then SWAP(i, k);
SIFT-DN(k, n)
endif
endif.
Here we assume that A[n + 1] is defined and larger than
A[n]. Since a path has at most log2 n edges, the time to re-
pair the heap-order takes time at most O(log n). To delete
the minimum we overwrite the root with the last element,
shorten the heap, and repair the heap-order:
void DELETEMIN(int ∗ n)
A[1] = A[∗n]; ∗n−−; SIFT-DN(1, ∗n).
Instead of the variable that stores n, we pass a pointer to
that variable, ∗n, in order to use it as input and output
parameter.
Inserting. Consider repairing the heap-order if it is vio-
lated at the last position of the heap. In this case, the item
moves up the heap until it reaches a position where its rank
is at least as large as that of its parent.
void SIFT-UP(int i)
if i ≥ 2 then k = ⌊i/2⌋;
if A[i]  A[k] then SWAP(i, k);
SIFT-UP(k)
endif
endif.
An item is added by first expanding the heap by one ele-
ment, placing the new item in the position that just opened
up, and repairing the heap-order.
void INSERT(int ∗ n, item x)
∗n++; A[∗n] = x; SIFT-UP(∗n).
A heap supports FINDMIN in constant time and INSERT
and DELETEMIN in time O(log n) each.
Sorting. Priority queues can be used for sorting. The
first step throws all items into the priority queue, and the
second step takes them out in order. Assuming the items
are already stored in the array, the first step can be done
by repeated heap repair:
for i = 1 to n do SIFT-UP(i) endfor.
In the worst case, the i-th item moves up all the way to
the root. The number of exchanges is therefore at most
Pn
i=1 log2 i ≤ n log2 n. The upper bound is asymptot-
ically tight because half the terms in the sum are at least
log2
n
2 = log2 n−1. It is also possible to construct the ini-
tial heap in time O(n) by building it from bottom to top.
We modify the first step accordingly, and we implement
the second step to rearrange the items in sorted order:
void HEAPSORT(int n)
for i = n downto 1 do SIFT-DN(i, n) endfor;
for i = n downto 1 do
SWAP(i, 1); SIFT-DN(1, i − 1)
endfor.
At each step of the first for-loop, we consider the sub-
tree with root A[i]. At this moment, the items in the left
and right subtrees rooted at A[2i] and A[2i + 1] are al-
ready heaps. We can therefore use one call to function
SIFT-DN to make the subtree with root A[i] a heap. We
will prove shortly that this bottom-up construction of the
heap takes time only O(n). Figure 42 shows the array
after each iteration of the second for-loop. Note how
the heap gets smaller by one element each step. A sin-
15 10 12 13
12
10
15
7
8
8
15
15
15
15
13
7
7
8
8
9
2
5
6
12 13 10
13
12
2
5
7
9
2
5
6
7
9
8
2
9 8
6 7 8
6 5
2
7
7
8
9
2
5
6
6
5
6
7
7
13
2
5
6
8 9 12
10 12
8
10
10
12
10
7 8 13
8
13
10 12
13
12 15
15
13
13
15
12
13
9 10 15
15
8
8 8 7 7 6 5 2
9
7
7
8
8
9
10
10
12
7 6 5 2
7 8
7
15 9 8 8 7
10
12
13
7
2 5 7
9
7
5 6 7
2
5
8
6 9 8 8 7
2
8
Figure 42: Each step moves the last heap element to the root and
thus shrinks the heap. The circles mark the items involved in the
sift-down operation.
gle sift-down operation takes time O(log n), and in total
HEAPSORT takes time O(n log n). In addition to the in-
put array, HEAPSORT uses a constant number of variables
36
and memory for the recursion stack used by SIFT-DN.
We can save the memory for the stack by writing func-
tion SIFT-DN as an iteration. The sort can be changed to
non-decreasing order by reversing the order of items in the
heap.
Analysis of heap construction. We return to proving
that the bottom-up approach to constructing a heap takes
only O(n) time. Assuming the worst case, in which ev-
ery node sifts down all the way to the last level, we draw
the swaps as edges in a tree; see Figure 43. To avoid
Figure 43: Each node generates a path that shares no edges with
the paths of the other nodes.
drawing any edge twice, we always first swap to the right
and then continue swapping to the left until we arrive at
the last level. This introduces only a small inaccuracy in
our estimate. The paths cover each edge once, except for
the edges on the leftmost path, which are not covered at
all. The number of edges in the tree is n − 1, which im-
plies that the total number of swaps is less than n. Equiv-
alently, the amortized number of swaps per item is less
than 1. There is a striking difference in time-complexity
to sorting, which takes an amortized number of about
log2 n comparisons per item. The difference between 1
and log2 n may be interpreted as a measure of how far
from sorted a heap-ordered array still is.
37
11 Fibonacci Heaps
The Fibonacci heap is a data structure implementing the
priority queue abstract data type, just like the ordinary
heap but more complicated and asymptotically faster for
some operations. We first introduce binomial trees, which
are special heap-ordered trees, and then explain Fibonacci
heaps as collections of heap-ordered trees.
Binomial trees. The binomial tree of height h is a tree
obtained from two binomial trees of height h − 1, by link-
ing the root of one to the other. The binomial tree of height
0 consists of a single node. Binomial trees of heights up
to 4 are shown in Figure 44. Each step in the construc-
Figure 44: Binomial trees of heights 0, 1, 2, 3, 4. Each tree is
obtained by linking two copies of the previous tree.
tion increases the height by one, increases the degree (the
number of children) of the root by one, and doubles the
size of the tree. It follows that a binomial tree of height h
has root degree h and size 2h
. The root has the largest de-
gree of any node in the binomial tree, which implies that
every node in a binomial tree with n nodes has degree at
most log2 n.
To store any set of items with priorities, we use a small
collection of binomial trees. For an integer n, let ni be
the i-th bit in the binary notation, so we can write n =
P
i≥0 ni2i
. To store n items, we use a binomial tree of
size 2i
for each ni = 1. The total number of binomial trees
is thus the number of 1’s in the binary notation of n, which
is at most log2(n + 1). The collection is referred to as a
binomial heap. The items in each binomial tree are stored
in heap-order. There is no specific relationship between
the items stored in different binomial trees. The item with
minimum key is thus stored in one of the logarithmically
many roots, but it is not prescribed ahead of time in which
one. An example is shown in Figure 45 where 1110 =
10112 items are stored in three binomial trees with sizes
8, 2, and 1. In order to add a new item to the set, we create
a new binomial tree of size 1 and we successively link
binomial trees as dictated by the rules of adding 1 to the
=
+
10
4
11
13
12
15 7
15
9 8
9
15
10
11
13
15
12
4 7
9
5
8
5
9
Figure 45: Adding the shaded node to a binomial heap consisting
of three binomial trees.
binary notation of n. In the example, we get 10112 +12 =
11002. The new collection thus consists of two binomial
trees with sizes 8 and 4. The size 8 tree is the old one, and
the size 4 tree is obtained by first linking the two size 1
trees and then linking the resulting size 2 tree to the old
size 2 tree. All this is illustrated in Figure 45.
Fibonacci heaps. A Fibonacci heap is a collection of
heap-ordered trees. Ideally, we would like it to be a col-
lection of binomial trees, but we need more flexibility. It
will be important to understand how exactly the nodes of a
Fibonacci heap are connected by pointers. Siblings are or-
ganized in doubly-linked cyclic lists, and each node has a
pointer to its parent and a pointer to one of its children, as
shown in Figure 46. Besides the pointers, each node stores
min
12 13
5
15
7
10
9
4
8
11
9
Figure 46: The Fibonacci heap representation of the first collec-
tion of heap-ordered trees in Figure 45.
a key, its degree, and a bit that can be used to mark or un-
mark the node. The roots of the heap-ordered trees are
doubly-linked in a cycle, and there is an explicit pointer to
the root that stores the item with the minimum key. Figure
47 illustrates a few basic operations we perform on a Fi-
bonacci heap. Given two heap-ordered trees, we link them
by making the root with the bigger key the child of the
other root. To unlink a heap-ordered tree or subtree, we
remove its root from the doubly-linked cycle. Finally, to
merge two cycles, we cut both open and connect them at
38
merging
linking
unlinking
Figure 47: Cartoons for linking two trees, unlinking a tree, and
merging two cycles.
their ends. Any one of these three operations takes only
constant time.
Potential function. A Fibonacci heap supports a vari-
ety of operations, including the standard ones for priority
queues. We use a potential function to analyze their amor-
tized cost applied to an initially empty Fibonacci heap.
Letting ri be the number of roots in the root cycle and
mi the number of marked nodes, the potential after the
i-th operation is Φi = ri +2mi. When we deal with a col-
lection of Fibonacci heaps, we define its potential as the
sum of individual potentials. The initial Fibonacci heap is
empty, so Φ0 = 0. As usual, we let ci be the actual cost
and ai = ci + Φi − Φi−1 the amortized cost of the i-th
operation. Since Φ0 = 0 and Φi ≥ 0 for all i, the actual
cost is less than the amortized cost:
n
X
i=1
ci ≤
n
X
i=1
ai = rn + 2mn +
n
X
i=1
ci.
For some of the operations, it is fairly easy to compute the
amortized cost. We get the minimum by returning the key
in the marked root. This operation does not change the po-
tential and its amortized and actual cost is ai = ci = 1.
We meld two Fibonacci heaps, H1 and H2, by first merg-
ing the two root circles and second adjusting the pointer to
the minimum key. We have
ri(H) = ri−1(H1) + ri−1(H2),
mi(H) = mi−1(H1) + mi−1(H2),
which implies that there is no change in potential. The
amortized and actual cost is therefore ai = ci = 1. We
insert a key into a Fibonacci heap by first creating a new
Fibonacci heap that stores only the new key and second
melding the two heaps. We have one more node in the
root cycle so the change in potential is Φi − Φi−1 = 1.
The amortized cost is therefore ai = ci + 1 = 2.
Deletemin. Next we consider the somewhat more in-
volved operation of deleting the minimum key, which is
done in four steps:
Step 1. Remove the node with minimum key from the
root cycle.
Step 2. Merge the root cycle with the cycle of children
of the removed node.
Step 3. As long as there are two roots with the same
degree link them.
Step 4. Recompute the pointer to the minimum key.
For Step 3, we use a pointer array R. Initially, R[i] =
NULL for each i. For each root ̺ in the root cycle, we
execute the following iteration.
i = ̺ → degree;
while R[i] 6= NULL do
̺′
= R[i]; R[i] = NULL; ̺ = LINK(̺, ̺′
); i++
endwhile;
R[i] = ̺.
To analyze the amortized cost for deleting the minimum,
let D(n) be the maximum possible degree of any node
in a Fibonacci heap of n nodes. The number of linking
operations in Step 3 is the number of roots we start with,
which is less than ri−1 +D(n), minus the number of roots
we end up with, which is ri. After Step 3, all roots have
different degrees, which implies ri ≤ D(n)+1. It follows
that the actual cost for the four steps is
ci ≤ 1 + 1 + (ri−1 + D(n) − ri) + (D(n) + 1)
= 3 + 2D(n) + ri−1 − ri.
The potential change is Φi −Φi−1 = ri −ri−1. The amor-
tized cost is therefore ai = ci + Φi − Φi−1 ≤ 2D(n) + 3.
We will prove next time that the maximum possible de-
gree is at most logarithmic in the size of the Fibonacci
heap, D(n)  2 log2(n + 1). This implies that deleting
the minimum has logarithmic amortized cost.
Decreasekey and delete. Besides deletemin, we also
have operations that delete an arbitrary item and that de-
crease the key of an item. Both change the structure of
the heap-ordered trees and are the reason why a Fibonacci
heap is not a collection of binomial trees but of more gen-
eral heap-ordered trees. The decreasekey operation re-
places the item with key x stored in the node ν by x − ∆,
where ∆ ≥ 0. We will see that this can be done more effi-
ciently than to delete x and to insert x − ∆. We decrease
the key in four steps.
39
Step 1. Unlink the tree rooted at ν.
Step 2. Decrease the key in ν by ∆.
Step 3. Add ν to the root cycle and possibly update
the pointer to the minimum key.
Step 4. Do cascading cuts.
We will explain cascading cuts shortly, after explaining
the four steps we take to delete a node ν. Before we delete
a node ν, we check whether ν = min, and if it is then we
delete the minimum as explained above. Assume therefore
that ν 6= min.
Step 1. Unlink the tree rooted at ν.
Step 2. Merge the root-cycle with the cycle of ν’s chil-
dren.
Step 3. Dispose of ν.
Step 4. Do cascading cuts.
Figure 48 illustrates the effect of decreasing a key and of
deleting a node. Both operations create trees that are not
decreasekey 12 to 2
delete 4
5
7
9
7
2
8
9
10
11
5
15
13
13
8
9 15
2
10
11
9
7 5
4
8
10
11
4
9
13
12
9
15
Figure 48: A Fibonacci heap initially consisting of three bino-
mial trees modified by a decreasekey and a delete operation.
binomial, and we use cascading cuts to make sure that the
shapes of these trees are not very different from the shapes
of binomial trees.
Cascading cuts. Let ν be a node that becomes the child
of another node at time t. We mark ν when it loses its first
child after time t. Then we unmark ν, unlink it, and add it
to the root-cycle when it loses its second child thereafter.
We call this operation a cut, and it may cascade because
one cut can cause another, and so on. Figure 49 illus-
trates the effect of cascading in a heap-ordered tree with
two marked nodes. The first step decreases key 10 to 7,
and the second step cuts first node 5 and then node 4.
5
4
5
7
4
7 5
4
10
Figure 49: The effect of cascading after decreasing 10 to 7.
Marked nodes are shaded.
Summary analysis. As mentioned earlier, we will prove
D(n)  2 log2(n+1) next time. Assuming this bound, we
are able to compute the amortized cost of all operations.
The actual cost of Step 4 in decreasekey or in delete is the
number of cuts, ci. The potential changes because there
are ci new roots and ci fewer marked nodes. Also, the last
cut may introduce a new mark. Thus
Φi − Φi−1 = ri − ri−1 + 2mi − 2mi−1
≤ ci − 2ci + 2
= −ci + 2.
The amortized cost is therefore ai = ci + Φi − Φi−1 ≤
ci − (2 − ci) = 2. The first three steps of a decreasekey
operation take only a constant amount of actual time and
increase the potential by at most a constant amount. It
follows that the amortized cost of decreasekey, including
the cascading cuts in Step 4, is only a constant. Similarly,
the actual cost of a delete operation is at most a constant,
but Step 2 may increase the potential of the Fibonacci heap
by as much as D(n). The rest is bounded from above by
a constant, which implies that the amortized cost of the
delete operation is O(log n). We summarize the amortized
cost of the various operations supported by the Fibonacci
heap:
find the minimum O(1)
meld two heaps O(1)
insert a new item O(1)
delete the minimum O(log n)
decrease the key of a node O(1)
delete a node O(log n)
We will later see graph problems for which the difference
in the amortized cost of the decreasekey and delete op-
erations implies a significant improvement in the running
time.
40
12 Solving Recurrence Relations
Recurrence relations are perhaps the most important tool
in the analysis of algorithms. We have encountered sev-
eral methods that can sometimes be used to solve such
relations, such as guessing the solution and proving it by
induction, or developing the relation into a sum for which
we find a closed form expression. We now describe a new
method to solve recurrence relations and use it to settle
the remaining open question in the analysis of Fibonacci
heaps.
Annihilation of sequences. Suppose we are given an in-
finite sequence of numbers, A = ha0, a1, a2, . . .i. We can
multiply with a constant, shift to the left and add another
sequence:
kA = hka0, ka1, ka2, . . .i,
LA = ha1, a2, a3, . . .i,
A + B = ha0 + b0, a1 + b1, a2 + b2, . . .i.
As an example, consider the sequence of powers of two,
ai = 2i
. Multiplying with 2 and shifting to the left give
the same result. Therefore,
LA − 2A = h0, 0, 0, . . .i.
We write LA − 2A = (L − 2)A and think of L − 2 as an
operator that annihilates the sequence of powers of 2. In
general, L − k annihilates any sequence of the form hcki
i.
What does L − k do to other sequences A = hcℓi
i, when
ℓ 6= k?
(L − k)A = hcℓ, cℓ2
, cℓ3
, . . .i − hck, ckℓ, ckℓ2
, . . .i
= (ℓ − k)hc, cℓ, cℓ2
, . . .i
= (ℓ − k)A.
We see that the operator L − k annihilates only one type
of sequence and multiplies other similar sequences by a
constant.
Multiple operators. Instead of just one, we can ap-
ply several operators to a sequence. We may multiply
with two constants, k(ℓA) = (kℓ)A, multiply and shift,
L(kA) = k(LA), and shift twice, L(LA) = L2
A. For
example, (L − k)(L − ℓ) annihilates all sequences of the
form hcki
+ dℓi
i, where we assume k 6= ℓ. Indeed, L − k
annihilates hcki
i and leaves behind h(ℓ − k)dℓi
i, which is
annihilated by L − ℓ. Furthermore, (L − k)(L − ℓ) anni-
hilates no other sequences. More generally, we have
FACT. (L − k1)(L − k2) . . . (L − kn) annihilates all se-
quences of the form hc1ki
1 + c2ki
2 + . . . + cnki
ni.
What if k = ℓ? To answer this question, we consider
(L − k)2
hiki
i = (L − k)h(i + 1)ki+1
− iki+1
i
= (L − k)hki+1
i
= h0i.
More generally, we have
FACT. (L − k)n
annihilates all sequences of the form
hp(i)ki
i, with p(i) a polynomial of degree n − 1.
Since operators annihilate only certain types of sequences,
we can determine the sequence if we know the annihilating
operator. The general method works in five steps:
1. Write down the annihilator for the recurrence.
2. Factor the annihilator.
3. Determine what sequence each factor annihilates.
4. Put the sequences together.
5. Solve for the constants of the solution by using initial
conditions.
Fibonacci numbers. We put the method to a test by con-
sidering the Fibonacci numbers defined recursively as fol-
lows:
F0 = 0,
F1 = 1,
Fj = Fj−1 + Fj−2, for j ≥ 2.
Writing a few of the initial numbers, we get the sequence
h0, 1, 1, 2, 3, 5, 8, . . .i. We notice that L2
− L − 1 annihi-
lates the sequence because
(L2
− L − 1)hFji = L2
hFji − LhFji − hFji
= hFj+2i − hFj+1i − hFji
= h0i.
If we factor the operator into its roots, we get
L2
− L − 1 = (L − ϕ)(L − ϕ),
where
ϕ =
1 +
√
5
2
= 1.618 . . .,
ϕ = 1 − ϕ =
1 −
√
5
2
= − 0.618 . . ..
41
The first root is known as the golden ratio because it repre-
sents the aspect ratio of a rectangular piece of paper from
which we may remove a square to leave a smaller rect-
angular piece of the same ratio: ϕ : 1 = 1 : ϕ − 1.
Thus we know that (L − ϕ)(L − ϕ) annihilates hFji and
this means that the j-th Fibonacci number is of the form
Fj = cϕj
+ c ϕj
. We get the constant factors from the
initial conditions:
F0 = 0 = c + c,
F1 = 1 = cϕ + c ϕ.
Solving the two linear equations in two unknowns, we get
c = 1/
√
5 and c = −1/
√
5. This implies that
Fj =
1
√
5
1 +
√
5
2
!j
−
1
√
5
1 −
√
5
2
!j
.
From this viewpoint, it seems surprising that Fj turns out
to be an integer for all j. Note that |ϕ|  1 and |ϕ|  1.
It follows that for growing exponent j, ϕj
goes to infinity
and ϕj
goes to zero. This implies that Fj is approximately
ϕj
/
√
5, and that this approximation becomes more and
more accurate as j grows.
Maximum degree. Recall that D(n) is the maximum
possible degree of any one node in a Fibonacci heap of
size n. We need two easy facts about the kind of trees that
arise in Fibonacci heaps in order to show that D(n) is at
most logarithmic in n. Let ν be a node of degree j, and
let µ1, µ2, . . . , µj be its children ordered by the time they
were linked to ν.
DEGREE LEMMA. The degree of µi is at least i − 2.
PROOF. Recall that nodes are linked only during the
deletemin operation. Right before the linking happens, the
two nodes are roots and have the same degree. It follows
that the degree of µi was at least i − 1 at the time it was
linked to ν. The degree of µi might have been even higher
because it is possible that ν lost some of the older children
after µi had been linked. After being linked, µi may have
lost at most one of its children, for else it would have been
cut. Its degree is therefore at least i − 2, as claimed.
SIZE LEMMA. The number of descendents of ν (includ-
ing ν) is at least Fj+2.
PROOF. Let sj be the minimum number of descendents a
node of degree j can have. We have s0 = 1 and s1 = 2.
For larger j, we get sj from sj−1 by adding the size of a
minimum tree with root degree j−2, which is sj−2. Hence
sj = sj−1 + sj−2, which is the same recurrence relation
that defines the Fibonacci numbers. The initial values are
shifted two positions so we get sj = Fj+2, as claimed.
Consider a Fibonacci heap with n nodes and let ν be a
node with maximum degree D = D(n). The Size Lemma
implies n ≥ FD+2. The Fibonacci number with index
D + 2 is roughly ϕD+2
/
√
5. Because ϕD+2

√
5, we
have
n ≥
1
√
5
ϕD+2
− 1.
After rearranging the terms and taking the logarithm to the
base ϕ, we get
D ≤ logϕ
√
5(n + 1) − 2.
Recall that logϕ x = log2 x/ log2 ϕ and use the calculator
to verify that log2 ϕ = 0.694 . . .  0.5 and logϕ
√
5 =
1.672 . . .  2. Hence
D ≤
log2(n + 1)
log2 ϕ
+ logϕ
√
5 − 2
 2 log2(n + 1).
Non-homogeneous terms. We now return to the anni-
hilation method for solving recurrence relations and con-
sider
aj = aj−1 + aj−2 + 1.
This is similar to the recurrence that defines Fibonacci
numbers and describes the minimum number of nodes in
an AVL tree, also known as height-balanced tree. It is de-
fined by the requirement that the height of the two sub-
trees of a node differ by at most 1. The smallest tree
of height j thus consists of the root, a subtree of height
j − 1 and another subtree of height j − 2. We refer to the
terms involving ai as the homogeneous terms of the re-
lation and the others as the non-homogeneous terms. We
know that L2
− L − 1 annihilates the homogeneous part,
aj = aj−1 + aj−2. If we apply it to the entire relation we
get
(L2
− L − 1)haji = haj+2i − haj+1i − haji
= h1, 1, . . .i.
The remaining sequence of 1s is annihilated by L − 1.
In other words, (L − ϕ)(L − ϕ)(L − 1) annihilates haji
implying that aj = cϕj
+ c ϕj
+ c′
1j
. It remains to find
42
the constants, which we get from the boundary conditions
a0 = 1, a1 = 2 and a2 = 4:
c + c + c′
= 1,
ϕc + ϕ c + c′
= 2,
ϕ2
c + ϕ2
c + c′
= 4.
Noting that ϕ2
= ϕ + 1, ϕ2
= ϕ + 1, and ϕ − ϕ =
√
5
we get c = (5 + 2
√
5)/5, c = (5 − 2
√
5)/5, and c′
= −1.
The minimum number of nodes of a height-j AVL tree is
therefore roughly the constant c times ϕj
. Conversely, the
maximum height of an AVL tree with n = cϕj
nodes is
roughly j = logϕ(n/c) = 1.440 . . . · log2 n + O(1). In
words, the height-balancing condition implies logarithmic
height.
Transformations. We extend the set of recurrences we
can solve by employing transformations that produce rela-
tions amenable to the annihilation method. We demon-
strate this by considering mergesort, which is another
divide-and-conquer algorithm that can be used to sort a
list of n items:
Step 1. Recursively sort the left half of the list.
Step 2. Recursively sort the right half of the list.
Step 3. Merge the two sorted lists by simultaneously
scanning both from beginning to end.
The running time is described by the solution to the recur-
rence
T (1) = 1,
T (n) = 2T (n/2) + n.
We have no way to work with terms like T (n/2) yet.
However, we can transform the recurrence into a more
manageable form. Defining n = 2i
and ti = T (2i
) we
get
t0 = 1,
ti = 2ti−1 + 2i
.
The homogeneous part is annihilated by L − 2. Similarly,
non-homogeneous part is annihilated by L − 2. Hence,
(L − 2)2
annihilates the entire relation and we get ti =
(ci+c)2i
. Expressed in the original notation we thus have
T (n) = (c log2 n + c)n = O(n log n). This result is of
course no surprise and reconfirms what we learned earlier
about sorting.
The Master Theorem. It is sometimes more convenient
to look up the solution to a recurrence relation than play-
ing with different techniques to see whether any one can
make it to yield. Such a cookbook method for recurrence
relations of the form
T (n) = aT (n/b) + f(n)
is provided by the following theorem. Here we assume
that a ≥ 1 and b  1 are constants and that f is a well-
behaved positive function.
MASTER THEOREM. Define c = logb a and let ε be an
arbitrarily small positive constant. Then
T (n) =



O(nc
) if f(n) = O(nc−ε
),
O(nc
log n) if f(n) = O(nc
),
O(f(n)) if f(n) = Ω(nc+ε
).
The last of the three cases also requires a usually satis-
fied technical condition, namely that af(n/b)  δf(n)
for some constant δ strictly less than 1. For example, this
condition is satisfied in T (n) = 2T (n/2) + n2
which im-
plies T (n) = O(n2
).
As another example consider the relation T (n) =
2T (n/2) + n that describes the running time of merge-
sort. We have c = log2 2 = 1 and f(n) = n = O(nc
).
The middle case of the Master Theorem applies and we
get T (n) = O(n log n), as before.
43
Third Homework Assignment
Write the solution to each problem on a single page. The
deadline for handing in solutions is October 14.
Problem 1. (20 = 10 + 10 points). Consider a lazy ver-
sion of heapsort in which each item in the heap is
either smaller than or equal to every other item in its
subtree, or the item is identified as uncertified. To
certify an item, we certify its children and then ex-
change it with the smaller child provided it is smaller
than the item itself. Suppose A[1..n] is a lazy heap
with all items uncertified.
(a) How much time does it take to certify A[1]?
(b) Does certifying A[1] turn A into a proper heap
in which every item satisfies the heap property?
(Justify your answer.)
Problem 2. (20 points). Recall that Fibonacci numbers
are defined recursively as F0 = 0, F1 = 1, and Fn =
Fn−1 +Fn−2. Prove the square of the n-th Fibonacci
number differs from the product of the two adjacent
numbers by one: F2
n = Fn−1 · Fn+1 + (−1)n+1
.
Problem 3. (20 points). Professor Pinocchio claims that
the height of an n-node Fibonacci heap is at most
some constant times log2 n. Show that the Profes-
sor is mistaken by exhibiting, for any integer n, a
sequence of operations that create a Fibonacci heap
consisting of just one tree that is a linear chain of n
nodes.
Problem 4. (20 = 10 + 10 points). To search in a sorted
array takes time logarithmic in the size of the array,
but to insert a new items takes linear time. We can
improve the running time for insertions by storing the
items in several instead of just one sorted arrays. Let
n be the number of items, let k = ⌈log2(n + 1)⌉,
and write n = nk−1nk−2 . . . n0 in binary notation.
We use k sorted arrays Ai (some possibly empty),
where Ai stores ni2i
items. Each item is stored ex-
actly once, and the total size of the arrays is indeed
Pk
i=0 ni2i
= n. Although each individual array is
sorted, there is no particular relationship between the
items in different arrays.
(a) Explain how to search in this data structure and
analyze your algorithm.
(b) Explain how to insert a new item into the data
structure and analyze your algorithm, both in
worst-case and in amortized time.
Problem 5. (20 = 10 + 10 points). Consider a full bi-
nary tree with n leaves. The size of a node, s(ν), is
the number of leaves in its subtree and the rank is
the floor of the binary logarithm of the size, r(ν) =
⌊log2 s(ν)⌋.
(a) Is it true that every internal node ν has a child
whose rank is strictly less than the rank of ν?
(b) Prove that there exists a leaf whose depth
(length of path to the root) is at most log2 n.
44
IV GRAPH ALGORITHMS
13 Graph Search
14 Shortest Paths
15 Minimum Spanning Trees
16 Union-find
Fourth Homework Assignment
45
13 Graph Search
We can think of graphs as generalizations of trees: they
consist of nodes and edges connecting nodes. The main
difference is that graphs do not in general represent hier-
archical organizations.
Types of graphs. Different applications require differ-
ent types of graphs. The most basic type is the simple
undirected graph that consists of a set V of vertices and a
set E of edges. Each edge is an unordered pair (a set) of
two vertices. We always assume V is finite, and we write
3
4
1
2
0
Figure 50: A simple undirected graph with vertices 0, 1, 2, 3, 4
and edges {0, 1}, {1, 2}, {2, 3}, {3, 0}, {3, 4}.
V
2

for the collection of all unordered pairs. Hence E is a
subset of V
2

. Note that because E is a set, each edge can
occur only once. Similarly, because each edge is a set (of
two vertices), it cannot connect to the same vertex twice.
Vertices u and v are adjacent if {u, v} ∈ E. In this case u
and v are called neighbors. Other types of graphs are
directed: E ⊆ V × V .
weighted: has a weighting function w : E → R.
labeled: has a labeling function ℓ : V → Z.
non-simple: there are loops and multi-edges.
A loop is like an edge, except that it connects to the same
vertex twice. A multi-edge consists of two or more edges
connecting the same two vertices.
Representation. The two most popular data structures
for graphs are direct representations of adjacency. Let
V = {0, 1, . . ., n − 1} be the set of vertices. The ad-
jacency matrix is the n-by-n matrix A = (aij) with
aij =

1 if {i, j} ∈ E,
0 if {i, j} 6∈ E.
For undirected graphs, we have aij = aji, so A is sym-
metric. For weighted graphs, we encode more informa-
tion than just the existence of an edge and define aij as
the weight of the edge connecting i and j. The adjacency
matrix of the graph in Figure 50 is
A =






0 1 0 1 0
1 0 1 0 0
0 1 0 1 0
1 0 1 0 1
0 0 0 1 0






,
which is symmetric. Irrespective of the number of edges,
0 1 2 3
V
4
0
2
4
3
3
1
3
0
2
1
Figure 51: The adjacency list representation of the graph in Fig-
ure 50. Each edge is represented twice, once for each endpoint.
the adjacency matrix has n2
elements and thus requires a
quadratic amount of space. Often, the number of edges
is quite small, maybe not much larger than the number of
vertices. In these cases, the adjacency matrix wastes mem-
ory, and a better choice is a sparse matrix representation
referred to as adjacency lists, which is illustrated in Fig-
ure 51. It consists of a linear array V for the vertices and
a list of neighbors for each vertex. For most algorithms,
we assume that vertices and edges are stored in structures
containing a small number of fields:
struct Vertex {int d, f, π; Edge ∗adj};
struct Edge {int v; Edge ∗next}.
The d, f, π fields will be used to store auxiliary informa-
tion used or created by the algorithms.
Depth-first search. Since graphs are generally not or-
dered, there are many sequences in which the vertices can
be visited. In fact, it is not entirely straightforward to make
sure that each vertex is visited once and only once. A use-
ful method is depth-first search. It uses a global variable,
time, which is incremented and used to leave time-stamps
behind to avoid repeated visits.
46
void VISIT(int i)
1 time++; V [i].d = time;
forall outgoing edges ij do
2 if V [j].d = 0 then
3 V [j].π = i; VISIT(j)
endif
endfor;
4 time++; V [i].f = time.
The test in line 2 checks whether the neighbor j of i has
already been visited. The assignment in line 3 records that
the vertex is visited from vertex i. A vertex is first stamped
in line 1 with the time at which it is encountered. A vertex
is second stamped in line 4 with the time at which its visit
has been completed. To prepare the search, we initialize
the global time variable to 0, label all vertices as not yet
visited, and call VISIT for all yet unvisited vertices.
time = 0;
forall vertices i do V [i].d = 0 endfor;
forall vertices i do
if V [i].d = 0 then V [i].π = 0; VISIT(i) endif
endfor.
Let n be the number of vertices and m the numberof edges
in the graph. Depth-first search visits every vertex once
and examines every edge twice, once for each endpoint.
The running time is therefore O(n + m), which is propor-
tional to the size of the graph and therefore optimal.
DFS forest. Figure 52 illustrates depth-first search by
showing the time-stamps d and f and the pointers π in-
dicating the predecessors in the traversal. We call an edge
{i, j} ∈ E a tree edge if i = V [j].π or j = V [i].π and a
back edge, otherwise. The tree edges form the DFS forest
12,13
11,14
10,15
1,16
4, 5 7, 8
2, 9
3, 6
Figure 52: The traversal starts at the vertex with time-stamp 1.
Each node is stamped twice, once when it is first encountered
and another time when its visit is complete.
of the graph. The forest is a tree if the graph is connected
and a collection of two or more trees if it is not connected.
Figure 53 shows the DFS forest of the graph in Figure 52
which, in this case, consists of a single tree. The time-
7, 8
4, 5 12,13
11,14
10,15
2, 9
1,16
3, 6
Figure 53: Tree edges are solid and back edges are dotted.
stamps d are consistent with the preorder traversal of the
DFS forest. The time-stamps f are consistent with the
postorder traversal. The two stamps can be used to decide,
in constant time, whether two nodes in the forest live in
different subtrees or one is a descendent of the other.
NESTING LEMMA. Vertex j is a proper descendent of
vertex i in the DFS forest iff V [i].d  V [j].d as well
as V [j].f  V [i].f.
Similarly, if you have a tree and the preorder and postorder
numbers of the nodes, you can determine the relation be-
tween any two nodes in constant time.
Directed graphs and relations. As mentioned earlier,
we have a directed graph if all edges are directed. A
directed graph is a way to think and talk about a mathe-
matical relation. A typical problem where relations arise
is scheduling. Some tasks are in a definite order while
others are unrelated. An example is the scheduling of
undergraduate computer science courses, as illustrated in
Figure 54. Abstractly, a relation is a pair (V, E), where
Comput. Org.
and Programm.
Operating Distributed
110
214
212
Inform. Syst.
and Implementation
Program Design
and Analysis I
Program Design
and Analysis II
Software Design Comput. Networks
and Distr. Syst.
Systems
108
006 100
104
Figure 54: A subgraph of the CPS course offering. The courses
CPS104 and CPS108 are incomparable, CPS104 is a predecessor
of CPS110, and so on.
V = {0, 1, . . ., n − 1} is a finite set of elements and
E ⊆ V × V is a finite set of ordered pairs. Instead of
47
(i, j) ∈ E we write i ≺ j and instead of (V, E) we write
(V, ≺). If i ≺ j then i is a predecessor of j and j is a suc-
cessor of i. The terms relation, directed graph, digraph,
and network are all synonymous.
Directed acyclic graphs. A cycle in a relation is a se-
quence i0 ≺ i1 ≺ . . . ≺ ik ≺ i0. Even i0 ≺ i0
is a cycle. A linear extension of (V, ≺) is an ordering
j0, j1, . . . , jn−1 of the elements that is consistent with the
relation. Formally this means that jk ≺ jℓ implies k  ℓ.
A directed graph without cycle is a directed acyclic graph.
EXTENSION LEMMA. (V, ≺) has a linear extension iff it
contains no cycle.
PROOF. “=⇒” is obvious. We prove “⇐=” by induction.
A vertex s ∈ V is called a source if it has no predecessor.
Assuming (V, ≺) has no cycle, we can prove that V has
a source by following edges against their direction. If we
return to a vertex that has already been visited, we have
a cycle and thus a contradiction. Otherwise we get stuck
at a vertex s, which can only happen because s has no
predecessor, which means s is a source.
Let U = V −{s} and note that (U, ≺) is a relation that is
smaller than (V, ≺). Hence (U, ≺) has a linear extension
by induction hypothesis. Call this extension X and note
that s, X is a linear extension of (V, ≺).
Topological sorting with queue. The problem of con-
structing a linear extension is called topological sorting.
A natural and fast algorithm follows the idea of the proof:
find a source s, print s, remove s, and repeat. To expedite
the first step of finding a source, each vertex maintains
its number of predecessors and a queue stores all sources.
First, we initialize this information.
forall vertices j do V [j].d = 0 endfor;
forall vertices i do
forall successors j of i do V [j].d++ endfor
endfor;
forall vertices j do
if V [j].d = 0 then ENQUEUE(j) endif
endfor.
Next, we compute the linear extension by repeated dele-
tion of a source.
while queue is non-empty do
s = DEQUEUE;
forall successors j of s do
V [j].d--;
if V [j].d = 0 then ENQUEUE(j) endif
endfor
endwhile.
The running time is linear in the number of vertices and
edges, namely O(n+m). What happens if there is a cycle
in the digraph? We illustrate the above algorithm for the
directed acyclic graph in Figure 55. The sequence of ver-
3, 2, 1, 0
1, 0
3, 2, 1, 0 1, 0
0
1, 0
0
1, 0
a d
e
c
h
f
b
g
Figure 55: The numbers next to each vertex count the predeces-
sors, which decreases during the algorithm.
tices added to the queue is also the linear extension com-
puted by the algorithm. If the process starts at vertex a
and if the successors of a vertex are ordered by name then
we get a, f, d, g, c, h, b, e, which we can check is indeed a
linear extension of the relation.
Topological sorting with DFS. Another algorithm that
can be used for topological sorting is depth-first search.
We output a vertex when its visit has been completed, that
is, when all its successors and their successors and so on
have already been printed. The linear extension is there-
fore generated from back to front. Figure 56 shows the
4, 5 6, 7 11, 12
1, 14
2, 9
15, 16
3, 8
10, 13
e
g
a
c
b h
f
d
Figure 56: The numbers next to each vertex are the two time
stamps applied by the depth-first search algorithm. The first
number gives the time the vertex is encountered, and the second
when the visit has been completed.
same digraph as Figure 55 and labels vertices with time
48
stamps. Consider the sequence of vertices in the order of
decreasing second time stamp:
a(16), f(14), g(13), h(12), d(9), c(8), e(7), b(5).
Although this sequence is different from the one computed
by the earlier algorithm, it is also a linear extension of the
relation.
49
14 Shortest Paths
One of the most common operations in graphs is finding
shortest paths between vertices. This section discusses
three algorithms for this problem: breadth-first search
for unweighted graphs, Dijkstra’s algorithm for weighted
graphs, and the Floyd-Warshall algorithm for computing
distances between all pairs of vertices.
Breadth-first search. We call a graph connected if there
is a path between every pair of vertices. A (connected)
component is a maximal connected subgraph. Breadth-
first search, or BFS, is a way to search a graph. It is sim-
ilar to depth-first search, but while DFS goes as deep as
quickly as possible, BFS is more cautious and explores a
broad neighborhood before venturing deeper. The starting
point is a vertex s. An example is shown in Figure 57. As
e a d
f b c g
2
2 1
1 1
0 1
2
s
Figure 57: A sample graph with eight vertices and ten edges
labeled by breath-first search. The label increases from a vertex
to its successors in the search.
before, we call and edge a tree edge if it is traversed by the
algorithm. The tree edges define the BFS tree, which we
can use to redraw the graph in a hierarchical manner, as in
Figure 58. In the case of an undirected graph, no non-tree
edge can connect a vertex to an ancestor in the BFS tree.
Why? We use a queue to turn the idea into an algorithm.
1 1
2
1
2
0
2
1
Figure 58: The tree edges in the redrawing of the graph in Figure
57 are solid, and the non-tree edges are dotted.
First, the graph and the queue are initialized.
forall vertices i do V [i].d = −1 endfor;
V [s].d = 0;
MAKEQUEUE; ENQUEUE(s); SEARCH.
A vertex is processed by adding its unvisited neighbors to
the queue. They will be processed in turn.
void SEARCH
while queue is non-empty do
i = DEQUEUE;
forall neighbors j of i do
if V [j].d = −1 then
V [j].d = V [i].d + 1; V [j].π = i;
ENQUEUE(j)
endif
endfor
endwhile.
The label V [i].d assigned to vertex i during the traversal is
the minimum number of edges of any path from s to i. In
other words, V [i].d is the length of the shortest path from
s to i. The running time of BFS for a graph with n vertices
and m edges is O(n + m).
Single-source shortest path. BFS can be used to find
shortest paths in unweighted graphs. We now extend the
algorithm to weighted graphs. Assume V and E are the
sets of vertices and edges of a simple, undirected graph
with a positive weighting function w : E → R+. The
length or weight of a path is the sum of the weights of
its edges. The distance between two vertices is the length
of the shortest path connecting them. For a given source
s ∈ V , we study the problem of finding the distances and
shortest paths to all other vertices. Figure 59 illustrates the
problem by showing the shortest paths to the source s. In
5 5 5
4
4 10
4
10 10
f b c g
e a s d
6
Figure 59: The bold edges form shortest paths and together the
shortest path tree with root s. It differs by one edge from the
breadth-first tree shown in Figure 57.
the non-degenerate case, in which no two paths have the
same length, the union of all shortest paths to s is a tree,
referred to as the shortest path tree. In the degenerate case,
we can break ties such that the union of paths is a tree.
As before, we grow a tree starting from s. Instead of a
queue, we use a priority queue to determine the next vertex
to be added to the tree. It stores all vertices not yet in the
50
tree and uses V [i].d for the priority of vertex i. First, we
initialize the graph and the priority queue.
V [s].d = 0; V [s].π = −1; INSERT(s);
forall vertices i 6= s do
V [i].d = ∞; INSERT(i)
endfor.
After initialization the priority queue stores s with priority
0 and all other vertices with priority ∞.
Dijkstra’s algorithm. We mark vertices in the tree to
distinguish them from vertices that are not yet in the tree.
The priority queue stores all unmarked vertices i with pri-
ority equal to the length of the shortest path that goes from
i in one edge to a marked vertex and then to s using only
marked vertices.
while priority queue is non-empty do
i = EXTRACTMIN; mark i;
forall neighbors j of i do
if j is unmarked then
V [j].d = min{w(ij) + V [i].d, V [j].d}
endif
endfor
endwhile.
Table 3 illustrates the algorithm by showing the informa-
tion in the priority queue after each iteration of the while-
loop operating on the graph in Figure 59. The mark-
s 0
a ∞ 5 5
b ∞ 10 10 9 9
c ∞ 4
d ∞ 5 5 5
e ∞ ∞ ∞ 10 10 10
f ∞ ∞ ∞ 15 15 15 15
g ∞ ∞ ∞ ∞ 15 15 15 15
Table 3: Each column shows the contents of the priority queue.
Time progresses from left to right.
ing mechanism is not necessary but clarifies the process.
The algorithm performs n EXTRACTMIN operations and
at most m DECREASEKEY operations. We compare the
running time under three different data structures used to
represent the priority queue. The first is a linear array, as
originally proposed by Dijkstra, the second is a heap, and
the third is a Fibonacci heap. The results are shown in
Table 4. We get the best result with Fibonacci heaps for
which the total running time is O(n log n + m).
array heap F-heap
EXTRACTMINs n2 n log n n log n
DECREASEKEYs m m log m m
Table 4: Running time of Dijkstra’s algorithm for three different
implementations of the priority queue holding the yet unmarked
vertices.
Correctness. It is not entirely obvious that Dijkstra’s al-
gorithm indeed finds the shortest paths to s. To show that
it does, we inductively prove that it maintains the follow-
ing two invariants.
(A) For every unmarked vertex j, V [j].d is the length of
the shortest path from j to s that uses only marked
vertices other than j.
(B) For every marked vertex i, V [i].d is the length of the
shortest path from i to s.
PROOF. Invariant (A) is true at the beginning of Dijkstra’s
algorithm. To show that it is maintained throughout the
process, we need to make sure that shortest paths are com-
puted correctly. Specifically, if we assume Invariant (B)
for vertex i then the algorithm correctly updates the prior-
ities V [j].d of all neighbors j of i, and no other priorities
change.
i
y
s
Figure 60: The vertex y is the last unmarked vertex on the hypo-
thetically shortest, dashed path that connects i to s.
At the moment vertex i is marked, it minimizes V [j].d
over all unmarked vertices j. Suppose that, at this mo-
ment, V [i].d is not the length of the shortest path from i to
s. Because of Invariant (A), there is at least one other un-
marked vertex on the shortest path. Let the last such vertex
be y, as shown in Figure 60. But then V [y].d  V [i].d,
which is a contradiction to the choice of i.
We used (B) to prove (A) and (A) to prove (B). To make
sure we did not create a circular argument, we parametrize
the two invariants with the number k of vertices that are
51
marked and thus belong to the currently constructed por-
tion of the shortest path tree. To prove (Ak) we need (Bk)
and to prove (Bk) we need (Ak−1). Think of the two in-
variants as two recursive functions, and for each pair of
calls, the parameter decreases by one and thus eventually
becomes zero, which is when the argument arrives at the
base case.
All-pairs shortest paths. We can run Dijkstra’s algo-
rithm n times, once for each vertex as the source, and thus
get the distance between every pair of vertices. The run-
ning time is O(n2
log n + nm) which, for dense graphs, is
the same as O(n3
). Cubic running time can be achieved
with a much simpler algorithm using the adjacency matrix
to store distances. The idea is to iterate n times, and after
the k-th iteration, the computed distance between vertices
i and j is the length of the shortest path from i to j that,
other than i and j, contains only vertices of index k or less.
for k = 1 to n do
for i = 1 to n do
for j = 1 to n do
A[i, j] = min{A[i, j], A[i, k] + A[k, j]}
endfor
endfor
endfor.
The only information needed to update A[i, j] during the
k-th iteration of the outer for-loop are its old value and
values in the k-th row and the k-th column of the prior
adjacency matrix. This row remains unchanged in this it-
eration and so does this column. We therefore do not have
to use two arrays, writing the new values right into the old
matrix. We illustrate the algorithm by showing the adja-
cency, or distance matrix before the algorithm in Figure
61 and after one iteration in Figure 62.
d
c
b
a
s
e
f
g
s a b c d e f g
0
0
0
0
0
0
0
0
5 4 5
5
10
10
4
4
5
4
4
6
5
10
6
10
10
5
4
10
Figure 61: Adjacency, or distance matrix of the graph in Figure
57. All blank entries store ∞.
s a b c d e f g
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
s a b c d e f g s a b c d e f g
d
c
b
a
s
e
f
g
s a b c d e f g
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
d
c
b
a
s
e
f
g
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
0
0
0
0
0
0
0
0
5 4 5
5
10
10
4
4
5
4
4
6
5
10
6
10
10
5
4
s a b c d e f g s a b c d e f g
d
c
b
a
s
e
f
g
d
c
b
a
s
e
f
g
10 10
10
10
9 10
15
14
9
10
14
15
9
9
10
15 19 20
15
14
9
14 15 20
19
14
13
13 14 9
15
10
9 10
9
10
9 10 15
9 10
9 13 14 9 14
14
9 13 14 19
10 14 15 20
10 9 14 15
15 19 20
9 10 15
9 10
9 13 14 9 14
9 13 14 19
10 14 15 20
10 9 14 15
15 19 20
14
14 14
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
10
9 10 15
9 10
9 13 14 9 14
9 13 14 19
10 14 15 20
10 9 14 15
15 19 20
15 20 24 14 25 30
30
25
14
24
20
15
14
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
10
9 10 15
9 10
9 13 14 9 14
9 13 14 19
10 14 15 20
10 9 14 15
15 19 20
15 20 24 14 25 30
30
25
14
24
20
15
14
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
10
9 10 15
9 10
9 13 14 9 14
9 13 14 19
10 14 15 20
10 9 14 15
15 19 20
15 20 24 14 25 30
30
25
14
24
20
15
14
0
0
0
0
0
0
0
0
5 4 5
5
4
4
5
4
4
6
5
10
6
10
10
5
4
10
9 10 15
9 10
9 13 14 9 14
9 13 14 19
10 14 15 20
10 9 14 15
15 19 20
15 20 24 14 25 30
30
25
14
24
20
15
14
Figure 62: Matrix after each iteration. The k-th row and colum
are shaded and the new, improved distances are high-lighted.
The algorithm works for weighted undirected as well
as for weighted directed graphs. Its correctness is easily
verified inductively. The running time is O(n3
).
52
15 Minimum Spanning Trees
When a graph is connected, we may ask how many edges
we can delete before it stops being connected. Depending
on the edges we remove, this may happen sooner or later.
The slowest strategy is to remove edges until the graph
becomes a tree. Here we study the somewhat more dif-
ficult problem of removing edges with a maximum total
weight. The remaining graph is then a tree with minimum
total weight. Applications that motivate this question can
be found in life support systems modeled as graphs or net-
works, such as telephone, power supply, and sewer sys-
tems.
Free trees. An undirected graph (U, T ) is a free tree if
it is connected and contains no cycle. We could impose a
hierarchy by declaring any one vertex as the root and thus
obtain a rooted tree. Here, we have no use for a hierarchi-
cal organization and exclusively deal with free trees. The
g
c
h
f
e
d
a b
i
Figure 63: Adding the edge dg to the tree creates a single cycle
with vertices d, g, h, f, e, a.
number of edges of a free tree is always one less than the
number of vertices. Whenever we add a new edge (con-
necting two old vertices) we create exactly one cycle. This
cycle can be destroyed by deleting any one of its edges,
and we get a new free tree, as in Figure 63. Let (V, E) be
a connected and undirected graph. A subgraph is another
graph (U, T ) with U ⊆ V and T ⊆ E. It is a spanning
tree if it is a free tree with U = V .
Minimum spanning trees. For the remainder of this
section, we assume that we also have a weighting func-
tion, w : E → R. The weight of subgraph is then the
total weight of its edges, w(T ) =
P
e∈T w(e). A mini-
mum spanning tree, or MST of G is a spanning tree that
minimizes the weight. The definitions are illustrated in
Figure 64 which shows a graph of solid edges with a min-
imum spanning tree of bold edges. A generic algorithm
for constructing an MST grows a tree by adding more and
1.9
1.1
1.3
1.2
2.5
1.6
1.5
0.9 1.4 2.8
1.6
1.4
a
d
e
f
1.3
b
1.2
c
g h i
3.6
Figure 64: The bold edges form a spanning tree of weight 0.9 +
1.2 + 1.3 + 1.4 + 1.1 + 1.2 + 1.6 + 1.9 = 10.6.
more edges. Let A ⊆ E be a subset of some MST of a
connected graph (V, E). An edge uv ∈ E − A is safe for
A if A ∪ {uv} is also subset of some MST. The generic
algorithm adds safe edges until it arrives at an MST.
A = ∅;
while (V, A) is not a spanning tree do
find a safe edge uv; A = A ∪ {uv}
endwhile.
As long as A is a proper subset of an MST there are safe
edges. Specifically, if (V, T ) is an MST and A ⊆ T then
all edges in T − A are safe for A. The algorithm will
therefore succeed in constructing an MST. The only thing
that is not yet clear is how to find safe edges quickly.
Cuts. To develop a mechanism for identifying safe
edges, we define a cut, which is a partition of the vertex
set into two complementary sets, V = W ˙
∪ (V −W). It is
crossed by an edge uv ∈ E if u ∈ W and v ∈ V −W, and
it respects an edge set A if A contains no crossing edge.
The definitions are illustrated in Figure 65.
Figure 65: The vertices inside and outside the shaded regions
form a cut that respects the collection of solid edges. The dotted
edges cross the cut.
53
CUT LEMMA. Let A be subset of an MST and consider a
cut W ˙
∪ (V −W) that respects A. If uv is a crossing
edge with minimum weight then uv is safe for A.
PROOF. Consider a minimum spanning tree (V, T ) with
A ⊆ T . If uv ∈ T then we are done. Otherwise, let
T ′
= T ∪ {uv}. Because T is a tree, there is a unique
path from u to v in T . We have u ∈ W and v ∈ V − W,
so the path switches at least once between the two sets.
Suppose it switches along xy, as in Figure 66. Edge xy
u
v
x
y
Figure 66: Adding uv creates a cycle and deleting xy destroys
the cycle.
crosses the cut, and since A contains no crossing edges we
have xy 6∈ A. Because uv has minimum weight among
crossing edges we have w(uv) ≤ w(xy). Define T ′′
=
T ′
− {xy}. Then (V, T ′′
) is a spanning tree and because
w(T ′′
) = w(T ) − w(xy) + w(uv) ≤ w(T )
it is a minimum spanning tree. The claim follows because
A ∪ {uv} ⊆ T ′′
.
A typical application of the Cut Lemma takes a compo-
nent of (V, A) and defines W as the set of vertices of that
component. The complementary set V − W contains all
other vertices, and crossing edges connect the component
with its complement.
Prim’s algorithm. Prim’s algorithm chooses safe edges
to grow the tree as a single component from an arbitrary
first vertex s. Similar to Dijkstra’s algorithm, the vertices
that do not yet belong to the tree are stored in a priority
queue. For each vertex i outside the tree, we define its
priority V [i].d equal to the minimum weight of any edge
that connects i to a vertex in the tree. If there is no such
edge then V [i].d = ∞. In addition to the priority, we store
the index of the other endpoint of the minimum weight
edge. We first initialize this information.
V [s].d = 0; V [s].π = −1; INSERT(s);
forall vertices i 6= s do
V [i].d = ∞; INSERT(i)
endfor.
The main algorithm expands the tree by one edge at a time.
It uses marks to distinguish vertices in the tree from ver-
tices outside the tree.
while priority queue is non-empty do
i = EXTRACTMIN; mark i;
forall neighbors j of i do
if j is unmarked and w(ij)  V [j].d then
V [j].d = w(ij); V [j].π = i
endif
endfor
endwhile.
After running the algorithm, the MST can be recovered
from the π-fields of the vertices. The algorithm together
with its initialization phase performs n = |V | insertions
into the priority queue, n extractmin operations, and at
most m = |E| decreasekey operations. Using the Fi-
bonacci heap implementation, we get a running time of
O(n log n + m), which is the same as for constructing the
shortest-path tree with Dijkstra’s algorithm.
Kruskal’s algorithm. Kruskal’s algorithm is another
implementation of the generic algorithm. It adds edges in
a sequence of non-decreasing weight. At any moment, the
chosen edges form a collection of trees. These trees merge
to form larger and fewer trees, until they eventually com-
bine into a single tree. The algorithm uses a priority queue
for the edges and a set system for the vertices. In this
context, the term ‘system’ is just another word for ‘set’,
but we will use it exclusively for sets whose elements are
themselves sets. Implementations of the set system will
be discussed in the next lecture. Initially, A = ∅, the pri-
ority queue contains all edges, and the system contains a
singleton set for each vertex, C = {{u} | u ∈ V }. The
algorithm finds an edge with minimum weight that con-
nects two components defined by A. We set W equal to
the vertex set of one component and use the Cut Lemma
to show that this edge is safe for A. The edge is added to
A and the process is repeated. The algorithm halts when
only one tree is left, which is the case when A contains
n − 1 = |V | − 1 edges.
A = ∅;
while |A|  n − 1 do
uv = EXTRACTMIN;
find P, Q ∈ C with u ∈ P and v ∈ Q;
if P 6= Q then
A = A ∪ {uv}; merge P and Q
endif
endwhile.
54
The running time is O(m log m) for the priority queue op-
erations plus some time for maintaining C. There are two
operations for the set system, namely finding the set that
contains a given element, and merging two sets into one.
An example. We illustrate Kruskal’s algorithm by ap-
plying it to the weighted graph in Figure 64. The sequence
of edges sorted by weight is cd, fi, fh, ad, ae, hi, de, ef,
ac, gh, dg, bf, eg, bi, ab. The evolution of the set system
b
i
g
c d e
a
f
h
Figure 67: Eight union operations merge the nine singleton sets
into one set.
is illustrated in Figure 67, and the MST computed with
Kruskal’s algorithm and indicated with dotted edges is the
same as in Figure 64. The edges cd, fi, fh, ad, ae are all
added to the tree. The next two edge, hi and de, are not
added because they each have both endpoints in the same
component, and adding either edge would create a cycle.
Edge ef is added to the tree giving rise to a set in the sys-
tem that contains all vertices other than g and b. Edge ac
is not added, gh is added, dg is not, and finally bf is added
to the tree. At this moment the system consists of a single
set that contains all vertices of the graph.
As suggested by Figure 67, the evolution of the con-
struction can be interpreted as a hierarchical clustering of
the vertices. The specific method that corresponds to the
evolution created by Kruskal’s algorithm is referred to as
single-linkage clustering.
55
16 Union-Find
In this lecture, we present two data structures for the dis-
joint set system problem we encountered in the implemen-
tation of Kruskal’s algorithm for minimum spanning trees.
An interesting feature of the problem is that m operations
can be executed in a time that is only ever so slightly more
than linear in m.
Abstract data type. A disjoint set system is an abstract
data type that represents a partition C of a set [n] =
{1, 2, . . ., n}. In other words, C is a set of pairwise dis-
joint subsets of [n] such that the union of all sets in C is
[n]. The data type supports
set FIND(i): return P ∈ C with i ∈ P;
void UNION(P, Q) : C = C − {P, Q} ∪ {P ∪ Q}.
In most applications, the sets themselves are irrelevant,
and it is only important to know when two elements be-
long to the same set and when they belong to different sets
in the system. For example, Kruskal’s algorithm executes
the operations only in the following sequence:
P = FIND(i); Q = FIND(j);
if P 6= Q then UNION(P, Q) endif.
This is similar to many everyday situations where it is usu-
ally not important to know what it is as long as we recog-
nize when two are the same and when they are different.
Linked lists. We construct a fairly simple and reason-
ably efficient first solution using linked lists for the sets.
We use a table of length n, and for each i ∈ [n], we store
the name of the set that contains i. Furthermore, we link
the elements of the same set and use the name of the first
element as the name of the set. Figure 68 shows a sample
set system and its representation. It is convenient to also
store the size of the set with the first element.
To perform a UNION operation, we need to change the
name for all elements in one of the two sets. To save time,
we do this only for the smaller set. To merge the two lists
without traversing the longer one, we insert the shorter list
between the first two elements of the longer list.
5
4
12 7
9
2
1
6
10 3 8 11
1 2 3 4 8 10
5 6 7 9 11 12
3 3 3 8 3 8
8 11 11 3 11 8
5 4 3
C.size
C.set
C.next
Figure 68: The system consists of three sets, each named by the
bold element. Each element stores the name of its set, possibly
the size of its set, and possibly a pointer to the next element in
the same set.
void UNION(int P, Q)
if C[P].size  C[Q].size then P ↔ Q endif;
C[P].size = C[P].size + C[Q].size;
second = C[P].next; C[P].next = Q; t = Q;
while t 6= 0 do
C[t].set = P; u = t; t = C[t].next
endwhile; C[u].next = second.
In the worst case, a single UNION operation takes time
Θ(n). The amortized performance is much better because
we spend time only on the elements of the smaller set.
WEIGHTED UNION LEMMA. n − 1 UNION operations
applied to a system of n singleton sets take time
O(n log n).
PROOF. For an element, i, we consider the cardinality of
the set that contains it, σ(i) = C[FIND(i)].size. Each time
the name of the set that contains i changes, σ(i) at least
doubles. After changing the name k times, we have σ(i) ≥
2k
and therefore k ≤ log2 n. In other words, i can be in
the smaller set of a UNION operation at most log2 n times.
The claim follows because a UNION operation takes time
proportional to the cardinality of the smaller set.
Up-trees. Thinking of names as pointers, the above data
structure stores each set in a tree of height one. We can
use more general trees and get more efficient UNION op-
erations at the expense of slower FIND operations. We
consider a class of algorithms with the following common-
alities:
56
• each set is a tree and the name of the set is the index
of the root;
• FIND traverses a path from a node to the root;
• UNION links two trees.
It suffices to store only one pointer per node, namely the
pointer to the parent. This is why these trees are called
up-trees. It is convenient to let the root point to itself.
5
6
1 3 4
7
2
11
8
12
9
10
Figure 69: The UNION operations create a tree by linking the
root of the first set to the root of the second set.
1 2 3 4 8 10
5 6 7 9 11 12
Figure 70: The table stores indices which function as pointers as
well as names of elements and of sets. The white dot represents
a pointer to itself.
Figure 69 shows the up-tree generated by executing the
following eleven UNION operations on a system of twelve
singleton sets: 2 ∪ 3, 4 ∪ 7, 2 ∪ 4, 1 ∪ 2, 4 ∪ 10, 9 ∪ 12,
12 ∪ 2, 8 ∪ 11, 8 ∪ 2, 5 ∪ 6, 6 ∪ 1. Figure 70 shows the
embedding of the tree in a table. UNION takes constant
time and FIND takes time proportional to the length of the
path, which can be as large as n − 1.
Weighted union. The running time of FIND can be im-
proved by linking smaller to larger trees. This is the ide
of weighted union again. Assume a field C[i].p for the
index of the parent (C[i].p = i if i is a root), and a field
C[i].size for the number of elements in the tree rooted at i.
We need the size field only for the roots and we need the
index to the parent field everywhere except for the roots.
The FIND and UNION operations can now be implemented
as follows:
int FIND(int i)
if C[i].p 6= i then return FIND(C[i].p) endif;
return i.
void UNION(int i, j)
if C[i].size  C[j].size then i ↔ j endif;
C[i].size = C[i].size + C[j].size; C[j].p = i.
The size of a subtree increases by at least a factor of 2 from
a node to its parent. The depth of a node can therefore not
exceed log2 n. It follows that FIND takes at most time
O(log n). We formulate the result on the height for later
reference.
HEIGHT LEMMA. An up-tree created from n singleton
nodes by n − 1 weighted union operations has height
at most log2 n.
Path compression. We can further improve the time for
FIND operations by linking traversed nodes directly to the
root. This is the idea of path compression. The UNION
operation is implemented as before and there is only one
modification in the implementation of the FIND operation:
int FIND(int i)
if C[i].p 6= i then C[i].p = FIND(C[i].p) endif;
return C[i].p.
7
6
4
7
5
6
3
2
1
1
1 1
8
7
2
6
1
4
2
3
2
3
5
6
4
3
2
7
7
8
7
5
6
4
3
2
5
3
2
6
1
4
3
2
4 6
4
3
2
7
5
6
Figure 71: The operations and up-trees develop from top to bot-
tom and within each row from left to right.
If i is not root then the recursion makes it the child of a
root, which is then returned. If i is a root, it returns itself
57
because in this case C[i].p = i, by convention. Figure 71
illustrates the algorithm by executing a sequence of eight
operations i ∪ j, which is short for finding the sets that
contain i and j, and performing a UNION operation if the
sets are different. At the beginning, every element forms
its own one-node tree. With path compression, it is diffi-
cult to imagine that long paths can develop at all.
Iterated logarithm. We will prove shortly that the iter-
ated logarithm is an upper bound on the amortized time
for a FIND operation. We begin by defining the function
from its inverse. Let F(0) = 1 and F(i + 1) = 2F (i)
. We
have F(1) = 2, F(2) = 22
, and F(3) = 222
. In general,
F(i) is the tower of i 2s. Table 5 shows the values of F
for the first six arguments. For i ≤ 3, F is very small, but
i 0 1 2 3 4 5
F 1 2 4 16 65, 536 265,536
Table 5: Values of F.
for i = 5 it already exceeds the number of atoms in our
universe. Note that the binary logarithm of a tower of i 2s
is a tower of i−1 2s. The iterated logarithm is the number
of times we can take the binary logarithm before we drop
down to one or less. In other words, the iterated logarithm
is the inverse of F,
log∗
n = min{i | F(i) ≥ n}
= min{i | log2 log2 . . . log2 n ≤ 1},
where the binary logarithm is taken i times. As n goes to
infinity, log∗
n goes to infinity, but very slowly.
Levels and groups. The analysis of the path com-
pression algorithm uses two Census Lemmas discussed
shortly. Let A1, A2, . . . , Am be a sequence of UNION and
FIND operations, and let T be the collection of up-trees
we get by executing the sequence, but without path com-
pression. In other words, the FIND operations have no
influence on the trees. The level λ(µ) of a node µ is its
height of its subtree in T plus one.
LEVEL CENSUS LEMMA. There are at most n/2ℓ−1
nodes at level ℓ.
PROOF. We use induction to show that a node at level ℓ
has a subtree of at least 2ℓ−1
nodes. The claim follows
because subtrees of nodes on the same level are disjoint.
Note that if µ is a proper descendent of another node
ν at some moment during the execution of the operation
sequence then µ is a proper descendent of ν in T . In this
case λ(µ)  λ(ν).
2
3
4
5
6
1
0
1
2
3
4
9
6
1
1
1
1
5
1
1
17
7
1
18
2
2
1
8
3
3
3
3
3
0
3
3
3
3
3
3
4
4
3
Figure 72: A schematic drawing of the tree T between the col-
umn of level numbers on the left and the column of group num-
bers on the right. The tree is decomposed into five groups, each
a sequences of contiguous levels.
Define the group number of a node µ as the iterated
logarithm of the level, g(µ) = log∗
λ(µ). Because the
level does not exceed n, we have g(µ) ≤ log∗
n, for every
node µ in T . The definition of g decomposes an up-tree
into at most 1 + log∗
n groups, as illustrated in Figure 72.
The number of levels in group g is F(g)−F(g−1), which
gets large very fast. On the other hand, because levels get
smaller at an exponential rate, the number of nodes in a
group is not much larger than the number of nodes in the
lowest level of that group.
GROUP CENSUS LEMMA. There are at most 2n/F(g)
nodes with group number g.
PROOF. Each node with group number g has level between
F(g − 1) + 1 and F(g). We use the Level Census Lemma
to bound their number:
F (g)
X
ℓ=F (g−1)+1
n
2ℓ−1
≤
n · (1 + 1
2 + 1
4 + . . .)
2F (g−1)
=
2n
F(g)
,
as claimed.
Analysis. The analysis is based on the interplay between
the up-trees obtained with and without path compression.
58
The latter are constructed by the weighted union opera-
tions and eventually form a single tree, which we denote
as T . The former can be obtained from the latter by the
application of path compression. Note that in T , the level
strictly increases from a node to its parent. Path compres-
sion preserves this property, so levels also increase when
we climb a path in the actual up-trees.
We now show that any sequence of m ≥ n UNION and
FIND operations on a ground set [n] takes time at most
O(m log∗
n) if weighted union and path compression is
used. We can focus on FIND because each UNION opera-
tion takes only constant time. For a FIND operation Ai, let
Xi be the set of nodes along the traversed path. The total
time for executing all FIND operations is proportional to
x =
X
i
|Xi|.
For µ ∈ Xi, let pi(µ) be the parent during the execution of
Ai. We partition Xi into the topmost two nodes, the nodes
just below boundaries between groups, and the rest:
Yi = {µ ∈ Xi | µ is root or child of root},
Zi = {µ ∈ Xi − Yi | g(µ)  g(pi(µ))},
Wi = {µ ∈ Xi − Yi | g(µ) = g(pi(µ))}.
Clearly, |Yi| ≤ 2 and |Zi| ≤ log∗
n. It remains to bound
the total size of the Wi, w =
P
i |Wi|. Instead of count-
ing, for each Ai, the nodes in Wi, we count, for each node
µ, the FIND operations Aj for which µ ∈ Wj. In other
words, we count how often µ can change parent until its
parent has a higher group number than µ. Each time µ
changes parent, the new parent has higher level than the
old parent. If follows that the number of changes is at
most F(g(µ)) − F(g(µ) − 1). The number of nodes with
group number g is at most 2n/F(g) by the Group Census
Lemma. Hence
w ≤
log∗
n
X
g=0
2n
F(g)
· (F(g) − F(g − 1))
≤ 2n · (1 + log∗
n).
This implies that
x ≤ 2m + m log∗
n + 2n(1 + log∗
n)
= O(m log∗
n),
assuming m ≥ n. This is an upper bound on the total time
it takes to execute m FIND operations. The amortized cost
per FIND operation is therefore at most O(log∗
n), which
for all practical purposes is a constant.
Summary. We proved an upper bound on the time
needed for m ≥ n UNION and FIND operations. The
bound is more than constant per operation, although for
all practical purposes it is constant. The log∗
n bound can
be improved to an even smaller function, usually referred
to as α(n) or the inverse of the Ackermann function, that
goes to infinity even slower than the iterated logarithm.
It can also be proved that (under some mild assumptions)
there is no algorithm that can execute general sequences
of UNION and FIND operations in amortized time that is
asymptotically less than α(n).
59
Fourth Homework Assignment
Write the solution to each problem on a single page. The
deadline for handing in solutions is October 30.
Problem 1. (20 = 10 + 10 points). Consider a free tree
and let d(u, v) be the number of edges in the path
connecting u to v. The diameter of the tree is the
maximum d(u, v) over all pairs of vertices in the tree.
(a) Give an efficient algorithm to compute the di-
ameter of a tree.
(b) Analyze the running time of your algorithm.
Problem 2. (20 points). Design an efficient algorithm to
find a spanning tree for a connected, weighted, undi-
rected graph such that the weight of the maximum
weight edge in the spanning tree is minimized. Prove
the correctness of your algorithm.
Problem 3. (7 + 6 + 7 points). A weighted graph G =
(V, E) is a near-tree if it is connected and has at most
n + 8 edges, where n is the number of vertices. Give
an O(n)-time algorithm to find a minimum weight
spanning tree for G.
Problem 4. (10 + 10 points). Given an undirected
weighted graph and vertices s, t, design an algorithm
that computes the number of shortest paths from s to
t in the case:
(a) All weights are positive numbers.
(b) All weights are real numbers.
Analyze your algorithm for both (a) and (b).
Problem 5. (20 = 10 + 10 points). The off-line mini-
mum problem is about maintaining a subset of [n] =
{1, 2, . . ., n} under the operations INSERT and EX-
TRACTMIN. Given an interleaved sequence of n in-
sertions and m min-extractions, the goal is to deter-
mine which key is returned by which min-extraction.
We assume that each element i ∈ [n] is inserted ex-
actly once. Specifically, we wish to fill in an array
E[1..m] such that E[i] is the key returned by the i-
th min-extraction. Note that the problem is off-line,
in the sense that we are allowed to process the entire
sequence of operations before determining any of the
returned keys.
(a) Describe how to use a union-find data structure
to solve the problem efficiently.
(b) Give a tight bound on the worst-case running
time of your algorithm.
60
V TOPOLOGICAL ALGORITHMS
17 Geometric Graphs
18 Surfaces
19 Homology
Fifth Homework Assignment
61
17 Geometric Graphs
In the abstract notion of a graph, an edge is merely a pair of
vertices. The geometric (or topological) notion of a graph
is closer to our intuition in which we think of an edge as a
curve that connects two vertices.
Embeddings. Let G = (V, E) be a simple, undirected
graph and write R2
for the two-dimensional real plane.
A drawing maps every vertex v ∈ V to a point ε(v) in
R2
, and it maps every edge {u, v} ∈ E to a curve with
endpoints ε(u) and ε(v). The drawing is an embedding if
1. different vertices map to different points;
2. the curves have no self-intersections;
3. the only points of a curve that are images of vertices
are its endpoints;
4. two curves intersect at most in their endpoints.
We can always map the vertices to points and the edges
to curves in R3
so they form an embedding. On the other
hand, not every graph has an embedding in R2
. The graph
G is planar if it has an embedding in R2
. As illustrated
in Figure 73, a planar graph has many drawings, not all of
which are embeddings. A straight-line drawing or embed-
Figure 73: Three drawings of K4, the complete graph with four
vertices. From left to right: a drawing that is not an embedding,
an embedding with one curved edge, a straight-line embedding.
ding is one in which each edge is mapped to a straight line
segment. It is uniquely determined by the mapping of the
vertices, ε : V → R2
. We will see later that every planar
graph has a straight-line embedding.
Euler’s formula. A face of an embedding ε of G is a
component of the thus defined decomposition of R2
. We
write n = |V |, m = |E|, and ℓ for the number of faces.
Euler’s formula says these numbers satisfy a linear rela-
tion.
EULER’S FORMULA. If G is connected and ε is an em-
bedding of G in R2
then n − m + ℓ = 2.
PROOF. Choose a spanning tree (V, T ) of G = (V, E). It
has n vertices, |T | = n − 1 edges, and one (unbounded)
face. We have n − (n − 1) + 1 = 2, which proves the for-
mula if G is a tree. Otherwise, draw the remaining edges,
one at a time. Each edge decomposes one face into two.
The number of vertices does not change, m increases by
one, and ℓ increases by one. Since the graph satisfies the
linear relation before drawing the edge, it satisfies the re-
lation also after drawing the edge.
A planar graph is maximally connected if adding any
one new edge violates planarity. Not surprisingly, a planar
graph of three or more vertices is maximally connected
iff every face in an embedding is bounded by three edges.
Indeed, suppose there is a face bounded by four or more
edges. Then we can find two vertices in its boundary that
are not yet connected and we can connect them by draw-
ing a curve that passes through the face; see Figure 74.
For obvious reasons, we call an embedding of a maxi-
d
a
b c
Figure 74: Drawing the edge from a to c decomposes the quad-
rangle into two triangles. Note that we cannot draw the edge
from b to d since it already exists outside the quadrangle.
mally connected planar graph with n ≥ 3 vertices a tri-
angulation. For such graphs, we have an additional linear
relation, namely 3ℓ = 2m. We can thus rewrite Euler’s
formula and get n − m + 2m
3 = 2 and n − 3ℓ
2 + ℓ = 2 and
therefore
m = 3n − 6;
ℓ = 2n − 4,
Every planar graph can be completed to a maximally con-
nected planar graph. For n ≥ 3 this implies that the planar
graph has at most 3n − 6 edges and at most 2n − 4 faces.
Forbidden subgraphs. We can use Euler’s relation to
prove that the complete graph of five vertices is not planar.
It has n = 5 vertices and m = 10 edges, contradicting the
upper bound of at most 3n − 6 = 9 edges. Indeed, every
drawing of K5 has at least two edges crossing; see Figure
75. Similarly, we can prove that the complete bipartite
62
Figure 75: A drawing of K5 on the left and of K3,3 on the right.
graph with three plus three vertices is not planar. It has
n = 6 vertices and m = 9 edges. Every cycle in a bipartite
graph has an even number of edges. Hence, 4ℓ ≤ 2m.
Plugging this into Euler’s formula, we get n−m+ m
2 ≥ 2
and therefore m ≤ 2n − 4 = 8, again a contradiction.
In a sense, K5 and K3,3 are the quintessential non-
planar graphs. To make this concrete, we still need an
operation that creates or removes degree-2 vertices. Two
graphs are homeomorphic if one can be obtained from the
other by a sequence of operations, each deleting a degree-2
vertex and replacing its two edges by the one that connects
its two neighbors, or the other way round.
KURATOWSKI’S THEOREM. A graph G is planar iff no
subgraph of G is homeomorphic to K5 or to K3,3.
The proof of this result is a bit lengthy and omitted.
Pentagons are star-convex. Euler’s formula can also be
used to show that every planar graph has a straight-line
embedding. Note that the sum of vertex degrees counts
each edge twice, that is,
P
v∈V deg(v) = 2m. For planar
graphs, twice the number of edges is less than 6n which
implies that the average degree is less than six. It follows
that every planar graph has at least one vertex of degree
5 or less. This can be strengthened by saying that every
planar graph with n ≥ 4 vertices has at least four vertices
of degree at most 5 each. To see this, assume the planar
graph is maximally connected and note that every vertex
has degree at least 3. The deficiency from degree 6 is thus
at most 3. The total deficiency is 6n −
P
v∈V deg(v) =
12 which implies that we have at least four vertices with
positive deficiency.
We need a little bit of geometry to prepare the construc-
tion of a straight-line embedding. A region R ⊆ R2
is
convex if x, y ∈ R implies that the entire line segment
connecting x and y is contained in R. Figure 76 shows
regions of either kind. We call R star-convex of there is
a point z ∈ R such that for every point x ∈ R the line
segment connecting x with z is contained in R. The set of
x
y
z
Figure 76: A convex region on the left and a non-convex star-
convex region on the right.
such points z is the kernel of R. Clearly, every convex re-
gion is star-convex but not every star-convex region is con-
vex. Similarly, there are regions that are not star-convex,
even rather simple ones such as the hexagon in Figure 77.
However, every pentagon is star-convex. Indeed, the pen-
z
Figure 77: A non-star-convex hexagon on the left and a star-
convex pentagon on the right. The dark region inside the pen-
tagon is its kernel.
tagon can be decomposed into three triangles by drawing
two diagonals that share an endpoint. Extending the inci-
dent sides into the pentagon gives locally the boundary of
the kernel. It follows that the kernel is non-empty and has
interior points.
Fáry’s construction. We construct a straight-line em-
bedding of a planar graph G = (V, E) assuming G is
maximally connected. Choose three vertices, a, b, c, con-
nected by three edges to form the outer triangle. If G has
only n = 3 vertices we are done. Else it has at least one
vertex u ∈ V = {a, b, c} with deg(u) ≤ 5.
Step 1. Remove u together with the k = deg(u) edges
incident to u. Add k − 3 edges to make the graph
maximally connected again.
Step 2. Recursively construct a straight-line embed-
ding of the smaller graph.
Step 3. Remove the added k − 3 edges and map u to
a point ε(u) in the interior of the kernel of the result-
ing k-gon. Connect ε(u) with line segments to the
vertices of the k-gon.
63
Figure 78 illustrates the recursive construction. It is
straightforward to implement but there are numerical is-
sues in the choice of ε(u) that limit the usefulness of this
construction.
recurse
u
remove
u
add back
u
a b
c
v
w x
y
Figure 78: We fix the outer triangle, remove the degree-5 vertex,
recursively construct a straight-line embedding of the rest, and
finally add the vertex back.
Tutte’s construction. A more useful construction of a
straight-line embedding goes back to the work of Tutte.
We begin with a definition. Given a finite set of points,
x1, x2, . . . , xj, the average is
x =
1
n
j
X
i=1
xi.
For j = 2, it is the midpoint of the edge and for j = 3,
it is the centroid of the triangle. In general, the average
is a point somewhere between the xi. Let G = (V, E)
be a maximally connected planar graph and a, b, c three
vertices connected by three edges. We now follow Tutte’s
construction to get a mapping ε : V → R2
so that the
straight-line drawing of G is a straight-line embedding.
Step 1. Map a, b, c to points ε(a), ε(b), ε(c) spanning
a triangle in R2
.
Step 2. For each vertex u ∈ V − {a, b, c}, let Nu be
the set of neighbors of u. Map u to the average of the
images of its neighbors, that is,
ε(u) =
1
|Nu|
X
v∈Nu
ε(v).
The fact that the resulting mapping ε : V → R2
gives a
straight-line embedding of G is known as Tutte’s Theo-
rem. It holds even if G is not quite maximally connected
and if the points are not quite the averages of their neigh-
bors. The proof is a bit involved and omitted.
The points ε(u) can be computed by solving a system of
linear equations. We illustrate this for the graph in Figure
78. We set ε(a) = −1
−1

, ε(b) = 1
−1

, ε(c) = 0
1

. The
other five points are computed by solving the system of
linear equations Av = 0, where
A =






0 0 1 −5 1 1 1 1
0 0 1 1 −3 1 0 0
1 1 1 1 1 −6 1 0
0 1 1 1 0 1 −5 1
0 0 1 1 0 0 1 −3






and v is the column vector of points ε(a) to ε(y). There
are really two linear systems, one for the horizontal and
the other for the vertical coordinates. In each system, we
have n − 3 equations and a total of n − 3 unknowns. This
gives a unique solution provided the equations are linearly
independent. Proving that they are is part of the proof of
Tutte’s Theorem. Solving the linear equations is a numeri-
cal problem that is studies in detail in courses on numerical
analysis.
64
18 Surfaces
Graphs may be drawn in two, three, or higher dimen-
sions, but they are still intrinsically only 1-dimensional.
One step up in dimensions, we find surfaces, which are
2-dimensional.
Topological 2-manifolds. The simplest kind of surfaces
are the ones that on a small scale look like the real plane.
A space M is a 2-manifold if every point x ∈ M is
locally homeomorphic to R2
. Specifically, there is an
open neighborhood N of x and a continuous bijection
h : N → R2
whose inverse is also continuous. Such a
bicontinuous map is called a homeomorphism. Examples
of 2-manifolds are the open disk and the sphere. The for-
mer is not compact because it has covers that do not have
finite subcovers. Figure 79 shows examples of compact 2-
manifolds. If we add the boundary circle to the open disk
Figure 79: Three compact 2-manifolds, the sphere, the torus, and
the double torus.
we get a closed disk which is compact but not every point
is locally homeomorphic to R2
. Specifically, a point on
the circle has an open neighborhood homeomorphic to the
closed half-plane, H2
= {(x1, x2) ∈ R2
| x1 ≥ 0}. A
space whose points have open neighborhoods homeomor-
phic to R2
or H2
is called a 2-manifolds with boundary;
see Figure 80 for examples. The boundary is the subset
Figure 80: Three 2-manifolds with boundary, the closed disk, the
cylinder, and the Möbius strip.
of points with neighborhoods homeomorphic to H2
. It is
a 1-manifold (without boundary), that is, every point is
locally homeomorphic to R. There is only one type of
compact, connected 1-manifold, namely the closed curve.
In topology, we do not distinguish spaces that are home-
omorphic to each other. Hence, every closed curve is like
every other one and they are all homeomorphic to the unit
circle, S1
= {x ∈ R2
| kxk = 1}.
Triangulations. A standard representation of a compact
2-manifold uses triangles that are glued to each other
along shared edges and vertices. A collection K of tri-
angles, edges, and vertices is a triangulation of a compact
2-manifold if
I. for every triangle in K, its three edges belong to K,
and for every edge in K, its two endpoints are ver-
tices in K;
II. every edge belongs to exactly two triangles and every
vertex belongs to a single ring of triangles.
An example is shown in Figure 81. To simplify language,
we call each element of K a simplex. If we need to be spe-
cific, we add the dimension, calling a vertex a 0-simplex,
an edge a 1-simplex, and a triangle a 2-simplex. A face
of a simplex τ is a simplex σ ⊆ τ. For example, a trian-
gle has seven faces, its three vertices, its two edges, and
itself. We can now state Condition I more succinctly: if
σ is a face of τ and τ ∈ K then σ ∈ K. To talk about
Figure 81: A triangulation of the sphere. The eight triangles are
glued to form the boundary of an octahedron which is homeo-
morphic to the sphere.
the inverse of the face relation, we define the star of a
simplex σ as the set of simplices that contain σ as a face,
St σ = {τ ∈ K | σ ⊆ τ}. Sometimes we think of the
star as a set of simplices and sometimes as a set of points,
namely the union of interiors of the simplices in the star.
With the latter interpretation, we can now express Condi-
tion II more succinctly: the star of every simplex in K is
homeomorphic to R2
.
Data structure. When we store a 2-manifold, it is use-
ful to keep track of which side we are facing and where
we are going so that we can move around efficiently.
The core piece of our data structure is a representation
of the symmetry group of a triangle. This group is iso-
morphic to the group of permutations of three elements,
65
the vertices of the triangle. We call each permutation
an ordered triangle and use cyclic shifts and transposi-
tions to move between them; see Figure 82. We store
ENEXT ENEXT
ENEXT
ENEXT
ENEXT
ENEXT
SYM SYM SYM
c a b
a b b c c a
b a c b a c
b
a
c
Figure 82: The symmetry group of the triangle consists of six
ordered versions. Each ordered triangle has a lead vertex and a
lead directed edge.
the entire symmetry group in a single node of an abstract
graph, with arcs between neighboring triangles. Further-
more, we store the vertices in a linear array, V [1..n]. For
each ordered triangle, we store the index of the lead ver-
tex and a pointer to the neighboring triangle that shares
the same directed lead edge. A pointer in this context
is the address of a node together with a three-bit inte-
ger, ι, that identifies the ordered version of the triangle
we refer to. Suppose for example that we identify the
ordered versions abc, bca, cab, bac, cba, acb of a triangle
with ι = 0, 1, 2, 4, 5, 6, in this sequence. Then we can
move between different ordered versions of the same tri-
angle using the following functions.
ordTri ENEXT(µ, ι)
if ι ≤ 2 then return (µ, (ι + 1) mod 3)
else return (µ, (ι + 1) mod 3 + 4)
endif.
ordTri SYM(µ, ι)
return (µ, (ι + 4) mod 8).
To get the index of the lead vertex, we use the integer func-
tion ORG(µ, ι) and to get the pointer to the neighboring
triangle, we use FNEXT(µ, ι).
Orientability. A 2-manifold is orientable if it has two
distinct sides, that is, if we move around on one we stay
there and never cross over to the other side. The one exam-
ple of a non-orientable manifold we have seen so far is the
Möbious strip in Figure 80. There are also non-orientable,
compact 2-manifolds (without boundary), as we can see in
Figure 83. We use the data structure to decide whether or
Figure 83: Two non-orientable, compact 2-manifolds, the pro-
jective plane on the left and the Klein bottle on the right.
not a 2-manifold is orientable. Note that the cyclic shift
partitions the set of six ordered triangles into two orien-
tations, each consisting of three triangles. We say two
neighboring triangles are consistently oriented if they dis-
agree on the direction of the shared edge, as in Figure 81.
Using depth-first search, we visit all triangles and orient
them consistently, if possible. At the first visit, we ori-
ent the triangle consistent with the preceding, neighboring
triangle. At subsequence visits, we check for consistent
orientation.
boolean ISORNTBL(µ, ι)
if µ is unmarked then
mark µ; choose the orientation that contains ι;
bx = ISORNTBL(FNEXT(SYM(µ, ι)));
by = ISORNTBL(FNEXT(ENEXT(SYM(µ, ι))));
bz = ISORNTBL(FNEXT(ENEXT2
(SYM(µ, ι))));
return bx and by and bz
else
return [orientation of µ contains ι]
endif.
There are two places where we return a boolean value. At
the second place, it indicates whether or not we have con-
sistent orientation in spite of the visited triangle being ori-
ented prior to the visit. At the first place, the boolean value
indicates whether or not we have found a contradiction to
orientablity so far. A value of FALSE anywhere during the
computation is propagated to the root of the search tree
telling us that the 2-manifold is non-orientable. The run-
ning time is proportional to the number of triangles in the
triangulation of the 2-manifold.
Classification. For the sphere and the torus, it is easy
to see how to make them out of a sheet of paper. Twist-
ing the paper gives a non-orientable 2-manifold. Perhaps
66
most difficult to understand is the projective plane. It is
obtained by gluing each point of the sphere to its antipodal
point. This way, the entire northern hemisphere is glued
to the southern hemisphere. This gives the disk except
that we still need to glue points of the bounding circle (the
equator) in pairs, as shown in the third paper construction
in Figure 84. The Klein bottle is easier to imagine as it
is obtained by twisting the paper just once, same as in the
construction of the Möbius strip.
b a
b
b b
a
b
a a
b
b b
a a a a
Figure 84: From left to right: the sphere, the torus, the projective
plane, and the Klein bottle.
There is a general method here that can be used to clas-
sify the compact 2-manifolds. Given two of them, we con-
struct a new one by removing an open disk each and glu-
ing the 2-manifolds along the two circles. The result is
called the connected sum of the two 2-manifolds, denoted
as M#N. For example, the double torus is the connected
sum of two tori, T2
#T2
. We can cut up the g-fold torus
into a flat sheet of paper, and the canonical way of doing
this gives a 4g-gon with edges identified in pairs as shown
in Figure 85 on the left. The number g is called the genus
of the manifold. Similarly, we can get new non-orientable
1
1
2
2
3
3
4
1
1
1
1
2
2
4
2
2
a a
a
a
b b
b
a a
a
a
a
a
a a
b
Figure 85: The polygonal schema in standard form for the double
torus and the double Klein bottle.
manifolds from the projective plane, P2
, by forming con-
nected sums. Cutting up the g-fold projective plane gives
a 2g-gon with edges identified in pairs as shown in Figure
85 on the right. We note that the constructions of the pro-
jective plane and the Klein bottle in Figure 84 are both not
in standard form. A remarkable result which is now more
than a century old is that every compact 2-manifold can be
cut up to give a standard polygonal schema. This implies
a classification of the possibilities.
CLASSIFICATION THEOREM. The members of the fami-
lies S2
, T2
, T2
#T2
, . . . and P2
, P2
#P2
, . . . are non-
homeomorphic and they exhaust the family of com-
pact 2-manifolds.
Euler characteristic. Suppose we are given a triangula-
tion, K, of a compact 2-manifold, M. We already know
how to decide whether or not M is orientable. To deter-
mine its type, we just need to find its genus, which we do
by counting simplices. The Euler characteristic is
χ = #vertices − #edges + #triangles.
Let us look at the orientable case first. We have a 4g-gon
which we triangulate. This is a planar graph with n −
m + ℓ = 2. However, 2g edge are counted double, the 4g
vertices of the 4g-gon are all the same, and the outer face
is not a triangle in K. Hence,
χ = (n − 4g + 1) − (m − 2g) + (ℓ − 1)
= (n − m + ℓ) − 2g
which is equal to 2 − 2g. The same analysis can be used
in the non-orientable case in which we get χ = (n − 2g +
1) − (m − g) + (ℓ − 1) = 2 − g. To decide whether
two compact 2-manifolds are homeomorphic it suffices to
determine whether they are both orientable or both non-
orientable and, if they are, whether they have the same
Euler characteristic. This can be done in time linear in the
number of simplices in their triangulations.
This result is in sharp contrast to the higher-dimensional
case. The classification of compact 3-manifolds has been
a longstanding open problem in Mathematics. Perhaps
the recent proof of the Poincaré conjecture by Perelman
brings us close to a resolution. Beyond three dimensions,
the situation is hopeless, that is, deciding whether or not
two triangulated compact manifolds of dimension four or
higher are homeomorphic is undecidable.
67
19 Homology
In topology, the main focus is not on geometric size but
rather on how a space is connected. The most elementary
notion distinguishes whether we can go from one place
to another. If not then there is a gap we cannot bridge.
Next we would ask whether there is a loop going around
an obstacle, or whether there is a void missing in the space.
Homology is a formalization of these ideas. It gives a way
to define and count holes using algebra.
The cyclomatic number of a graph. To motivate the
more general concepts, consider a connected graph, G,
with n vertices and m edges. A spanning tree has n − 1
edges and every additional edge forms a unique cycle to-
gether with edges in this tree; see Figure 86. Every other
Figure 86: A tree with three additional edges defining the same
number of cycles.
cycle in G can be written as a sum of these m − (n − 1)
cycles. To make this concrete, we define a cycle as a sub-
set of the edges such that every vertex belongs to an even
number of these edges. A cycle does not need to be con-
nected. The sum of two cycles is the symmetric difference
of the two sets such that multiple edges erase each other
in pairs. Clearly, the sum of two cycles is again a cy-
cle. Every cycle, γ, in G contains some positive number
of edges that do not belong to the spanning tree. Call-
ing these edges e1, e2, . . . , ek and the cycles they define
γ1, γ2, . . . , γk, we claim that
γ = γ1 + γ2 + . . . + γk.
To see this assume that δ = γ1 + γ2 + . . .+ γk is different
from γ. Then γ+δ is again a cycle but it contains no edges
that do not belong to the spanning tree. Hence γ + δ = ∅
and therefore γ = δ, as claimed. This implies that the
m−n+1 cycles form a basis of the group of cycles which
motivates us to call m − n + 1 the cyclomatic number of
the graph. Note that the basis depends on the choice of
spanning tree while the cyclomatic number is independent
of that choice.
Simplicial complexes. We begin with a combinatorial
representation of a topological space. Using a finite
ground set of vertices, V , we call a subset σ ⊆ V an
abstract simplex. Its dimension is one less than the car-
dinality, dim σ = |σ| − 1. A face is a subset τ ⊆ σ.
DEFINITION. An abstract simplicial complex over V is a
system K ⊆ 2V
such that σ ∈ K and τ ⊆ σ implies
τ ∈ K.
The dimension of K is the largest dimension of any sim-
plex in K. A graph is thus a 1-dimensional abstract sim-
plicial complex. Just like for graphs, we sometimes think
of K as an abstract structure and at other times as a geo-
metric object consisting of geometric simplices. In the lat-
ter interpretation, we glue the simplices along shared faces
to form a geometric realization of K, denoted as |K|. We
say K triangulates a space X if there is a homeomorphism
h : X → |K|. We have seen 1- and 2-dimensional exam-
ples in the preceding sections. The boundary of a simplex
σ is the collection of co-dimension one faces,
∂σ = {τ ⊆ σ | dim τ = dim σ − 1}.
If dim σ = p then the boundary consists of p + 1 (p − 1)-
simplices. Every (p − 1)-simplex has p (p − 2)-simplices
in its own boundary. This way we get (p + 1)p (p − 2)-
simplices, counting each of the p+1
p−1

= p+1
2

(p − 2)-
dimensional faces of σ twice.
Chain complexes. We now generalize the cycles in
graphs to cycles of different dimensions in simplicial com-
plexes. A p-chain is a set of p-simplices in K. The sum
of two p-chains is their symmetric difference. We usually
write the sets as formal sums,
c = a1σ1 + a2σ2 + . . . + anσn;
d = b1σ1 + b2σ2 + . . . + bnσn,
where the ai and bi are either 0 or 1. Addition can then be
done using modulo 2 arithmetic,
c +2 d = (a1 +2 b1)σ1 + . . . + (an +2 bn)σn,
where ai +2 bi is the exclusive or operation. We simplify
notation by dropping the subscript but note that the two
plus signs are different, one modulo two and the other a
formal notation separating elements in a set. The p-chains
68
form a group, which we denote as (Cp, +) or simply Cp.
Note that the boundary of a p-simplex is a (p − 1)-chain,
an element of Cp−1. Extending this concept linearly, we
define the boundary of a p-chain as the sum of boundaries
of its simplices, ∂c = a1∂σ1+. . .+an∂σn. The boundary
is thus a map between chain groups and we sometimes
write the dimension as index for clarity,
∂p : Cp → Cp−1.
It is a homomorphism since ∂p(c + d) = ∂pc + ∂pd. The
infinite sequence of chain groups connected by boundary
homomorphisms is called the chain complex of K. All
groups of dimension smaller than 0 and larger than the di-
mension of K are trivial. It is convenient to keep them
around to avoid special cases at the ends. A p-cycle is a
p-chain whose boundary is zero. The sum of two p-cycles
is again a p-cycle so we get a subgroup, Zp ⊆ Cp. A
p-boundary is a p-chain that is the boundary of a (p + 1)-
chain. The sum of two p-boundaries is again a p-boundary
so we get another subgroup, Bp ⊆ Cp, Taking the bound-
ary twice in a row gives zero for every simplex and thus
for every chain, that is, (∂p(∂p+1d) = 0. It follows that
Bp is a subgroup of Zp. We can therefore draw the chain
complex as in Figure 87.
B
Z
Cp+1
p+1
p+1
C
Z
B
p
p
p
C
Z
B
p−1
p−1
p−1
0 0 0
p+2 p+1 p p−1
Figure 87: The chain complex consisting of a linear sequence
of chain, cycle, and boundary groups connected by homomor-
phisms.
Homology groups. We would like to talk about cycles
but ignore the boundaries since they do not go around a
hole. At the same time, we would like to consider two
cycles the same if they differ by a boundary. See Figure
88 for a few 1-cycles, some of which are 1-boundaries and
some of which are not. This is achieved by taking the
quotient of the cycle group and the boundary group. The
result is the p-th homology group,
Hp = Zp/Bp.
Its elements are of the form [c] = c + Bp, where c is a p-
cycle. [c] is called a homology class, c is a representative
of [c], and any two cycles in [c] are homologous denoted
γ
δ
ε
Figure 88: The 1-cycles γ and δ are not 1-boundaries. Adding
the 1-boundary ε to δ gives a 1-cycle homologous to δ.
as c ∼ c′
. Note that [c] = [c′
] whenever c ∼ c′
. Also note
that [c + d] = [c′
+ d′
] whenever c ∼ c′
and d ∼ d′
. We
use this as a definition of addition for homology classes, so
we again have a group. For example, the 1-st homology
group of the torus consists of four elements, [0] = B1,
[γ] = γ +B1, [δ] = δ +B1, and [γ +δ] = γ +δ +B1. We
often draw the elements as the corners of a cube of some
dimension; see Figure 89. If the dimension is β then it has
[0]
[γ] [γ+δ]
[γ]
Figure 89: The four homology classes of H1 are generated by
two classes, [γ] and [δ].
2β
corners. The dimension is also the number of classes
needed to generate the group, the size of the basis. For
the p-th homology group, this number is βp = rank Hp =
log2 |Hp|, the p-th Betti number. For the torus we have
β0 = 1;
β1 = 2;
β2 = 1,
and βp = 0 for all p 6= 0, 1, 2. Every 0-chain is a 0-
cycle. Two 0-cycles are homologous if they are both the
sum of an even number or both of an odd number of ver-
tices. Hence β0 = log2 2 = 1. We have seen the reason
for β1 = 2 before. Finally, there are only two 2-cycles,
namely 0 and the set of all triangles. The latter is not a
boundary, hence β2 = log2 2 = 1.
Boundary matrices. To compute homology groups and
Betti numbers, we use a matrix representation of the sim-
plicial complex. Specifically, we store the boundary ho-
momorphism for each dimension, setting ∂p[i, j] = 1 if
69
the i-th (p − 1)-simplex is in the boundary of the j-th p-
simplex, and ∂p[i, j] = 0, otherwise. For example, if the
complex consists of all faces of the tetrahedron, then the
boundary matrices are
∂0 =

0 0 0 0

;
∂1 =




1 1 1 0 0 0
1 0 0 1 1 0
0 1 0 1 0 1
0 0 1 0 1 1



 ;
∂2 =








1 1 0 0
1 0 1 0
0 1 1 0
1 0 0 1
0 1 0 1
0 0 1 1








;
∂3 =




1
1
1
1



 .
Given a p-chain as a column vector, v, its boundary is
computed by matrix multiplication, ∂pv. The result is a
combination of columns in the p-th boundary matrix, as
specified by v. Thus, v is a p-cycle iff ∂pv = 0 and v is a
p-boundary iff there is u such that ∂p+1u = v.
Matrix reduction. Letting np be the number of p-
simplices in K, we note that it is also the rank of the p-th
chain group, np = rank Cp. The p-th boundary matrix
thus has np−1 rows and np columns. To figure the sizes of
the cycle and boundary groups, and thus of the homology
groups, we reduce the matrix to normal form, as shown
in Figure 90. The algorithm of choice uses column and
Z
rankB
C
C
−1
p
p
rank
rank
rank
p
p −1
Figure 90: The p-th boundary matrix in normal form. The entries
in the shaded portion of the diagonal are 1 and all other entries
are 0.
row operations similar to Gaussian elimination for solv-
ing a linear system. We write it recursively, calling it with
m = 1.
void REDUCE(m)
if ∃k, l ≥ m with ∂p[k, l] = 1 then
exchange rows m and k and columns m and l;
for i = m + 1 to np−1 do
if ∂p[i, m] = 1 then
add row m to row i
endif
endfor;
for j = m + 1 to np do
if ∂p[m, j] = 1 then
add column m to column j
endif
endfor;
REDUCE(m + 1)
endif.
For each recursive call, we have at most a linear number
of row and column operations. The total running time is
therefore at most cubic in the number of simplices. Figure
90 shows how we interpret the result. Specifically, the
number of zero columns is the rank of the cycle group,
Zp, and the number of 1s in the diagonal is the rank of the
boundary group, Bp−1. The Betti number is the difference,
βp = rank Zp − rankBp,
taking the rank of the boundary group from the reduced
matrix one dimension up. Working on our example, we
get the following reduced matrices.
∂0 =

0 0 0 0

;
∂1 =




1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 0 0 0



 ;
∂2 =








1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 0
0 0 0 0
0 0 0 0








;
∂3 =




1
0
0
0



 .
Writing zp = rank Zp and bp = rank Bp, we get z0 = 4
from the zeroth and b0 = 3 from the first reduced bound-
ary matrix. Hence β0 = z0 = b0 = 1. Furthermore,
70
z1 = 3 and b1 = 3 giving β1 = 0, z2 = 1 and b2 = 1
giving β2 = 0, and z3 = 0 giving β3 = 0. These are the
Betti numbers of the closed ball.
Euler-Poincaré Theorem. The Euler characteristic of a
simplicial complex is the alternating sum of simplex num-
bers,
χ =
X
p≥0
(−1)p
np.
Recalling that np is the rank of the p-th chain group and
that it equals the rank of the p-th cycle group plus the rank
of the (p − 1)-st boundary group, we get
χ =
X
p≥0
(−1)p
(zp + bp−1)
=
X
p≥0
(−1)p
(zp − bp),
which is the same as the alternating sum of Betti num-
bers. To appreciate the beauty of this result, we need to
know that the Betti numbers do not depend on the trian-
gulation chosen for the space. The proof of this property
is technical and omitted. This now implies that the Euler
characteristic is an invariant of the space, same as the Betti
numbers.
EULER-POINCARÉ THEOREM. χ =
P
(−1)p
βp.
71
Fifth Homework Assignment
Write the solution to each problem on a single page. The
deadline for handing in solutions is November 13.
Problem 1. (20 points). Let G = (V, E) be a maxi-
mally connected planar graph and recall that [k] =
{1, 2, . . ., k}. A vertex k-coloring is a mapping γ :
V → [k] such that γ(u) 6= γ(v) whenever u 6= v
are adjacent, and an edge k-coloring is a mapping
η : E → [k] such that η(e) 6= η(f) whenever e 6= f
bound a common triangle. Prove that if G has a ver-
tex 4-coloring then it also has an edge 3-coloring.
Problem 2. (20 = 10 + 10 points). Let K be a set of
triangles together with their edges and vertices. The
vertices are represented by a linear array, as usual, but
there is no particular ordering information in the way
the edges and triangles are given. In other words, the
edges are just a list of index pairs and the triangles
are a list of index triplets into the vertex array.
(a) Give an algorithm that decides whether or not
K is a triangulation of a 2-manifold.
(b) Analyze your algorithm and collect credit
points if the running time of your algorithm is
linear in the number of triangles.
Problem 3. (20 = 5+7+8 points). Determine the type of
2-manifold with boundary obtained by the following
constructions.
(a) Remove a cylinder from a torus in such a way
that the rest of the torus remains connected.
(b) Remove a disk from the projective plane.
(c) Remove a Möbius strip from a Klein bottle.
Whenever we remove a piece, we do this like cutting
with scissors so that the remainder is still closed, in
each case a 2-manifold with boundary.
Problem 4. (20 = 5 + 5 + 5 + 5 points). Recall that the
sphere is the space of points at unit distance from the
origin in three-dimensional Euclidean space, S2
=
{x ∈ R3
| kxk = 1}.
(a) Give a triangulation of S2
.
(b) Give the corresponding boundary matrices.
(c) Reduce the boundary matrices.
(d) Give the Betti numbers of S2
.
Problem 5. (20 = 10 + 10 points). The dunce cap is ob-
tained by gluing the three edges of a triangular sheet
of paper to each other. [After gluing the first two
edges you get a cone, with the glued edges forming a
seam connecting the cone point with the rim. In the
final step, wrap the seam around the rim, gluing all
three edges to each other. To imagine how this work,
it might help to think of the final result as similar to
the shell of a snale.]
(a) Is the dunce cap a 2-manifold? Justify your an-
swer.
(b) Give a triangulation of the dunce cap, making
sure that no two edges connect the same two
vertices and no two triangles connect the same
three vertices.
72
VI GEOMETRIC ALGORITHMS
20 Plane-Sweep
21 Delaunay Triangulations
22 Alpha Shapes
Sixth Homework Assignment
73
20 Plane-Sweep
Plane-sweep is an algorithmic paradigm that emerges in
the study of two-dimensional geometric problems. The
idea is to sweep the plane with a line and perform the com-
putations in the sequence the data is encountered. In this
section, we solve three problems with this paradigm: we
construct the convex hull of a set of points, we triangulate
the convex hull using the points as vertices, and we test a
set of line segments for crossings.
Convex hull. Let S be a finite set of points in the plane,
each given by its two coordinates. The convex hull of S,
denoted by conv S, is the smallest convex set that con-
tains S. Figure 91 illustrates the definition for a set of
nine points. Imagine the points as solid nails in a planar
board. An intuitive construction stretches a rubber band
around the nails. After letting go, the nails prevent the
complete relaxation of the rubber band which will then
trace the boundary of the convex hull.
6
7
4
1
2 9
8
3
5
Figure 91: The convex hull of nine points, which we represent
by the counterclockwise sequence of boundary vertices: 1, 3, 6,
8, 9, 2.
To construct the counterclockwise cyclic sequence of
boundary vertices representing the convex hull, we sweep
a vertical line from left to right over the data. At any mo-
ment in time, the points in front (to the right) of the line
are untouched and the points behind (to the left) of the line
have already been processed.
Step 1. Sort the points from left to right and relabel
them in this sequence as x1, x2, . . . , xn.
Step 2. Construct a counterclockwise triangle from
the first three points: x1x2x3 or x1x3x2.
Step 3. For i from 4 to n, add the next point xi to the
convex hull of the preceding points by finding the
two lines that pass through xi and support the con-
vex hull.
The algorithm is illustrated in Figure 92, which shows the
addition of the sixth point in the data set.
6
5
4
1
2 9
8
3
7
Figure 92: The vertical sweep-line passes through point 6. To
add 6, we substitute 6 for the sequence of vertices on the bound-
ary between 3 and 5.
Orientation test. A critical test needed to construct the
convex hull is to determine the orientation of a sequence
of three points. In other words, we need to be able to dis-
tinguish whether we make a left-turn or a right-turn as we
go from the first to the middle and then the last point in
the sequence. A convenient way to determine the orien-
tation evaluates the determinant of a three-by-three ma-
trix. More precisely, the points a = (a1, a2), b = (b1, b2),
c = (c1, c2) form a left-turn iff
det


1 a1 a2
1 b1 b2
1 c1 c2

  0.
The three points form a right-turn iff the determinant is
negative and they lie on a common line iff the determinant
is zero.
boolean LEFT(Points a, b, c)
return [a1(b2 − c2) + b1(c2 − a2)
+ c1(a2 − b2)  0].
To see that this formula is correct, we may convince our-
selves that it is correct for three non-collinear points, e.g.
a = (0, 0), b = (1, 0), and c = (0, 1). Remember also
that the determinant measures the area of the triangle and
is therefore a continuous function that passes through zero
only when the three points are collinear. Since we can
continuously move every left-turn to every other left-turn
without leaving the class of left-turns, it follows that the
sign of the determinant is the same for all of them.
Finding support lines. We use a doubly-linked cyclic
list of vertices to represent the convex hull boundary. Each
74
node in the list contains pointers to the next and the previ-
ous nodes. In addition, we have a pointer last to the last
vertex added to the list. This vertex is also the rightmost
in the list. We add the i-th point by connecting it to the
vertices µ → pt and λ → pt identified in a counterclock-
wise and a clockwise traversal of the cycle starting at last,
as illustrated in Figure 93. We simplify notation by using
last
µ
ν
λ
Figure 93: The upper support line passes through the first point
µ → pt that forms a left-turn from ν → pt to µ → next → pt.
nodes in the parameter list of the orientation test instead
of the points they store.
µ = λ = last; create new node with ν → pt = i;
while RIGHT(ν, µ, µ → next) do
µ = µ → next
endwhile;
while LEFT(ν, λ, λ → prev) do
λ = λ → prev
endwhile;
ν → next = µ; ν → prev = λ;
µ → prev = λ → next = ν; last = ν.
The effort to add the i-th point can be large, but if it is
then we remove many previously added vertices from the
list. Indeed, each iteration of the for-loop adds only one
vertex to the cyclic list. We charge $2 for the addition,
one dollar for the cost of adding and the other to pay for
the future deletion, if any. The extra dollars pay for all
iterations of the while-loops, except for the first and the
last. This implies that we spend only constant amortized
time per point. After sorting the points from left to right,
we can therefore construct the convex hull of n points in
time O(n).
Triangulation. The same plane-sweep algorithm can be
used to decompose the convex hull into triangles. All
we need to change is that points and edges are never re-
moved and a new point is connected to every point exam-
ined during the two while-loops. We define a (geometric)
triangulation of a finite set of points S in the plane as a
maximally connected straight-line embedding of a planar
graph whose vertices are mapped to points in S. Figure 94
shows the triangulation of the nine points in Figure 91 con-
structed by the plane-sweep algorithm. A triangulation is
3
8
9
2
1
5
4
7
6
Figure 94: Triangulation constructed with the plane-sweep algo-
rithm.
not necessarily a maximally connected planar graph since
the prescribed placement of the points fixes the boundary
of the outer face to be the boundary of the convex hull.
Letting k be the number of edges of that boundary, we
would have to add k − 3 more edges to get a maximally
connected planar graph. It follows that the triangulation
has m = 3n − (k + 3) edges and ℓ = 2n − (k + 2)
triangles.
Line segment intersection. As a third application of the
plane-sweep paradigm, we consider the problem of decid-
ing whether or not n given line segments have pairwise
disjoint interiors. We allow line segments to share end-
points but we do not allow them to cross or to overlap. We
may interpret this problem as deciding whether or not a
straight-line drawing of a graph is an embedding. To sim-
plify the description of the algorithm, we assume no three
endpoints are collinear, so we only have to worry about
crossings and not about other overlaps.
How can we decide whether or not a line segment
with endpoint u = (u1, u2) and v = (v1, v2) crosses
another line segment with endpoints p = (p1, p2) and
q = (q1, q2)? Figure 95 illustrates the question by show-
ing the four different cases of how two line segments and
the lines they span can intersect. The line segments cross
iff uv intersects the line of pq and pq intersects the line of
uv. This condition can be checked using the orientation
test.
boolean CROSS(Points u, v, p, q)
return [(LEFT(u, v, p) xor LEFT(u, v, q)) and
(LEFT(p, q, u) xor LEFT(p, q, v))].
We can use the above function to test all n
2

pairs of line
segments, which takes time O(n2
).
75
u
q
v
p
u
v
q
p
q
q v
u p
v
u
p
Figure 95: Three pairs of non-crossing and one pair of crossing
line segments.
Plane-sweep algorithm. We obtain a faster algorithm
by sweeping the plane with a vertical line from left to
right, as before. To avoid special cases, we assume that
no two endpoints are the same or lie on a common verti-
cal line. During the sweep, we maintain the subset of line
segments that intersect the sweep-line in the order they
meet the line, as shown in Figure 96. We store this subset
Figure 96: Five of the line segments intersect the sweep-line at
its current position and two of them cross.
in a dictionary, which is updated at every endpoint. Only
line segments that are adjacent in the ordering along the
sweep-line are tested for crossings. Indeed, two line seg-
ments that cross are adjacent right before the sweep-line
passes through the crossing, if not earlier.
Step 1. Sort the 2n endpoints from left to right and re-
label them in this sequence as x1, x2, . . . , x2n. Each
point still remembers the index of the other endpoint
of its line segment.
Step 2. For i from 1 to 2n, process the i-th endpoint
as follows:
Case 2.1 xi is left endpoint of the line segment
xixj. Therefore, i  j. Insert xixj into
the dictionary and let uv and pq be its prede-
cessor and successor. If CROSS(u, v, xi, xj)
or CROSS(p, q, xi, xj) then report the crossing
and stop.
Case 2.2 xi is right endpoint of the line segment
xixj. Therefore, i  j. Let uv and pq be
the predecessor and the successor of xixj. If
CROSS(u, v, p, q) then report the crossing and
stop. Delete xixj from the dictionary.
We do an insertion into the dictionary for each left end-
point and a deletion from the dictionary for each right
endpoint, both in time O(log n). In addition, we do at
most two crossing tests per endpoint, which takes constant
time. In total, the algorithm takes time O(n log n) to test
whether a set of n line segments contains two that cross.
76
21 Delaunay Triangulations
The triangulations constructing by plane-sweep are typi-
cally of inferior quality, that is, there are many long and
skinny triangles and therefore many small and many large
angles. We study Delaunay triangulations which distin-
guish themselves from all other triangulations by a num-
ber of nice properties, including they have fast algorithms
and they avoid small angles to the extent possible.
Plane-sweep versus Delaunay triangulation. Figures
97 and 98 show two triangulations of the same set of
points, one constructed by plane-sweep and the other the
Delaunay triangulation. The angles in the Delaunay trian-
Figure 97: Triangulation constructed by plane-sweep. Points on
the same vertical line are processed from bottom to top.
gulation seem consistently larger than those in the plane-
sweep triangulation. This is not a coincidence and it can
be proved that the Delaunay triangulation maximizes the
minimum angle for every input set. Both triangulations
Figure 98: Delaunay triangulation of the same twenty-one points
triangulated in Figure 97.
contain the edges that bound the convex hull of the input
set.
Voronoi diagram. We introduce the Delaunay triangu-
lation indirectly, by first defining a particular decomposi-
tion of the plane into regions, one per point in the finite
data set S. The region of the point u in S contains all
points x in the plane that are at least as close to u as to any
other point in S, that is,
Vu = {x ∈ R2
| kx − uk ≤ kx − vk, v ∈ S},
where kx − uk = [(x1 − u1)2
+ (x2 − u2)2
]1/2
is the Eu-
clidean distance between the points x and u. We refer to
Vu as the Voronoi region of u. It is closed and its bound-
ary consists of Voronoi edges which Vu shares with neigh-
boring Voronoi regions. A Voronoi edge ends in Voronoi
vertices which it shares with other Voronoi edges. The
Voronoi diagram of S is the collection of Voronoi regions,
edges and vertices. Figure 99 illustrates the definitions.
Let n be the number of points in S. We list some of the
properties that will be important later.
Figure 99: The (solid) Voronoi diagram drawn above the (dot-
ted) Delaunay triangulation of the same twenty-one points trian-
gulated in Figures 97 and 98. Some of the Voronoi edges are too
far out to fit into the picture.
• Each Voronoi region is a convex polygon constructed
as the intersection of n − 1 closed half-planes.
• The Voronoi region Vu is bounded (finite) iff u lies in
the interior of the convex hull of S.
• The Voronoi regions have pairwise disjoint interiors
and together cover the entire plane.
Delaunay triangulation. We define the Delaunay trian-
gulation as the straight-line dual of the Voronoi diagram.
Specifically, for every pair of Voronoi regions Vu and Vv
that share an edge, we draw the line segment from u to v.
By construction, every Voronoi vertex, x, has j ≥ 3 clos-
est input points. Usually there are exactly three closest
77
points, u, v, w, in which case the triangle they span be-
longs to the Delaunay triangulation. Note that x is equally
far from u, v, and w and further from all other points in
S. This implies the empty circle property of Delaunay tri-
angles: all points of S − {u, v, w} lie outside the circum-
scribed circle of uvw. Similarly, for each Delaunay edge
uv, there is a circle that passes through u and v such that
all points of S − {u, v} lie outside the circle. For exam-
ple, the circle centered at the midpoint of the Voronoi edge
shared by Vu and Vv is empty in this sense. This property
can be used to prove that the edge skeleton of the Delau-
nay triangulation is a straight-line embedding of a planar
graph.
Figure 100: A Voronoi vertex of degree 5 and the corresponding
pentagon in the Delaunay triangulation. The dotted edges com-
plete the triangulation by decomposing the pentagon into three
triangles.
Now suppose there is a vertex with degree j  3. It cor-
responds to a polygon with j  3 edges in the Delaunay
triangulation, as illustrated in Figure 100. Strictly speak-
ing, the Delaunay triangulation is no longer a triangulation
but we can complete it to a triangulation by decompos-
ing each j-gon into j − 2 triangles. This corresponds to
perturbing the data points every so slightly such that the
degree-j Voronoi vertices are resolved into trees in which
j − 2 degree-3 vertices are connected by j − 3 tiny edges.
Local Delaunayhood. Given a triangulation of a finite
point set S, we can test whether or not it is the Delaunay
triangulation by testing each edge against the two trian-
gles that share the edge. Suppose the edge uv in the tri-
angulation T is shared by the triangles uvp and uvq. We
call uv locally Delaunay, or lD for short, if q lies on or
outside the circle that passes through u, v, p. The condi-
tion is symmetric in p and q because the circle that passes
through u, v, p intersects the first circle in points u and v.
It follows that p lies on or outside the circle of u, v, q iff q
lies on or outside the circle of u, v, p. We also call uv lo-
cally Delaunay if it bounds the convex hull of S and thus
belongs to only one triangle. The local condition on the
edges implies a global property.
DELAUNAY LEMMA. If every edge in a triangulation K
of S is locally Delaunay then K is the Delaunay tri-
angulation of S.
Although every edge of the Delaunay triangulation is lo-
cally Delaunay, the Delaunay Lemma is not trivial. In-
deed, K may contain edges that are locally Delaunay but
do not belong to the Delaunay triangulation, as shown in
Figure 101. We omit the proof of the lemma.
u v
Figure 101: The edge uv is locally Delaunay but does not belong
to the Delaunay triangulation.
Edge-flipping. The Delaunay Lemma suggests we con-
struct the Delaunay triangulation by first constructing an
arbitrary triangulation of the point set S and then modify-
ing it locally to make all edges lD. The idea is to look for
non-lD edges and to flip them, as illustrated in Figure 102.
Indeed, if uv is a non-lD edge shared by triangles uvp and
v
p
u
q
Figure 102: The edge uv is non-lD and can be flipped to the edge
pq, which is lD.
uvq then upvq is a convex quadrilateral and flipping uv
means substituting one diagonal for the other, namely pq
78
for uv. Note that if uv is non-lD then pq is lD. It is im-
portant that the algorithm finds non-lD edges quickly. For
this purpose, we use a stack of edges. Initially, we push
all edges on the stack and mark them.
while stack is non-empty do
pop edge uv from stack and unmark it;
if uv is non-lD then
substitute pq for uv;
for ab ∈ {up, pv, vq, qu} do
if ab is unmarked then
push ab on the stack and mark it
endif
endfor
endif
endwhile.
The marks avoid multiple copies of the same edge on the
stack. This implies that at any one moment the size of the
stack is less than 3n. Note also that initially the stack con-
tains all non-lD edges and that this property is maintained
as an invariant of the algorithm. The Delaunay Lemma
implies that when the algorithm halts, which is when the
stack is empty, then the triangulation is the Delaunay tri-
angulation. However, it is not yet clear that the algorithm
terminates. Indeed, the stack can grow and shrink dur-
ing the course of the algorithm, which makes it difficult to
prove that it ever runs empty.
In-circle test. Before studying the termination of the al-
gorithm, we look into the question of distinguishing lD
from non-lD edges. As before we assume that the edge uv
is shared by the triangles uvp and uvq in the current trian-
gulation. Recall that uv is lD iff q lies outside the circle
that passes through u, v, p. Let f : R2
→ R be defined by
f(x) = x2
1 + x2
2. As illustrated in Figure 103, the graph
of this function is a paraboloid in three-dimensional space
and we write x+
= (x1, x2, f(x)) for the vertical projec-
tion of the point x onto the paraboloid. Assuming the three
points u, v, p do not lie on a common line then the points
u+
, v+
, p+
lie on a non-vertical plane that is the graph of
a function h(x) = αx1 + βx2 + γ. The projection of the
intersection of the paraboloid and the plane back into R2
is given by
0 = f(x) − h(x)
= x2
1 + x2
2 − αx1 − βx2 − γ,
which is the equation of a circle. This circle passes
through u, v, p so it is the circle we have to compare q
u p
v
q
Figure 103: The plane passing through u+
, v+
, p+
intersects the
paraboloid in an ellipse whose projection into R2
passes through
the points u, v, p. The point q+
lies below the plane iff q lies
inside the circle.
against. We note that q lies inside the circle iff q+
lies be-
low the plane. The latter test can be based on the sign of
the determinant of the 4-by-4 matrix
∆ =




1 u1 u2 u2
1 + u2
2
1 v1 v2 v2
1 + v2
2
1 p1 p2 p2
1 + p2
2
1 q1 q2 q2
1 + q2
2



 .
Exchanging two rows in the matrix changes the sign.
While the in-circle test should be insensitive to the order
of the first three points, the sign of the determinant is not.
We correct the change using the sign of the determinant of
the 3-by-3 matrix that keeps track of the ordering of u, v, p
along the circle,
Γ =


1 u1 u2
1 v1 v2
1 p1 p2

 .
Now we claim that s is inside the circle of u, v, p iff the
two determinants have opposite signs:
boolean INCIRCLE(Points u, v, p, q)
return det Γ · det ∆  0.
We first show that the boolean function is correct for u =
(0, 0), v = (1, 0), p = (0, 1), and q = (0, 0.5). The sign
of the product of determinants remains unchanged if we
continuously move the points and avoid the configurations
that make either determinant zero, which are when u, v, p
are collinear and when u, v, p, q are cocircular. We can
change any configuration where q is inside the circle of
u, v, p continuously into the special configuration without
going through zero, which implies the correctness of the
function for general input points.
79
Termination and running time. To prove the edge-flip
algorithm terminates, we imagine the triangulation lifted
to R3
. We do this by projecting the vertices vertically
onto the paraboloid, as before, and connecting them with
straight edges and triangles in space. Let uv be an edge
shared by triangles uvp and uvq that is flipped to pq by
the algorithm. It follows the line segments uv and pq cross
and their endpoints form a convex quadrilateral, as shown
in Figure 104. After lifting the two line segments, we get
q
v
u
p
Figure 104: A flip in the plane lifts to a tetrahedron in space in
which the lD edge passes below the non-lD edge.
u+
v+
passing above p+
q+
. We may thus think of the flip
as gluing the tetrahedron u+
v+
p+
q+
underneath the sur-
face obtained by lifting the triangulation. The surface is
pushed down by each flip and never pushed back up. The
removed edge is now above the new surface and can there-
fore not be reintroduced by a later flip. It follows that the
algorithm performs at most n
2

flips and thus takes at most
time O(n2
) to construct the Delaunay triangulation of S.
There are faster algorithms that work in time O(n log n)
but we prefer the suboptimal method because it is simpler
and it reveals more about Delaunay triangulations than the
other algorithms.
The lifting of the input points to R3
leads to an interest-
ing interpretation of the edge-flip algorithm. Starting with
a monotone triangulated surface passing through the lifted
points, we glue tetrahedra below the surface until we reach
the unique convex surface that passes through the points.
The projection of this convex surface is the Delaunay tri-
angulation of the points in the plane. This also gives a
reinterpretation of the Delaunay Lemma in terms of con-
vex and concave edges of the surface.
80
22 Alpha Shapes
Many practical applications of geometry have to do with
the intuitive but vague concept of the shape of a finite point
set. To make this idea concrete, we use the distances be-
tween the points to identify subcomplexes of the Delaunay
triangulation that represent that shape at different levels of
resolution.
Union of disks. Let S be a set of n points in R2
. For
each r ≥ 0, we write Bu(r) = {x ∈ R2
| kx − uk ≤
r} for the closed disk with center u and radius r. Let
U(r) =
S
u∈S Bu(r) be the union of the n disks. We de-
compose this union into convex sets of the form Ru(r) =
Bu(r) ∩ Vu. Then
(i) Ru(r) is closed and convex for every point u ∈ S
and every radius r ≥ 0;
(ii) Ru(r) and Rv(r) have disjoint interiors whenever the
two points, u and v, are different;
(iii) U(r) =
S
u∈S Ru(r).
We illustrate this decomposition in Figure 105. Each re-
gion Ru(r) is the intersection of n − 1 closed half-planes
and a closed disk. All these sets are closed and convex,
which implies (i). The Voronoi regions have disjoint inte-
riors, which implies (ii). Finally, take a point x ∈ U(r)
and let u be a point in S with x ∈ Vu. Then x ∈ Bu(r)
and therefore x ∈ Ru(x). This implies (iii).
Figure 105: The Voronoi decomposition of a union of eight disks
in the plane and superimposed dual alpha complex.
Nerve. Similar to defining the Delaunay triangulation as
the dual of the Voronoi diagram, we define the alpha com-
plex as the dual of the Voronoi decomposition of the union
of disks. This time around, we do this more formally. Let-
ting C be a finite collection of sets, the nerve of C is the
system of subcollections that have a non-empty common
intersection,
Nrv C = {X ⊆ C |

X 6= ∅}.
This is an abstract simplicial complex since
T
X 6= ∅ and
Y ⊆ X implies
T
Y 6= ∅. For example, if C is the collec-
tion of Voronoi regions then Nrv C is an abstract version
of the Delaunay triangulation. More specifically, this is
true provide the points are in general position and in par-
ticular no four points lie on a common circle. We will as-
sume this for the remainder of this section. We say the De-
launay triangulation is a geometric realization of Nrv C,
namely the one obtained by mapping each Voronoi region
(a vertex in the abstract simplicial complex) to the gener-
ating point. All edges and triangles are just convex hulls
of their incident vertices. To go from the Delaunay trian-
gulation to the alpha complex, we substitute the regions
Ru(r) for the Vu. Specifically,
Alpha(r) = Nrv {Ru(r) | u ∈ S}.
Clearly, this is isomorphic to a subcomplex of the nerve
of Voronoi regions. We can therefore draw Alpha(r) as
a subcomplex of the Delaunay triangulation; see Figure
105. We call this geometric realization of Alpha(r) the
alpha complex for radius r, denoted as A(r). The alpha
shape for the same radius is the underlying space of the
alpha complex, |A(r)|.
The nerve preserves the way the union is connected.
In particular, their Betti numbers are the same, that is,
βp(U(r)) = βp(A(r)) for all dimensions p and all radii
r. This implies that the union and the alpha shape have
the same number of components and the same number of
holes. For example, in Figure 105 both have one compo-
nent and two holes. We omit the proof of this property.
Filtration. We are interested in the sequence of alpha
shapes as the radius grows from zero to infinity. Since
growing r grows the regions Ru(r), the nerve can only
get bigger. In other words, A(r) ⊆ A(s) whenever r ≤ s.
There are only finitely many subcomplexes of the Delau-
nay triangulation. Hence, we get a finite sequence of alpha
complexes. Writing Ai for the i-th alpha complex, we get
the following nested sequence,
S = A1 ⊂ A2 ⊂ . . . ⊂ Ak = D,
81
where D denotes the Delaunay triangulation of S. We
call such a sequence of complexes a filtration. We illus-
trate this construction in Figure 106. The sequence of al-
b
d
a
c
e
f
g
h
Figure 106: A finite sequence of unions of disks, all decomposed
by the same Voronoi diagram.
pha complexes begins with a set of n isolated vertices, the
points in S. To go from one complex to the next, we either
add an edge, we add a triangle, or we add a pair consisting
of a triangle with one of its edges. In Figure 106, we be-
gin with eight vertices and get the following sequence of
alpha complexes.
A1 = {a, b, c, d, e, f, g, h};
A2 = A1 ∪ {ah};
A3 = A2 ∪ {bc};
A4 = A3 ∪ {ab, ef };
A5 = A4 ∪ {de};
A6 = A5 ∪ {gh};
A7 = A6 ∪ {cd};
A8 = A7 ∪ {fg};
A9 = A8 ∪ {cg}.
Going from A7 to A8, we get for the first time a 1-cycle,
which bounds a hole in the embedding. In A9, this hole is
cut into two. This is the alpha complex depicted in Figure
105. We continue.
A10 = A9 ∪ {cf };
A11 = A10 ∪ {abh, bh};
A12 = A11 ∪ {cde, ce};
A13 = A12 ∪ {cfg};
A14 = A13 ∪ {cef };
A15 = A14 ∪ {bch, ch};
A16 = A15 ∪ {cgh}.
At this moment, we have a triangulated disk but not yet the
entire Delaunay triangulation since the triangle bcd and the
edge bd are still missing. Each step is generic except when
we add two equally long edges to A3.
Compatible ordering of simplices. We can represent
the entire filtration of alpha complexes compactly by sort-
ing the simplices in the order they join the growing com-
plex. An ordering σ1, σ2, . . . , σm of the Delaunay sim-
plices is compatible with the filtration if
1. the simplices in Ai precede the ones not in Ai for
each i;
2. the faces of a simplex precede the simplex.
For example, the sequence
a, b, c, d, e, f, g, h; ah; bc; ab, ef ;
de; gh; cd; fg; cg; cf ; bh, abh; ce,
cde; cfg; cef ; ch, bch; cgh; bd; bcd
is compatible with the filtration in Figure 106. Every alpha
complex is a prefix of the compatible sequence but not
necessarily the other way round. Condition 2 guarantees
that every prefix is a complex, whether an alpha complex
or not. We thus get a finer filtration of complexes
∅ = K0 ⊂ K1 ⊂ . . . ⊂ Km = D,
where Ki is the set of simplices from σ1 to σi. To con-
struct the compatible ordering, we just need to compute
for each Delaunay simplex the radius ri = r(σi) such that
σi ∈ A(r) iff r ≥ ri. For a vertex, this radius is zero.
For a triangle, this is the radius of the circumcircle. For
ϕ
ψ
ϕ
ψ
Figure 107: Left: the middle edge belongs to two acute triangles.
Right: it belongs to an obtuse and an acute triangle.
an edge, we have two cases. Let ϕ and ψ be the angles
opposite the edge σi inside the two incident triangles. We
have ϕ + ψ  180◦
because of the empty circle property.
CASE 1. ϕ  90◦
and ψ  90◦
. Then ri = r(σi) is half
the length of the edge.
82
CASE 2. ϕ ≥ 90◦
. Then ri = rj, where σj is the incident
triangle with angle ϕ.
Both cases are illustrated in Figure 107. In Case 2, the
edge σi enters the growing alpha complex together with
the triangle σj. The total number of simplices in the De-
launay triangulation is m  6n. The threshold radii can
be computed in time O(n). Sorting the simplices into
the compatible ordering can therefore be done in time
O(n log n).
Betti numbers. In two dimensions, Betti numbers can
be computed directly, without resorting to boundary matri-
ces. The only two possibly non-zero Betti numbers are β0,
the number of components, and β1, the number of holes.
We compute the Betti numbers of Kj by adding the sim-
plices in order.
β0 = β1 = 0;
for i = 1 to j do
switch dim σi:
case 0: β0 = β0 + 1;
case 1: let u, v be the endpoints of σi;
if FIND(u) = FIND(v) then β1 = β1 + 1
else β0 = β0 − 1;
UNION(u, v)
endif
case 2: β1 = β1 − 1
endswitch
endfor.
All we need is tell apart the two cases when σi is an edge.
This is done using a union-find data structure maintaining
the components of the alpha complex in amortized time
α(n) per simplex. The total running time of the algorithm
for computing Betti numbers is therefore O(nα(n)).
83
Sixth Homework Assignment
Write the solution to each problem on a single page. The
deadline for handing in solutions is November 25.
Problem 1. (20 points). Let S be a set of n unit disks
in the Euclidean plane, each given by its center and
radius, which is one. Give an algorithm that decides
whether any two of the disks in S intersect.
Problem 2. (20 = 10 + 10 points). Let S be a set of
n points in the Euclidean plane. The Gabriel graph
connects points u, v ∈ S with a straight edge if
ku − vk2
≤ ku − pk2
+ kv − pk2
for every point p in S.
(a) Show that the Grabriel graph is a subgraph of
the edge skeleton of the Delaunay triangulation.
(b) Is the Gabriel graph necessarily connected?
Justify your answer.
Problem 3. (20 = 10 + 10 points). Consider a set of n ≥
3 closed disks in the Euclidean plane. The disks are
allowed to touch but no two of them have an interior
point in common.
(a) Show that the number of touching pairs of disks
is at most 3n − 6.
(b) Give a construction that achieves the upper
bound in (a) for any n ≥ 3.
Problem 4. (20 = 10 + 10 points). Let K be a triangula-
tion of a set of n ≥ 3 points in the plane. Let L be a
line that avoids all the points.
(a) Prove that L intersects at most 2n − 4 of the
edges in K.
(b) Give a construction for which L achieves the
upper bound in (a) for any n ≥ 3.
Problem 5. (20 points). Let S be a set of n points in the
Euclidean plane, consider its Delaunay triangulation
and the corresponding filtration of alpha complexes,
S = A1 ⊂ A2 ⊂ . . . ⊂ Ak.
Under what conditions is it true that Ai and Ai+1 dif-
fer by a single simplex for every 1 ≤ i ≤ m − 1?
84
VII NP-COMPLETENESS
23 Easy and Hard Problems
24 NP-Complete Problems
25 Approximation Algorithms
Seventh Homework Assignment
85
23 Easy and Hard Problems
The theory of NP-completeness is an attempt to draw a
line between tractable and intractable problems. The most
important question is whether there is indeed a difference
between the two, and this question is still unanswered.
Typical results are therefore relative statements such as “if
problem B has a polynomial-time algorithm then so does
problem C” and its equivalent contra-positive “if prob-
lem C has no polynomial-time algorithm then neither has
problem B”. The second formulation suggests we remem-
ber hard problems C and for a new problem B we first see
whether we can prove the implication. If we can then we
may not want to even try to solve problem B efficiently. A
good deal of formalism is necessary for a proper descrip-
tion of results of this kind, of which we will introduce only
a modest amount.
What is a problem? An abstract decision problem is a
function I → {0, 1}, where I is the set of problem in-
stances and 0 and 1 are interpreted to mean FALSE and
TRUE, as usual. To completely formalize the notion, we
encode the problem instances in strings of zeros and ones:
I → {0, 1}∗
. A concrete decision problem is then a func-
tion Q : {0, 1}∗
→ {0, 1}. Following the usual conven-
tion, we map bit-strings that do not correspond to mean-
ingful problem instances to 0.
As an example consider the shortest-path problem. A
problem instance is a graph and a pair of vertices, u and
v, in the graph. A solution is a shortest path from u and
v, or the length of such a path. The decision problem ver-
sion specifies an integer k and asks whether or not there
exists a path from u to v whose length is at most k. The
theory of NP-completeness really only deals with deci-
sion problems. Although this is a loss of generality, the
loss is not dramatic. For example, given an algorithm for
the decision version of the shortest-path problem, we can
determine the length of the shortest path by repeated de-
cisions for different values of k. Decision problems are
always easier (or at least not harder) than the correspond-
ing optimization problems. So in order to prove that an
optimization problem is hard it suffices to prove that the
corresponding decision problem is hard.
Polynomial time. An algorithm solves a concrete deci-
sion problem Q in time T (n) if for every instance x ∈
{0, 1}∗
of length n the algorithm produces Q(x) in time
at most T (n). Note that this is the worst-case notion of
time-complexity. The problem Q is polynomial-time solv-
able if T (n) = O(nk
) for some constant k independent of
n. The first important complexity class of problems is
P = set of concrete decision problems
that are polynomial-time solvable.
The problems Q ∈ P are called tractable or easy and the
problems Q 6∈ P are called intractable or hard. Algo-
rithms that take only polynomial time are called efficient
and algorithms that require more than polynomial time
are inefficient. In other words, until now in this course
we only talked about efficient algorithms and about easy
problems. This terminology is adapted because the rather
fine grained classification of algorithms by complexity we
practiced until now is not very useful in gaining insights
into the rather coarse distinction between polynomial and
non-polynomial.
It is convenient to recast the scenario in a formal lan-
guage framework. A language is a set L ⊆ {0, 1}∗
. We
can think of it as the set of problem instances, x, that
have an affirmative answer, Q(x) = 1. An algorithm
A : {0, 1}∗
→ {0, 1} accepts x ∈ {0, 1}∗
if A(x) = 1
and it rejects x if A(x) = 0. The language accepted by A
is the set of strings x ∈ {0, 1}∗
with A(x) = 1. There is
a subtle difference between accepting and deciding a lan-
guage L. The latter means that A accepts every x ∈ L and
rejects every x 6∈ L. For example, there is an algorithm
that accepts every program that halts, but there is no algo-
rithm that decides the language of such programs. Within
the formal language framework we redefine the class of
polynomial-time solvable problems as
P = {L ⊆ {0, 1}∗
| L is accepted by
a polynomial-time algorithm}
= {L ⊆ {0, 1}∗
| L is decided by
a polynomial-time algorithm}.
Indeed, a language that can be accepted in polynomial
time can also be decided in polynomial time: we keep
track of the time and if too much goes by without x be-
ing accepted, we turn around and reject x. This is a non-
constructive argument since we may not know the con-
stants in the polynomial. However, we know such con-
stants exist which suffices to show that a simulation as
sketched exists.
Hamiltonian cycles. We use a specific graph problem to
introduce the notion of verifying a solution to a problem,
as opposed to solving it. Let G = (V, E) be an undi-
rected graph. A hamiltonian cycle contains every vertex
86
v ∈ V exactly once. The graph G is hamiltonian if it has
a hamiltonian cycle. Figure 108 shows a hamiltonian cy-
cle of the edge graph of a Platonic solid. How about the
edge graphs of the other four Platonic solids? Define L =
Figure 108: The edge graph of the dodecahedron and one of its
hamiltonian cycles.
{G | G is hamiltonian}. We can thus ask whether or not
L ∈ P, that is, whether or not there is a polynomial-time
algorithm that decides whether or not a graph is hamilto-
nian. The answer to this question is currently not known,
but there is evidence that the answer might be negative. On
the other hand, suppose y is a hamiltonian cycle of G. The
language L′
= {(G, y) | y is a hamiltonian cycle of G} is
certainly in P because we just need to make sure that y
and G have the same number of vertices and every edge of
y is also an edge of G.
Non-deterministic polynomial time. More generally, it
seems easier to verify a given solution than to come up
with one. In a nutshell, this is what NP-completeness is
about, namely finding out whether this is indeed the case
and whether the difference between accepting and verify-
ing can be used to separate hard from easy problems.
Call y ∈ {0, 1}∗
a certificate. An algorithm A verifies
a problem instance x ∈ {0, 1}∗
if there exists a certificate
y with A(x, y) = 1. The language verified by A is the set
of strings x ∈ {0, 1}∗
verified by A. We now define a new
class of problems,
NP = {L ⊆ {0, 1}∗
| L is verified by
a polynomial-time algorithm}.
More formally, L is in NP if for every problem instance
x ∈ L there is a certificate y whose length is bounded
from above by a polynomial in the length of x such that
A(x, y) = 1 and A runs in polynomial time. For exam-
ple, deciding whether or not G is hamiltonian is in NP.
The name NP is an abbreviation for non-deterministic
polynomial time, because a non-deterministic computer
can guess a certificate and then verify that certificate. In a
parallel emulation, the computer would generate all possi-
ble certificates and then verify them in parallel. Generat-
ing one certificate is easy, because it only has polynomial
length, but generating all of them is hard, because there
are exponentially many strings of polynomial length.
P =
co−NP
NP
NP NP co−NP
P = NP = co−NP NP = co−NP
P
P
Figure 109: Four possible relations between the complexity
classes P, NP, and co-NP.
Non-deterministic machine are at least as powerful as
deterministic machines. It follows that every problem in
P is also in NP, P ⊆ NP. Define
co-NP = {L | L = {x 6∈ L} ∈ NP},
which is the class of languages whose complement can
be verified in non-deterministic polynomial time. It is
not known whether or not NP = co-NP. For example,
it seems easy to verify that a graph is hamiltonian but
it seems hard to verify that a graph is not hamiltonian.
We said earlier that if L ∈ P then L ∈ P. Therefore,
P ⊆ co-NP. Hence, only the four relationships between
the three complexity classes shown in Figure 109 are pos-
sible, but at this time we do not know which one is correct.
Problem reduction. We now develop the concept of re-
ducing one problem to another, which is key in the con-
struction of the class of NP-complete problems. The idea
is to map or transform an instance of a first problem to an
instance of a second problem and to map the solution to
the second problem back to a solution to the first problem.
For decision problems, the solutions are the same and need
no transformation.
Language L1 is polynomial-time reducible to language
L2, denoted L1 ≤P L2, if there is a polynomial-time com-
putable function f : {0, 1}∗
→ {0, 1}∗
such that x ∈ L1
iff f(x) ∈ L2, for all x ∈ {0, 1}∗
. Now suppose that
87
L1 is polynomial-time reducible to L2 and that L2 has a
polynomial-time algorithm A2 that decides L2,
x
f
−→ f(x)
A2
−→ {0, 1}.
We can compose the two algorithms and obtain a poly-
nomial-time algorithm A1 = A2 ◦ f that decides L1. In
other words, we gained an efficient algorithm for L1 just
by reducing it to L2.
REDUCTION LEMMA. If L1 ≤P L2 and L2 ∈ P then
L1 ∈ P.
In words, if L1 is polynomial-time reducible to L2 and
L2 is easy then L1 is also easy. Conversely, if we know
that L1 is hard then we can conclude that L2 is also hard.
This motivates the following definition. A language L ⊆
{0, 1}∗
is NP-complete if
(1) L ∈ NP;
(2) L′
≤P L, for every L′
∈ NP.
Since every L′
∈ NP is polynomial-time reducible to L,
all L′
have to be easy for L to have a chance to be easy.
The L′
thus only provide evidence that L might indeed
be hard. We say L is NP-hard if it satisfies (2) but not
necessarily (1). The problems that satisfy (1) and (2) form
the complexity class
NPC = {L | L is NP-complete}.
All these definitions would not mean much if we could
not find any problems in NPC. The first step is the most
difficult one. Once we have one problem in NPC we can
get others using reductions.
Satisfying boolean formulas. Perhaps surprisingly, a
first NP-complete problem has been found, namely the
problem of satisfiability for logical expressions. A
boolean formula, ϕ, consists of variables, x1, x2, . . ., op-
erators, ¬, ∧, ∨, =⇒, . . ., and parentheses. A truth assign-
ment maps each variable to a boolean value, 0 or 1. The
truth assignment satisfies if the formula evaluates to 1. The
formula is satisfiable if there exists a satisfying truth as-
signment. Define SAT = {ϕ | ϕ is satisfiable}. As an
example consider the formula
ψ = (x1 =⇒ x2) ⇐⇒ (x2 ∨ ¬x1).
If we set x1 = x2 = 1 we get (x1 =⇒ x2) = 1, (x2 ∨
¬x1) = 1 and therefore ψ = 1. It follows that ψ ∈ SAT.
In fact, all truth assignments evaluate to 1, which means
that ψ is really a tautology. More generally, a boolean
formula, ϕ, is satisfyable iff ¬ϕ is not a tautology.
SATISFIABILITY THEOREM. We have SAT ∈ NP and
L′
≤P SAT for every L′
∈ NP.
That SAT is in the class NP is easy to prove: just guess an
assignment and verify that it satisfies. However, to prove
that every L′
∈ NP can be reduced to SAT in polynomial
time is quite technical and we omit the proof. The main
idea is to use the polynomial-time algorithm that verifies
L′
and to construct a boolean formula from this algorithm.
To formalize this idea, we would need a formal model of a
computer, a Touring machine, which is beyond the scope
of this course.
88
24 NP-Complete Problems
In this section, we discuss a number of NP-complete prob-
lems, with the goal to develop a feeling for what hard
problems look like. Recognizing hard problems is an im-
portant aspect of a reliable judgement for the difficulty of
a problem and the most promising approach to a solution.
Of course, for NP-complete problems, it seems futile to
work toward polynomial-time algorithms and instead we
would focus on finding approximations or circumventing
the problems altogether. We begin with a result on differ-
ent ways to write boolean formulas.
Reduction to 3-satisfiability. We call a boolean vari-
able or its negation a literal. The conjunctive normal
form is a sequence of clauses connected by ∧s, and each
clause is a sequence of literals connected by ∨s. A for-
mula is in 3-CNF if it is in conjunctive normal form and
each clause consists of three literals. It turns out that de-
ciding the satisfiability of a boolean formula in 3-CNF
is no easier than for a general boolean formula. Define
3-SAT = {ϕ ∈ SAT | ϕ is in 3-CNF}. We prove the
above claim by reducing SAT to 3-SAT.
SATISFIABILITY LEMMA. SAT ≤P 3-SAT.
PROOF. We take a boolean formula ϕ and transform it into
3-CNF in three steps.
Step 1. Think of ϕ as an expression and represent it as
a binary tree. Each node is an operation that gets the
input from its two children and forwards the output
to its parent. Introduce a new variable for the output
and define a new formula ϕ′
for each node, relating
the two input edges with the one output edge. Figure
110 shows the tree representation of the formula ϕ =
(x1 =⇒ x2) ⇐⇒ (x2 ∨ ¬x1). The new formula is
ϕ′
= (y2 ⇐⇒ (x1 =⇒ x2))
∧(y3 ⇐⇒ (x2 ∨ ¬x1))
∧(y1 ⇐⇒ (y2 ⇐⇒ y3)) ∧ y1.
It should be clear that there is a satisfying assignment
for ϕ iff there is one for ϕ′
.
Step 2. Convert each clause into disjunctive normal
form. The most mechanical way uses the truth table
for each clause, as illustrated in Table 6. Each clause
has at most three literals. For example, the negation
of y2 ⇐⇒ (x1 =⇒ x2) is equivalent to the disjunc-
tion of the conjunctions in the rightmost column. It
x
2
x
2
1 1
x
y1
2
y 3
x
y
Figure 110: The tree representation of the formula ϕ. Inciden-
tally, ϕ is a tautology, which means it is satisfied by every truth
assignment. Equivalently, ¬ϕ is not satisfiable.
y2 x1 x2 y2 ⇔ (x1 ⇒ x2) prohibited
0 0 0 0 ¬y2 ∧ ¬x1 ∧ ¬x2
0 0 1 0 ¬y2 ∧ ¬x1 ∧ x2
0 1 0 1
0 1 1 0 ¬y2 ∧ x1 ∧ x2
1 0 0 1
1 0 1 1
1 1 0 0 y2 ∧ x1 ∧ ¬x2
1 1 1 1
Table 6: Conversion of a clause into a disjunction of conjunctions
of at most three literals each.
follows that y2 ⇐⇒ (x1 =⇒ x2) is equivalent to the
negation of that disjunction, which by de Morgan’s
law is (y2 ∨x1 ∨x2)∧(y2 ∨x1 ∨¬x2)∧(y2 ∨¬x1 ∨
¬x2) ∧ (¬y2 ∨ ¬x1 ∨ x2).
Step 3. The clauses with fewer than three literals can
be expanded by adding new variables. For example
a ∨ b is expanded to (a ∨ b ∨ p) ∧ (a ∨ b ∨ ¬p) and
(a) is expanded to (a ∨ p ∨ q) ∧ (a ∨ p ∨ ¬q) ∧ (a ∨
¬p ∨ q) ∧ (a ∨ ¬p ∨ ¬q).
Each step takes only polynomial time. At the end, we get
an equivalent formula in 3-conjunctive normal form.
We note that clauses of length three are necessary to
make the satisfiability problem hard. Indeed, there is a
polynomial-time algorithm that decides the satisfiability
of a formula in 2-CNF.
NP-completeness proofs. Using polynomial-time re-
ductions, we can show fairly mechanically that problems
are NP-complete, if they are. A key property is the tran-
sitivity of ≤P , that is, if L′
≤P L1 and L1 ≤P L2
then L′
≤P L2, as can be seen by composing the two
polynomial-time computable functions to get a third one.
REDUCTION LEMMA. Let L1, L2 ⊆ {0, 1}∗
and assume
L1 ≤P L2. If L1 is NP-hard and L2 ∈ NP then
L2 ∈ NPC.
89
A generic NP-completeness proof thus follows the steps
outline below.
Step 1. Prove that L2 ∈ NP.
Step 2. Select a known NP-hard problem, L1, and find
a polynomial-time computable function, f, with x ∈
L1 iff f(x) ∈ L2.
This is what we did for L2 = 3-SAT and L1 = SAT.
Therefore 3-SAT ∈ NPC. Currently, there are thousands
of problems known to be NP-complete. This is often con-
NPC
NP
P
Figure 111: Possible relation between P, NPC, and NP.
sidered evidence that P 6= NP, which can be the case only
if P ∩ NPC = ∅, as drawn in Figure 111.
Cliques and independent sets. There are many NP-
complete problems on graphs. A typical such problem
asks for the largest complete subgraph. Define a clique
in an undirected graph G = (V, E) as a subgraph (W, F)
with F = W
2

. Given G and an integer k, the CLIQUE
problem asks whether or not there is a clique of k or more
vertices.
CLAIM. CLIQUE ∈ NPC.
PROOF. Given k vertices in G, we can verify in poly-
nomial time whether or not they form a complete graph.
Thus CLIQUE ∈ NP. To prove property (2), we show
that 3-SAT ≤P CLIQUE. Let ϕ be a boolean formula in
3-CNF consisting of k clauses. We construct a graph as
follows:
(i) each clause is replaced by three vertices;
(ii) two vertices are connected by an edge if they do not
belong to the same clause and they are not negations
of each other.
In a satisfying truth assignment, there is at least one true
literal in each clause. The true literals form a clique. Con-
versely, a clique of k or more vertices covers all clauses
and thus implies a satisfying truth assignment.
It is easy to decide in time O(k2
nk+2
) whether or not a
graph of n vertices has a clique of size k. If k is a constant,
the running time of this algorithm is polynomial in n. For
the CLIQUE problem to be NP-complete it is therefore es-
sential that k be a variable that can be arbitrarily large.
We use the NP-completeness of finding large cliques to
prove the NP-completeness of large sets of pairwise non-
adjacent vertices. Let G = (V, E) be an undirected graph.
A subset W ⊆ V is independent if none of the vertices in
W are adjacent or, equivalently, if E ∩ W
2

= ∅. Given
G and an integer k, the INDEPENDENT SET problem asks
whether or not there is an independent set of k or more
vertices.
CLAIM. INDEPENDENT SET ∈ NPC.
PROOF. It is easy to verify that there is an independent set
of size k: just guess a subset of k vertices and verify that
no two are adjacent.
Figure 112: The four shaded vertices form an independent set in
the graph on the left and a clique in the complement graph on the
right.
We complete the proof by reducing the CLIQUE to the
INDEPENDENT SET problem. As illustrated in Figure 112,
W ⊆ V is independent iff W defines a clique in the com-
plement graph, G = (V, V
2

− E). To prove CLIQUE ≤P
INDEPENDENT SET, we transform an instance H, k of the
CLIQUE problem to the instance G = H, k of the INDE-
PENDENT SET problem. G has an independent set of size
k or larger iff H has a clique of size k or larger.
Various NP-complete graph problems. We now de-
scribe a few NP-complete problems for graphs without
proving that they are indeed NP-complete. Let G =
(V, E) be an undirected graph with n vertices and k a pos-
itive integer, as before. The following problems defined
for G and k are NP-complete.
An ℓ-coloring of G is a function χ : V → [ℓ] with
χ(u) 6= χ(v) whenever u and v are adjacent. The CHRO-
MATIC NUMBER problem asks whether or not G has an ℓ-
coloring with ℓ ≤ k. The problem remains NP-complete
90
for fixed k ≥ 3. For k = 2, the CHROMATIC NUMBER
problem asks whether or not G is bipartite, for which there
is a polynomial-time algorithm.
The bandwidth of G is the minimum ℓ such that there
is a bijection β : V → [n] with |β(u) − β(v)| ≤ ℓ for
all adjacent vertices u and v. The BANDWIDTH problem
asks whether or not the bandwidth of G is k or less. The
problem arises in linear algebra, where we permute rows
and columns of a matrix to move all non-zero elements of
a square matrix as close to the diagonal as possible. For
example, if the graph is a simple path then the bandwidth
is 1, as can be seen in Figure 113. We can transform the
1
0 1
1 0 1
1 0 1
0
1
1 0 1
1 0 1
1
1
1
0
0
0
0
2
1 3
4 7
8
6
5
Figure 113: Simple path and adjacency matrix with rows and
columns ordered along the path.
adjacency matrix of G such that all non-zero diagonals are
at most the bandwidth of G away from the main diagonal.
Assume now that the graph G is complete, E =
V
2

, and that each edge, uv, has a positive integer
weight, w(uv). The TRAVELING SALESMAN problem
asks whether there is a permutation u0, u1, . . . , un−1 of
the vertices such that the sum of edges connecting con-
tiguous vertices (and the last vertex to the first) is k or
less,
n−1
X
i=0
w(uiui+1) ≤ k,
where indices are taken modulo n. The problem remains
NP-complete if w : E → {1, 2} (reduction to HAMILTO-
NIAN CYCLE problem), and also if the vertices are points
in the plane and the weight of an edge is the Euclidean
distance between the two endpoints.
Set systems. Simple graphs are set systems in which the
sets contain only two elements. We now list a few NP-
complete problems for more general set systems. Letting
V be a finite set, C ⊆ 2V
a set system, and k a positive
integer, the following problems are NP-complete.
The PACKING problem asks whether or not C has k or
more mutually disjoint sets. The problem remains NP-
complete if no set in C contains more than three elements,
and there is a polynomial-time algorithm if every set con-
tains two elements. In the latter case, the set system is a
graph and a maximum packing is a maximum matching.
The COVERING problem asks whether or not C has k
or fewer subsets whose union is V . The problem remains
NP-complete if no set in C contains more than three ele-
ments, and there is a polynomial-time algorithm if every
sets contains two elements. In the latter case, the set sys-
tem is a graph and the minimum cover can be constructed
in polynomial time from a maximum matching.
Suppose every element v ∈ V has a positive integer
weight, w(v). The PARTITION problem asks whether
there is a subset U ⊆ V with
X
u∈U
w(u) =
X
v∈V −U
w(v).
The problem remains NP-complete if we require that U
and V − U have the same number of elements.
91
25 Approximation Algorithms
Many important problems are NP-hard and just ignoring
them is not an option. There are indeed many things one
can do. For problems of small size, even exponential-
time algorithms can be effective and special subclasses
of hard problems sometimes have polynomial-time algo-
rithms. We consider a third coping strategy appropriate
for optimization problems, which is computing almost op-
timal solutions in polynomial time. In case the aim is
to maximize a positive cost, a ̺(n)-approximation algo-
rithm is one that guarantees to find a solution with cost
C ≥ C∗
/̺(n), where C∗
is the maximum cost. For mini-
mization problems, we would require C ≤ C∗
̺(n). Note
that ̺(n) ≥ 1 and if ̺(n) = 1 then the algorithm produces
optimal solutions. Ideally, ̺ is a constant but sometime
even this is not achievable in polynomial time.
Vertex cover. The first problem we consider is finding
the minimum set of vertices in a graph G = (V, E) that
covers all edges. Formally, a subset V ′
⊆ V is a ver-
tex cover if every edge has at least one endpoint in V ′
.
Observe that V ′
is a vertex cover iff V − V ′
is an inde-
pendent set. Finding a minimum vertex cover is therefore
equivalent to finding a maximum independent set. Since
the latter problem is NP-complete, we conclude that find-
ing a minimum vertex cover is also NP-complete. Here is
a straightforward algorithm that achieves approximation
ratio ̺(n) = 2, for all n = |V |.
V ′
= ∅; E′
= E;
while E′
6= ∅ do
select an arbitrary edge uv in E′
;
add u and v to V ′
;
remove all edges incident to u or v from E′
endwhile.
Clearly, V ′
is a vertex cover. Using adjacency lists with
links between the two copies of an edge, the running time
is O(n + m), where m is the number of edges. Further-
more, we have ̺ = 2 because every cover must pick at
least one vertex of each edge uv selected by the algorithm,
hence C ≤ 2C∗
. Observe that this result does not imply
a constant approximation ratio for the maximum indepen-
dent set problem. We have |V −V ′
| = n−C ≥ n−2C∗
,
which we have to compare with n − C∗
, the size of the
maximum independent set. For C∗
= n
2 , the approxima-
tion ratio is unbounded.
Let us contemplate the argument we used to relate C
and C∗
. The set of edges uv selected by the algorithm is
a matching, that is, a subset of the edges so that no two
share a vertex. The size of the minimum vertex cover is
at least the size of the largest possible matching. The al-
gorithm finds a matching and since it picks two vertices
per edge, we are guaranteed at most twice as many ver-
tices as needed. This pattern of bounding C∗
by the size
of another quantity (in this case the size of the largest
matching) is common in the analysis of approximation al-
gorithms. Incidentally, for bipartite graphs, the size of the
largest matching is equal to the size of the smallest vertex
cover. Furthermore, there is a polynomial-time algorithm
for computing them.
Traveling salesman. Second, we consider the traveling
salesman problem, which is formulated for a complete
graph G = (V, E) with a positive integer cost function
c : E → Z+. A tour in this graph is a Hamiltonian
cycle and the problem is finding the tour, A, with mini-
mum total cost, c(A) =
P
uv∈A c(uv). Let us first as-
sume that the cost function satisfies the triangle inequal-
ity, c(uw) ≤ c(uv) + c(vw) for all u, v, w ∈ V . It can
be shown that the problem of finding the shortest tour
remains NP-complete even if we restrict it to weighted
graphs that satisfy this inequality. We formulate an al-
gorithm based on the observation that the cost of every
tour is at least the cost of the minimum spanning tree,
C∗
≥ c(T ).
1 Construct the minimum spanning tree T of G.
2 Return the preorder sequence of vertices in T .
Using Prim’s algorithm for the minimum spanning tree,
the running time is O(n2
). Figure 114 illustrates the algo-
rithm. The preorder sequence is only defined if we have
Figure 114: The solid minimum spanning tree, the dotted traver-
sal using each edge of the tree twice, and the solid tour obtained
by taking short-cuts.
a root and the neighbors of each vertex are ordered, but
92
we may choose both arbitrarily. The cost of the returned
tour is at most twice the cost of the minimum spanning
tree. To see this, consider traversing each edge of the min-
imum spanning tree twice, once in each direction. When-
ever a vertex is visited more than once, we take the direct
edge connecting the two neighbors of the second copy as a
short-cut. By the triangle inequality, this substitution can
only decrease the overall cost of the traversal. It follows
that C ≤ 2c(T ) ≤ 2C∗
.
The triangle inequality is essential in finding a constant
approximation. Indeed, without it we can construct in-
stances of the problem for which finding a constant ap-
proximation is NP-hard. To see this, transform an un-
weighted graph G′
= (V ′
, E′
) to the complete weighted
graph G = (V, E) with
c(uv) =

1 if uv ∈ E′
,
̺n + 1 otherwise.
Any ̺-approximation algorithm must return the Hamilto-
nian cycle of G′
, if there is one.
Set cover. Third, we consider the problem of covering
a set X with sets chosen from a set system F. We as-
sume the set is the union of sets in the system, X =
S
F.
More precisely, we are looking for a smallest subsystem
F′
⊆ F with X =
S
F′
. The cost of this subsystem is
the number of sets it contains, |F′
|. See Figure 115 for
an illustration of the problem. The vertex cover problem
Figure 115: The set X of twelve dots can be covered with four
of the five sets in the system.
is a special case: X = E and F contains all subsets of
edges incident to a common vertex. It is special because
each element (edge) belongs to exactly two sets. Since we
no longer have a bound on the number of sets containing
a single element, it is not surprising that the algorithm for
vertex covers does not extend to a constant-approximation
algorithm for set covers. Instead, we consider the follow-
ing greedy approach that selects, at each step, the set con-
taining the maximum number of yet uncovered elements.
F′
= ∅; X′
= X;
while X′
6= ∅ do
select S ∈ F maximizing |S ∩ X′
|;
F′
= F′
∪ {S}; X′
= X′
− S
endwhile.
Using a sparse matrix representation of the set system
(similar to an adjacency list representation of a graph), we
can run the algorithm in time proportional to the total size
of the sets in the system, n =
P
S∈F |S|. We omit the
details.
Analysis. More interesting than the running time is the
analysis of the approximation ratio the greedy algorithm
achieves. It is convenient to have short notation for the d-
th harmonic number, Hd =
Pd
i=1
1
i for d ≥ 0. Recall that
Hd ≤ 1 + ln d for d ≥ 1. Let the size of the largest set in
the system be m = max{|S| | S ∈ F}.
CLAIM. The greedy method is an Hm-approximation al-
gorithm for the set cover problem.
PROOF. For each set S selected by the algorithm, we dis-
tribute $1 over the |S ∩ X′
| elements covered for the first
time. Let cx be the cost allocated this way to x ∈ X. We
have |F′
| =
P
x∈X cx. If x is covered the first time by the
i-th selected set, Si, then
cx =
1
|Si − (S1 ∪ . . . ∪ Si−1)|
.
We have |F′
| ≤
P
S∈F∗
P
x∈S cx because the optimal
cover, F∗
, contains each element x at least once. We will
prove shortly that
P
x∈S cx ≤ H|S| for every set S ∈ F.
It follows that
|F′
| ≤
X
S∈F∗
H|S| ≤ Hm|F∗
|,
as claimed.
For m = 3, we get ̺ = H3 = 11
6 . This implies that
for graphs with vertex-degrees at most 3, the greedy algo-
rithm guarantees a vertex cover of size at most 11
6 times
the optimum, which is better than the ratio 2 guaranteed
by our first algorithm.
We still need to prove that the sum of costs cx over the
elements of a set S in the system is bounded from above
by H|S|. Let ui be the number of elements in S that are
93
not covered by the first i selected sets, ui = |S − (S1 ∪
. . . ∪ Si)|, and observe that the numbers do not increase.
Let uk−1 be the last non-zero number in the sequence, so
|S| = u0 ≥ . . . ≥ uk−1  uk = 0. Since ui−1 − ui is the
number of elements in S covered the first time by Si, we
have
X
x∈S
cx =
k
X
i=1
ui−1 − ui
|Si − (S1 ∪ . . . ∪ Si−1)|
.
We also have ui−1 ≤ |Si − (S1 ∪ . . . ∪ Si−1)|, for all
i ≤ k, because of the greedy choice of Si. If this were
not the case, the algorithm would have chosen S instead
of Si in the construction of F′
. The problem thus reduces
to bounding the sum of ratios ui−1−ui
ui−1
. It is not difficult
to see that this sum can be at least logarithmic in the size
of S. Indeed, if we choose ui about half the size of ui−1,
for all i ≥ 1, then we have logarithmically many terms,
each roughly 1
2 . We use a sequence of simple arithmetic
manipulations to prove that this lower bound is asymptot-
ically tight:
X
x∈S
cx ≤
k
X
i=1
ui−1 − ui
ui−1
=
k
X
i=1
ui−1
X
j=ui+1
1
ui−1
.
We now replace the denominator by j ≤ ui−1 to form a
telescoping series of harmonic numbers and get
X
x∈S
cx ≤
k
X
i=1
ui−1
X
j=ui+1
1
j
=
k
X
i=1


ui−1
X
j=1
1
j
−
ui
X
j=1
1
j


=
k
X
i=1
(Hui−1 − Hui ).
This is equal to Hu0 − Huk
= H|S|, which fills the gap
left in the analysis of the greedy algorithm.
94
Seventh Homework Assignment
The purpose of this assignment is to help you prepare for
the final exam. Solutions will neither be graded nor even
collected.
Problem 1. (20 = 5 + 15 points). Consider the class
of satisfiable boolean formulas in conjunctive nor-
mal form in which each clause contains two literals,
2-SAT = {ϕ ∈ SAT | ϕ is 2-CNF}.
(a) Is 2-SAT ∈ NP?
(b) Is there a polynomial-time algorithm for decid-
ing whether or not a boolean formula in 2-CNF
is satisfiable? If your answer is yes, then de-
scribe and analyze your algorithm. If your an-
swer is no, then show that 2-SAT ∈ NPC.
Problem 2. (20 points). Let A be a finite set and f a func-
tion that maps every a ∈ A to a positive integer f(a).
The PARTITION problem asks whether or not there is
a subset B ⊆ A such that
X
b∈B
f(b) =
X
a∈A−B
f(a).
We have learned that the PARTITION problem is
NP-complete. Given positive integers j and k, the
SUM OF SQUARES problem asks whether or not
A can be partitioned into j disjoint subsets, A =
B1 ˙
∪ B2 ˙
∪ . . . ˙
∪ Bj, such that
j
X
i=1
X
a∈Bi
f(a)
!2
≤ k.
Prove that the SUM OF SQUARES problem is NP-
complete.
Problem 3. (20 = 10+10 points). Let G be an undirected
graph. A path in G is simple if it contains each ver-
tex at most once. Specifying two vertices u, v and a
positive integer k, the LONGEST PATH problem asks
whether or not there is a simple path connecting u
and v whose length is k or longer.
(a) Give a polynomial-time algorithm for the
LONGEST PATH problem or show that it is NP-
hard.
(b) Revisit (a) under the assumption that G is di-
rected and acyclic.
Problem 4. (20 = 10 + 10 points). Let A ⊆ 2V
be an
abstract simplicial complex over the finite set V and
let k be a positive integer.
(a) Is it NP-hard to decide whether A has k or more
disjoint simplices?
(b) Is it NP-hard to decide whether A has k or
fewer simplices whose union is V ?
Problem 5. (20 points). Let G = (V, E) be an undi-
rected, bipartite graph and recall that there is a
polynomial-time algorithm for constructing a max-
imum matching. We are interested in computing a
minimum set of matchings such that every edge of
the graph is a member of at least one of the selected
matchings. Give a polynomial-time algorithm con-
structing an O(log n) approximation for this prob-
lem.
95

More Related Content

PDF
Sienna 4 divideandconquer
PPTX
Introduction to Algorithms
PPTX
Divide and Conquer
PPT
MergesortQuickSort.ppt
PPT
presentation_mergesortquicksort_1458716068_193111.ppt
Sienna 4 divideandconquer
Introduction to Algorithms
Divide and Conquer
MergesortQuickSort.ppt
presentation_mergesortquicksort_1458716068_193111.ppt

Similar to Book.pdf01_Intro.ppt algorithm for preperation stu used (20)

PPT
Algorithm.ppt
PPT
Lec 6 Divide and conquer of Data Structures & Algortihms
PPTX
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
PPTX
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
PPT
Introduction to basic algorithm knowledge.ppt
PPTX
Quick sort.pptx
PDF
Data Structures and Algorithm Analysis in C++, 3rd Edition by Dr. Clifford A....
PPTX
Lecture 3.3.4 Quick sort.pptxIIIIIIIIIII
PPTX
Sortings .pptx
PPTX
Algorithim lec1.pptx
PPT
lecture4.ppt
PPTX
Merge sort and quick sort
PPT
s4_quick_sort.ppt
PDF
Skiena algorithm 2007 lecture08 quicksort
PPTX
data structures and algorithms Unit 3
PDF
Divide and conquer
DOCX
Merge sort lab mannual
PPTX
09 QUICK SORT Design and Analysis of algorithms
PDF
Introduction To Algorithms 4th Thomas H Cormen Charles E Leiserson
Algorithm.ppt
Lec 6 Divide and conquer of Data Structures & Algortihms
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
Introduction to basic algorithm knowledge.ppt
Quick sort.pptx
Data Structures and Algorithm Analysis in C++, 3rd Edition by Dr. Clifford A....
Lecture 3.3.4 Quick sort.pptxIIIIIIIIIII
Sortings .pptx
Algorithim lec1.pptx
lecture4.ppt
Merge sort and quick sort
s4_quick_sort.ppt
Skiena algorithm 2007 lecture08 quicksort
data structures and algorithms Unit 3
Divide and conquer
Merge sort lab mannual
09 QUICK SORT Design and Analysis of algorithms
Introduction To Algorithms 4th Thomas H Cormen Charles E Leiserson
Ad

More from archu26 (6)

PPTX
811109685-CS3401-Algorithms-Unit-IV.pptx
PPT
7.2 Cook's Theorem.ppt 01_Intro.ppt algorithm for preperation stu used
PPT
02-asymp.ppt01_Intro.ppt algorithm for preperation stu used
PPT
01_Intro.ppt algorithm for preperation stu used
PPTX
algorithm cs3401 ppt unit 3 - cover all topics
PPTX
UNIT DAA PPT cover all topics 2021 regulation
811109685-CS3401-Algorithms-Unit-IV.pptx
7.2 Cook's Theorem.ppt 01_Intro.ppt algorithm for preperation stu used
02-asymp.ppt01_Intro.ppt algorithm for preperation stu used
01_Intro.ppt algorithm for preperation stu used
algorithm cs3401 ppt unit 3 - cover all topics
UNIT DAA PPT cover all topics 2021 regulation
Ad

Recently uploaded (20)

PDF
composite construction of structures.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
additive manufacturing of ss316l using mig welding
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
web development for engineering and engineering
PPTX
Welding lecture in detail for understanding
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPT
Mechanical Engineering MATERIALS Selection
PPT
Project quality management in manufacturing
PPTX
OOP with Java - Java Introduction (Basics)
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Geodesy 1.pptx...............................................
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
composite construction of structures.pdf
Lecture Notes Electrical Wiring System Components
UNIT-1 - COAL BASED THERMAL POWER PLANTS
additive manufacturing of ss316l using mig welding
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
web development for engineering and engineering
Welding lecture in detail for understanding
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Operating System & Kernel Study Guide-1 - converted.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Mechanical Engineering MATERIALS Selection
Project quality management in manufacturing
OOP with Java - Java Introduction (Basics)
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Geodesy 1.pptx...............................................
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx

Book.pdf01_Intro.ppt algorithm for preperation stu used

  • 1. CPS 230 DESIGN AND ANALYSIS OF ALGORITHMS Fall 2008 Instructor: Herbert Edelsbrunner Teaching Assistant: Zhiqiang Gu
  • 2. CPS 230 Fall Semester of 2008 Table of Contents 1 Introduction 3 I DESIGN TECHNIQUES 4 2 Divide-and-Conquer 5 3 Prune-and-Search 8 4 Dynamic Programming 11 5 Greedy Algorithms 14 First Homework Assignment 17 II SEARCHING 18 6 Binary Search Trees 19 7 Red-Black Trees 22 8 Amortized Analysis 26 9 Splay Trees 29 Second Homework Assignment 33 III PRIORITIZING 34 10 Heaps and Heapsort 35 11 Fibonacci Heaps 38 12 Solving Recurrence Relations 41 Third Homework Assignment 44 IV GRAPH ALGORITHMS 45 13 Graph Search 46 14 Shortest Paths 50 15 Minimum Spanning Trees 53 16 Union-Find 56 Fourth Homework Assignment 60 V TOPOLOGICAL ALGORITHMS 61 17 Geometric Graphs 62 18 Surfaces 65 19 Homology 68 Fifth Homework Assignment 72 VI GEOMETRIC ALGORITHMS 73 20 Plane-Sweep 74 21 Delaunay Triangulations 77 22 Alpha Shapes 81 Sixth Homework Assignment 84 VII NP-COMPLETENESS 85 23 Easy and Hard Problems 86 24 NP-Complete Problems 89 25 Approximation Algorithms 92 Seventh Homework Assignment 95 2
  • 3. 1 Introduction Meetings. We meet twice a week, on Tuesdays and Thursdays, from 1:15 to 2:30pm, in room D106 LSRC. Communication. The course material will be delivered in the two weekly lectures. A written record of the lec- tures will be available on the web, usually a day after the lecture. The web also contains other information, such as homework assignments, solutions, useful links, etc. The main supporting text is TARJAN. Data Structures and Network Algorithms. SIAM, 1983. The book focuses on fundamental data structures and graph algorithms, and additional topics covered in the course can be found in the lecture notes or other texts in algorithms such as KLEINBERG AND TARDOS. Algorithm Design. Pearson Ed- ucation, 2006. Examinations. There will be a final exam (covering the material of the entire semester) and a midterm (at the be- ginning of October), You may want to freshen up your math skills before going into this course. The weighting of exams and homework used to determine your grades is homework 35%, midterm 25%, final 40%. Homework. We have seven homeworks scheduled throughout this semester, one per main topic covered in the course. The solutions to each homework are due one and a half weeks after the assignment. More precisely, they are due at the beginning of the third lecture after the assignment. The seventh homework may help you prepare for the final exam and solutions will not be collected. Rule 1. The solution to any one homework question must fit on a single page (together with the statement of the problem). Rule 2. The discussion of questions and solutions before the due date is not discouraged, but you must formu- late your own solution. Rule 3. The deadline for turning in solutions is 10 min- utes after the beginning of the lecture on the due date. Overview. The main topics to be covered in this course are I Design Techniques; II Searching; III Prioritizing; IV Graph Algorithms; V Topological Algorithms; VI Geometric Algorithms; VII NP-completeness. The emphasis will be on algorithm design and on algo- rithm analysis. For the analysis, we frequently need ba- sic mathematical tools. Think of analysis as the measure- ment of the quality of your design. Just like you use your sense of taste to check your cooking, you should get into the habit of using algorithm analysis to justify design de- cisions when you write an algorithm or a computer pro- gram. This is a necessary step to reach the next level in mastering the art of programming. I encourage you to im- plement new algorithms and to compare the experimental performance of your program with the theoretical predic- tion gained through analysis. 3
  • 4. I DESIGN TECHNIQUES 2 Divide-and-Conquer 3 Prune-and-Search 4 Dynamic Programming 5 Greedy Algorithms First Homework Assignment 4
  • 5. 2 Divide-and-Conquer We use quicksort as an example for an algorithm that fol- lows the divide-and-conquer paradigm. It has the repu- tation of being the fasted comparison-based sorting algo- rithm. Indeed it is very fast on the average but can be slow for some input, unless precautions are taken. The algorithm. Quicksort follows the general paradigm of divide-and-conquer, which means it divides the un- sorted array into two, it recurses on the two pieces, and it finally combines the two sorted pieces to obtain the sorted array. An interesting feature of quicksort is that the divide step separates small from large items. As a consequence, combining the sorted pieces happens automatically with- out doing anything extra. void QUICKSORT(int ℓ, r) if ℓ < r then m = SPLIT(ℓ, r); QUICKSORT(ℓ, m − 1); QUICKSORT(m + 1, r) endif. We assume the items are stored in A[0..n − 1]. The array is sorted by calling QUICKSORT(0, n − 1). Splitting. The performance of quicksort depends heav- ily on the performance of the split operation. The effect of splitting from ℓ to r is: • x = A[ℓ] is moved to its correct location at A[m]; • no item in A[ℓ..m − 1] is larger than x; • no item in A[m + 1..r] is smaller than x. Figure 1 illustrates the process with an example. The nine items are split by moving a pointer i from left to right and another pointer j from right to left. The process stops when i and j cross. To get splitting right is a bit delicate, in particular in special cases. Make sure the algorithm is correct for (i) x is smallest item, (ii) x is largest item, (iii) all items are the same. int SPLIT(int ℓ, r) x = A[ℓ]; i = ℓ; j = r + 1; repeat repeat i++ until x ≤ A[i]; repeat j-- until x ≥ A[j]; if i < j then SWAP(i, j) endif until i ≥ j; SWAP(ℓ, j); return j. i j m j i 1 9 3 5 4 2 4 1 9 2 7 5 7 4 2 9 4 2 1 7 3 3 5 4 2 4 2 Figure 1: First, i and j stop at items 9 and 1, which are then swapped. Second, i and j cross and the pivot, 7, is swapped with item 2. Special cases (i) and (iii) are ok but case (ii) requires a stopper at A[r + 1]. This stopper must be an item at least as large as x. If r < n − 1 this stopper is automatically given. For r = n − 1, we create such a stopper by setting A[n] = +∞. Running time. The actions taken by quicksort can be expressed using a binary tree: each (internal) node repre- sents a call and displays the length of the subarray; see Figure 2. The worst case occurs when A is already sorted. 1 2 1 1 2 5 7 9 1 Figure 2: The total amount of time is proportional to the sum of lengths, which are the numbers of nodes in the corresponding subtrees. In the displayed case this sum is 29. In this case the tree degenerates to a list without branch- ing. The sum of lengths can be described by the following recurrence relation: T (n) = n + T (n − 1) = n X i=1 i = n + 1 2 . The running time in the worst case is therefore in O(n2 ). In the best case the tree is completely balanced and the sum of lengths is described by the recurrence relation T (n) = n + 2 · T n − 1 2 . 5
  • 6. If we assume n = 2k − 1 we can rewrite the relation as U(k) = (2k − 1) + 2 · U(k − 1) = (2k − 1) + 2(2k−1 − 1) + . . . + 2k−1 (2 − 1) = k · 2k − k−1 X i=0 2i = 2k · k − (2k − 1) = (n + 1) · log2(n + 1) − n. The running time in the best case is therefore in O(n log n). Randomization. One of the drawbacks of quicksort, as described until now, is that it is slow on rather common almost sorted sequences. The reason are pivots that tend to create unbalanced splittings. Such pivots tend to oc- cur in practice more often than one might expect. Hu- man and often also machine generated data is frequently biased towards certain distributions (in this case, permuta- tions), and it has been said that 80% of the time or more, sorting is done on either already sorted or almost sorted files. Such situations can often be helped by transferring the algorithm’s dependence on the input data to internally made random choices. In this particular case, we use ran- domization to make the choice of the pivot independent of the input data. Assume RANDOM(ℓ, r) returns an integer p ∈ [ℓ, r] with uniform probability: Prob[RANDOM(ℓ, r) = p] = 1 r − ℓ + 1 for each ℓ ≤ p ≤ r. In other words, each p ∈ [ℓ, r] is equally likely. The following algorithm splits the array with a random pivot: int RSPLIT(int ℓ, r) p = RANDOM(ℓ, r); SWAP(ℓ, p); return SPLIT(ℓ, r). We get a randomized implementation by substituting RSPLIT for SPLIT. The behavior of this version of quick- sort depends on p, which is produced by a random number generator. Average analysis. We assume that the items in A[0..n− 1] are pairwise different. The pivot splits A into A[0..m − 1], A[m], A[m + 1..n − 1]. By assumption on function RSPLIT, the probability for each m ∈ [0, n − 1] is 1 n . Therefore the average sum of array lengths split by QUICKSORT is T (n) = n + 1 n · n−1 X m=0 (T (m) + T (n − m − 1)) . To simplify, we multiply with n and obtain a second rela- tion by substituting n − 1 for n: n · T (n) = n2 + 2 · n−1 X i=0 T (i), (1) (n − 1) · T (n − 1) = (n − 1)2 + 2 · n−2 X i=0 T (i). (2) Next we subtract (2) from (1), we divide by n(n + 1), we use repeated substitution to express T (n) as a sum, and finally split the sum in two: T (n) n + 1 = T (n − 1) n + 2n − 1 n(n + 1) = T (n − 2) n − 1 + 2n − 3 (n − 1)n + 2n − 1 n(n + 1) = n X i=1 2i − 1 i(i + 1) = 2 · n X i=1 1 i + 1 − n X i=1 1 i(i + 1) . Bounding the sums. The second sum is solved directly by transformation to a telescoping series: n X i=1 1 i(i + 1) = n X i=1 1 i − 1 i + 1 = 1 − 1 n + 1 . The first sum is bounded from above by the integral of 1 x for x ranging from 1 to n + 1; see Figure 3. The sum of 1 i+1 is the sum of areas of the shaded rectangles, and because all rectangles lie below the graph of 1 x we get a bound for the total rectangle area: n X i=1 1 i + 1 Z n+1 1 dx x = ln(n + 1). 6
  • 7. x x 1/ 4 3 2 1 Figure 3: The areas of the rectangles are the terms in the sum, and the total rectangle area is bounded by the integral from 1 through n + 1. We plug this bound back into the expression for the aver- age running time: T (n) (n + 1) · n X i=1 2 i + 1 2 · (n + 1) · ln(n + 1) = 2 log2 e · (n + 1) · log2(n + 1). In words, the running time of quicksort in the average case is only a factor of about 2/ log2 e = 1.386 . . . slower than in the best case. This also implies that the worst case can- not happen very often, for else the average performance would be slower. Stack size. Another drawback of quicksort is the recur- sion stack, which can reach a size of Ω(n) entries. This can be improved by always first sorting the smaller side and simultaneously removing the tail-recursion: void QUICKSORT(int ℓ, r) i = ℓ; j = r; while i j do m = RSPLIT(i, j); if m − i j − m then QUICKSORT(i, m − 1); i = m + 1 else QUICKSORT(m + 1, j); j = m − 1 endif endwhile. In each recursive call to QUICKSORT, the length of the ar- ray is at most half the length of the array in the preceding call. This implies that at any moment of time the stack contains no more than 1 + log2 n entries. Note that with- out removal of the tail-recursion, the stack can reach Ω(n) entries even if the smaller side is sorted first. Summary. Quicksort incorporates two design tech- niques to efficiently sort n numbers: divide-and-conquer for reducing large to small problems and randomization for avoiding the sensitivity to worst-case inputs. The av- erage running time of quicksort is in O(n log n) and the extra amount of memory it requires is in O(log n). For the deterministic version, the average is over all n! per- mutations of the input items. For the randomized version the average is the expected running time for every input sequence. 7
  • 8. 3 Prune-and-Search We use two algorithms for selection as examples for the prune-and-search paradigm. The problem is to find the i-smallest item in an unsorted collection of n items. We could first sort the list and then return the item in the i-th position, but just finding the i-th item can be done faster than sorting the entire list. As a warm-up exercise consider selecting the 1-st or smallest item in the unsorted array A[1..n]. min = 1; for j = 2 to n do if A[j] A[min] then min = j endif endfor. The index of the smallest item is found in n − 1 com- parisons, which is optimal. Indeed, there is an adversary argument, that is, with fewer than n − 1 comparisons we can change the minimum without changing the outcomes of the comparisons. Randomized selection. We return to finding the i- smallest item for a fixed but arbitrary integer 1 ≤ i ≤ n, which we call the rank of that item. We can use the split- ting function of quicksort also for selection. As in quick- sort, we choose a random pivot and split the array, but we recurse only for one of the two sides. We invoke the func- tion with the range of indices of the current subarray and the rank of the desired item, i. Initially, the range consists of all indices between ℓ = 1 and r = n, limits included. int RSELECT(int ℓ, r, i) q = RSPLIT(ℓ, r); m = q − ℓ + 1; if i m then return RSELECT(ℓ, q − 1, i) elseif i = m then return q else return RSELECT(q + 1, r, i − m) endif. For small sets, the algorithm is relatively ineffective and its running time can be improved by switching over to sorting when the size drops below some constant thresh- old. On the other hand, each recursive step makes some progress so that termination is guaranteed even without special treatment of small sets. Expected running time. For each 1 ≤ m ≤ n, the probability that the array is split into subarrays of sizes m − 1 and n − m is 1 n . For convenience we assume that n is even. The expected running time increases with increas- ing number of items, T (k) ≤ T (m) if k ≤ m. Hence, T (n) ≤ n + 1 n n X m=1 max{T (m − 1), T (n − m)} ≤ n + 2 n n X m= n 2 +1 T (m − 1). Assume inductively that T (m) ≤ cm for m n and a sufficiently large positive constant c. Such a constant c can certainly be found for m = 1, since for that case the running time of the algorithm is only a constant. This establishes the basis of the induction. The case of n items reduces to cases of m n items for which we can use the induction hypothesis. We thus get T (n) ≤ n + 2c n n X m= n 2 +1 m − 1 = n + c · (n − 1) − c 2 · n 2 + 1 = n + 3c 4 · n − 3c 2 . Assuming c ≥ 4 we thus have T (n) ≤ cn as required. Note that we just proved that the expected running time of RSELECT is only a small constant times that of RSPLIT. More precisely, that constant factor is no larger than four. Deterministic selection. The randomized selection al- gorithm takes time proportional to n2 in the worst case, for example if each split is as unbalanced as possible. It is however possible to select in O(n) time even in the worst case. The median of the set plays a special role in this al- gorithm. It is defined as the i-smallest item where i = n+1 2 if n is odd and i = n 2 or n+2 2 if n is even. The determinis- tic algorithm takes five steps to select: Step 1. Partition the n items into n 5 groups of size at most 5 each. Step 2. Find the median in each group. Step 3. Find the median of the medians recursively. Step 4. Split the array using the median of the medians as the pivot. Step 5. Recurse on one side of the pivot. It is convenient to define k = n 5 and to partition such that each group consists of items that are multiples of k positions apart. This is what is shown in Figure 4 provided we arrange the items row by row in the array. 8
  • 9. Figure 4: The 43 items are partitioned into seven groups of 5 and two groups of 4, all drawn vertically. The shaded items are the medians and the dark shaded item is the median of medians. Implementation with insertion sort. We use insertion sort on each group to determine the medians. Specifically, we sort the items in positions ℓ, ℓ+k, ℓ+2k, ℓ+3k, ℓ+4k of array A, for each ℓ. void ISORT(int ℓ, k, n) j = ℓ + k; while j ≤ n do i = j; while i ℓ and A[i] A[i − k] do SWAP(i, i − k); i = i − k endwhile; j = j + k endwhile. Although insertion sort takes quadratic time in the worst case, it is very fast for small arrays, as in this applica- tion. We can now combine the various pieces and write the selection algorithm in pseudo-code. Starting with the code for the randomized algorithm, we first remove the randomization and second add code for Steps 1, 2, and 3. Recall that i is the rank of the desired item in A[ℓ..r]. Af- ter sorting the groups, we have their medians arranged in the middle fifth of the array, A[ℓ+2k..ℓ+3k−1], and we compute the median of the medians by recursive applica- tion of the function. int SELECT(int ℓ, r, i) k = ⌈(r − ℓ + 1)/5⌉; for j = 0 to k − 1 do ISORT(ℓ + j, k, r) endfor; m′ = SELECT(ℓ + 2k, ℓ + 3k − 1, ⌊(k + 1)/2⌋); SWAP(ℓ, m′ ); q = SPLIT(ℓ, r); m = q − ℓ + 1; if i m then return SELECT(ℓ, q − 1, i) elseif i = m then return q else return SELECT(q + 1, r, i − m) endif. Observe that the algorithm makes progress as long as there are at least three items in the set, but we need special treat- ment of the cases of one or of two items. The role of the median of medians is to prevent an unbalanced split of the array so we can safely use the deterministic version of splitting. Worst-case running time. To simplify the analysis, we assume that n is a multiple of 5 and ignore ceiling and floor functions. We begin by arguing that the number of items less than or equal to the median of medians is at least 3n 10 . These are the first three items in the sets with medians less than or equal to the median of medians. In Figure 4, these items are highlighted by the box to the left and above but containing the median of medians. Symmetrically, the number of items greater than or equal to the median of medians is at least 3n 10 . The first recursion works on a set of n 5 medians, and the second recursion works on a set of at most 7n 10 items. We have T (n) ≤ n + T n 5 + T 7n 10 . We prove T (n) = O(n) by induction assuming T (m) ≤ c · m for m n and c a large enough constant. T (n) ≤ n + c 5 · n + 7c 10 · n = 1 + 9c 10 · n. Assuming c ≥ 10 we have T (n) ≤ cn, as required. Again the running time is at most some constant times that of splitting the array. The constant is about two and a half times the one for the randomized selection algorithm. A somewhat subtle issue is the presence of equal items in the input collection. Such occurrences make the func- tion SPLIT unpredictable since they could occur on either side of the pivot. An easy way out of the dilemma is to make sure that the items that are equal to the pivot are treated as if they were smaller than the pivot if they occur in the first half of the array and they are treated as if they were larger than the pivot if they occur in the second half of the array. Summary. The idea of prune-and-search is very similar to divide-and-conquer, which is perhaps the reason why some textbooks make no distinction between the two. The characteristic feature of prune-and-search is that the recur- sion covers only a constant fraction of the input set. As we have seen in the analysis, this difference implies a better running time. It is interesting to compare the randomized with the de- terministic version of selection: 9
  • 10. • the use of randomization leads to a simpler algorithm but it requires a source of randomness; • upon repeating the algorithm for the same data, the deterministic version goes through the exact same steps while the randomized version does not; • we analyze the worst-case running time of the deter- ministic version and the expected running time (for the worst-case input) of the randomized version. All three differences are fairly universal and apply to other algorithms for which we have the choice between a deter- ministic and a randomized implementation. 10
  • 11. 4 Dynamic Programming Sometimes, divide-and-conquer leads to overlapping sub- problems and thus to redundant computations. It is not uncommon that the redundancies accumulate and cause an exponential amount of wasted time. We can avoid the waste using a simple idea: solve each subproblem only once. To be able to do that, we have to add a cer- tain amount of book-keeping to remember subproblems we have already solved. The technical name for this de- sign paradigm is dynamic programming. Edit distance. We illustrate dynamic programming us- ing the edit distance problem, which is motivated by ques- tions in genetics. We assume a finite set of characters or letters, Σ, which we refer to as the alphabet, and we consider strings or words formed by concatenating finitely many characters from the alphabet. The edit distance be- tween two words is the minimum number of letter inser- tions, letter deletions, and letter substitutions required to transform one word to the other. For example, the edit distance between FOOD and MONEY is at most four: FOOD → MOOD → MOND → MONED → MONEY A better way to display the editing process is the gap rep- resentation that places the words one above the other, with a gap in the first word for every insertion and a gap in the second word for every deletion: F O O D M O N E Y Columns with two different characters correspond to sub- stitutions. The number of editing steps is therefore the number of columns that do not contain the same character twice. Prefix property. It is not difficult to see that you cannot get from FOOD to MONEY in less than four steps. However, for longer examples it seems considerably more difficult to find the minimum number of steps or to recognize an optimal edit sequence. Consider for example A L G O R I T H M A L T R U I S T I C Is this optimal or, equivalently, is the edit distance between ALGORITHM and ALTRUISTIC six? Instead of answering this specific question, we develop a dynamic program- ming algorithm that computes the edit distance between an m-character string A[1..m] and an n-character string B[1..n]. Let E(i, j) be the edit distance between the pre- fixes of length i and j, that is, between A[1..i] and B[1..j]. The edit distance between the complete strings is therefore E(m, n). A crucial step towards the development of this algorithm is the following observation about the gap rep- resentation of an optimal edit sequence. PREFIX PROPERTY. If we remove the last column of an optimal edit sequence then the remaining columns represent an optimal edit sequence for the remaining substrings. We can easily prove this claim by contradiction: if the substrings had a shorter edit sequence, we could just glue the last column back on and get a shorter edit sequence for the original strings. Recursive formulation. We use the Prefix Property to develop a recurrence relation for E. The dynamic pro- gramming algorithm will be a straightforward implemen- tation of that relation. There are a couple of obvious base cases: • Erasing: we need i deletions to erase an i-character string, E(i, 0) = i. • Creating: we need j insertions to create a j- character string, E(0, j) = j. In general, there are four possibilities for the last column in an optimal edit sequence. • Insertion: the last entry in the top row is empty, E(i, j) = E(i, j − 1) + 1. • Deletion: the last entry in the bottom row is empty, E(i, j) = E(i − 1, j) + 1. • Substitution: both rows have characters in the last column that are different, E(i, j) = E(i − 1, j − 1) + 1. • No action: both rows end in the same character, E(i, j) = E(i − 1, j − 1). Let P be the logical proposition A[i] 6= B[j] and denote by |P| its indicator variable: |P| = 1 if P is true and |P| = 0 if P is false. We can now summarize and for i, j 0 get the edit distance as the smallest of the possibilities: E(i, j) = min    E(i, j − 1) + 1 E(i − 1, j) + 1 E(i − 1, j − 1) + |P|    . 11
  • 12. The algorithm. If we turned this recurrence relation di- rectly into a divide-and-conqueralgorithm, we would have the following recurrence for the running time: T (m, n) = T (m, n − 1) + T (m − 1, n) + T (m − 1, n − 1) + 1. The solution to this recurrence is exponential in m and n, which is clearly not the way to go. Instead, let us build an m + 1 times n + 1 table of possible values of E(i, j). We can start by filling in the base cases, the entries in the 0-th row and column. To fill in any other entry, we need to know the values directly to the left, directly above, and both to the left and above. If we fill the table from top to bottom and from left to right then whenever we reach an entry, the entries it depends on are already available. int EDITDISTANCE(int m, n) for i = 0 to m do E[i, 0] = i endfor; for j = 1 to n do E[0, j] = j endfor; for i = 1 to m do for j = 1 to n do E[i, j] = min{E[i, j − 1] + 1, E[i − 1, j] + 1, E[i − 1, j − 1] + |A[i] 6= B[j]|} endfor endfor; return E[m, n]. Since there are (m+1)(n+1) entries in the table and each takes a constant time to compute, the total running time is in O(mn). An example. The table constructed for the conversion of ALGORITHM to ALTRUISTIC is shown in Figure 5. Boxed numbers indicate places where the two strings have equal characters. The arrows indicate the predecessors that de- fine the entries. Each direction of arrow corresponds to a different edit operation: horizontal for insertion, vertical for deletion, and diagonal for substitution. Dotted diago- nal arrows indicate free substitutions of a letter for itself. Recovering the edit sequence. By construction, there is at least one path from the upper left to the lower right corner, but often there will be several. Each such path describes an optimal edit sequence. For the example at hand, we have three optimal edit sequences: A L G O R I T H M A L T R U I S T I C A L I G O R T H M A L T R U I S T I C 0 4 1 2 3 5 6 7 8 3 9 4 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 1 0 1 2 3 4 5 6 7 8 2 1 1 2 3 4 5 6 7 8 3 2 2 2 3 4 5 6 7 8 4 3 2 3 4 5 6 7 8 5 4 4 3 3 3 4 5 6 7 6 5 4 4 4 4 5 6 4 7 6 5 5 5 5 5 5 5 6 8 7 6 6 6 6 6 6 6 6 Figure 5: The table of edit distances between all prefixes of ALGORITHM and of ALTRUISTIC. The shaded area highlights the optimal edit sequences, which are paths from the upper left to the lower right corner. A L G O R I T H M A L T R U I S T I C A L G O R I T H M A L T R U I S T I C They are easily recovered by tracing the paths backward, from the end to the beginning. The following algorithm recovers an optimal solution that also minimizes the num- ber of insertions and deletions. We call it with the lengths of the strings as arguments, R(m, n). void R(int i, j) if i 0 or j 0 then switch incoming arrow: case ց: R(i − 1, j − 1); print(A[i], B[j]) case ↓: R(i − 1, j); print(A[i], ) case →: R(i, j − 1); print( , B[j]). endswitch endif. Summary. The structure of dynamic programming is again similar to divide-and-conquer, except that the sub- problems to be solved overlap. As a consequence, we get different recursive paths to the same subproblems. To de- velop a dynamic programming algorithm that avoids re- dundant solutions, we generally proceed in two steps: 12
  • 13. 1. We formulate the problem recursively. In other words, we write down the answer to the whole prob- lem as a combination of the answers to smaller sub- problems. 2. We build solutions from bottom up. Starting with the base cases, we work our way up to the final solution and (usually) store intermediate solutions in a table. For dynamic programming to be effective, we need a structure that leads to at most some polynomial number of different subproblems. Most commonly, we deal with sequences, which have linearly many prefixes and suffixes and quadratically many contiguous substrings. 13
  • 14. 5 Greedy Algorithms The philosophy of being greedy is shortsightedness. Al- ways go for the seemingly best next thing, always op- timize the presence, without any regard for the future, and never change your mind about the past. The greedy paradigm is typically applied to optimization problems. In this section, we first consider a scheduling problem and second the construction of optimal codes. A scheduling problem. Consider a set of activities 1, 2, . . . , n. Activity i starts at time si and finishes at time fi si. Two activities i and j overlap if [si, fi] ∩ [sj, fj] 6= ∅. The objective is to select a maxi- mum number of pairwise non-overlapping activities. An example is shown in Figure 6. The largest number of ac- c d b h g a e time f [ [ [ [ [ [ [ [ ] ] ] ] ] ] ] ] Figure 6: A best schedule is c, e, f, but there are also others of the same size. tivities can be scheduled by choosing activities with early finish times first. We first sort and reindex such that i j implies fi ≤ fj. S = {1}; last = 1; for i = 2 to n do if flast si then S = S ∪ {i}; last = i endif endfor. The running time is O(n log n) for sorting plus O(n) for the greedy collection of activities. It is often difficult to determine how close to the opti- mum the solutions found by a greedy algorithm really are. However, for the above scheduling problem the greedy algorithm always finds an optimum. For the proof let 1 = i1 i2 . . . ik be the greedy schedule con- structed by the algorithm. Let j1 j2 . . . jℓ be any other feasible schedule. Since i1 = 1 has the earliest finish time of any activity, we have fi1 ≤ fj1 . We can therefore add i1 to the feasible schedule and remove at most one ac- tivity, namely j1. Among the activities that do not overlap i1, i2 has the earliest finish time, hence fi2 ≤ fj2 . We can again add i2 to the feasible schedule and remove at most one activity, namely j2 (or possibly j1 if it was not re- moved before). Eventually, we replace the entire feasible schedule by the greedy schedule without decreasing the number of activities. Since we could have started with a maximum feasible schedule, we conclude that the greedy schedule is also maximum. Binary codes. Next we consider the problem of encod- ing a text using a string of 0s and 1s. A binary code maps each letter in the alphabet of the text to a unique string of 0s and 1s. Suppose for example that the letter ‘t’ is encoded as ‘001’, ‘h’ is encoded as ‘101’, and ‘e’ is en- coded as ‘01’. Then the word ‘the’ would be encoded as the concatenation of codewords: ‘00110101’. This partic- ular encoding is unambiguous because the code is prefix- free: no codeword is prefix of another codeword. There is 1 1 0 h 1 1 0 0 1 h t e 0 e t 0 1 Figure 7: Letters correspond to leaves and codewords correspond to maximal paths. A left edge is read as ‘0’ and a right edge as ‘1’. The tree to the right is full and improves the code. a one-to-one correspondence between prefix-free binary codes and binary trees where each leaf is a letter and the corresponding codeword is the path from the root to that leaf. Figure 7 illustrates the correspondence for the above 3-letter code. Being prefix-free corresponds to leaves not having children. The tree in Figure 7 is not full because three of its internal nodes have only one child. This is an indication of waste. The code can be improved by replac- ing each node with one child by its child. This changes the above code to ‘00’ for ‘t’, ‘1’ for ‘h’, and ‘01’ for ‘e’. Huffman trees. Let wi be the frequency of the letter ci in the given text. It will be convenient to refer to wi as the weight of ci or of its external node. To get an effi- cient code, we choose short codewords for common let- ters. Suppose δi is the length of the codeword for ci. Then the number of bits for encoding the entire text is P = X i wi · δi. Since δi is the depth of the leaf ci, P is also known as the weighted external path length of the corresponding tree. 14
  • 15. The Huffman tree for the ci minimizes the weighted ex- ternal path length. To construct this tree, we start with n nodes, one for each letter. At each stage of the algorithm, we greedily pick the two nodes with smallest weights and make them the children of a new node with weight equal to the sum of two weights. We repeat until only one node remains. The resulting tree for a collection of nine letters with displayed weights is shown in Figure 8. Ties that 38 17 61 23 13 7 10 4 21 8 6 3 9 5 3 1 5 Figure 8: The numbers in the external nodes (squares) are the weights of the corresponding letters, and the ones in the internal nodes (circles) are the weights of these nodes. The Huffman tree is full by construction. 001 000 11 101 100 01110 01111 0110 010 5 61 23 38 10 13 3 1 4 3 5 17 6 21 9 8 7 Figure 9: The weighted external path length is 15 + 15 + 18 + 12 + 5 + 15 + 24 + 27 + 42 = 173. arise during the algorithm are broken arbitrarily. We re- draw the tree and order the children of a node as left and right child arbitrarily, as shown in Figure 9. The algorithm works with a collection N of nodes which are the roots of the trees constructed so far. Ini- tially, each leaf is a tree by itself. We denote the weight of a node by w(µ) and use a function EXTRACTMIN that returns the node with the smallest weight and, at the same time, removes this node from the collection. Tree HUFFMAN loop µ = EXTRACTMIN(N); if N = ∅ then return µ endif; ν = EXTRACTMIN(N); create node κ with children µ and ν and weight w(κ) = w(µ) + w(ν); add κ to N forever. Straightforward implementations use an array or a linked list and take time O(n) for each operation involving N. There are fewer than 2n extractions of the minimum and fewer than n additions, which implies that the total run- ning time is O(n2 ). We will see later that there are better ways to implement N leading to running time O(n log n). An inequality. We prepare the proof that the Huffman tree indeed minimizes the weighted external path length. Let T be a full binary tree with weighted external path length P(T ). Let Λ(T ) be the set of leaves and let µ and ν be any two leaves with smallest weights. Then we can construct a new tree T ′ with (1) set of leaves Λ(T ′ ) = (Λ(T ) − {µ, ν}) ˙ ∪ {κ} , (2) w(κ) = w(µ) + w(ν), (3) P(T ′ ) ≤ P(T ) − w(µ) − w(ν), with equality if µ and ν are siblings. We now argue that T ′ really exists. If µ and ν are siblings then we construct T ′ from T by removing µ and ν and declaring their parent, κ, as the new leaf. Then µ ν µ σ ν σ Figure 10: The increase in the depth of ν is compensated by the decrease in depth of the leaves in the subtree of σ. P(T ′ ) = P(T ) − w(µ)δ − w(ν)δ + w(κ)(δ − 1) = P(T ) − w(µ) − w(ν), where δ = δ(µ) = δ(ν) = δ(κ) + 1 is the common depth of µ and ν. Otherwise, assume δ(µ) ≥ δ(ν) and let σ be 15
  • 16. the sibling of µ, which may or may not be a leaf. Exchange ν and σ. Since the length of the path from the root to σ is at least as long as the path to µ, the weighted external path length can only decrease; see Figure 10. Then do the same as in the other case. Proof of optimality. The optimality of the Huffman tree can now be proved by induction. HUFFMAN TREE THEOREM. Let T be the Huffman tree and X another tree with the same set of leaves and weights. Then P(T ) ≤ P(X). PROOF. If there are only two leaves then the claim is obvi- ous. Otherwise, let µ and ν be the two leaves selected by the algorithm. Construct trees T ′ and X′ with P(T ′ ) = P(T ) − w(µ) − w(ν), P(X′ ) ≤ P(X) − w(µ) − w(ν). T ′ is the Huffman tree for n − 1 leaves so we can use the inductive assumption and get P(T ′ ) ≤ P(X′ ). It follows that P(T ) = P(T ′ ) + w(µ) + w(ν) ≤ P(X′ ) + w(µ) + w(ν) ≤ P(X). Huffman codes are binary codes that correspond to Huffman trees as described. They are commonly used to compress text and other information. Although Huffman codes are optimal in the sense defined above, there are other codes that are also sensitive to the frequency of se- quences of letters and this way outperform Huffman codes for general text. Summary. The greedy algorithm for constructing Huff- man trees works bottom-up by stepwise merging, rather than top-down by stepwise partitioning. If we run the greedy algorithm backwards, it becomes very similar to dynamic programming, except that it pursues only one of many possible partitions. Often this implies that it leads to suboptimal solutions. Nevertheless, there are problems that exhibit enough structure that the greedy algorithm succeeds in finding an optimum, and the scheduling and coding problems described above are two such examples. 16
  • 17. First Homework Assignment Write the solution to each problem on a single page. The deadline for handing in solutions is September 18. Problem 1. (20 points). Consider two sums, X = x1 + x2 + . . . + xn and Y = y1 + y2 + . . . + ym. Give an algorithm that finds indices i and j such that swap- ping xi with yj makes the two sums equal, that is, X − xi + yj = Y − yj + xi, if they exist. Analyze your algorithm. (You can use sorting as a subroutine. The amount of credit depends on the correctness of the analysis and the running time of your algorithm.) Problem 2. (20 = 10 + 10 points). Consider dis- tinct items x1, x2, . . . , xn with positive weights w1, w2, . . . , wn such that Pn i=1 wi = 1.0. The weighted median is the item xk that satisfies X xixk wi 0.5 and X xj xk wj ≤ 0.5. (a) Show how to compute the weighted median of n items in worst-case time O(n log n) using sorting. (b) Show how to compute the weighted median in worst-case time O(n) using a linear-time me- dian algorithm. Problem 3. (20 = 6 + 14 points). A game-board has n columns, each consisting of a top number, the cost of visiting the column, and a bottom number, the maxi- mum number of columns you are allowed to jump to the right. The top number can be any positive integer, while the bottom number is either 1, 2, or 3. The ob- jective is to travel from the first column off the board, to the right of the nth column. The cost of a game is the sum of the costs of the visited columns. Assuming the board is represented in a two- dimensional array, B[2, n], the following recursive procedure computes the cost of the cheapest game: int CHEAPEST(int i) if i n then return 0 endif; x = B[1, i] + CHEAPEST(i + 1); y = B[1, i] + CHEAPEST(i + 2); z = B[1, i] + CHEAPEST(i + 3); case B[2, i] = 1: return x; B[2, i] = 2: return min{x, y}; B[2, i] = 3: return min{x, y, z} endcase. (a) Analyze the asymptotic running time of the pro- cedure. (b) Describe and analyze a more efficient algorithm for finding the cheapest game. Problem 4. (20 = 10 + 10 points). Consider a set of n intervals [ai, bi] that cover the unit interval, that is, [0, 1] is contained in the union of the intervals. (a) Describe an algorithm that computes a mini- mum subset of the intervals that also covers [0, 1]. (b) Analyze the running time of your algorithm. (For question (b) you get credit for the correctness of your analysis but also for the running time of your algorithm. In other words, a fast algorithm earns you more points than a slow algorithm.) Problem 5. (20 = 7 + 7 + 6 points). Let A[1..m] and B[1..n] be two strings. (a) Modify the dynamic programming algorithm for computing the edit distance between A and B for the case in which there are only two al- lowed operations, insertions and deletions of in- dividual letters. (b) A (not necessarily contiguous) subsequence of A is defined by the increasing sequence of its indices, 1 ≤ i1 i2 . . . ik ≤ m. Use dynamic programming to find the longest com- mon subsequence of A and B and analyze its running time. (c) What is the relationship between the edit dis- tance defined in (a) and the longest common subsequence computed in (b)? 17
  • 18. II SEARCHING 6 Binary Search Trees 7 Red-black Trees 8 Amortized Analysis 9 Splay Trees Second Homework Assignment 18
  • 19. 6 Binary Search Trees One of the purposes of sorting is to facilitate fast search- ing. However, while a sorted sequence stored in a lin- ear array is good for searching, it is expensive to add and delete items. Binary search trees give you the best of both worlds: fast search and fast update. Definitions and terminology. We begin with a recursive definition of the most common type of tree used in algo- rithms. A (rooted) binary tree is either empty or a node (the root) with a binary tree as left subtree and binary tree as right subtree. We store items in the nodes of the tree. It is often convenient to say the items are the nodes. A binary tree is sorted if each item is between the smaller or equal items in the left subtree and the larger or equal items in the right subtree. For example, the tree illustrated in Figure 11 is sorted assuming the usual ordering of English characters. Terms for relations between family members such as child, parent, sibling are also used for nodes in a tree. Every node has one parent, except the root which has no parent. A leaf or external node is one without children; all other nodes are internal. A node ν is a descendent of µ if ν = µ or ν is a descendent of a child of µ. Symmetri- cally, µ is an ancestor of ν if ν is a descendent of µ. The subtree of µ consists of all descendents of µ. An edge is a parent-child pair. m k l z v i j d b r g y c Figure 11: The parent, sibling and two children of the dark node are shaded. The internal nodes are drawn as circles while the leaves are drawn as squares. The size of the tree is the number of nodes. A binary tree is full if every internal node has two children. Every full binary tree has one more leaf than internal node. To count its edges, we can either count 2 for each internal node or 1 for every node other than the root. Either way, the total number of edges is one less than the size of the tree. A path is a sequence of contiguous edges without repetitions. Usually we only consider paths that descend or paths that ascend. The length of a path is the number of edges. For every node µ, there is a unique path from the root to µ. The length of that path is the depth of µ. The height of the tree is the maximum depth of any node. The path length is the sum of depths over all nodes, and the external path length is the same sum restricted to the leaves in the tree. Searching. A binary search tree is a sorted binary tree. We assume each node is a record storing an item and point- ers to two children: struct Node{item info; Node ∗ ℓ, ∗ r}; typedef Node ∗ Tree. Sometimes it is convenient to also store a pointer to the parent, but for now we will do without. We can search in a binary search tree by tracing a path starting at the root. Node ∗ SEARCH(Tree ̺, item x) case ̺ = NULL: return NULL; x ̺ → info: return SEARCH(̺ → ℓ, x); x = ̺ → info: return ̺; x ̺ → info: return SEARCH(̺ → r, x) endcase. The running time depends on the length of the path, which is at most the height of the tree. Let n be the size. In the worst case the tree is a linked list and searching takes time O(n). In the best case the tree is perfectly balanced and searching takes only time O(log n). Insert. To add a new item is similarly straightforward: follow a path from the root to a leaf and replace that leaf by a new node storing the item. Figure 12 shows the tree obtained after adding w to the tree in Figure 11. The run- c j y g r b d i v z l k m w Figure 12: The shaded nodes indicate the path from the root we traverse when we insert w into the sorted tree. ning time depends again on the length of the path. If the insertions come in a random order then the tree is usually 19
  • 20. close to being perfectly balanced. Indeed, the tree is the same as the one that arises in the analysis of quicksort. The expected number of comparisons for a (successful) search is one n-th of the expected running time of quick- sort, which is roughly 2 ln n. Delete. The main idea for deleting an item is the same as for inserting: follow the path from the root to the node ν that stores the item. Case 1. ν has no internal node as a child. Remove ν. Case 2. ν has one internal child. Make that child the child of the parent of ν. Case 3. ν has two internal children. Find the rightmost internal node in the left subtree, remove it, and sub- stitute it for ν, as shown in Figure 13. ν ν K J J Figure 13: Store J in ν and delete the node that used to store J. The analysis of the expected search time in a binary search tree constructed by a random sequence of insertions and deletions is considerably more challenging than if no dele- tions are present. Even the definition of a random se- quence is ambiguous in this case. Optimal binary search trees. Instead of hoping the in- cremental construction yields a shallow tree, we can con- struct the tree that minimizes the search time. We con- sider the common problem in which items have different probabilities to be the target of a search. For example, some words in the English dictionary are more commonly searched than others and are therefore assigned a higher probability. Let a1 a2 . . . an be the items and pi the corresponding probabilities. To simplify the discus- sion, we only consider successful searches and thus as- sume Pn i=1 pi = 1. The expected number of comparisons for a successful search in a binary search tree T storing the n items is 1 + C(T ) = n X i=1 pi · (δi + 1) = 1 + n X i=1 pi · δi, where δi is the depth of the node that stores ai. C(T ) is the weighted path length or the cost of T . We study the problem of constructing a tree that minimizes the cost. To develop an example, let n = 3 and p1 = 1 2 , p2 = 1 3 , p3 = 1 6 . Figure 14 shows the five binary trees with three nodes and states their costs. It can be shown that the a2 3 a a 1 a2 a2 a 1 1 a1 a1 a a3 a2 a 3 2 a3 a3 a Figure 14: There are five different binary trees of three nodes. From left to right their costs are 2 3 , 5 6 , 2 3 , 7 6 , 4 3 . The first tree and the third tree are both optimal. number of different binary trees with n nodes is 1 n+1 2n n , which is exponential in n. This is far too large to try all possibilities, so we need to look for a more efficient way to construct an optimum tree. Dynamic programming. We write T j i for the optimum weighted binary search tree of ai, ai+1, . . . , aj, Cj i for its cost, and pj i = Pj k=i pk for the total probability of the items in T j i . Suppose we know that the optimum tree stores item ak in its root. Then the left subtree is T k−1 i and the right subtree is T j k+1. The cost of the optimum tree is therefore Cj i = Ck−1 i + Cj k+1 + pj i − pk. Since we do not know which item is in the root, we try all possibili- ties and find the minimum: Cj i = min i≤k≤j {Ck−1 i + Cj k+1 + pj i − pk}. This formula can be translated directly into a dynamic pro- gramming algorithm. We use three two-dimensional ar- rays, one for the sums of probabilities, pj i , one for the costs of optimum trees, Cj i , and one for the indices of the items stored in their roots, Rj i . We assume that the first array has already been computed. We initialize the other two arrays along the main diagonal and add one dummy diagonal for the cost. 20
  • 21. for k = 1 to n do C[k, k − 1] = C[k, k] = 0; R[k, k] = k endfor; C[n + 1, n] = 0. We fill the rest of the two arrays one diagonal at a time. for ℓ = 2 to n do for i = 1 to n − ℓ + 1 do j = i + ℓ − 1; C[i, j] = ∞; for k = i to j do cost = C[i, k − 1] + C[k + 1, j] + p[i, j] − p[k, k]; if cost C[i, j] then C[i, j] = cost; R[i, j] = k endif endfor endfor endfor. The main part of the algorithm consists of three nested loops each iterating through at most n values. The running time is therefore in O(n3 ). Example. Table 1 shows the partial sums of probabil- ities for the data in the earlier example. Table 2 shows 6p 1 2 3 1 3 5 6 2 2 3 3 1 Table 1: Six times the partial sums of probabilities used by the dynamic programming algorithm. the costs and the indices of the roots of the optimum trees computed for all contiguous subsequences. The optimum 6C 1 2 3 1 0 2 4 2 0 1 3 0 R 1 2 3 1 1 1 1 2 2 2 3 3 Table 2: Six times the costs and the roots of the optimum trees. tree can be constructed from R as follows. The root stores the item with index R[1, 3] = 1. The left subtree is there- fore empty and the right subtree stores a2, a3. The root of the optimum right subtree stores the item with index R[2, 3] = 2. Again the left subtree is empty and the right subtree consists of a single node storing a3. Improved running time. Notice that the array R in Ta- ble 2 is monotonic, both along rows and along columns. Indeed it is possible to prove Rj−1 i ≤ Rj i in every row and Rj i ≤ Rj i+1 in every column. We omit the proof and show how the two inequalities can be used to improve the dy- namic programming algorithm. Instead of trying all roots from i through j we restrict the innermost for-loop to for k = R[i, j − 1] to R[i + 1, j] do The monotonicity property implies that this change does not alter the result of the algorithm. The running time of a single iteration of the outer for-loop is now Uℓ(n) = n−ℓ+1 X i=1 (Rj i+1 − Rj−1 i + 1). Recall that j = i + ℓ − 1 and note that most terms cancel, giving Uℓ(n) = Rn n−ℓ+2 − Rℓ−1 1 + (n − ℓ + 1) ≤ 2n. In words, each iteration of the outer for-loop takes only time O(n), which implies that the entire algorithm takes only time O(n2 ). 21
  • 22. 7 Red-Black Trees Binary search trees are an elegant implementation of the dictionary data type, which requires support for item SEARCH (item), void INSERT (item), void DELETE (item), and possible additional operations. Their main disadvan- tage is the worst case time Ω(n) for a single operation. The reasons are insertions and deletions that tend to get the tree unbalanced. It is possible to counteract this ten- dency with occasional local restructuring operations and to guarantee logarithmic time per operation. 2-3-4 trees. A special type of balanced tree is the 2-3-4 tree. Each internal node stores one, two, or three items and has two, three, or four children. Each leaf has the same depth. As shown in Figure 15, the items in the in- ternal nodes separate the items stored in the subtrees and thus facilitate fast searching. In the smallest 2-3-4 tree of 7 15 25 20 4 17 9 2 Figure 15: A 2-3-4 tree of height two. All items are stored in internal nodes. height h, every internal node has exactly two children, so we have 2h leaves and 2h −1 internal nodes. In the largest 2-3-4 tree of height h, every internal node has four chil- dren, so we have 4h leaves and (4h − 1)/3 internal nodes. We can store a 2-3-4 tree in a binary tree by expanding a node with i 1 items and i + 1 children into i nodes each with one item, as shown in Figure 16. Red-black trees. Suppose we color each edge of a bi- nary search tree either red or black. The color is conve- niently stored in the lower node of the edge. Such a edge- colored tree is a red-black tree if (1) there are no two consecutive red edges on any de- scending path and every maximal such path ends with a black edge; (2) all maximal descending paths have the same number of black edges. b a c or a b b a c a a b b Figure 16: Transforming a 2-3-4 tree into a binary tree. Bold edges are called red and the others are called black. The number of black edges on a maximal descending path is the black height, denoted as bh(̺). When we transform a 2-3-4 tree into a binary tree as in Figure 16, we get a red- black tree. The result of transforming the tree in Figure 15 17 20 15 25 2 9 7 4 Figure 17: A red-black tree obtained from the 2-3-4 tree in Fig- ure 15. is shown in Figure 17. HEIGHT LEMMA. A red-black tree with n internal nodes has height at most 2 log2(n + 1). PROOF. The number of leaves is n + 1. Contract each red edge to get a 2-3-4 tree with n + 1 leaves. Its height is h ≤ log2(n + 1). We have bh(̺) = h, and by Rule (1) the height of the red-black tree is at most 2bh(̺) ≤ 2 log2(n + 1). Rotations. Restructuring a red-black tree can be done with only one operation (and its symmetric version): a ro- tation that moves a subtree from one side to another, as shown in Figure 18. The ordered sequence of nodes in the left tree of Figure 18 is . . . , order(A), ν, order(B), µ, order(C), . . . , and this is also the ordered sequence of nodes in the right tree. In other words, a rotation maintains the ordering. Function ZIG below implements the right rotation: 22
  • 23. C B A B A C right rotation left rotation Zig Zag ν µ ν µ Figure 18: From left to right a right rotation and from right to left a left rotation. Node ∗ ZIG(Node ∗ µ) assert µ 6= NULL and ν = µ → ℓ 6= NULL; µ → ℓ = ν → r; ν → r = µ; return ν. Function ZAG is symmetric and performs a left rotation. Occasionally, it is necessary to perform two rotations in sequence, and it is convenient to combine them into a sin- gle operation referred to as a double rotation, as shown in Figure 19. We use a function ZIGZAG to implement a A right rotation ZigZag double ν µ κ κ ν µ C B A D B C D Figure 19: The double right rotation at µ is the concatenation of a single left rotation at ν and a single right rotation at µ. double right rotation and the symmetric function ZAGZIG to implement a double left rotation. Node ∗ ZIGZAG(Node ∗ µ) µ → ℓ = ZAG(µ → ℓ); return ZIG(µ). The double right rotation is the composition of two single rotations: ZIGZAG(µ) = ZIG(µ) ◦ ZAG(ν). Remember that the composition of functions is written from right to left, so the single left rotation of ν precedes the single right rotation of µ. Single rotations preserve the ordering of nodes and so do double rotations. Insertion. Before studying the details of the restructur- ing algorithms for red-black trees, we look at the trees that arise in a short insertion sequence, as shown in Figure 20. After adding 10, 7, 13, 4, we have two red edges in se- quence and repair this by promoting 10 (A). After adding 2, we repair the two red edges in sequence by a single ro- tation of 7 (B). After adding 5, we promote 4 (C), and after adding 6, we do a double rotation of 7 (D). 5 4 13 2 4 5 2 7 10 6 5 13 7 2 7 6 10 4 A 7 D C B 13 4 2 10 13 4 10 10 13 7 Figure 20: Sequence of red-black trees generated by inserting the items 10, 7, 13, 4, 2, 5, 6 in this sequence. An item x is added by substituting a new internal node for a leaf at the appropriate position. To satisfy Rule (2) of the red-black tree definition, color the incoming edge of the new node red, as shown in Figure 21. Start the ν ν x Figure 21: The incoming edge of a newly added node is always red. adjustment of color and structure at the parent ν of the new node. We state the properties maintained by the insertion algorithm as invariants that apply to a node ν traced by the algorithm. INVARIANT I. The only possible violation of the red- black tree properties is that of Rule (1) at the node ν, and if ν has a red incoming edge then it has ex- actly one red outgoing edge. Observe that Invariant I holds right after adding x. We continue with the analysis of all the cases that may arise. The local adjustment operations depend on the neighbor- hood of ν. Case 1. The incoming edge of ν is black. Done. 23
  • 24. Case 2. The incoming edge of ν is red. Let µ be the parent of ν and assume ν is left child of µ. Case 2.1. Both outgoing edges of µ are red, as in Figure 22. Promote µ. Let ν be the parent of µ and recurse. ν µ µ ν Figure 22: Promotion of µ. (The colors of the outgoing edges of ν may be the other way round). Case 2.2. Only one outgoing edge of µ is red, namely the one from µ to ν. Case 2.2.1. The left outgoing edge of ν is red, as in Figure 23 to the left. Right rotate µ. Done. σ ν µ µ ν µ ν σ ν µ Figure 23: Right rotation of µ to the left and double right rotation of µ to the right. Case 2.2.2. The right outgoing edge of ν is red, as in Figure 23 to the right. Double right rotate µ. Done. Case 2 has a symmetric case where left and right are in- terchanged. An insertion may cause logarithmically many promotions but at most two rotations. Deletion. First find the node π that is to be removed. If necessary, we substitute the inorder successor for π so we can assume that both children of π are leaves. If π is last in inorder we substitute symmetrically. Replace π by a leaf ν, as shown in Figure 24. If the incoming edge of π is red then change it to black. Otherwise, remember the in- coming edge of ν as ‘double-black’, which counts as two black edges. Similar to insertions, it helps to understand the deletion algorithm in terms of a property it maintains. INVARIANT D. The only possible violation of the red- black tree properties is a double-black incoming edge of ν. π ν Figure 24: Deletion of node π. The dashed edge counts as two black edges when we compute the black depth. Note that Invariant D holds right after we remove π. We now present the analysis of all the possible cases. The ad- justment operation is chosen depending on the local neigh- borhood of ν. Case 1. The incoming edge of ν is black. Done. Case 2. The incoming edge of ν is double-black. Let µ be the parent and κ the sibling of ν. Assume ν is left child of µ and note that κ is internal. Case 2.1. The edge from µ to κ is black. Case 2.1.1. Both outgoing edges of κ are black, as in Figure 25. Demote µ. Recurse for ν = µ. κ µ κ ν ν µ Figure 25: Demotion of µ. Case 2.1.2. The right outgoing edge of κ is red, as in Figure 26 to the left. Change the color of that edge to black and left ro- tate µ. Done. µ κ κ κ σ ν ν µ ν ν µ µ κ σ Figure 26: Left rotation of µ to the left and double left rotation of µ to the right. Case 2.1.3. The right outgoing edge of κ is black, as in Figure 26 to the right. Change the color of the left outgoing edge to black and double left rotate µ. Done. Case 2.2. The edge from µ to κ is red, as in Fig- ure 27. Left rotate µ. Recurse for ν. 24
  • 25. µ ν κ µ κ ν Figure 27: Left rotation of µ. Case 2 has a symmetric case in which ν is the right child of µ. Case 2.2 seems problematic because it recurses without moving ν any closer to the root. However, the configura- tion excludes the possibility of Case 2.2 occurring again. If we enter Cases 2.1.2 or 2.1.3 then the termination is im- mediate. If we enter Case 2.1.1 then the termination fol- lows because the incoming edge of µ is red. The deletion may cause logarithmically many demotions but at most three rotations. Summary. The red-black tree is an implementation of the dictionary data type and supports the operations search, insert, delete in logarithmic time each. An inser- tion or deletion requires the equivalent of at most three single rotations. The red-black tree also supports finding the minimum, maximum and the inorder successor, prede- cessor of a given node in logarithmic time each. 25
  • 26. 8 Amortized Analysis Amortization is an analysis technique that can influence the design of algorithms in a profound way. Later in this course, we will encounter data structures that owe their very existence to the insight gained in performance due to amortized analysis. Binary counting. We illustrate the idea of amortization by analyzing the cost of counting in binary. Think of an integer as a linear array of bits, n = P i≥0 A[i] · 2i . The following loop keeps incrementing the integer stored in A. loop i = 0; while A[i] = 1 do A[i] = 0; i++ endwhile; A[i] = 1. forever. We define the cost of counting as the total number of bit changes that are needed to increment the number one by one. What is the cost to count from 0 to n? Figure 28 shows that counting from 0 to 15 requires 26 bit changes. Since n takes only 1 + ⌊log2 n⌋ bits or positions in A, 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 4 3 2 1 0 Figure 28: The numbers are written vertically from top to bot- tom. The boxed bits change when the number is incremented. a single increment does at most 2 + log2 n steps. This implies that the cost of counting from 0 to n is at most n log2 n+2n. Even though the upper bound of 2 +log2 n is almost tight for the worst single step, we can show that the total cost is much less than n times that. We do this with two slightly different amortization methods referred to as aggregation and accounting. Aggregation. The aggregation method takes a global view of the problem. The pattern in Figure 28 suggests we define bi equal to the number of 1s and ti equal to the number of trailing 1s in the binary notation of i. Ev- ery other number has no trailing 1, every other number of the remaining ones has one trailing 1, etc. Assuming n = 2k − 1, we therefore have exactly j − 1 trailing 1s for 2k−j = (n + 1)/2j integers between 0 and n − 1. The total number of bit changes is therefore T (n) = n−1 X i=0 (ti + 1) = (n + 1) · k X j=1 j 2j . We use index transformation to show that the sum on the right is less than 2: X j≥1 j 2j = X j≥1 j − 1 2j−1 = 2 · X j≥1 j 2j − X j≥1 1 2j−1 = 2. Hence the cost is T (n) 2(n + 1). The amortized cost per operation is T (n) n , which is about 2. Accounting. The idea of the accounting method is to charge each operation what we think its amortized cost is. If the amortized cost exceeds the actual cost, then the sur- plus remains as a credit associated with the data structure. If the amortized cost is less than the actual cost, the accu- mulated credit is used to pay for the cost overflow. Define the amortized cost of a bit change 0 → 1 as $2 and that of 1 → 0 as $0. When we change 0 to 1 we pay $1 for the actual expense and $1 stays with the bit, which is now 1. This $1 pays for the (later) cost of changing the 1 to 0. Each increment has amortized cost $2, and together with the money in the system, this is enough to pay for all the bit changes. The cost is therefore at most 2n. We see how a little trick, like making the 0 → 1 changes pay for the 1 → 0 changes, leads to a very simple analysis that is even more accurate than the one obtained by aggre- gation. Potential functions. We can further formalize the amor- tized analysis by using a potential function. The idea is similar to accounting, except there is no explicit credit saved anywhere. The accumulated credit is an expres- sion of the well-being or potential of the data structure. Let ci be the actual cost of the i-th operation and Di the data structure after the i-th operation. Let Φi = Φ(Di) be the potential of Di, which is some numerical value depending on the concrete application. Then we define ai = ci + Φi − Φi−1 as the amortized cost of the i-th 26
  • 27. operation. The sum of amortized costs of n operations is n X i=1 ai = n X i=1 (ci + Φi − Φi−1) = n X i=1 ci + Φn − Φ0. We aim at choosing the potential such that Φ0 = 0 and Φn ≥ 0 because then we get P ai ≥ P ci. In words, the sum of amortized costs covers the sum of actual costs. To apply the method to binary counting we define the po- tential equal to the number of 1s in the binary notation, Φi = bi. It follows that Φi − Φi−1 = bi − bi−1 = (bi−1 − ti−1 + 1) − bi−1 = 1 − ti−1. The actual cost of the i-th operation is ci = 1 + ti−1, and the amortized cost is ai = ci + Φi − Φi−1 = 2. We have Φ0 = 0 and Φn ≥ 0 as desired, and therefore P ci ≤ P ai = 2n, which is consistent with the analysis of binary counting with the aggregation and the account- ing methods. 2-3-4 trees. As a more complicated application of amor- tization we consider 2-3-4 trees and the cost of restructur- ing them under insertions and deletions. We have seen 2-3-4 trees earlier when we talked about red-black trees. A set of keys is stored in sorted order in the internal nodes of a 2-3-4 tree, which is characterized by the following rules: (1) each internal node has 2 ≤ d ≤ 4 children and stores d − 1 keys; (2) all leaves have the same depth. As for binary trees, being sorted means that the left-to- right order of the keys is sorted. The only meaningful def- inition of this ordering is the ordered sequence of the first subtree followed by the first key stored in the root followed by the ordered sequence of the second subtree followed by the second key, etc. To insert a new key, we attach a new leaf and add the key to the parent ν of that leaf. All is fine unless ν overflows because it now has five children. If it does, we repair the violation of Rule (1) by climbing the tree one node at a time. We call an internal node non-saturated if it has fewer than four children. Case 1. ν has five children and a non-saturated sibling to its left or right. Move one child from ν to that sibling, as in Figure 29. $1 $0 $6 $3 Figure 29: The overflowing node gives one child to a non- saturated sibling. Case 2. ν has five children and no non-saturated sib- ling. Split ν into two nodes and recurse for the parent of ν, as in Figure 30. If ν has no parent then create a new root whose only children are the two nodes ob- tained from ν. $0 $6 $3 $6 $1 Figure 30: The overflowing node is split into two and the parent is treated recursively. Deleting a key is done is a similar fashion, although there we have to battle with nodes ν that have too few children rather than too many. Let ν have only one child. We repair Rule (1) by adopting a child from a sibling or by merging ν with a sibling. In the latter case the parent of ν looses a child and needs to be visited recursively. The two opera- tions are illustrated in Figures 31 and 32. $4 $3 $1 $0 Figure 31: The underflowing node receives one child from a sib- ling. Amortized analysis. The worst case for inserting a new key occurs when all internal nodes are saturated. The in- sertion then triggers logarithmically many splits. Sym- metrically, the worst case for a deletion occurs when all 27
  • 28. $1 $4 $0 $1 $0 Figure 32: The underflowing node is merged with a sibling and the parent is treated recursively. internal nodes have only two children. The deletion then triggers logarithmically many mergers. Nevertheless, we can show that in the amortized sense there are at most a constant number of split and merge operations per inser- tion and deletion. We use the accounting method and store money in the internal nodes. The best internal nodes have three children because then they are flexible in both directions. They require no money, but all other nodes are given a posi- tive amount to pay for future expenses caused by split and merge operations. Specifically, we store $4, $1, $0, $3, $6 in each internal node with 1, 2, 3, 4, 5 children. As il- lustrated in Figures 29 and 31, an adoption moves money only from ν to its sibling. The operation keeps the total amount the same or decreases it, which is even better. As shown in Figure 30, a split frees up $5 from ν and spends at most $3 on the parent. The extra $2 pay for the split operation. Similarly, a merger frees $5 from the two af- fected nodes and spends at most $3 on the parent. This is illustrated in Figure 32. An insertion makes an initial investment of at most $3 to pay for creating a new leaf. Similarly, a deletion makes an initial investment of at most $3 for destroying a leaf. If we charge $2 for each split and each merge operation, the money in the system suffices to cover the expenses. This implies that for n insertions and deletions we get a total of at most 3n 2 split and merge oper- ations. In other words, the amortized number of split and merge operations is at most 3 2 . Recall that there is a one-to-one correspondence be- tween 2-3-4 tree and red-black trees. We can thus trans- late the above update procedure and get an algorithm for red-black trees with an amortized constant restructuring cost per insertion and deletion. We already proved that for red-black trees the number of rotations per insertion and deletion is at most a constant. The above argument im- plies that also the number of promotions and demotions is at most a constant, although in the amortized and not in the worst-case sense as for the rotations. 28
  • 29. 9 Splay Trees Splay trees are similar to red-black trees except that they guarantee good shape (small height) only on the average. They are simpler to code than red-black trees and have the additional advantage of giving faster access to items that are more frequently searched. The reason for both is that splay trees are self-adjusting. Self-adjusting binary search trees. Instead of explic- itly maintaining the balance using additional information (such as the color of edges in the red-black tree), splay trees maintain balance implicitly through a self-adjusting mechanism. Good shape is a side-effect of the operations that are applied. These operations are applied while splay- ing a node, which means moving it up to the root of the tree, as illustrated in Figure 33. A detailed analysis will 2 1 3 4 4 3 1 2 2 3 1 4 1 4 3 2 Figure 33: The node storing 1 is splayed using three single rota- tions. reveal that single rotations do not imply good amortized performance but combinations of single rotations in pairs do. Aside from double rotations, we use roller-coaster rotations that compose two single left or two single right rotations, as shown in Figure 35. The sequence of the two single rotations is important, namely first the higher then the lower node. Recall that ZIG(κ) performs a single right rotation and returns the new root of the rotated subtree. The roller-coaster rotation to the right is then Node ∗ ZIGZIG(Node ∗ κ) return ZIG(ZIG(κ)). Function ZAGZAG is symmetric, exchanging left and right, and functions ZIGZAG and ZAGZIG are the two double rotations already used for red-black trees. Splay. A splay operation finds an item and uses rotations to move the corresponding node up to the root position. Whenever possible, a double rotation or a roller-coaster rotation is used. We dispense with special cases and show Function SPLAY for the case the search item x is less than the item in the root. if x ̺ → info then µ = ̺ → ℓ; if x µ → info then µ → ℓ = SPLAY(µ → ℓ, x); return ZIGZIG(̺) elseif x µ → info then µ → r = SPLAY(µ → r, x); return ZIGZAG(̺) else return ZIG(̺) endif. If x is stored in one of the children of ̺ then it is moved to the root by a single rotation. Otherwise, it is splayed recursively to the third level and moved to the root either by a double or a roller-coaster rotation. The number of rotation depends on the length of the path from ̺ to x. Specifically, if the path is i edges long then x is splayed in ⌊i/2⌋ double and roller-coaster rotations and zero or one single rotation. In the worst case, a single splay operation takes almost as many rotations as there are nodes in the tree. We will see shortly that the amortized number of rotations is at most logarithmic in the number of nodes. Amortized cost. Recall that the amortized cost of an op- eration is the actual cost minus the cost for work put into improving the data structure. To analyze the cost, we use a potential function that measures the well-being of the data structure. We need definitions: the size s(ν) is the number of descendents of node ν, in- cluding ν, the balance β(ν) is twice the floor of the binary logarithm of the size, β(ν) = 2⌊log2 s(ν)⌋, the potential Φ of a tree or a collection of trees is the sum of balances over all nodes, Φ = P β(ν), the actual cost ci of the i-th splay operation is 1 plus the number of single rotations (counting a double or roller-coaster rotation as two single rotations). the amortized cost ai of the i-th splay operation is ai = ci + Φi − Φi−1. We have Φ0 = 0 for the empty tree and Φi ≥ 0 in general. This implies that the total actual cost does not exceed the total amortized cost, P ci = P ai − Φn + Φ0 ≤ P ai. To get a feeling for the potential, we compute Φ for the two extreme cases. Note first that the integral of the 29
  • 30. natural logarithm is R ln x = x ln x − x and therefore R log2 x = x log2 x − x/ ln 2. In the extreme unbal- anced case, the balance of the i-th node from the bottom is 2⌊log2 i⌋ and the potential is Φ = 2 n X i=1 ⌊log2 i⌋ = 2n log2 n − O(n). In the balanced case, we bound Φ from above by 2U(n), where U(n) = 2U(n 2 )+log2 n. We prove that U(n) 2n for the case when n = 2k . Consider the perfectly balanced tree with n leaves. The height of the tree is k = log2 n. We encode the term log2 n of the recurrence relation by drawing the hook-like path from the root to the right child and then following left edges until we reach the leaf level. Each internal node encodes one of the recursively surfac- ing log-terms by a hook-like path starting at that node. The paths are pairwise edge-disjoint, which implies that their total length is at most the number of edges in the tree, which is 2n − 2. Investment. The main part of the amortized time analy- sis is a detailed study of the three types of rotations: sin- gle, roller-coaster, and double. We write β(ν) for the bal- ance of a node ν before the rotation and β′ (ν) for the bal- ance after the rotation. Let ν be the lowest node involved in the rotation. The goal is to prove that the amortized cost of a roller-coaster and a double rotation is at most 3[β′ (ν) − β(ν)] each, and that of a single rotation is at most 1 + 3[β′ (ν) − β(ν)]. Summing these terms over the rotations of a splay operation gives a telescoping series in which all terms cancel except the first and the last. To this we add 1 for the at most one single rotation and another 1 for the constant cost in definition of actual cost. INVESTMENT LEMMA. The amortized cost of splaying a node ν in a tree ̺ is at most 2 + 3[β(̺) − β(ν)]. Before looking at the details of the three types of rota- tions, we prove that if two siblings have the same balance then their common parent has a larger balance. Because balances are even integers this means that the balance of the parent exceeds the balance of its children by at least 2. BALANCE LEMMA. If µ has children ν, κ and β(ν) = β(κ) = β then β(µ) ≥ β + 2. PROOF. By definition β(ν) = 2⌊log2 s(ν)⌋ and therefore s(ν) ≥ 2β/2 . We have s(µ) = 1 + s(ν) + s(κ) ≥ 21+β/2 , and therefore β(µ) ≥ β + 2. Single rotation. The amortized cost of a single rotation shown in Figure 34 is 1 for performing the rotation plus the change in the potential: a = 1 + β′ (ν) + β′ (µ) − β(ν) − β(µ) ≤ 1 + 3[β′ (ν) − β(ν)] because β′ (µ) ≤ β(µ) and β(ν) ≤ β′ (ν). µ ν µ ν Figure 34: The size of µ decreases and that of ν increases from before to after the rotation. Roller-coaster rotation. The amortized cost of a roller- coaster rotation shown in Figure 35 is a = 2 + β′ (ν) + β′ (µ) + β′ (κ) − β(ν) − β(µ) − β(κ) ≤ 2 + 2[β′ (ν) − β(ν)] because β′ (κ) ≤ β(κ), β′ (µ) ≤ β′ (ν), and β(ν) ≤ β(µ). We distinguish two cases to prove that a is bounded from above by 3[β′ (ν) − β(ν)]. In both cases, the drop in the µ ν κ ν µ κ µ κ ν Figure 35: If in the middle tree the balance of ν is the same as the balance of µ then by the Balance Lemma the balance of κ is less than that common balance. potential pays for the two single rotations. Case β′ (ν) β(ν). The difference between the balance of ν before and after the roller-coaster rotation is at least 2. Hence a ≤ 3[β′ (ν) − β(ν)]. Case β′ (ν) = β(ν) = β. Then the balances of nodes ν and µ in the middle tree in Figure 35 are also equal to β. The Balance Lemma thus implies that the bal- ance of κ in that middle tree is at most β − 2. But since the balance of κ after the roller-coaster rotation is the same as in the middle tree, we have β′ (κ) β. Hence a ≤ 0 = 3[β′ (ν) − β(ν)]. 30
  • 31. Double rotation. The amortized cost of a double rota- tion shown in Figure 36 is a = 2 + β′ (ν) + β′ (µ) + β′ (κ) − β(ν) − β(µ) − β(κ) ≤ 2 + [β′ (ν) − β(ν)] because β′ (κ) ≤ β(κ) and β′ (µ) ≤ β(µ). We again dis- tinguish two cases to prove that a is bounded from above by 3[β′ (ν)−β(ν)]. In both cases, the drop in the potential pays for the two single rotations. Case β′ (ν) β(ν). The difference is at least 2, which implies a ≤ 3[β′ (ν) − β(ν)], as before. Case β′ (ν) = β(ν) = β. Then β(µ) = β(κ) = β. We have β′ (µ) β′ (ν) or β′ (κ) β′ (ν) by the Balance Lemma. Hence a ≤ 0 = 3[β′ (ν) − β(ν)]. µ κ ν µ ν κ Figure 36: In a double rotation, the sizes of µ and κ decrease from before to after the operation. Dictionary operations. In summary, we showed that the amortized cost of splaying a node ν in a binary search tree with root ̺ is at most 1+3[β(̺)−β(ν)]. We now use this result to show that splay trees have good amortized perfor- mance for all standard dictionary operations and more. To access an item we first splay it to the root and return the root even if it does not contain x. The amortized cost is O(β(̺)). Given an item x, we can split a splay tree into two, one containing all items smaller than or equal to x and the other all items larger than x, as illustrated in Figure 37. The amortized cost is the amortized cost for splaying plus x x Figure 37: After splaying x to the root, we split the tree by un- linking the right subtree. the increase in the potential, which we denote as Φ′ − Φ. Recall that the potential of a collection of trees is the sum of the balances of all nodes. Splitting the tree decreases the number of descendents and therefore the balance of the root, which implies that Φ′ − Φ 0. It follows that the amortized cost of a split operation is less than that of a splay operation and therefore in O(β(̺)). Two splay trees can be joined into one if all items in one tree are smaller than all items in the other tree, as il- lustrated in Figure 38. The cost for splaying the maximum max max Figure 38: We first splay the maximum in the tree with the smaller items and then link the two trees. in the first tree is O(β(̺1)). The potential increase caused by linking the two trees is Φ′ − Φ ≤ 2⌊log2(s(̺1) + s(̺2))⌋ ≤ 2 log2 s(̺1) + 2 log2 s(̺2). The amortized cost of joining is thus O(β(̺1) + β(̺2)). To insert a new item, x, we split the tree. If x is al- ready in the tree, we undo the split operation by linking the two trees. Otherwise, we make the two trees the left and right subtrees of a new node storing x. The amortized cost for splaying is O(β(̺)). The potential increase caused by linking is Φ′ − Φ ≤ 2⌊log2(s(̺1) + s(̺2) + 1)⌋ = β(̺). The amortized cost of an insertion is thus O(β(̺)). To delete an item, we splay it to the root, remove the root, and join the two subtrees. Removing x decreases the potential, and the amortized cost of joining the two sub- trees is at most O(β(̺)). This implies that the amortized cost of a deletion is at most O(β(̺)). Weighted search. A nice property of splay trees not shared by most other balanced trees is that they automat- ically adapt to biased search probabilities. It is plausible that this would be the case because items that are often accessed tend to live at or near the root of the tree. The analysis is somewhat involved and we only state the re- sult. Each item or node has a positive weight, w(ν) 0, 31
  • 32. and we define W = P ν w(ν). We have the following generalization of the Investment Lemma, which we state without proof. WEIGHTED INVESTMENT LEMMA. The amortized cost of splaying a node ν in a tree with total weight W is at most 2 + 3 log2(W/w(ν)). It can be shown that this result is asymptotically best pos- sible. In other words, the amortized search time in a splay tree is at most a constant times the optimum, which is what we achieve with an optimum weighted binary search tree. In contrast to splay trees, optimum trees are expen- sive to construct and they require explicit knowledge of the weights. 32
  • 33. Second Homework Assignment Write the solution to each problem on a single page. The deadline for handing in solutions is October 02. Problem 1. (20 = 12 + 8 points). Consider an array A[1..n] for which we know that A[1] ≥ A[2] and A[n − 1] ≤ A[n]. We say that i is a local minimum if A[i − 1] ≥ A[i] ≤ A[i + 1]. Note that A has at least one local minimum. (a) We can obviously find a local minimum in time O(n). Describe a more efficient algorithm that does the same. (b) Analyze your algorithm. Problem 2. (20 points). A vertex cover for a tree is a sub- set V of its vertices such that each edge has at least one endpoint in V . It is minimum if there is no other vertex cover with a smaller number of vertices. Given a tree with n vertices, describe an O(n)-time algo- rithm for finding a minimum vertex cover. (Hint: use dynamic programming or the greedy method.) Problem 3. (20 points). Consider a red-black tree formed by the sequential insertion of n 1 items. Argue that the resulting tree has at least one red edge. [Notice that we are talking about a red-black tree formed by insertions. Without this assumption, the tree could of course consist of black edges only.] Problem 4. (20 points). Prove that 2n rotations suffice to transform any binary search tree into any other binary search tree storing the same n items. Problem 5. (20 = 5 + 5 + 5 + 5 points). Consider a collection of items, each consisting of a key and a cost. The keys come from a totally ordered universe and the costs are real numbers. Show how to maintain a collection of items under the following operations: (a) ADD(k, c): assuming no item in the collection has key k yet, add an item with key k and cost c to the collection; (b) REMOVE(k): remove the item with key k from the collection; (c) MAX(k1, k2): assuming k1 ≤ k2, report the maximum cost among all items with keys k ∈ [k1, k2]. (d) COUNT(c1, c2): assuming c1 ≤ c2, report the number of items with cost c ∈ [c1, c2]; Each operation should take at most O(log n) time in the worst case, where n is the number of items in the collection when the operation is performed. 33
  • 34. III PRIORITIZING 10 Heaps and Heapsort 11 Fibonacci Heaps 12 Solving Recurrence Relations Third Homework Assignment 34
  • 35. 10 Heaps and Heapsort A heap is a data structure that stores a set and allows fast access to the item with highest priority. It is the basis of a fast implementation of selection sort. On the average, this algorithm is a little slower than quicksort but it is not sensitive to the input ordering or to random bits and runs about as fast in the worst case as on the average. Priority queues. A data structure implements the prior- ity queue abstract data type if it supports at least the fol- lowing operations: void INSERT (item), item FINDMIN (void), void DELETEMIN (void). The operations are applied to a set of items with priori- ties. The priorities are totally ordered so any two can be compared. To avoid any confusion, we will usually refer to the priorities as ranks. We will always use integers as priorities and follow the convention that smaller ranks rep- resent higher priorities. In many applications, FINDMIN and DELETEMIN are combined: void EXTRACTMIN(void) r = FINDMIN; DELETEMIN; return r. Function EXTRACTMIN removes and returns the item with smallest rank. Heap. A heap is a particularly compact priority queue. We can think of it as a binary tree with items stored in the internal nodes, as in Figure 39. Each level is full, except 13 8 7 9 6 5 8 2 15 12 10 7 Figure 39: Ranks increase or, more precisely, do not decrease from top to bottom. possibly the last, which is filled from left to right until we run out of items. The items are stored in heap-order: every node µ has a rank larger than or equal to the rank of its parent. Symmetrically, µ has a rank less than or equal to the ranks of both its children. As a consequence, the root contains the item with smallest rank. We store the nodes of the tree in a linear array, level by level from top to bottom and each level from left to right, as shown in Figure 40. The embedding saves ex- 6 9 8 15 8 7 7 2 5 10 7 8 9 10 11 12 6 12 13 1 2 3 4 5 Figure 40: The binary tree is layed out in a linear array. The root is placed in A[1], its children follow in A[2] and A[3], etc. plicit pointers otherwise needed to establish parent-child relations. Specifically, we can find the children and par- ent of a node by index computation: the left child of A[i] is A[2i], the right child is A[2i + 1], and the parent is A[⌊i/2⌋]. The item with minimum rank is stored in the first element: item FINDMIN(int n) assert n ≥ 1; return A[1]. Since the index along a path at least doubles each step, paths can have length at most log2 n. Deleting the minimum. We first study the problem of repairing the heap-order if it is violated at the root, as shown in Figure 41. Let n be the length of the array. We 8 10 6 9 2 7 8 7 12 15 13 5 Figure 41: The root is exchanged with the smaller of its two children. The operation is repeated along a single path until the heap-order is repaired. repair the heap-order by a sequence of swaps along a sin- gle path. Each swap is between an item and the smaller of its children: 35
  • 36. void SIFT-DN(int i, n) if 2i ≤ n then k = arg min{A[2i], A[2i + 1]} if A[k] A[i] then SWAP(i, k); SIFT-DN(k, n) endif endif. Here we assume that A[n + 1] is defined and larger than A[n]. Since a path has at most log2 n edges, the time to re- pair the heap-order takes time at most O(log n). To delete the minimum we overwrite the root with the last element, shorten the heap, and repair the heap-order: void DELETEMIN(int ∗ n) A[1] = A[∗n]; ∗n−−; SIFT-DN(1, ∗n). Instead of the variable that stores n, we pass a pointer to that variable, ∗n, in order to use it as input and output parameter. Inserting. Consider repairing the heap-order if it is vio- lated at the last position of the heap. In this case, the item moves up the heap until it reaches a position where its rank is at least as large as that of its parent. void SIFT-UP(int i) if i ≥ 2 then k = ⌊i/2⌋; if A[i] A[k] then SWAP(i, k); SIFT-UP(k) endif endif. An item is added by first expanding the heap by one ele- ment, placing the new item in the position that just opened up, and repairing the heap-order. void INSERT(int ∗ n, item x) ∗n++; A[∗n] = x; SIFT-UP(∗n). A heap supports FINDMIN in constant time and INSERT and DELETEMIN in time O(log n) each. Sorting. Priority queues can be used for sorting. The first step throws all items into the priority queue, and the second step takes them out in order. Assuming the items are already stored in the array, the first step can be done by repeated heap repair: for i = 1 to n do SIFT-UP(i) endfor. In the worst case, the i-th item moves up all the way to the root. The number of exchanges is therefore at most Pn i=1 log2 i ≤ n log2 n. The upper bound is asymptot- ically tight because half the terms in the sum are at least log2 n 2 = log2 n−1. It is also possible to construct the ini- tial heap in time O(n) by building it from bottom to top. We modify the first step accordingly, and we implement the second step to rearrange the items in sorted order: void HEAPSORT(int n) for i = n downto 1 do SIFT-DN(i, n) endfor; for i = n downto 1 do SWAP(i, 1); SIFT-DN(1, i − 1) endfor. At each step of the first for-loop, we consider the sub- tree with root A[i]. At this moment, the items in the left and right subtrees rooted at A[2i] and A[2i + 1] are al- ready heaps. We can therefore use one call to function SIFT-DN to make the subtree with root A[i] a heap. We will prove shortly that this bottom-up construction of the heap takes time only O(n). Figure 42 shows the array after each iteration of the second for-loop. Note how the heap gets smaller by one element each step. A sin- 15 10 12 13 12 10 15 7 8 8 15 15 15 15 13 7 7 8 8 9 2 5 6 12 13 10 13 12 2 5 7 9 2 5 6 7 9 8 2 9 8 6 7 8 6 5 2 7 7 8 9 2 5 6 6 5 6 7 7 13 2 5 6 8 9 12 10 12 8 10 10 12 10 7 8 13 8 13 10 12 13 12 15 15 13 13 15 12 13 9 10 15 15 8 8 8 7 7 6 5 2 9 7 7 8 8 9 10 10 12 7 6 5 2 7 8 7 15 9 8 8 7 10 12 13 7 2 5 7 9 7 5 6 7 2 5 8 6 9 8 8 7 2 8 Figure 42: Each step moves the last heap element to the root and thus shrinks the heap. The circles mark the items involved in the sift-down operation. gle sift-down operation takes time O(log n), and in total HEAPSORT takes time O(n log n). In addition to the in- put array, HEAPSORT uses a constant number of variables 36
  • 37. and memory for the recursion stack used by SIFT-DN. We can save the memory for the stack by writing func- tion SIFT-DN as an iteration. The sort can be changed to non-decreasing order by reversing the order of items in the heap. Analysis of heap construction. We return to proving that the bottom-up approach to constructing a heap takes only O(n) time. Assuming the worst case, in which ev- ery node sifts down all the way to the last level, we draw the swaps as edges in a tree; see Figure 43. To avoid Figure 43: Each node generates a path that shares no edges with the paths of the other nodes. drawing any edge twice, we always first swap to the right and then continue swapping to the left until we arrive at the last level. This introduces only a small inaccuracy in our estimate. The paths cover each edge once, except for the edges on the leftmost path, which are not covered at all. The number of edges in the tree is n − 1, which im- plies that the total number of swaps is less than n. Equiv- alently, the amortized number of swaps per item is less than 1. There is a striking difference in time-complexity to sorting, which takes an amortized number of about log2 n comparisons per item. The difference between 1 and log2 n may be interpreted as a measure of how far from sorted a heap-ordered array still is. 37
  • 38. 11 Fibonacci Heaps The Fibonacci heap is a data structure implementing the priority queue abstract data type, just like the ordinary heap but more complicated and asymptotically faster for some operations. We first introduce binomial trees, which are special heap-ordered trees, and then explain Fibonacci heaps as collections of heap-ordered trees. Binomial trees. The binomial tree of height h is a tree obtained from two binomial trees of height h − 1, by link- ing the root of one to the other. The binomial tree of height 0 consists of a single node. Binomial trees of heights up to 4 are shown in Figure 44. Each step in the construc- Figure 44: Binomial trees of heights 0, 1, 2, 3, 4. Each tree is obtained by linking two copies of the previous tree. tion increases the height by one, increases the degree (the number of children) of the root by one, and doubles the size of the tree. It follows that a binomial tree of height h has root degree h and size 2h . The root has the largest de- gree of any node in the binomial tree, which implies that every node in a binomial tree with n nodes has degree at most log2 n. To store any set of items with priorities, we use a small collection of binomial trees. For an integer n, let ni be the i-th bit in the binary notation, so we can write n = P i≥0 ni2i . To store n items, we use a binomial tree of size 2i for each ni = 1. The total number of binomial trees is thus the number of 1’s in the binary notation of n, which is at most log2(n + 1). The collection is referred to as a binomial heap. The items in each binomial tree are stored in heap-order. There is no specific relationship between the items stored in different binomial trees. The item with minimum key is thus stored in one of the logarithmically many roots, but it is not prescribed ahead of time in which one. An example is shown in Figure 45 where 1110 = 10112 items are stored in three binomial trees with sizes 8, 2, and 1. In order to add a new item to the set, we create a new binomial tree of size 1 and we successively link binomial trees as dictated by the rules of adding 1 to the = + 10 4 11 13 12 15 7 15 9 8 9 15 10 11 13 15 12 4 7 9 5 8 5 9 Figure 45: Adding the shaded node to a binomial heap consisting of three binomial trees. binary notation of n. In the example, we get 10112 +12 = 11002. The new collection thus consists of two binomial trees with sizes 8 and 4. The size 8 tree is the old one, and the size 4 tree is obtained by first linking the two size 1 trees and then linking the resulting size 2 tree to the old size 2 tree. All this is illustrated in Figure 45. Fibonacci heaps. A Fibonacci heap is a collection of heap-ordered trees. Ideally, we would like it to be a col- lection of binomial trees, but we need more flexibility. It will be important to understand how exactly the nodes of a Fibonacci heap are connected by pointers. Siblings are or- ganized in doubly-linked cyclic lists, and each node has a pointer to its parent and a pointer to one of its children, as shown in Figure 46. Besides the pointers, each node stores min 12 13 5 15 7 10 9 4 8 11 9 Figure 46: The Fibonacci heap representation of the first collec- tion of heap-ordered trees in Figure 45. a key, its degree, and a bit that can be used to mark or un- mark the node. The roots of the heap-ordered trees are doubly-linked in a cycle, and there is an explicit pointer to the root that stores the item with the minimum key. Figure 47 illustrates a few basic operations we perform on a Fi- bonacci heap. Given two heap-ordered trees, we link them by making the root with the bigger key the child of the other root. To unlink a heap-ordered tree or subtree, we remove its root from the doubly-linked cycle. Finally, to merge two cycles, we cut both open and connect them at 38
  • 39. merging linking unlinking Figure 47: Cartoons for linking two trees, unlinking a tree, and merging two cycles. their ends. Any one of these three operations takes only constant time. Potential function. A Fibonacci heap supports a vari- ety of operations, including the standard ones for priority queues. We use a potential function to analyze their amor- tized cost applied to an initially empty Fibonacci heap. Letting ri be the number of roots in the root cycle and mi the number of marked nodes, the potential after the i-th operation is Φi = ri +2mi. When we deal with a col- lection of Fibonacci heaps, we define its potential as the sum of individual potentials. The initial Fibonacci heap is empty, so Φ0 = 0. As usual, we let ci be the actual cost and ai = ci + Φi − Φi−1 the amortized cost of the i-th operation. Since Φ0 = 0 and Φi ≥ 0 for all i, the actual cost is less than the amortized cost: n X i=1 ci ≤ n X i=1 ai = rn + 2mn + n X i=1 ci. For some of the operations, it is fairly easy to compute the amortized cost. We get the minimum by returning the key in the marked root. This operation does not change the po- tential and its amortized and actual cost is ai = ci = 1. We meld two Fibonacci heaps, H1 and H2, by first merg- ing the two root circles and second adjusting the pointer to the minimum key. We have ri(H) = ri−1(H1) + ri−1(H2), mi(H) = mi−1(H1) + mi−1(H2), which implies that there is no change in potential. The amortized and actual cost is therefore ai = ci = 1. We insert a key into a Fibonacci heap by first creating a new Fibonacci heap that stores only the new key and second melding the two heaps. We have one more node in the root cycle so the change in potential is Φi − Φi−1 = 1. The amortized cost is therefore ai = ci + 1 = 2. Deletemin. Next we consider the somewhat more in- volved operation of deleting the minimum key, which is done in four steps: Step 1. Remove the node with minimum key from the root cycle. Step 2. Merge the root cycle with the cycle of children of the removed node. Step 3. As long as there are two roots with the same degree link them. Step 4. Recompute the pointer to the minimum key. For Step 3, we use a pointer array R. Initially, R[i] = NULL for each i. For each root ̺ in the root cycle, we execute the following iteration. i = ̺ → degree; while R[i] 6= NULL do ̺′ = R[i]; R[i] = NULL; ̺ = LINK(̺, ̺′ ); i++ endwhile; R[i] = ̺. To analyze the amortized cost for deleting the minimum, let D(n) be the maximum possible degree of any node in a Fibonacci heap of n nodes. The number of linking operations in Step 3 is the number of roots we start with, which is less than ri−1 +D(n), minus the number of roots we end up with, which is ri. After Step 3, all roots have different degrees, which implies ri ≤ D(n)+1. It follows that the actual cost for the four steps is ci ≤ 1 + 1 + (ri−1 + D(n) − ri) + (D(n) + 1) = 3 + 2D(n) + ri−1 − ri. The potential change is Φi −Φi−1 = ri −ri−1. The amor- tized cost is therefore ai = ci + Φi − Φi−1 ≤ 2D(n) + 3. We will prove next time that the maximum possible de- gree is at most logarithmic in the size of the Fibonacci heap, D(n) 2 log2(n + 1). This implies that deleting the minimum has logarithmic amortized cost. Decreasekey and delete. Besides deletemin, we also have operations that delete an arbitrary item and that de- crease the key of an item. Both change the structure of the heap-ordered trees and are the reason why a Fibonacci heap is not a collection of binomial trees but of more gen- eral heap-ordered trees. The decreasekey operation re- places the item with key x stored in the node ν by x − ∆, where ∆ ≥ 0. We will see that this can be done more effi- ciently than to delete x and to insert x − ∆. We decrease the key in four steps. 39
  • 40. Step 1. Unlink the tree rooted at ν. Step 2. Decrease the key in ν by ∆. Step 3. Add ν to the root cycle and possibly update the pointer to the minimum key. Step 4. Do cascading cuts. We will explain cascading cuts shortly, after explaining the four steps we take to delete a node ν. Before we delete a node ν, we check whether ν = min, and if it is then we delete the minimum as explained above. Assume therefore that ν 6= min. Step 1. Unlink the tree rooted at ν. Step 2. Merge the root-cycle with the cycle of ν’s chil- dren. Step 3. Dispose of ν. Step 4. Do cascading cuts. Figure 48 illustrates the effect of decreasing a key and of deleting a node. Both operations create trees that are not decreasekey 12 to 2 delete 4 5 7 9 7 2 8 9 10 11 5 15 13 13 8 9 15 2 10 11 9 7 5 4 8 10 11 4 9 13 12 9 15 Figure 48: A Fibonacci heap initially consisting of three bino- mial trees modified by a decreasekey and a delete operation. binomial, and we use cascading cuts to make sure that the shapes of these trees are not very different from the shapes of binomial trees. Cascading cuts. Let ν be a node that becomes the child of another node at time t. We mark ν when it loses its first child after time t. Then we unmark ν, unlink it, and add it to the root-cycle when it loses its second child thereafter. We call this operation a cut, and it may cascade because one cut can cause another, and so on. Figure 49 illus- trates the effect of cascading in a heap-ordered tree with two marked nodes. The first step decreases key 10 to 7, and the second step cuts first node 5 and then node 4. 5 4 5 7 4 7 5 4 10 Figure 49: The effect of cascading after decreasing 10 to 7. Marked nodes are shaded. Summary analysis. As mentioned earlier, we will prove D(n) 2 log2(n+1) next time. Assuming this bound, we are able to compute the amortized cost of all operations. The actual cost of Step 4 in decreasekey or in delete is the number of cuts, ci. The potential changes because there are ci new roots and ci fewer marked nodes. Also, the last cut may introduce a new mark. Thus Φi − Φi−1 = ri − ri−1 + 2mi − 2mi−1 ≤ ci − 2ci + 2 = −ci + 2. The amortized cost is therefore ai = ci + Φi − Φi−1 ≤ ci − (2 − ci) = 2. The first three steps of a decreasekey operation take only a constant amount of actual time and increase the potential by at most a constant amount. It follows that the amortized cost of decreasekey, including the cascading cuts in Step 4, is only a constant. Similarly, the actual cost of a delete operation is at most a constant, but Step 2 may increase the potential of the Fibonacci heap by as much as D(n). The rest is bounded from above by a constant, which implies that the amortized cost of the delete operation is O(log n). We summarize the amortized cost of the various operations supported by the Fibonacci heap: find the minimum O(1) meld two heaps O(1) insert a new item O(1) delete the minimum O(log n) decrease the key of a node O(1) delete a node O(log n) We will later see graph problems for which the difference in the amortized cost of the decreasekey and delete op- erations implies a significant improvement in the running time. 40
  • 41. 12 Solving Recurrence Relations Recurrence relations are perhaps the most important tool in the analysis of algorithms. We have encountered sev- eral methods that can sometimes be used to solve such relations, such as guessing the solution and proving it by induction, or developing the relation into a sum for which we find a closed form expression. We now describe a new method to solve recurrence relations and use it to settle the remaining open question in the analysis of Fibonacci heaps. Annihilation of sequences. Suppose we are given an in- finite sequence of numbers, A = ha0, a1, a2, . . .i. We can multiply with a constant, shift to the left and add another sequence: kA = hka0, ka1, ka2, . . .i, LA = ha1, a2, a3, . . .i, A + B = ha0 + b0, a1 + b1, a2 + b2, . . .i. As an example, consider the sequence of powers of two, ai = 2i . Multiplying with 2 and shifting to the left give the same result. Therefore, LA − 2A = h0, 0, 0, . . .i. We write LA − 2A = (L − 2)A and think of L − 2 as an operator that annihilates the sequence of powers of 2. In general, L − k annihilates any sequence of the form hcki i. What does L − k do to other sequences A = hcℓi i, when ℓ 6= k? (L − k)A = hcℓ, cℓ2 , cℓ3 , . . .i − hck, ckℓ, ckℓ2 , . . .i = (ℓ − k)hc, cℓ, cℓ2 , . . .i = (ℓ − k)A. We see that the operator L − k annihilates only one type of sequence and multiplies other similar sequences by a constant. Multiple operators. Instead of just one, we can ap- ply several operators to a sequence. We may multiply with two constants, k(ℓA) = (kℓ)A, multiply and shift, L(kA) = k(LA), and shift twice, L(LA) = L2 A. For example, (L − k)(L − ℓ) annihilates all sequences of the form hcki + dℓi i, where we assume k 6= ℓ. Indeed, L − k annihilates hcki i and leaves behind h(ℓ − k)dℓi i, which is annihilated by L − ℓ. Furthermore, (L − k)(L − ℓ) anni- hilates no other sequences. More generally, we have FACT. (L − k1)(L − k2) . . . (L − kn) annihilates all se- quences of the form hc1ki 1 + c2ki 2 + . . . + cnki ni. What if k = ℓ? To answer this question, we consider (L − k)2 hiki i = (L − k)h(i + 1)ki+1 − iki+1 i = (L − k)hki+1 i = h0i. More generally, we have FACT. (L − k)n annihilates all sequences of the form hp(i)ki i, with p(i) a polynomial of degree n − 1. Since operators annihilate only certain types of sequences, we can determine the sequence if we know the annihilating operator. The general method works in five steps: 1. Write down the annihilator for the recurrence. 2. Factor the annihilator. 3. Determine what sequence each factor annihilates. 4. Put the sequences together. 5. Solve for the constants of the solution by using initial conditions. Fibonacci numbers. We put the method to a test by con- sidering the Fibonacci numbers defined recursively as fol- lows: F0 = 0, F1 = 1, Fj = Fj−1 + Fj−2, for j ≥ 2. Writing a few of the initial numbers, we get the sequence h0, 1, 1, 2, 3, 5, 8, . . .i. We notice that L2 − L − 1 annihi- lates the sequence because (L2 − L − 1)hFji = L2 hFji − LhFji − hFji = hFj+2i − hFj+1i − hFji = h0i. If we factor the operator into its roots, we get L2 − L − 1 = (L − ϕ)(L − ϕ), where ϕ = 1 + √ 5 2 = 1.618 . . ., ϕ = 1 − ϕ = 1 − √ 5 2 = − 0.618 . . .. 41
  • 42. The first root is known as the golden ratio because it repre- sents the aspect ratio of a rectangular piece of paper from which we may remove a square to leave a smaller rect- angular piece of the same ratio: ϕ : 1 = 1 : ϕ − 1. Thus we know that (L − ϕ)(L − ϕ) annihilates hFji and this means that the j-th Fibonacci number is of the form Fj = cϕj + c ϕj . We get the constant factors from the initial conditions: F0 = 0 = c + c, F1 = 1 = cϕ + c ϕ. Solving the two linear equations in two unknowns, we get c = 1/ √ 5 and c = −1/ √ 5. This implies that Fj = 1 √ 5 1 + √ 5 2 !j − 1 √ 5 1 − √ 5 2 !j . From this viewpoint, it seems surprising that Fj turns out to be an integer for all j. Note that |ϕ| 1 and |ϕ| 1. It follows that for growing exponent j, ϕj goes to infinity and ϕj goes to zero. This implies that Fj is approximately ϕj / √ 5, and that this approximation becomes more and more accurate as j grows. Maximum degree. Recall that D(n) is the maximum possible degree of any one node in a Fibonacci heap of size n. We need two easy facts about the kind of trees that arise in Fibonacci heaps in order to show that D(n) is at most logarithmic in n. Let ν be a node of degree j, and let µ1, µ2, . . . , µj be its children ordered by the time they were linked to ν. DEGREE LEMMA. The degree of µi is at least i − 2. PROOF. Recall that nodes are linked only during the deletemin operation. Right before the linking happens, the two nodes are roots and have the same degree. It follows that the degree of µi was at least i − 1 at the time it was linked to ν. The degree of µi might have been even higher because it is possible that ν lost some of the older children after µi had been linked. After being linked, µi may have lost at most one of its children, for else it would have been cut. Its degree is therefore at least i − 2, as claimed. SIZE LEMMA. The number of descendents of ν (includ- ing ν) is at least Fj+2. PROOF. Let sj be the minimum number of descendents a node of degree j can have. We have s0 = 1 and s1 = 2. For larger j, we get sj from sj−1 by adding the size of a minimum tree with root degree j−2, which is sj−2. Hence sj = sj−1 + sj−2, which is the same recurrence relation that defines the Fibonacci numbers. The initial values are shifted two positions so we get sj = Fj+2, as claimed. Consider a Fibonacci heap with n nodes and let ν be a node with maximum degree D = D(n). The Size Lemma implies n ≥ FD+2. The Fibonacci number with index D + 2 is roughly ϕD+2 / √ 5. Because ϕD+2 √ 5, we have n ≥ 1 √ 5 ϕD+2 − 1. After rearranging the terms and taking the logarithm to the base ϕ, we get D ≤ logϕ √ 5(n + 1) − 2. Recall that logϕ x = log2 x/ log2 ϕ and use the calculator to verify that log2 ϕ = 0.694 . . . 0.5 and logϕ √ 5 = 1.672 . . . 2. Hence D ≤ log2(n + 1) log2 ϕ + logϕ √ 5 − 2 2 log2(n + 1). Non-homogeneous terms. We now return to the anni- hilation method for solving recurrence relations and con- sider aj = aj−1 + aj−2 + 1. This is similar to the recurrence that defines Fibonacci numbers and describes the minimum number of nodes in an AVL tree, also known as height-balanced tree. It is de- fined by the requirement that the height of the two sub- trees of a node differ by at most 1. The smallest tree of height j thus consists of the root, a subtree of height j − 1 and another subtree of height j − 2. We refer to the terms involving ai as the homogeneous terms of the re- lation and the others as the non-homogeneous terms. We know that L2 − L − 1 annihilates the homogeneous part, aj = aj−1 + aj−2. If we apply it to the entire relation we get (L2 − L − 1)haji = haj+2i − haj+1i − haji = h1, 1, . . .i. The remaining sequence of 1s is annihilated by L − 1. In other words, (L − ϕ)(L − ϕ)(L − 1) annihilates haji implying that aj = cϕj + c ϕj + c′ 1j . It remains to find 42
  • 43. the constants, which we get from the boundary conditions a0 = 1, a1 = 2 and a2 = 4: c + c + c′ = 1, ϕc + ϕ c + c′ = 2, ϕ2 c + ϕ2 c + c′ = 4. Noting that ϕ2 = ϕ + 1, ϕ2 = ϕ + 1, and ϕ − ϕ = √ 5 we get c = (5 + 2 √ 5)/5, c = (5 − 2 √ 5)/5, and c′ = −1. The minimum number of nodes of a height-j AVL tree is therefore roughly the constant c times ϕj . Conversely, the maximum height of an AVL tree with n = cϕj nodes is roughly j = logϕ(n/c) = 1.440 . . . · log2 n + O(1). In words, the height-balancing condition implies logarithmic height. Transformations. We extend the set of recurrences we can solve by employing transformations that produce rela- tions amenable to the annihilation method. We demon- strate this by considering mergesort, which is another divide-and-conquer algorithm that can be used to sort a list of n items: Step 1. Recursively sort the left half of the list. Step 2. Recursively sort the right half of the list. Step 3. Merge the two sorted lists by simultaneously scanning both from beginning to end. The running time is described by the solution to the recur- rence T (1) = 1, T (n) = 2T (n/2) + n. We have no way to work with terms like T (n/2) yet. However, we can transform the recurrence into a more manageable form. Defining n = 2i and ti = T (2i ) we get t0 = 1, ti = 2ti−1 + 2i . The homogeneous part is annihilated by L − 2. Similarly, non-homogeneous part is annihilated by L − 2. Hence, (L − 2)2 annihilates the entire relation and we get ti = (ci+c)2i . Expressed in the original notation we thus have T (n) = (c log2 n + c)n = O(n log n). This result is of course no surprise and reconfirms what we learned earlier about sorting. The Master Theorem. It is sometimes more convenient to look up the solution to a recurrence relation than play- ing with different techniques to see whether any one can make it to yield. Such a cookbook method for recurrence relations of the form T (n) = aT (n/b) + f(n) is provided by the following theorem. Here we assume that a ≥ 1 and b 1 are constants and that f is a well- behaved positive function. MASTER THEOREM. Define c = logb a and let ε be an arbitrarily small positive constant. Then T (n) =    O(nc ) if f(n) = O(nc−ε ), O(nc log n) if f(n) = O(nc ), O(f(n)) if f(n) = Ω(nc+ε ). The last of the three cases also requires a usually satis- fied technical condition, namely that af(n/b) δf(n) for some constant δ strictly less than 1. For example, this condition is satisfied in T (n) = 2T (n/2) + n2 which im- plies T (n) = O(n2 ). As another example consider the relation T (n) = 2T (n/2) + n that describes the running time of merge- sort. We have c = log2 2 = 1 and f(n) = n = O(nc ). The middle case of the Master Theorem applies and we get T (n) = O(n log n), as before. 43
  • 44. Third Homework Assignment Write the solution to each problem on a single page. The deadline for handing in solutions is October 14. Problem 1. (20 = 10 + 10 points). Consider a lazy ver- sion of heapsort in which each item in the heap is either smaller than or equal to every other item in its subtree, or the item is identified as uncertified. To certify an item, we certify its children and then ex- change it with the smaller child provided it is smaller than the item itself. Suppose A[1..n] is a lazy heap with all items uncertified. (a) How much time does it take to certify A[1]? (b) Does certifying A[1] turn A into a proper heap in which every item satisfies the heap property? (Justify your answer.) Problem 2. (20 points). Recall that Fibonacci numbers are defined recursively as F0 = 0, F1 = 1, and Fn = Fn−1 +Fn−2. Prove the square of the n-th Fibonacci number differs from the product of the two adjacent numbers by one: F2 n = Fn−1 · Fn+1 + (−1)n+1 . Problem 3. (20 points). Professor Pinocchio claims that the height of an n-node Fibonacci heap is at most some constant times log2 n. Show that the Profes- sor is mistaken by exhibiting, for any integer n, a sequence of operations that create a Fibonacci heap consisting of just one tree that is a linear chain of n nodes. Problem 4. (20 = 10 + 10 points). To search in a sorted array takes time logarithmic in the size of the array, but to insert a new items takes linear time. We can improve the running time for insertions by storing the items in several instead of just one sorted arrays. Let n be the number of items, let k = ⌈log2(n + 1)⌉, and write n = nk−1nk−2 . . . n0 in binary notation. We use k sorted arrays Ai (some possibly empty), where Ai stores ni2i items. Each item is stored ex- actly once, and the total size of the arrays is indeed Pk i=0 ni2i = n. Although each individual array is sorted, there is no particular relationship between the items in different arrays. (a) Explain how to search in this data structure and analyze your algorithm. (b) Explain how to insert a new item into the data structure and analyze your algorithm, both in worst-case and in amortized time. Problem 5. (20 = 10 + 10 points). Consider a full bi- nary tree with n leaves. The size of a node, s(ν), is the number of leaves in its subtree and the rank is the floor of the binary logarithm of the size, r(ν) = ⌊log2 s(ν)⌋. (a) Is it true that every internal node ν has a child whose rank is strictly less than the rank of ν? (b) Prove that there exists a leaf whose depth (length of path to the root) is at most log2 n. 44
  • 45. IV GRAPH ALGORITHMS 13 Graph Search 14 Shortest Paths 15 Minimum Spanning Trees 16 Union-find Fourth Homework Assignment 45
  • 46. 13 Graph Search We can think of graphs as generalizations of trees: they consist of nodes and edges connecting nodes. The main difference is that graphs do not in general represent hier- archical organizations. Types of graphs. Different applications require differ- ent types of graphs. The most basic type is the simple undirected graph that consists of a set V of vertices and a set E of edges. Each edge is an unordered pair (a set) of two vertices. We always assume V is finite, and we write 3 4 1 2 0 Figure 50: A simple undirected graph with vertices 0, 1, 2, 3, 4 and edges {0, 1}, {1, 2}, {2, 3}, {3, 0}, {3, 4}. V 2 for the collection of all unordered pairs. Hence E is a subset of V 2 . Note that because E is a set, each edge can occur only once. Similarly, because each edge is a set (of two vertices), it cannot connect to the same vertex twice. Vertices u and v are adjacent if {u, v} ∈ E. In this case u and v are called neighbors. Other types of graphs are directed: E ⊆ V × V . weighted: has a weighting function w : E → R. labeled: has a labeling function ℓ : V → Z. non-simple: there are loops and multi-edges. A loop is like an edge, except that it connects to the same vertex twice. A multi-edge consists of two or more edges connecting the same two vertices. Representation. The two most popular data structures for graphs are direct representations of adjacency. Let V = {0, 1, . . ., n − 1} be the set of vertices. The ad- jacency matrix is the n-by-n matrix A = (aij) with aij = 1 if {i, j} ∈ E, 0 if {i, j} 6∈ E. For undirected graphs, we have aij = aji, so A is sym- metric. For weighted graphs, we encode more informa- tion than just the existence of an edge and define aij as the weight of the edge connecting i and j. The adjacency matrix of the graph in Figure 50 is A =       0 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0       , which is symmetric. Irrespective of the number of edges, 0 1 2 3 V 4 0 2 4 3 3 1 3 0 2 1 Figure 51: The adjacency list representation of the graph in Fig- ure 50. Each edge is represented twice, once for each endpoint. the adjacency matrix has n2 elements and thus requires a quadratic amount of space. Often, the number of edges is quite small, maybe not much larger than the number of vertices. In these cases, the adjacency matrix wastes mem- ory, and a better choice is a sparse matrix representation referred to as adjacency lists, which is illustrated in Fig- ure 51. It consists of a linear array V for the vertices and a list of neighbors for each vertex. For most algorithms, we assume that vertices and edges are stored in structures containing a small number of fields: struct Vertex {int d, f, π; Edge ∗adj}; struct Edge {int v; Edge ∗next}. The d, f, π fields will be used to store auxiliary informa- tion used or created by the algorithms. Depth-first search. Since graphs are generally not or- dered, there are many sequences in which the vertices can be visited. In fact, it is not entirely straightforward to make sure that each vertex is visited once and only once. A use- ful method is depth-first search. It uses a global variable, time, which is incremented and used to leave time-stamps behind to avoid repeated visits. 46
  • 47. void VISIT(int i) 1 time++; V [i].d = time; forall outgoing edges ij do 2 if V [j].d = 0 then 3 V [j].π = i; VISIT(j) endif endfor; 4 time++; V [i].f = time. The test in line 2 checks whether the neighbor j of i has already been visited. The assignment in line 3 records that the vertex is visited from vertex i. A vertex is first stamped in line 1 with the time at which it is encountered. A vertex is second stamped in line 4 with the time at which its visit has been completed. To prepare the search, we initialize the global time variable to 0, label all vertices as not yet visited, and call VISIT for all yet unvisited vertices. time = 0; forall vertices i do V [i].d = 0 endfor; forall vertices i do if V [i].d = 0 then V [i].π = 0; VISIT(i) endif endfor. Let n be the number of vertices and m the numberof edges in the graph. Depth-first search visits every vertex once and examines every edge twice, once for each endpoint. The running time is therefore O(n + m), which is propor- tional to the size of the graph and therefore optimal. DFS forest. Figure 52 illustrates depth-first search by showing the time-stamps d and f and the pointers π in- dicating the predecessors in the traversal. We call an edge {i, j} ∈ E a tree edge if i = V [j].π or j = V [i].π and a back edge, otherwise. The tree edges form the DFS forest 12,13 11,14 10,15 1,16 4, 5 7, 8 2, 9 3, 6 Figure 52: The traversal starts at the vertex with time-stamp 1. Each node is stamped twice, once when it is first encountered and another time when its visit is complete. of the graph. The forest is a tree if the graph is connected and a collection of two or more trees if it is not connected. Figure 53 shows the DFS forest of the graph in Figure 52 which, in this case, consists of a single tree. The time- 7, 8 4, 5 12,13 11,14 10,15 2, 9 1,16 3, 6 Figure 53: Tree edges are solid and back edges are dotted. stamps d are consistent with the preorder traversal of the DFS forest. The time-stamps f are consistent with the postorder traversal. The two stamps can be used to decide, in constant time, whether two nodes in the forest live in different subtrees or one is a descendent of the other. NESTING LEMMA. Vertex j is a proper descendent of vertex i in the DFS forest iff V [i].d V [j].d as well as V [j].f V [i].f. Similarly, if you have a tree and the preorder and postorder numbers of the nodes, you can determine the relation be- tween any two nodes in constant time. Directed graphs and relations. As mentioned earlier, we have a directed graph if all edges are directed. A directed graph is a way to think and talk about a mathe- matical relation. A typical problem where relations arise is scheduling. Some tasks are in a definite order while others are unrelated. An example is the scheduling of undergraduate computer science courses, as illustrated in Figure 54. Abstractly, a relation is a pair (V, E), where Comput. Org. and Programm. Operating Distributed 110 214 212 Inform. Syst. and Implementation Program Design and Analysis I Program Design and Analysis II Software Design Comput. Networks and Distr. Syst. Systems 108 006 100 104 Figure 54: A subgraph of the CPS course offering. The courses CPS104 and CPS108 are incomparable, CPS104 is a predecessor of CPS110, and so on. V = {0, 1, . . ., n − 1} is a finite set of elements and E ⊆ V × V is a finite set of ordered pairs. Instead of 47
  • 48. (i, j) ∈ E we write i ≺ j and instead of (V, E) we write (V, ≺). If i ≺ j then i is a predecessor of j and j is a suc- cessor of i. The terms relation, directed graph, digraph, and network are all synonymous. Directed acyclic graphs. A cycle in a relation is a se- quence i0 ≺ i1 ≺ . . . ≺ ik ≺ i0. Even i0 ≺ i0 is a cycle. A linear extension of (V, ≺) is an ordering j0, j1, . . . , jn−1 of the elements that is consistent with the relation. Formally this means that jk ≺ jℓ implies k ℓ. A directed graph without cycle is a directed acyclic graph. EXTENSION LEMMA. (V, ≺) has a linear extension iff it contains no cycle. PROOF. “=⇒” is obvious. We prove “⇐=” by induction. A vertex s ∈ V is called a source if it has no predecessor. Assuming (V, ≺) has no cycle, we can prove that V has a source by following edges against their direction. If we return to a vertex that has already been visited, we have a cycle and thus a contradiction. Otherwise we get stuck at a vertex s, which can only happen because s has no predecessor, which means s is a source. Let U = V −{s} and note that (U, ≺) is a relation that is smaller than (V, ≺). Hence (U, ≺) has a linear extension by induction hypothesis. Call this extension X and note that s, X is a linear extension of (V, ≺). Topological sorting with queue. The problem of con- structing a linear extension is called topological sorting. A natural and fast algorithm follows the idea of the proof: find a source s, print s, remove s, and repeat. To expedite the first step of finding a source, each vertex maintains its number of predecessors and a queue stores all sources. First, we initialize this information. forall vertices j do V [j].d = 0 endfor; forall vertices i do forall successors j of i do V [j].d++ endfor endfor; forall vertices j do if V [j].d = 0 then ENQUEUE(j) endif endfor. Next, we compute the linear extension by repeated dele- tion of a source. while queue is non-empty do s = DEQUEUE; forall successors j of s do V [j].d--; if V [j].d = 0 then ENQUEUE(j) endif endfor endwhile. The running time is linear in the number of vertices and edges, namely O(n+m). What happens if there is a cycle in the digraph? We illustrate the above algorithm for the directed acyclic graph in Figure 55. The sequence of ver- 3, 2, 1, 0 1, 0 3, 2, 1, 0 1, 0 0 1, 0 0 1, 0 a d e c h f b g Figure 55: The numbers next to each vertex count the predeces- sors, which decreases during the algorithm. tices added to the queue is also the linear extension com- puted by the algorithm. If the process starts at vertex a and if the successors of a vertex are ordered by name then we get a, f, d, g, c, h, b, e, which we can check is indeed a linear extension of the relation. Topological sorting with DFS. Another algorithm that can be used for topological sorting is depth-first search. We output a vertex when its visit has been completed, that is, when all its successors and their successors and so on have already been printed. The linear extension is there- fore generated from back to front. Figure 56 shows the 4, 5 6, 7 11, 12 1, 14 2, 9 15, 16 3, 8 10, 13 e g a c b h f d Figure 56: The numbers next to each vertex are the two time stamps applied by the depth-first search algorithm. The first number gives the time the vertex is encountered, and the second when the visit has been completed. same digraph as Figure 55 and labels vertices with time 48
  • 49. stamps. Consider the sequence of vertices in the order of decreasing second time stamp: a(16), f(14), g(13), h(12), d(9), c(8), e(7), b(5). Although this sequence is different from the one computed by the earlier algorithm, it is also a linear extension of the relation. 49
  • 50. 14 Shortest Paths One of the most common operations in graphs is finding shortest paths between vertices. This section discusses three algorithms for this problem: breadth-first search for unweighted graphs, Dijkstra’s algorithm for weighted graphs, and the Floyd-Warshall algorithm for computing distances between all pairs of vertices. Breadth-first search. We call a graph connected if there is a path between every pair of vertices. A (connected) component is a maximal connected subgraph. Breadth- first search, or BFS, is a way to search a graph. It is sim- ilar to depth-first search, but while DFS goes as deep as quickly as possible, BFS is more cautious and explores a broad neighborhood before venturing deeper. The starting point is a vertex s. An example is shown in Figure 57. As e a d f b c g 2 2 1 1 1 0 1 2 s Figure 57: A sample graph with eight vertices and ten edges labeled by breath-first search. The label increases from a vertex to its successors in the search. before, we call and edge a tree edge if it is traversed by the algorithm. The tree edges define the BFS tree, which we can use to redraw the graph in a hierarchical manner, as in Figure 58. In the case of an undirected graph, no non-tree edge can connect a vertex to an ancestor in the BFS tree. Why? We use a queue to turn the idea into an algorithm. 1 1 2 1 2 0 2 1 Figure 58: The tree edges in the redrawing of the graph in Figure 57 are solid, and the non-tree edges are dotted. First, the graph and the queue are initialized. forall vertices i do V [i].d = −1 endfor; V [s].d = 0; MAKEQUEUE; ENQUEUE(s); SEARCH. A vertex is processed by adding its unvisited neighbors to the queue. They will be processed in turn. void SEARCH while queue is non-empty do i = DEQUEUE; forall neighbors j of i do if V [j].d = −1 then V [j].d = V [i].d + 1; V [j].π = i; ENQUEUE(j) endif endfor endwhile. The label V [i].d assigned to vertex i during the traversal is the minimum number of edges of any path from s to i. In other words, V [i].d is the length of the shortest path from s to i. The running time of BFS for a graph with n vertices and m edges is O(n + m). Single-source shortest path. BFS can be used to find shortest paths in unweighted graphs. We now extend the algorithm to weighted graphs. Assume V and E are the sets of vertices and edges of a simple, undirected graph with a positive weighting function w : E → R+. The length or weight of a path is the sum of the weights of its edges. The distance between two vertices is the length of the shortest path connecting them. For a given source s ∈ V , we study the problem of finding the distances and shortest paths to all other vertices. Figure 59 illustrates the problem by showing the shortest paths to the source s. In 5 5 5 4 4 10 4 10 10 f b c g e a s d 6 Figure 59: The bold edges form shortest paths and together the shortest path tree with root s. It differs by one edge from the breadth-first tree shown in Figure 57. the non-degenerate case, in which no two paths have the same length, the union of all shortest paths to s is a tree, referred to as the shortest path tree. In the degenerate case, we can break ties such that the union of paths is a tree. As before, we grow a tree starting from s. Instead of a queue, we use a priority queue to determine the next vertex to be added to the tree. It stores all vertices not yet in the 50
  • 51. tree and uses V [i].d for the priority of vertex i. First, we initialize the graph and the priority queue. V [s].d = 0; V [s].π = −1; INSERT(s); forall vertices i 6= s do V [i].d = ∞; INSERT(i) endfor. After initialization the priority queue stores s with priority 0 and all other vertices with priority ∞. Dijkstra’s algorithm. We mark vertices in the tree to distinguish them from vertices that are not yet in the tree. The priority queue stores all unmarked vertices i with pri- ority equal to the length of the shortest path that goes from i in one edge to a marked vertex and then to s using only marked vertices. while priority queue is non-empty do i = EXTRACTMIN; mark i; forall neighbors j of i do if j is unmarked then V [j].d = min{w(ij) + V [i].d, V [j].d} endif endfor endwhile. Table 3 illustrates the algorithm by showing the informa- tion in the priority queue after each iteration of the while- loop operating on the graph in Figure 59. The mark- s 0 a ∞ 5 5 b ∞ 10 10 9 9 c ∞ 4 d ∞ 5 5 5 e ∞ ∞ ∞ 10 10 10 f ∞ ∞ ∞ 15 15 15 15 g ∞ ∞ ∞ ∞ 15 15 15 15 Table 3: Each column shows the contents of the priority queue. Time progresses from left to right. ing mechanism is not necessary but clarifies the process. The algorithm performs n EXTRACTMIN operations and at most m DECREASEKEY operations. We compare the running time under three different data structures used to represent the priority queue. The first is a linear array, as originally proposed by Dijkstra, the second is a heap, and the third is a Fibonacci heap. The results are shown in Table 4. We get the best result with Fibonacci heaps for which the total running time is O(n log n + m). array heap F-heap EXTRACTMINs n2 n log n n log n DECREASEKEYs m m log m m Table 4: Running time of Dijkstra’s algorithm for three different implementations of the priority queue holding the yet unmarked vertices. Correctness. It is not entirely obvious that Dijkstra’s al- gorithm indeed finds the shortest paths to s. To show that it does, we inductively prove that it maintains the follow- ing two invariants. (A) For every unmarked vertex j, V [j].d is the length of the shortest path from j to s that uses only marked vertices other than j. (B) For every marked vertex i, V [i].d is the length of the shortest path from i to s. PROOF. Invariant (A) is true at the beginning of Dijkstra’s algorithm. To show that it is maintained throughout the process, we need to make sure that shortest paths are com- puted correctly. Specifically, if we assume Invariant (B) for vertex i then the algorithm correctly updates the prior- ities V [j].d of all neighbors j of i, and no other priorities change. i y s Figure 60: The vertex y is the last unmarked vertex on the hypo- thetically shortest, dashed path that connects i to s. At the moment vertex i is marked, it minimizes V [j].d over all unmarked vertices j. Suppose that, at this mo- ment, V [i].d is not the length of the shortest path from i to s. Because of Invariant (A), there is at least one other un- marked vertex on the shortest path. Let the last such vertex be y, as shown in Figure 60. But then V [y].d V [i].d, which is a contradiction to the choice of i. We used (B) to prove (A) and (A) to prove (B). To make sure we did not create a circular argument, we parametrize the two invariants with the number k of vertices that are 51
  • 52. marked and thus belong to the currently constructed por- tion of the shortest path tree. To prove (Ak) we need (Bk) and to prove (Bk) we need (Ak−1). Think of the two in- variants as two recursive functions, and for each pair of calls, the parameter decreases by one and thus eventually becomes zero, which is when the argument arrives at the base case. All-pairs shortest paths. We can run Dijkstra’s algo- rithm n times, once for each vertex as the source, and thus get the distance between every pair of vertices. The run- ning time is O(n2 log n + nm) which, for dense graphs, is the same as O(n3 ). Cubic running time can be achieved with a much simpler algorithm using the adjacency matrix to store distances. The idea is to iterate n times, and after the k-th iteration, the computed distance between vertices i and j is the length of the shortest path from i to j that, other than i and j, contains only vertices of index k or less. for k = 1 to n do for i = 1 to n do for j = 1 to n do A[i, j] = min{A[i, j], A[i, k] + A[k, j]} endfor endfor endfor. The only information needed to update A[i, j] during the k-th iteration of the outer for-loop are its old value and values in the k-th row and the k-th column of the prior adjacency matrix. This row remains unchanged in this it- eration and so does this column. We therefore do not have to use two arrays, writing the new values right into the old matrix. We illustrate the algorithm by showing the adja- cency, or distance matrix before the algorithm in Figure 61 and after one iteration in Figure 62. d c b a s e f g s a b c d e f g 0 0 0 0 0 0 0 0 5 4 5 5 10 10 4 4 5 4 4 6 5 10 6 10 10 5 4 10 Figure 61: Adjacency, or distance matrix of the graph in Figure 57. All blank entries store ∞. s a b c d e f g 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 s a b c d e f g s a b c d e f g d c b a s e f g s a b c d e f g 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 d c b a s e f g 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 0 0 0 0 0 0 0 0 5 4 5 5 10 10 4 4 5 4 4 6 5 10 6 10 10 5 4 s a b c d e f g s a b c d e f g d c b a s e f g d c b a s e f g 10 10 10 10 9 10 15 14 9 10 14 15 9 9 10 15 19 20 15 14 9 14 15 20 19 14 13 13 14 9 15 10 9 10 9 10 9 10 15 9 10 9 13 14 9 14 14 9 13 14 19 10 14 15 20 10 9 14 15 15 19 20 9 10 15 9 10 9 13 14 9 14 9 13 14 19 10 14 15 20 10 9 14 15 15 19 20 14 14 14 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 10 9 10 15 9 10 9 13 14 9 14 9 13 14 19 10 14 15 20 10 9 14 15 15 19 20 15 20 24 14 25 30 30 25 14 24 20 15 14 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 10 9 10 15 9 10 9 13 14 9 14 9 13 14 19 10 14 15 20 10 9 14 15 15 19 20 15 20 24 14 25 30 30 25 14 24 20 15 14 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 10 9 10 15 9 10 9 13 14 9 14 9 13 14 19 10 14 15 20 10 9 14 15 15 19 20 15 20 24 14 25 30 30 25 14 24 20 15 14 0 0 0 0 0 0 0 0 5 4 5 5 4 4 5 4 4 6 5 10 6 10 10 5 4 10 9 10 15 9 10 9 13 14 9 14 9 13 14 19 10 14 15 20 10 9 14 15 15 19 20 15 20 24 14 25 30 30 25 14 24 20 15 14 Figure 62: Matrix after each iteration. The k-th row and colum are shaded and the new, improved distances are high-lighted. The algorithm works for weighted undirected as well as for weighted directed graphs. Its correctness is easily verified inductively. The running time is O(n3 ). 52
  • 53. 15 Minimum Spanning Trees When a graph is connected, we may ask how many edges we can delete before it stops being connected. Depending on the edges we remove, this may happen sooner or later. The slowest strategy is to remove edges until the graph becomes a tree. Here we study the somewhat more dif- ficult problem of removing edges with a maximum total weight. The remaining graph is then a tree with minimum total weight. Applications that motivate this question can be found in life support systems modeled as graphs or net- works, such as telephone, power supply, and sewer sys- tems. Free trees. An undirected graph (U, T ) is a free tree if it is connected and contains no cycle. We could impose a hierarchy by declaring any one vertex as the root and thus obtain a rooted tree. Here, we have no use for a hierarchi- cal organization and exclusively deal with free trees. The g c h f e d a b i Figure 63: Adding the edge dg to the tree creates a single cycle with vertices d, g, h, f, e, a. number of edges of a free tree is always one less than the number of vertices. Whenever we add a new edge (con- necting two old vertices) we create exactly one cycle. This cycle can be destroyed by deleting any one of its edges, and we get a new free tree, as in Figure 63. Let (V, E) be a connected and undirected graph. A subgraph is another graph (U, T ) with U ⊆ V and T ⊆ E. It is a spanning tree if it is a free tree with U = V . Minimum spanning trees. For the remainder of this section, we assume that we also have a weighting func- tion, w : E → R. The weight of subgraph is then the total weight of its edges, w(T ) = P e∈T w(e). A mini- mum spanning tree, or MST of G is a spanning tree that minimizes the weight. The definitions are illustrated in Figure 64 which shows a graph of solid edges with a min- imum spanning tree of bold edges. A generic algorithm for constructing an MST grows a tree by adding more and 1.9 1.1 1.3 1.2 2.5 1.6 1.5 0.9 1.4 2.8 1.6 1.4 a d e f 1.3 b 1.2 c g h i 3.6 Figure 64: The bold edges form a spanning tree of weight 0.9 + 1.2 + 1.3 + 1.4 + 1.1 + 1.2 + 1.6 + 1.9 = 10.6. more edges. Let A ⊆ E be a subset of some MST of a connected graph (V, E). An edge uv ∈ E − A is safe for A if A ∪ {uv} is also subset of some MST. The generic algorithm adds safe edges until it arrives at an MST. A = ∅; while (V, A) is not a spanning tree do find a safe edge uv; A = A ∪ {uv} endwhile. As long as A is a proper subset of an MST there are safe edges. Specifically, if (V, T ) is an MST and A ⊆ T then all edges in T − A are safe for A. The algorithm will therefore succeed in constructing an MST. The only thing that is not yet clear is how to find safe edges quickly. Cuts. To develop a mechanism for identifying safe edges, we define a cut, which is a partition of the vertex set into two complementary sets, V = W ˙ ∪ (V −W). It is crossed by an edge uv ∈ E if u ∈ W and v ∈ V −W, and it respects an edge set A if A contains no crossing edge. The definitions are illustrated in Figure 65. Figure 65: The vertices inside and outside the shaded regions form a cut that respects the collection of solid edges. The dotted edges cross the cut. 53
  • 54. CUT LEMMA. Let A be subset of an MST and consider a cut W ˙ ∪ (V −W) that respects A. If uv is a crossing edge with minimum weight then uv is safe for A. PROOF. Consider a minimum spanning tree (V, T ) with A ⊆ T . If uv ∈ T then we are done. Otherwise, let T ′ = T ∪ {uv}. Because T is a tree, there is a unique path from u to v in T . We have u ∈ W and v ∈ V − W, so the path switches at least once between the two sets. Suppose it switches along xy, as in Figure 66. Edge xy u v x y Figure 66: Adding uv creates a cycle and deleting xy destroys the cycle. crosses the cut, and since A contains no crossing edges we have xy 6∈ A. Because uv has minimum weight among crossing edges we have w(uv) ≤ w(xy). Define T ′′ = T ′ − {xy}. Then (V, T ′′ ) is a spanning tree and because w(T ′′ ) = w(T ) − w(xy) + w(uv) ≤ w(T ) it is a minimum spanning tree. The claim follows because A ∪ {uv} ⊆ T ′′ . A typical application of the Cut Lemma takes a compo- nent of (V, A) and defines W as the set of vertices of that component. The complementary set V − W contains all other vertices, and crossing edges connect the component with its complement. Prim’s algorithm. Prim’s algorithm chooses safe edges to grow the tree as a single component from an arbitrary first vertex s. Similar to Dijkstra’s algorithm, the vertices that do not yet belong to the tree are stored in a priority queue. For each vertex i outside the tree, we define its priority V [i].d equal to the minimum weight of any edge that connects i to a vertex in the tree. If there is no such edge then V [i].d = ∞. In addition to the priority, we store the index of the other endpoint of the minimum weight edge. We first initialize this information. V [s].d = 0; V [s].π = −1; INSERT(s); forall vertices i 6= s do V [i].d = ∞; INSERT(i) endfor. The main algorithm expands the tree by one edge at a time. It uses marks to distinguish vertices in the tree from ver- tices outside the tree. while priority queue is non-empty do i = EXTRACTMIN; mark i; forall neighbors j of i do if j is unmarked and w(ij) V [j].d then V [j].d = w(ij); V [j].π = i endif endfor endwhile. After running the algorithm, the MST can be recovered from the π-fields of the vertices. The algorithm together with its initialization phase performs n = |V | insertions into the priority queue, n extractmin operations, and at most m = |E| decreasekey operations. Using the Fi- bonacci heap implementation, we get a running time of O(n log n + m), which is the same as for constructing the shortest-path tree with Dijkstra’s algorithm. Kruskal’s algorithm. Kruskal’s algorithm is another implementation of the generic algorithm. It adds edges in a sequence of non-decreasing weight. At any moment, the chosen edges form a collection of trees. These trees merge to form larger and fewer trees, until they eventually com- bine into a single tree. The algorithm uses a priority queue for the edges and a set system for the vertices. In this context, the term ‘system’ is just another word for ‘set’, but we will use it exclusively for sets whose elements are themselves sets. Implementations of the set system will be discussed in the next lecture. Initially, A = ∅, the pri- ority queue contains all edges, and the system contains a singleton set for each vertex, C = {{u} | u ∈ V }. The algorithm finds an edge with minimum weight that con- nects two components defined by A. We set W equal to the vertex set of one component and use the Cut Lemma to show that this edge is safe for A. The edge is added to A and the process is repeated. The algorithm halts when only one tree is left, which is the case when A contains n − 1 = |V | − 1 edges. A = ∅; while |A| n − 1 do uv = EXTRACTMIN; find P, Q ∈ C with u ∈ P and v ∈ Q; if P 6= Q then A = A ∪ {uv}; merge P and Q endif endwhile. 54
  • 55. The running time is O(m log m) for the priority queue op- erations plus some time for maintaining C. There are two operations for the set system, namely finding the set that contains a given element, and merging two sets into one. An example. We illustrate Kruskal’s algorithm by ap- plying it to the weighted graph in Figure 64. The sequence of edges sorted by weight is cd, fi, fh, ad, ae, hi, de, ef, ac, gh, dg, bf, eg, bi, ab. The evolution of the set system b i g c d e a f h Figure 67: Eight union operations merge the nine singleton sets into one set. is illustrated in Figure 67, and the MST computed with Kruskal’s algorithm and indicated with dotted edges is the same as in Figure 64. The edges cd, fi, fh, ad, ae are all added to the tree. The next two edge, hi and de, are not added because they each have both endpoints in the same component, and adding either edge would create a cycle. Edge ef is added to the tree giving rise to a set in the sys- tem that contains all vertices other than g and b. Edge ac is not added, gh is added, dg is not, and finally bf is added to the tree. At this moment the system consists of a single set that contains all vertices of the graph. As suggested by Figure 67, the evolution of the con- struction can be interpreted as a hierarchical clustering of the vertices. The specific method that corresponds to the evolution created by Kruskal’s algorithm is referred to as single-linkage clustering. 55
  • 56. 16 Union-Find In this lecture, we present two data structures for the dis- joint set system problem we encountered in the implemen- tation of Kruskal’s algorithm for minimum spanning trees. An interesting feature of the problem is that m operations can be executed in a time that is only ever so slightly more than linear in m. Abstract data type. A disjoint set system is an abstract data type that represents a partition C of a set [n] = {1, 2, . . ., n}. In other words, C is a set of pairwise dis- joint subsets of [n] such that the union of all sets in C is [n]. The data type supports set FIND(i): return P ∈ C with i ∈ P; void UNION(P, Q) : C = C − {P, Q} ∪ {P ∪ Q}. In most applications, the sets themselves are irrelevant, and it is only important to know when two elements be- long to the same set and when they belong to different sets in the system. For example, Kruskal’s algorithm executes the operations only in the following sequence: P = FIND(i); Q = FIND(j); if P 6= Q then UNION(P, Q) endif. This is similar to many everyday situations where it is usu- ally not important to know what it is as long as we recog- nize when two are the same and when they are different. Linked lists. We construct a fairly simple and reason- ably efficient first solution using linked lists for the sets. We use a table of length n, and for each i ∈ [n], we store the name of the set that contains i. Furthermore, we link the elements of the same set and use the name of the first element as the name of the set. Figure 68 shows a sample set system and its representation. It is convenient to also store the size of the set with the first element. To perform a UNION operation, we need to change the name for all elements in one of the two sets. To save time, we do this only for the smaller set. To merge the two lists without traversing the longer one, we insert the shorter list between the first two elements of the longer list. 5 4 12 7 9 2 1 6 10 3 8 11 1 2 3 4 8 10 5 6 7 9 11 12 3 3 3 8 3 8 8 11 11 3 11 8 5 4 3 C.size C.set C.next Figure 68: The system consists of three sets, each named by the bold element. Each element stores the name of its set, possibly the size of its set, and possibly a pointer to the next element in the same set. void UNION(int P, Q) if C[P].size C[Q].size then P ↔ Q endif; C[P].size = C[P].size + C[Q].size; second = C[P].next; C[P].next = Q; t = Q; while t 6= 0 do C[t].set = P; u = t; t = C[t].next endwhile; C[u].next = second. In the worst case, a single UNION operation takes time Θ(n). The amortized performance is much better because we spend time only on the elements of the smaller set. WEIGHTED UNION LEMMA. n − 1 UNION operations applied to a system of n singleton sets take time O(n log n). PROOF. For an element, i, we consider the cardinality of the set that contains it, σ(i) = C[FIND(i)].size. Each time the name of the set that contains i changes, σ(i) at least doubles. After changing the name k times, we have σ(i) ≥ 2k and therefore k ≤ log2 n. In other words, i can be in the smaller set of a UNION operation at most log2 n times. The claim follows because a UNION operation takes time proportional to the cardinality of the smaller set. Up-trees. Thinking of names as pointers, the above data structure stores each set in a tree of height one. We can use more general trees and get more efficient UNION op- erations at the expense of slower FIND operations. We consider a class of algorithms with the following common- alities: 56
  • 57. • each set is a tree and the name of the set is the index of the root; • FIND traverses a path from a node to the root; • UNION links two trees. It suffices to store only one pointer per node, namely the pointer to the parent. This is why these trees are called up-trees. It is convenient to let the root point to itself. 5 6 1 3 4 7 2 11 8 12 9 10 Figure 69: The UNION operations create a tree by linking the root of the first set to the root of the second set. 1 2 3 4 8 10 5 6 7 9 11 12 Figure 70: The table stores indices which function as pointers as well as names of elements and of sets. The white dot represents a pointer to itself. Figure 69 shows the up-tree generated by executing the following eleven UNION operations on a system of twelve singleton sets: 2 ∪ 3, 4 ∪ 7, 2 ∪ 4, 1 ∪ 2, 4 ∪ 10, 9 ∪ 12, 12 ∪ 2, 8 ∪ 11, 8 ∪ 2, 5 ∪ 6, 6 ∪ 1. Figure 70 shows the embedding of the tree in a table. UNION takes constant time and FIND takes time proportional to the length of the path, which can be as large as n − 1. Weighted union. The running time of FIND can be im- proved by linking smaller to larger trees. This is the ide of weighted union again. Assume a field C[i].p for the index of the parent (C[i].p = i if i is a root), and a field C[i].size for the number of elements in the tree rooted at i. We need the size field only for the roots and we need the index to the parent field everywhere except for the roots. The FIND and UNION operations can now be implemented as follows: int FIND(int i) if C[i].p 6= i then return FIND(C[i].p) endif; return i. void UNION(int i, j) if C[i].size C[j].size then i ↔ j endif; C[i].size = C[i].size + C[j].size; C[j].p = i. The size of a subtree increases by at least a factor of 2 from a node to its parent. The depth of a node can therefore not exceed log2 n. It follows that FIND takes at most time O(log n). We formulate the result on the height for later reference. HEIGHT LEMMA. An up-tree created from n singleton nodes by n − 1 weighted union operations has height at most log2 n. Path compression. We can further improve the time for FIND operations by linking traversed nodes directly to the root. This is the idea of path compression. The UNION operation is implemented as before and there is only one modification in the implementation of the FIND operation: int FIND(int i) if C[i].p 6= i then C[i].p = FIND(C[i].p) endif; return C[i].p. 7 6 4 7 5 6 3 2 1 1 1 1 8 7 2 6 1 4 2 3 2 3 5 6 4 3 2 7 7 8 7 5 6 4 3 2 5 3 2 6 1 4 3 2 4 6 4 3 2 7 5 6 Figure 71: The operations and up-trees develop from top to bot- tom and within each row from left to right. If i is not root then the recursion makes it the child of a root, which is then returned. If i is a root, it returns itself 57
  • 58. because in this case C[i].p = i, by convention. Figure 71 illustrates the algorithm by executing a sequence of eight operations i ∪ j, which is short for finding the sets that contain i and j, and performing a UNION operation if the sets are different. At the beginning, every element forms its own one-node tree. With path compression, it is diffi- cult to imagine that long paths can develop at all. Iterated logarithm. We will prove shortly that the iter- ated logarithm is an upper bound on the amortized time for a FIND operation. We begin by defining the function from its inverse. Let F(0) = 1 and F(i + 1) = 2F (i) . We have F(1) = 2, F(2) = 22 , and F(3) = 222 . In general, F(i) is the tower of i 2s. Table 5 shows the values of F for the first six arguments. For i ≤ 3, F is very small, but i 0 1 2 3 4 5 F 1 2 4 16 65, 536 265,536 Table 5: Values of F. for i = 5 it already exceeds the number of atoms in our universe. Note that the binary logarithm of a tower of i 2s is a tower of i−1 2s. The iterated logarithm is the number of times we can take the binary logarithm before we drop down to one or less. In other words, the iterated logarithm is the inverse of F, log∗ n = min{i | F(i) ≥ n} = min{i | log2 log2 . . . log2 n ≤ 1}, where the binary logarithm is taken i times. As n goes to infinity, log∗ n goes to infinity, but very slowly. Levels and groups. The analysis of the path com- pression algorithm uses two Census Lemmas discussed shortly. Let A1, A2, . . . , Am be a sequence of UNION and FIND operations, and let T be the collection of up-trees we get by executing the sequence, but without path com- pression. In other words, the FIND operations have no influence on the trees. The level λ(µ) of a node µ is its height of its subtree in T plus one. LEVEL CENSUS LEMMA. There are at most n/2ℓ−1 nodes at level ℓ. PROOF. We use induction to show that a node at level ℓ has a subtree of at least 2ℓ−1 nodes. The claim follows because subtrees of nodes on the same level are disjoint. Note that if µ is a proper descendent of another node ν at some moment during the execution of the operation sequence then µ is a proper descendent of ν in T . In this case λ(µ) λ(ν). 2 3 4 5 6 1 0 1 2 3 4 9 6 1 1 1 1 5 1 1 17 7 1 18 2 2 1 8 3 3 3 3 3 0 3 3 3 3 3 3 4 4 3 Figure 72: A schematic drawing of the tree T between the col- umn of level numbers on the left and the column of group num- bers on the right. The tree is decomposed into five groups, each a sequences of contiguous levels. Define the group number of a node µ as the iterated logarithm of the level, g(µ) = log∗ λ(µ). Because the level does not exceed n, we have g(µ) ≤ log∗ n, for every node µ in T . The definition of g decomposes an up-tree into at most 1 + log∗ n groups, as illustrated in Figure 72. The number of levels in group g is F(g)−F(g−1), which gets large very fast. On the other hand, because levels get smaller at an exponential rate, the number of nodes in a group is not much larger than the number of nodes in the lowest level of that group. GROUP CENSUS LEMMA. There are at most 2n/F(g) nodes with group number g. PROOF. Each node with group number g has level between F(g − 1) + 1 and F(g). We use the Level Census Lemma to bound their number: F (g) X ℓ=F (g−1)+1 n 2ℓ−1 ≤ n · (1 + 1 2 + 1 4 + . . .) 2F (g−1) = 2n F(g) , as claimed. Analysis. The analysis is based on the interplay between the up-trees obtained with and without path compression. 58
  • 59. The latter are constructed by the weighted union opera- tions and eventually form a single tree, which we denote as T . The former can be obtained from the latter by the application of path compression. Note that in T , the level strictly increases from a node to its parent. Path compres- sion preserves this property, so levels also increase when we climb a path in the actual up-trees. We now show that any sequence of m ≥ n UNION and FIND operations on a ground set [n] takes time at most O(m log∗ n) if weighted union and path compression is used. We can focus on FIND because each UNION opera- tion takes only constant time. For a FIND operation Ai, let Xi be the set of nodes along the traversed path. The total time for executing all FIND operations is proportional to x = X i |Xi|. For µ ∈ Xi, let pi(µ) be the parent during the execution of Ai. We partition Xi into the topmost two nodes, the nodes just below boundaries between groups, and the rest: Yi = {µ ∈ Xi | µ is root or child of root}, Zi = {µ ∈ Xi − Yi | g(µ) g(pi(µ))}, Wi = {µ ∈ Xi − Yi | g(µ) = g(pi(µ))}. Clearly, |Yi| ≤ 2 and |Zi| ≤ log∗ n. It remains to bound the total size of the Wi, w = P i |Wi|. Instead of count- ing, for each Ai, the nodes in Wi, we count, for each node µ, the FIND operations Aj for which µ ∈ Wj. In other words, we count how often µ can change parent until its parent has a higher group number than µ. Each time µ changes parent, the new parent has higher level than the old parent. If follows that the number of changes is at most F(g(µ)) − F(g(µ) − 1). The number of nodes with group number g is at most 2n/F(g) by the Group Census Lemma. Hence w ≤ log∗ n X g=0 2n F(g) · (F(g) − F(g − 1)) ≤ 2n · (1 + log∗ n). This implies that x ≤ 2m + m log∗ n + 2n(1 + log∗ n) = O(m log∗ n), assuming m ≥ n. This is an upper bound on the total time it takes to execute m FIND operations. The amortized cost per FIND operation is therefore at most O(log∗ n), which for all practical purposes is a constant. Summary. We proved an upper bound on the time needed for m ≥ n UNION and FIND operations. The bound is more than constant per operation, although for all practical purposes it is constant. The log∗ n bound can be improved to an even smaller function, usually referred to as α(n) or the inverse of the Ackermann function, that goes to infinity even slower than the iterated logarithm. It can also be proved that (under some mild assumptions) there is no algorithm that can execute general sequences of UNION and FIND operations in amortized time that is asymptotically less than α(n). 59
  • 60. Fourth Homework Assignment Write the solution to each problem on a single page. The deadline for handing in solutions is October 30. Problem 1. (20 = 10 + 10 points). Consider a free tree and let d(u, v) be the number of edges in the path connecting u to v. The diameter of the tree is the maximum d(u, v) over all pairs of vertices in the tree. (a) Give an efficient algorithm to compute the di- ameter of a tree. (b) Analyze the running time of your algorithm. Problem 2. (20 points). Design an efficient algorithm to find a spanning tree for a connected, weighted, undi- rected graph such that the weight of the maximum weight edge in the spanning tree is minimized. Prove the correctness of your algorithm. Problem 3. (7 + 6 + 7 points). A weighted graph G = (V, E) is a near-tree if it is connected and has at most n + 8 edges, where n is the number of vertices. Give an O(n)-time algorithm to find a minimum weight spanning tree for G. Problem 4. (10 + 10 points). Given an undirected weighted graph and vertices s, t, design an algorithm that computes the number of shortest paths from s to t in the case: (a) All weights are positive numbers. (b) All weights are real numbers. Analyze your algorithm for both (a) and (b). Problem 5. (20 = 10 + 10 points). The off-line mini- mum problem is about maintaining a subset of [n] = {1, 2, . . ., n} under the operations INSERT and EX- TRACTMIN. Given an interleaved sequence of n in- sertions and m min-extractions, the goal is to deter- mine which key is returned by which min-extraction. We assume that each element i ∈ [n] is inserted ex- actly once. Specifically, we wish to fill in an array E[1..m] such that E[i] is the key returned by the i- th min-extraction. Note that the problem is off-line, in the sense that we are allowed to process the entire sequence of operations before determining any of the returned keys. (a) Describe how to use a union-find data structure to solve the problem efficiently. (b) Give a tight bound on the worst-case running time of your algorithm. 60
  • 61. V TOPOLOGICAL ALGORITHMS 17 Geometric Graphs 18 Surfaces 19 Homology Fifth Homework Assignment 61
  • 62. 17 Geometric Graphs In the abstract notion of a graph, an edge is merely a pair of vertices. The geometric (or topological) notion of a graph is closer to our intuition in which we think of an edge as a curve that connects two vertices. Embeddings. Let G = (V, E) be a simple, undirected graph and write R2 for the two-dimensional real plane. A drawing maps every vertex v ∈ V to a point ε(v) in R2 , and it maps every edge {u, v} ∈ E to a curve with endpoints ε(u) and ε(v). The drawing is an embedding if 1. different vertices map to different points; 2. the curves have no self-intersections; 3. the only points of a curve that are images of vertices are its endpoints; 4. two curves intersect at most in their endpoints. We can always map the vertices to points and the edges to curves in R3 so they form an embedding. On the other hand, not every graph has an embedding in R2 . The graph G is planar if it has an embedding in R2 . As illustrated in Figure 73, a planar graph has many drawings, not all of which are embeddings. A straight-line drawing or embed- Figure 73: Three drawings of K4, the complete graph with four vertices. From left to right: a drawing that is not an embedding, an embedding with one curved edge, a straight-line embedding. ding is one in which each edge is mapped to a straight line segment. It is uniquely determined by the mapping of the vertices, ε : V → R2 . We will see later that every planar graph has a straight-line embedding. Euler’s formula. A face of an embedding ε of G is a component of the thus defined decomposition of R2 . We write n = |V |, m = |E|, and ℓ for the number of faces. Euler’s formula says these numbers satisfy a linear rela- tion. EULER’S FORMULA. If G is connected and ε is an em- bedding of G in R2 then n − m + ℓ = 2. PROOF. Choose a spanning tree (V, T ) of G = (V, E). It has n vertices, |T | = n − 1 edges, and one (unbounded) face. We have n − (n − 1) + 1 = 2, which proves the for- mula if G is a tree. Otherwise, draw the remaining edges, one at a time. Each edge decomposes one face into two. The number of vertices does not change, m increases by one, and ℓ increases by one. Since the graph satisfies the linear relation before drawing the edge, it satisfies the re- lation also after drawing the edge. A planar graph is maximally connected if adding any one new edge violates planarity. Not surprisingly, a planar graph of three or more vertices is maximally connected iff every face in an embedding is bounded by three edges. Indeed, suppose there is a face bounded by four or more edges. Then we can find two vertices in its boundary that are not yet connected and we can connect them by draw- ing a curve that passes through the face; see Figure 74. For obvious reasons, we call an embedding of a maxi- d a b c Figure 74: Drawing the edge from a to c decomposes the quad- rangle into two triangles. Note that we cannot draw the edge from b to d since it already exists outside the quadrangle. mally connected planar graph with n ≥ 3 vertices a tri- angulation. For such graphs, we have an additional linear relation, namely 3ℓ = 2m. We can thus rewrite Euler’s formula and get n − m + 2m 3 = 2 and n − 3ℓ 2 + ℓ = 2 and therefore m = 3n − 6; ℓ = 2n − 4, Every planar graph can be completed to a maximally con- nected planar graph. For n ≥ 3 this implies that the planar graph has at most 3n − 6 edges and at most 2n − 4 faces. Forbidden subgraphs. We can use Euler’s relation to prove that the complete graph of five vertices is not planar. It has n = 5 vertices and m = 10 edges, contradicting the upper bound of at most 3n − 6 = 9 edges. Indeed, every drawing of K5 has at least two edges crossing; see Figure 75. Similarly, we can prove that the complete bipartite 62
  • 63. Figure 75: A drawing of K5 on the left and of K3,3 on the right. graph with three plus three vertices is not planar. It has n = 6 vertices and m = 9 edges. Every cycle in a bipartite graph has an even number of edges. Hence, 4ℓ ≤ 2m. Plugging this into Euler’s formula, we get n−m+ m 2 ≥ 2 and therefore m ≤ 2n − 4 = 8, again a contradiction. In a sense, K5 and K3,3 are the quintessential non- planar graphs. To make this concrete, we still need an operation that creates or removes degree-2 vertices. Two graphs are homeomorphic if one can be obtained from the other by a sequence of operations, each deleting a degree-2 vertex and replacing its two edges by the one that connects its two neighbors, or the other way round. KURATOWSKI’S THEOREM. A graph G is planar iff no subgraph of G is homeomorphic to K5 or to K3,3. The proof of this result is a bit lengthy and omitted. Pentagons are star-convex. Euler’s formula can also be used to show that every planar graph has a straight-line embedding. Note that the sum of vertex degrees counts each edge twice, that is, P v∈V deg(v) = 2m. For planar graphs, twice the number of edges is less than 6n which implies that the average degree is less than six. It follows that every planar graph has at least one vertex of degree 5 or less. This can be strengthened by saying that every planar graph with n ≥ 4 vertices has at least four vertices of degree at most 5 each. To see this, assume the planar graph is maximally connected and note that every vertex has degree at least 3. The deficiency from degree 6 is thus at most 3. The total deficiency is 6n − P v∈V deg(v) = 12 which implies that we have at least four vertices with positive deficiency. We need a little bit of geometry to prepare the construc- tion of a straight-line embedding. A region R ⊆ R2 is convex if x, y ∈ R implies that the entire line segment connecting x and y is contained in R. Figure 76 shows regions of either kind. We call R star-convex of there is a point z ∈ R such that for every point x ∈ R the line segment connecting x with z is contained in R. The set of x y z Figure 76: A convex region on the left and a non-convex star- convex region on the right. such points z is the kernel of R. Clearly, every convex re- gion is star-convex but not every star-convex region is con- vex. Similarly, there are regions that are not star-convex, even rather simple ones such as the hexagon in Figure 77. However, every pentagon is star-convex. Indeed, the pen- z Figure 77: A non-star-convex hexagon on the left and a star- convex pentagon on the right. The dark region inside the pen- tagon is its kernel. tagon can be decomposed into three triangles by drawing two diagonals that share an endpoint. Extending the inci- dent sides into the pentagon gives locally the boundary of the kernel. It follows that the kernel is non-empty and has interior points. Fáry’s construction. We construct a straight-line em- bedding of a planar graph G = (V, E) assuming G is maximally connected. Choose three vertices, a, b, c, con- nected by three edges to form the outer triangle. If G has only n = 3 vertices we are done. Else it has at least one vertex u ∈ V = {a, b, c} with deg(u) ≤ 5. Step 1. Remove u together with the k = deg(u) edges incident to u. Add k − 3 edges to make the graph maximally connected again. Step 2. Recursively construct a straight-line embed- ding of the smaller graph. Step 3. Remove the added k − 3 edges and map u to a point ε(u) in the interior of the kernel of the result- ing k-gon. Connect ε(u) with line segments to the vertices of the k-gon. 63
  • 64. Figure 78 illustrates the recursive construction. It is straightforward to implement but there are numerical is- sues in the choice of ε(u) that limit the usefulness of this construction. recurse u remove u add back u a b c v w x y Figure 78: We fix the outer triangle, remove the degree-5 vertex, recursively construct a straight-line embedding of the rest, and finally add the vertex back. Tutte’s construction. A more useful construction of a straight-line embedding goes back to the work of Tutte. We begin with a definition. Given a finite set of points, x1, x2, . . . , xj, the average is x = 1 n j X i=1 xi. For j = 2, it is the midpoint of the edge and for j = 3, it is the centroid of the triangle. In general, the average is a point somewhere between the xi. Let G = (V, E) be a maximally connected planar graph and a, b, c three vertices connected by three edges. We now follow Tutte’s construction to get a mapping ε : V → R2 so that the straight-line drawing of G is a straight-line embedding. Step 1. Map a, b, c to points ε(a), ε(b), ε(c) spanning a triangle in R2 . Step 2. For each vertex u ∈ V − {a, b, c}, let Nu be the set of neighbors of u. Map u to the average of the images of its neighbors, that is, ε(u) = 1 |Nu| X v∈Nu ε(v). The fact that the resulting mapping ε : V → R2 gives a straight-line embedding of G is known as Tutte’s Theo- rem. It holds even if G is not quite maximally connected and if the points are not quite the averages of their neigh- bors. The proof is a bit involved and omitted. The points ε(u) can be computed by solving a system of linear equations. We illustrate this for the graph in Figure 78. We set ε(a) = −1 −1 , ε(b) = 1 −1 , ε(c) = 0 1 . The other five points are computed by solving the system of linear equations Av = 0, where A =       0 0 1 −5 1 1 1 1 0 0 1 1 −3 1 0 0 1 1 1 1 1 −6 1 0 0 1 1 1 0 1 −5 1 0 0 1 1 0 0 1 −3       and v is the column vector of points ε(a) to ε(y). There are really two linear systems, one for the horizontal and the other for the vertical coordinates. In each system, we have n − 3 equations and a total of n − 3 unknowns. This gives a unique solution provided the equations are linearly independent. Proving that they are is part of the proof of Tutte’s Theorem. Solving the linear equations is a numeri- cal problem that is studies in detail in courses on numerical analysis. 64
  • 65. 18 Surfaces Graphs may be drawn in two, three, or higher dimen- sions, but they are still intrinsically only 1-dimensional. One step up in dimensions, we find surfaces, which are 2-dimensional. Topological 2-manifolds. The simplest kind of surfaces are the ones that on a small scale look like the real plane. A space M is a 2-manifold if every point x ∈ M is locally homeomorphic to R2 . Specifically, there is an open neighborhood N of x and a continuous bijection h : N → R2 whose inverse is also continuous. Such a bicontinuous map is called a homeomorphism. Examples of 2-manifolds are the open disk and the sphere. The for- mer is not compact because it has covers that do not have finite subcovers. Figure 79 shows examples of compact 2- manifolds. If we add the boundary circle to the open disk Figure 79: Three compact 2-manifolds, the sphere, the torus, and the double torus. we get a closed disk which is compact but not every point is locally homeomorphic to R2 . Specifically, a point on the circle has an open neighborhood homeomorphic to the closed half-plane, H2 = {(x1, x2) ∈ R2 | x1 ≥ 0}. A space whose points have open neighborhoods homeomor- phic to R2 or H2 is called a 2-manifolds with boundary; see Figure 80 for examples. The boundary is the subset Figure 80: Three 2-manifolds with boundary, the closed disk, the cylinder, and the Möbius strip. of points with neighborhoods homeomorphic to H2 . It is a 1-manifold (without boundary), that is, every point is locally homeomorphic to R. There is only one type of compact, connected 1-manifold, namely the closed curve. In topology, we do not distinguish spaces that are home- omorphic to each other. Hence, every closed curve is like every other one and they are all homeomorphic to the unit circle, S1 = {x ∈ R2 | kxk = 1}. Triangulations. A standard representation of a compact 2-manifold uses triangles that are glued to each other along shared edges and vertices. A collection K of tri- angles, edges, and vertices is a triangulation of a compact 2-manifold if I. for every triangle in K, its three edges belong to K, and for every edge in K, its two endpoints are ver- tices in K; II. every edge belongs to exactly two triangles and every vertex belongs to a single ring of triangles. An example is shown in Figure 81. To simplify language, we call each element of K a simplex. If we need to be spe- cific, we add the dimension, calling a vertex a 0-simplex, an edge a 1-simplex, and a triangle a 2-simplex. A face of a simplex τ is a simplex σ ⊆ τ. For example, a trian- gle has seven faces, its three vertices, its two edges, and itself. We can now state Condition I more succinctly: if σ is a face of τ and τ ∈ K then σ ∈ K. To talk about Figure 81: A triangulation of the sphere. The eight triangles are glued to form the boundary of an octahedron which is homeo- morphic to the sphere. the inverse of the face relation, we define the star of a simplex σ as the set of simplices that contain σ as a face, St σ = {τ ∈ K | σ ⊆ τ}. Sometimes we think of the star as a set of simplices and sometimes as a set of points, namely the union of interiors of the simplices in the star. With the latter interpretation, we can now express Condi- tion II more succinctly: the star of every simplex in K is homeomorphic to R2 . Data structure. When we store a 2-manifold, it is use- ful to keep track of which side we are facing and where we are going so that we can move around efficiently. The core piece of our data structure is a representation of the symmetry group of a triangle. This group is iso- morphic to the group of permutations of three elements, 65
  • 66. the vertices of the triangle. We call each permutation an ordered triangle and use cyclic shifts and transposi- tions to move between them; see Figure 82. We store ENEXT ENEXT ENEXT ENEXT ENEXT ENEXT SYM SYM SYM c a b a b b c c a b a c b a c b a c Figure 82: The symmetry group of the triangle consists of six ordered versions. Each ordered triangle has a lead vertex and a lead directed edge. the entire symmetry group in a single node of an abstract graph, with arcs between neighboring triangles. Further- more, we store the vertices in a linear array, V [1..n]. For each ordered triangle, we store the index of the lead ver- tex and a pointer to the neighboring triangle that shares the same directed lead edge. A pointer in this context is the address of a node together with a three-bit inte- ger, ι, that identifies the ordered version of the triangle we refer to. Suppose for example that we identify the ordered versions abc, bca, cab, bac, cba, acb of a triangle with ι = 0, 1, 2, 4, 5, 6, in this sequence. Then we can move between different ordered versions of the same tri- angle using the following functions. ordTri ENEXT(µ, ι) if ι ≤ 2 then return (µ, (ι + 1) mod 3) else return (µ, (ι + 1) mod 3 + 4) endif. ordTri SYM(µ, ι) return (µ, (ι + 4) mod 8). To get the index of the lead vertex, we use the integer func- tion ORG(µ, ι) and to get the pointer to the neighboring triangle, we use FNEXT(µ, ι). Orientability. A 2-manifold is orientable if it has two distinct sides, that is, if we move around on one we stay there and never cross over to the other side. The one exam- ple of a non-orientable manifold we have seen so far is the Möbious strip in Figure 80. There are also non-orientable, compact 2-manifolds (without boundary), as we can see in Figure 83. We use the data structure to decide whether or Figure 83: Two non-orientable, compact 2-manifolds, the pro- jective plane on the left and the Klein bottle on the right. not a 2-manifold is orientable. Note that the cyclic shift partitions the set of six ordered triangles into two orien- tations, each consisting of three triangles. We say two neighboring triangles are consistently oriented if they dis- agree on the direction of the shared edge, as in Figure 81. Using depth-first search, we visit all triangles and orient them consistently, if possible. At the first visit, we ori- ent the triangle consistent with the preceding, neighboring triangle. At subsequence visits, we check for consistent orientation. boolean ISORNTBL(µ, ι) if µ is unmarked then mark µ; choose the orientation that contains ι; bx = ISORNTBL(FNEXT(SYM(µ, ι))); by = ISORNTBL(FNEXT(ENEXT(SYM(µ, ι)))); bz = ISORNTBL(FNEXT(ENEXT2 (SYM(µ, ι)))); return bx and by and bz else return [orientation of µ contains ι] endif. There are two places where we return a boolean value. At the second place, it indicates whether or not we have con- sistent orientation in spite of the visited triangle being ori- ented prior to the visit. At the first place, the boolean value indicates whether or not we have found a contradiction to orientablity so far. A value of FALSE anywhere during the computation is propagated to the root of the search tree telling us that the 2-manifold is non-orientable. The run- ning time is proportional to the number of triangles in the triangulation of the 2-manifold. Classification. For the sphere and the torus, it is easy to see how to make them out of a sheet of paper. Twist- ing the paper gives a non-orientable 2-manifold. Perhaps 66
  • 67. most difficult to understand is the projective plane. It is obtained by gluing each point of the sphere to its antipodal point. This way, the entire northern hemisphere is glued to the southern hemisphere. This gives the disk except that we still need to glue points of the bounding circle (the equator) in pairs, as shown in the third paper construction in Figure 84. The Klein bottle is easier to imagine as it is obtained by twisting the paper just once, same as in the construction of the Möbius strip. b a b b b a b a a b b b a a a a Figure 84: From left to right: the sphere, the torus, the projective plane, and the Klein bottle. There is a general method here that can be used to clas- sify the compact 2-manifolds. Given two of them, we con- struct a new one by removing an open disk each and glu- ing the 2-manifolds along the two circles. The result is called the connected sum of the two 2-manifolds, denoted as M#N. For example, the double torus is the connected sum of two tori, T2 #T2 . We can cut up the g-fold torus into a flat sheet of paper, and the canonical way of doing this gives a 4g-gon with edges identified in pairs as shown in Figure 85 on the left. The number g is called the genus of the manifold. Similarly, we can get new non-orientable 1 1 2 2 3 3 4 1 1 1 1 2 2 4 2 2 a a a a b b b a a a a a a a a b Figure 85: The polygonal schema in standard form for the double torus and the double Klein bottle. manifolds from the projective plane, P2 , by forming con- nected sums. Cutting up the g-fold projective plane gives a 2g-gon with edges identified in pairs as shown in Figure 85 on the right. We note that the constructions of the pro- jective plane and the Klein bottle in Figure 84 are both not in standard form. A remarkable result which is now more than a century old is that every compact 2-manifold can be cut up to give a standard polygonal schema. This implies a classification of the possibilities. CLASSIFICATION THEOREM. The members of the fami- lies S2 , T2 , T2 #T2 , . . . and P2 , P2 #P2 , . . . are non- homeomorphic and they exhaust the family of com- pact 2-manifolds. Euler characteristic. Suppose we are given a triangula- tion, K, of a compact 2-manifold, M. We already know how to decide whether or not M is orientable. To deter- mine its type, we just need to find its genus, which we do by counting simplices. The Euler characteristic is χ = #vertices − #edges + #triangles. Let us look at the orientable case first. We have a 4g-gon which we triangulate. This is a planar graph with n − m + ℓ = 2. However, 2g edge are counted double, the 4g vertices of the 4g-gon are all the same, and the outer face is not a triangle in K. Hence, χ = (n − 4g + 1) − (m − 2g) + (ℓ − 1) = (n − m + ℓ) − 2g which is equal to 2 − 2g. The same analysis can be used in the non-orientable case in which we get χ = (n − 2g + 1) − (m − g) + (ℓ − 1) = 2 − g. To decide whether two compact 2-manifolds are homeomorphic it suffices to determine whether they are both orientable or both non- orientable and, if they are, whether they have the same Euler characteristic. This can be done in time linear in the number of simplices in their triangulations. This result is in sharp contrast to the higher-dimensional case. The classification of compact 3-manifolds has been a longstanding open problem in Mathematics. Perhaps the recent proof of the Poincaré conjecture by Perelman brings us close to a resolution. Beyond three dimensions, the situation is hopeless, that is, deciding whether or not two triangulated compact manifolds of dimension four or higher are homeomorphic is undecidable. 67
  • 68. 19 Homology In topology, the main focus is not on geometric size but rather on how a space is connected. The most elementary notion distinguishes whether we can go from one place to another. If not then there is a gap we cannot bridge. Next we would ask whether there is a loop going around an obstacle, or whether there is a void missing in the space. Homology is a formalization of these ideas. It gives a way to define and count holes using algebra. The cyclomatic number of a graph. To motivate the more general concepts, consider a connected graph, G, with n vertices and m edges. A spanning tree has n − 1 edges and every additional edge forms a unique cycle to- gether with edges in this tree; see Figure 86. Every other Figure 86: A tree with three additional edges defining the same number of cycles. cycle in G can be written as a sum of these m − (n − 1) cycles. To make this concrete, we define a cycle as a sub- set of the edges such that every vertex belongs to an even number of these edges. A cycle does not need to be con- nected. The sum of two cycles is the symmetric difference of the two sets such that multiple edges erase each other in pairs. Clearly, the sum of two cycles is again a cy- cle. Every cycle, γ, in G contains some positive number of edges that do not belong to the spanning tree. Call- ing these edges e1, e2, . . . , ek and the cycles they define γ1, γ2, . . . , γk, we claim that γ = γ1 + γ2 + . . . + γk. To see this assume that δ = γ1 + γ2 + . . .+ γk is different from γ. Then γ+δ is again a cycle but it contains no edges that do not belong to the spanning tree. Hence γ + δ = ∅ and therefore γ = δ, as claimed. This implies that the m−n+1 cycles form a basis of the group of cycles which motivates us to call m − n + 1 the cyclomatic number of the graph. Note that the basis depends on the choice of spanning tree while the cyclomatic number is independent of that choice. Simplicial complexes. We begin with a combinatorial representation of a topological space. Using a finite ground set of vertices, V , we call a subset σ ⊆ V an abstract simplex. Its dimension is one less than the car- dinality, dim σ = |σ| − 1. A face is a subset τ ⊆ σ. DEFINITION. An abstract simplicial complex over V is a system K ⊆ 2V such that σ ∈ K and τ ⊆ σ implies τ ∈ K. The dimension of K is the largest dimension of any sim- plex in K. A graph is thus a 1-dimensional abstract sim- plicial complex. Just like for graphs, we sometimes think of K as an abstract structure and at other times as a geo- metric object consisting of geometric simplices. In the lat- ter interpretation, we glue the simplices along shared faces to form a geometric realization of K, denoted as |K|. We say K triangulates a space X if there is a homeomorphism h : X → |K|. We have seen 1- and 2-dimensional exam- ples in the preceding sections. The boundary of a simplex σ is the collection of co-dimension one faces, ∂σ = {τ ⊆ σ | dim τ = dim σ − 1}. If dim σ = p then the boundary consists of p + 1 (p − 1)- simplices. Every (p − 1)-simplex has p (p − 2)-simplices in its own boundary. This way we get (p + 1)p (p − 2)- simplices, counting each of the p+1 p−1 = p+1 2 (p − 2)- dimensional faces of σ twice. Chain complexes. We now generalize the cycles in graphs to cycles of different dimensions in simplicial com- plexes. A p-chain is a set of p-simplices in K. The sum of two p-chains is their symmetric difference. We usually write the sets as formal sums, c = a1σ1 + a2σ2 + . . . + anσn; d = b1σ1 + b2σ2 + . . . + bnσn, where the ai and bi are either 0 or 1. Addition can then be done using modulo 2 arithmetic, c +2 d = (a1 +2 b1)σ1 + . . . + (an +2 bn)σn, where ai +2 bi is the exclusive or operation. We simplify notation by dropping the subscript but note that the two plus signs are different, one modulo two and the other a formal notation separating elements in a set. The p-chains 68
  • 69. form a group, which we denote as (Cp, +) or simply Cp. Note that the boundary of a p-simplex is a (p − 1)-chain, an element of Cp−1. Extending this concept linearly, we define the boundary of a p-chain as the sum of boundaries of its simplices, ∂c = a1∂σ1+. . .+an∂σn. The boundary is thus a map between chain groups and we sometimes write the dimension as index for clarity, ∂p : Cp → Cp−1. It is a homomorphism since ∂p(c + d) = ∂pc + ∂pd. The infinite sequence of chain groups connected by boundary homomorphisms is called the chain complex of K. All groups of dimension smaller than 0 and larger than the di- mension of K are trivial. It is convenient to keep them around to avoid special cases at the ends. A p-cycle is a p-chain whose boundary is zero. The sum of two p-cycles is again a p-cycle so we get a subgroup, Zp ⊆ Cp. A p-boundary is a p-chain that is the boundary of a (p + 1)- chain. The sum of two p-boundaries is again a p-boundary so we get another subgroup, Bp ⊆ Cp, Taking the bound- ary twice in a row gives zero for every simplex and thus for every chain, that is, (∂p(∂p+1d) = 0. It follows that Bp is a subgroup of Zp. We can therefore draw the chain complex as in Figure 87. B Z Cp+1 p+1 p+1 C Z B p p p C Z B p−1 p−1 p−1 0 0 0 p+2 p+1 p p−1 Figure 87: The chain complex consisting of a linear sequence of chain, cycle, and boundary groups connected by homomor- phisms. Homology groups. We would like to talk about cycles but ignore the boundaries since they do not go around a hole. At the same time, we would like to consider two cycles the same if they differ by a boundary. See Figure 88 for a few 1-cycles, some of which are 1-boundaries and some of which are not. This is achieved by taking the quotient of the cycle group and the boundary group. The result is the p-th homology group, Hp = Zp/Bp. Its elements are of the form [c] = c + Bp, where c is a p- cycle. [c] is called a homology class, c is a representative of [c], and any two cycles in [c] are homologous denoted γ δ ε Figure 88: The 1-cycles γ and δ are not 1-boundaries. Adding the 1-boundary ε to δ gives a 1-cycle homologous to δ. as c ∼ c′ . Note that [c] = [c′ ] whenever c ∼ c′ . Also note that [c + d] = [c′ + d′ ] whenever c ∼ c′ and d ∼ d′ . We use this as a definition of addition for homology classes, so we again have a group. For example, the 1-st homology group of the torus consists of four elements, [0] = B1, [γ] = γ +B1, [δ] = δ +B1, and [γ +δ] = γ +δ +B1. We often draw the elements as the corners of a cube of some dimension; see Figure 89. If the dimension is β then it has [0] [γ] [γ+δ] [γ] Figure 89: The four homology classes of H1 are generated by two classes, [γ] and [δ]. 2β corners. The dimension is also the number of classes needed to generate the group, the size of the basis. For the p-th homology group, this number is βp = rank Hp = log2 |Hp|, the p-th Betti number. For the torus we have β0 = 1; β1 = 2; β2 = 1, and βp = 0 for all p 6= 0, 1, 2. Every 0-chain is a 0- cycle. Two 0-cycles are homologous if they are both the sum of an even number or both of an odd number of ver- tices. Hence β0 = log2 2 = 1. We have seen the reason for β1 = 2 before. Finally, there are only two 2-cycles, namely 0 and the set of all triangles. The latter is not a boundary, hence β2 = log2 2 = 1. Boundary matrices. To compute homology groups and Betti numbers, we use a matrix representation of the sim- plicial complex. Specifically, we store the boundary ho- momorphism for each dimension, setting ∂p[i, j] = 1 if 69
  • 70. the i-th (p − 1)-simplex is in the boundary of the j-th p- simplex, and ∂p[i, j] = 0, otherwise. For example, if the complex consists of all faces of the tetrahedron, then the boundary matrices are ∂0 = 0 0 0 0 ; ∂1 =     1 1 1 0 0 0 1 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1     ; ∂2 =         1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 1 1         ; ∂3 =     1 1 1 1     . Given a p-chain as a column vector, v, its boundary is computed by matrix multiplication, ∂pv. The result is a combination of columns in the p-th boundary matrix, as specified by v. Thus, v is a p-cycle iff ∂pv = 0 and v is a p-boundary iff there is u such that ∂p+1u = v. Matrix reduction. Letting np be the number of p- simplices in K, we note that it is also the rank of the p-th chain group, np = rank Cp. The p-th boundary matrix thus has np−1 rows and np columns. To figure the sizes of the cycle and boundary groups, and thus of the homology groups, we reduce the matrix to normal form, as shown in Figure 90. The algorithm of choice uses column and Z rankB C C −1 p p rank rank rank p p −1 Figure 90: The p-th boundary matrix in normal form. The entries in the shaded portion of the diagonal are 1 and all other entries are 0. row operations similar to Gaussian elimination for solv- ing a linear system. We write it recursively, calling it with m = 1. void REDUCE(m) if ∃k, l ≥ m with ∂p[k, l] = 1 then exchange rows m and k and columns m and l; for i = m + 1 to np−1 do if ∂p[i, m] = 1 then add row m to row i endif endfor; for j = m + 1 to np do if ∂p[m, j] = 1 then add column m to column j endif endfor; REDUCE(m + 1) endif. For each recursive call, we have at most a linear number of row and column operations. The total running time is therefore at most cubic in the number of simplices. Figure 90 shows how we interpret the result. Specifically, the number of zero columns is the rank of the cycle group, Zp, and the number of 1s in the diagonal is the rank of the boundary group, Bp−1. The Betti number is the difference, βp = rank Zp − rankBp, taking the rank of the boundary group from the reduced matrix one dimension up. Working on our example, we get the following reduced matrices. ∂0 = 0 0 0 0 ; ∂1 =     1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0     ; ∂2 =         1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0         ; ∂3 =     1 0 0 0     . Writing zp = rank Zp and bp = rank Bp, we get z0 = 4 from the zeroth and b0 = 3 from the first reduced bound- ary matrix. Hence β0 = z0 = b0 = 1. Furthermore, 70
  • 71. z1 = 3 and b1 = 3 giving β1 = 0, z2 = 1 and b2 = 1 giving β2 = 0, and z3 = 0 giving β3 = 0. These are the Betti numbers of the closed ball. Euler-Poincaré Theorem. The Euler characteristic of a simplicial complex is the alternating sum of simplex num- bers, χ = X p≥0 (−1)p np. Recalling that np is the rank of the p-th chain group and that it equals the rank of the p-th cycle group plus the rank of the (p − 1)-st boundary group, we get χ = X p≥0 (−1)p (zp + bp−1) = X p≥0 (−1)p (zp − bp), which is the same as the alternating sum of Betti num- bers. To appreciate the beauty of this result, we need to know that the Betti numbers do not depend on the trian- gulation chosen for the space. The proof of this property is technical and omitted. This now implies that the Euler characteristic is an invariant of the space, same as the Betti numbers. EULER-POINCARÉ THEOREM. χ = P (−1)p βp. 71
  • 72. Fifth Homework Assignment Write the solution to each problem on a single page. The deadline for handing in solutions is November 13. Problem 1. (20 points). Let G = (V, E) be a maxi- mally connected planar graph and recall that [k] = {1, 2, . . ., k}. A vertex k-coloring is a mapping γ : V → [k] such that γ(u) 6= γ(v) whenever u 6= v are adjacent, and an edge k-coloring is a mapping η : E → [k] such that η(e) 6= η(f) whenever e 6= f bound a common triangle. Prove that if G has a ver- tex 4-coloring then it also has an edge 3-coloring. Problem 2. (20 = 10 + 10 points). Let K be a set of triangles together with their edges and vertices. The vertices are represented by a linear array, as usual, but there is no particular ordering information in the way the edges and triangles are given. In other words, the edges are just a list of index pairs and the triangles are a list of index triplets into the vertex array. (a) Give an algorithm that decides whether or not K is a triangulation of a 2-manifold. (b) Analyze your algorithm and collect credit points if the running time of your algorithm is linear in the number of triangles. Problem 3. (20 = 5+7+8 points). Determine the type of 2-manifold with boundary obtained by the following constructions. (a) Remove a cylinder from a torus in such a way that the rest of the torus remains connected. (b) Remove a disk from the projective plane. (c) Remove a Möbius strip from a Klein bottle. Whenever we remove a piece, we do this like cutting with scissors so that the remainder is still closed, in each case a 2-manifold with boundary. Problem 4. (20 = 5 + 5 + 5 + 5 points). Recall that the sphere is the space of points at unit distance from the origin in three-dimensional Euclidean space, S2 = {x ∈ R3 | kxk = 1}. (a) Give a triangulation of S2 . (b) Give the corresponding boundary matrices. (c) Reduce the boundary matrices. (d) Give the Betti numbers of S2 . Problem 5. (20 = 10 + 10 points). The dunce cap is ob- tained by gluing the three edges of a triangular sheet of paper to each other. [After gluing the first two edges you get a cone, with the glued edges forming a seam connecting the cone point with the rim. In the final step, wrap the seam around the rim, gluing all three edges to each other. To imagine how this work, it might help to think of the final result as similar to the shell of a snale.] (a) Is the dunce cap a 2-manifold? Justify your an- swer. (b) Give a triangulation of the dunce cap, making sure that no two edges connect the same two vertices and no two triangles connect the same three vertices. 72
  • 73. VI GEOMETRIC ALGORITHMS 20 Plane-Sweep 21 Delaunay Triangulations 22 Alpha Shapes Sixth Homework Assignment 73
  • 74. 20 Plane-Sweep Plane-sweep is an algorithmic paradigm that emerges in the study of two-dimensional geometric problems. The idea is to sweep the plane with a line and perform the com- putations in the sequence the data is encountered. In this section, we solve three problems with this paradigm: we construct the convex hull of a set of points, we triangulate the convex hull using the points as vertices, and we test a set of line segments for crossings. Convex hull. Let S be a finite set of points in the plane, each given by its two coordinates. The convex hull of S, denoted by conv S, is the smallest convex set that con- tains S. Figure 91 illustrates the definition for a set of nine points. Imagine the points as solid nails in a planar board. An intuitive construction stretches a rubber band around the nails. After letting go, the nails prevent the complete relaxation of the rubber band which will then trace the boundary of the convex hull. 6 7 4 1 2 9 8 3 5 Figure 91: The convex hull of nine points, which we represent by the counterclockwise sequence of boundary vertices: 1, 3, 6, 8, 9, 2. To construct the counterclockwise cyclic sequence of boundary vertices representing the convex hull, we sweep a vertical line from left to right over the data. At any mo- ment in time, the points in front (to the right) of the line are untouched and the points behind (to the left) of the line have already been processed. Step 1. Sort the points from left to right and relabel them in this sequence as x1, x2, . . . , xn. Step 2. Construct a counterclockwise triangle from the first three points: x1x2x3 or x1x3x2. Step 3. For i from 4 to n, add the next point xi to the convex hull of the preceding points by finding the two lines that pass through xi and support the con- vex hull. The algorithm is illustrated in Figure 92, which shows the addition of the sixth point in the data set. 6 5 4 1 2 9 8 3 7 Figure 92: The vertical sweep-line passes through point 6. To add 6, we substitute 6 for the sequence of vertices on the bound- ary between 3 and 5. Orientation test. A critical test needed to construct the convex hull is to determine the orientation of a sequence of three points. In other words, we need to be able to dis- tinguish whether we make a left-turn or a right-turn as we go from the first to the middle and then the last point in the sequence. A convenient way to determine the orien- tation evaluates the determinant of a three-by-three ma- trix. More precisely, the points a = (a1, a2), b = (b1, b2), c = (c1, c2) form a left-turn iff det   1 a1 a2 1 b1 b2 1 c1 c2   0. The three points form a right-turn iff the determinant is negative and they lie on a common line iff the determinant is zero. boolean LEFT(Points a, b, c) return [a1(b2 − c2) + b1(c2 − a2) + c1(a2 − b2) 0]. To see that this formula is correct, we may convince our- selves that it is correct for three non-collinear points, e.g. a = (0, 0), b = (1, 0), and c = (0, 1). Remember also that the determinant measures the area of the triangle and is therefore a continuous function that passes through zero only when the three points are collinear. Since we can continuously move every left-turn to every other left-turn without leaving the class of left-turns, it follows that the sign of the determinant is the same for all of them. Finding support lines. We use a doubly-linked cyclic list of vertices to represent the convex hull boundary. Each 74
  • 75. node in the list contains pointers to the next and the previ- ous nodes. In addition, we have a pointer last to the last vertex added to the list. This vertex is also the rightmost in the list. We add the i-th point by connecting it to the vertices µ → pt and λ → pt identified in a counterclock- wise and a clockwise traversal of the cycle starting at last, as illustrated in Figure 93. We simplify notation by using last µ ν λ Figure 93: The upper support line passes through the first point µ → pt that forms a left-turn from ν → pt to µ → next → pt. nodes in the parameter list of the orientation test instead of the points they store. µ = λ = last; create new node with ν → pt = i; while RIGHT(ν, µ, µ → next) do µ = µ → next endwhile; while LEFT(ν, λ, λ → prev) do λ = λ → prev endwhile; ν → next = µ; ν → prev = λ; µ → prev = λ → next = ν; last = ν. The effort to add the i-th point can be large, but if it is then we remove many previously added vertices from the list. Indeed, each iteration of the for-loop adds only one vertex to the cyclic list. We charge $2 for the addition, one dollar for the cost of adding and the other to pay for the future deletion, if any. The extra dollars pay for all iterations of the while-loops, except for the first and the last. This implies that we spend only constant amortized time per point. After sorting the points from left to right, we can therefore construct the convex hull of n points in time O(n). Triangulation. The same plane-sweep algorithm can be used to decompose the convex hull into triangles. All we need to change is that points and edges are never re- moved and a new point is connected to every point exam- ined during the two while-loops. We define a (geometric) triangulation of a finite set of points S in the plane as a maximally connected straight-line embedding of a planar graph whose vertices are mapped to points in S. Figure 94 shows the triangulation of the nine points in Figure 91 con- structed by the plane-sweep algorithm. A triangulation is 3 8 9 2 1 5 4 7 6 Figure 94: Triangulation constructed with the plane-sweep algo- rithm. not necessarily a maximally connected planar graph since the prescribed placement of the points fixes the boundary of the outer face to be the boundary of the convex hull. Letting k be the number of edges of that boundary, we would have to add k − 3 more edges to get a maximally connected planar graph. It follows that the triangulation has m = 3n − (k + 3) edges and ℓ = 2n − (k + 2) triangles. Line segment intersection. As a third application of the plane-sweep paradigm, we consider the problem of decid- ing whether or not n given line segments have pairwise disjoint interiors. We allow line segments to share end- points but we do not allow them to cross or to overlap. We may interpret this problem as deciding whether or not a straight-line drawing of a graph is an embedding. To sim- plify the description of the algorithm, we assume no three endpoints are collinear, so we only have to worry about crossings and not about other overlaps. How can we decide whether or not a line segment with endpoint u = (u1, u2) and v = (v1, v2) crosses another line segment with endpoints p = (p1, p2) and q = (q1, q2)? Figure 95 illustrates the question by show- ing the four different cases of how two line segments and the lines they span can intersect. The line segments cross iff uv intersects the line of pq and pq intersects the line of uv. This condition can be checked using the orientation test. boolean CROSS(Points u, v, p, q) return [(LEFT(u, v, p) xor LEFT(u, v, q)) and (LEFT(p, q, u) xor LEFT(p, q, v))]. We can use the above function to test all n 2 pairs of line segments, which takes time O(n2 ). 75
  • 76. u q v p u v q p q q v u p v u p Figure 95: Three pairs of non-crossing and one pair of crossing line segments. Plane-sweep algorithm. We obtain a faster algorithm by sweeping the plane with a vertical line from left to right, as before. To avoid special cases, we assume that no two endpoints are the same or lie on a common verti- cal line. During the sweep, we maintain the subset of line segments that intersect the sweep-line in the order they meet the line, as shown in Figure 96. We store this subset Figure 96: Five of the line segments intersect the sweep-line at its current position and two of them cross. in a dictionary, which is updated at every endpoint. Only line segments that are adjacent in the ordering along the sweep-line are tested for crossings. Indeed, two line seg- ments that cross are adjacent right before the sweep-line passes through the crossing, if not earlier. Step 1. Sort the 2n endpoints from left to right and re- label them in this sequence as x1, x2, . . . , x2n. Each point still remembers the index of the other endpoint of its line segment. Step 2. For i from 1 to 2n, process the i-th endpoint as follows: Case 2.1 xi is left endpoint of the line segment xixj. Therefore, i j. Insert xixj into the dictionary and let uv and pq be its prede- cessor and successor. If CROSS(u, v, xi, xj) or CROSS(p, q, xi, xj) then report the crossing and stop. Case 2.2 xi is right endpoint of the line segment xixj. Therefore, i j. Let uv and pq be the predecessor and the successor of xixj. If CROSS(u, v, p, q) then report the crossing and stop. Delete xixj from the dictionary. We do an insertion into the dictionary for each left end- point and a deletion from the dictionary for each right endpoint, both in time O(log n). In addition, we do at most two crossing tests per endpoint, which takes constant time. In total, the algorithm takes time O(n log n) to test whether a set of n line segments contains two that cross. 76
  • 77. 21 Delaunay Triangulations The triangulations constructing by plane-sweep are typi- cally of inferior quality, that is, there are many long and skinny triangles and therefore many small and many large angles. We study Delaunay triangulations which distin- guish themselves from all other triangulations by a num- ber of nice properties, including they have fast algorithms and they avoid small angles to the extent possible. Plane-sweep versus Delaunay triangulation. Figures 97 and 98 show two triangulations of the same set of points, one constructed by plane-sweep and the other the Delaunay triangulation. The angles in the Delaunay trian- Figure 97: Triangulation constructed by plane-sweep. Points on the same vertical line are processed from bottom to top. gulation seem consistently larger than those in the plane- sweep triangulation. This is not a coincidence and it can be proved that the Delaunay triangulation maximizes the minimum angle for every input set. Both triangulations Figure 98: Delaunay triangulation of the same twenty-one points triangulated in Figure 97. contain the edges that bound the convex hull of the input set. Voronoi diagram. We introduce the Delaunay triangu- lation indirectly, by first defining a particular decomposi- tion of the plane into regions, one per point in the finite data set S. The region of the point u in S contains all points x in the plane that are at least as close to u as to any other point in S, that is, Vu = {x ∈ R2 | kx − uk ≤ kx − vk, v ∈ S}, where kx − uk = [(x1 − u1)2 + (x2 − u2)2 ]1/2 is the Eu- clidean distance between the points x and u. We refer to Vu as the Voronoi region of u. It is closed and its bound- ary consists of Voronoi edges which Vu shares with neigh- boring Voronoi regions. A Voronoi edge ends in Voronoi vertices which it shares with other Voronoi edges. The Voronoi diagram of S is the collection of Voronoi regions, edges and vertices. Figure 99 illustrates the definitions. Let n be the number of points in S. We list some of the properties that will be important later. Figure 99: The (solid) Voronoi diagram drawn above the (dot- ted) Delaunay triangulation of the same twenty-one points trian- gulated in Figures 97 and 98. Some of the Voronoi edges are too far out to fit into the picture. • Each Voronoi region is a convex polygon constructed as the intersection of n − 1 closed half-planes. • The Voronoi region Vu is bounded (finite) iff u lies in the interior of the convex hull of S. • The Voronoi regions have pairwise disjoint interiors and together cover the entire plane. Delaunay triangulation. We define the Delaunay trian- gulation as the straight-line dual of the Voronoi diagram. Specifically, for every pair of Voronoi regions Vu and Vv that share an edge, we draw the line segment from u to v. By construction, every Voronoi vertex, x, has j ≥ 3 clos- est input points. Usually there are exactly three closest 77
  • 78. points, u, v, w, in which case the triangle they span be- longs to the Delaunay triangulation. Note that x is equally far from u, v, and w and further from all other points in S. This implies the empty circle property of Delaunay tri- angles: all points of S − {u, v, w} lie outside the circum- scribed circle of uvw. Similarly, for each Delaunay edge uv, there is a circle that passes through u and v such that all points of S − {u, v} lie outside the circle. For exam- ple, the circle centered at the midpoint of the Voronoi edge shared by Vu and Vv is empty in this sense. This property can be used to prove that the edge skeleton of the Delau- nay triangulation is a straight-line embedding of a planar graph. Figure 100: A Voronoi vertex of degree 5 and the corresponding pentagon in the Delaunay triangulation. The dotted edges com- plete the triangulation by decomposing the pentagon into three triangles. Now suppose there is a vertex with degree j 3. It cor- responds to a polygon with j 3 edges in the Delaunay triangulation, as illustrated in Figure 100. Strictly speak- ing, the Delaunay triangulation is no longer a triangulation but we can complete it to a triangulation by decompos- ing each j-gon into j − 2 triangles. This corresponds to perturbing the data points every so slightly such that the degree-j Voronoi vertices are resolved into trees in which j − 2 degree-3 vertices are connected by j − 3 tiny edges. Local Delaunayhood. Given a triangulation of a finite point set S, we can test whether or not it is the Delaunay triangulation by testing each edge against the two trian- gles that share the edge. Suppose the edge uv in the tri- angulation T is shared by the triangles uvp and uvq. We call uv locally Delaunay, or lD for short, if q lies on or outside the circle that passes through u, v, p. The condi- tion is symmetric in p and q because the circle that passes through u, v, p intersects the first circle in points u and v. It follows that p lies on or outside the circle of u, v, q iff q lies on or outside the circle of u, v, p. We also call uv lo- cally Delaunay if it bounds the convex hull of S and thus belongs to only one triangle. The local condition on the edges implies a global property. DELAUNAY LEMMA. If every edge in a triangulation K of S is locally Delaunay then K is the Delaunay tri- angulation of S. Although every edge of the Delaunay triangulation is lo- cally Delaunay, the Delaunay Lemma is not trivial. In- deed, K may contain edges that are locally Delaunay but do not belong to the Delaunay triangulation, as shown in Figure 101. We omit the proof of the lemma. u v Figure 101: The edge uv is locally Delaunay but does not belong to the Delaunay triangulation. Edge-flipping. The Delaunay Lemma suggests we con- struct the Delaunay triangulation by first constructing an arbitrary triangulation of the point set S and then modify- ing it locally to make all edges lD. The idea is to look for non-lD edges and to flip them, as illustrated in Figure 102. Indeed, if uv is a non-lD edge shared by triangles uvp and v p u q Figure 102: The edge uv is non-lD and can be flipped to the edge pq, which is lD. uvq then upvq is a convex quadrilateral and flipping uv means substituting one diagonal for the other, namely pq 78
  • 79. for uv. Note that if uv is non-lD then pq is lD. It is im- portant that the algorithm finds non-lD edges quickly. For this purpose, we use a stack of edges. Initially, we push all edges on the stack and mark them. while stack is non-empty do pop edge uv from stack and unmark it; if uv is non-lD then substitute pq for uv; for ab ∈ {up, pv, vq, qu} do if ab is unmarked then push ab on the stack and mark it endif endfor endif endwhile. The marks avoid multiple copies of the same edge on the stack. This implies that at any one moment the size of the stack is less than 3n. Note also that initially the stack con- tains all non-lD edges and that this property is maintained as an invariant of the algorithm. The Delaunay Lemma implies that when the algorithm halts, which is when the stack is empty, then the triangulation is the Delaunay tri- angulation. However, it is not yet clear that the algorithm terminates. Indeed, the stack can grow and shrink dur- ing the course of the algorithm, which makes it difficult to prove that it ever runs empty. In-circle test. Before studying the termination of the al- gorithm, we look into the question of distinguishing lD from non-lD edges. As before we assume that the edge uv is shared by the triangles uvp and uvq in the current trian- gulation. Recall that uv is lD iff q lies outside the circle that passes through u, v, p. Let f : R2 → R be defined by f(x) = x2 1 + x2 2. As illustrated in Figure 103, the graph of this function is a paraboloid in three-dimensional space and we write x+ = (x1, x2, f(x)) for the vertical projec- tion of the point x onto the paraboloid. Assuming the three points u, v, p do not lie on a common line then the points u+ , v+ , p+ lie on a non-vertical plane that is the graph of a function h(x) = αx1 + βx2 + γ. The projection of the intersection of the paraboloid and the plane back into R2 is given by 0 = f(x) − h(x) = x2 1 + x2 2 − αx1 − βx2 − γ, which is the equation of a circle. This circle passes through u, v, p so it is the circle we have to compare q u p v q Figure 103: The plane passing through u+ , v+ , p+ intersects the paraboloid in an ellipse whose projection into R2 passes through the points u, v, p. The point q+ lies below the plane iff q lies inside the circle. against. We note that q lies inside the circle iff q+ lies be- low the plane. The latter test can be based on the sign of the determinant of the 4-by-4 matrix ∆ =     1 u1 u2 u2 1 + u2 2 1 v1 v2 v2 1 + v2 2 1 p1 p2 p2 1 + p2 2 1 q1 q2 q2 1 + q2 2     . Exchanging two rows in the matrix changes the sign. While the in-circle test should be insensitive to the order of the first three points, the sign of the determinant is not. We correct the change using the sign of the determinant of the 3-by-3 matrix that keeps track of the ordering of u, v, p along the circle, Γ =   1 u1 u2 1 v1 v2 1 p1 p2   . Now we claim that s is inside the circle of u, v, p iff the two determinants have opposite signs: boolean INCIRCLE(Points u, v, p, q) return det Γ · det ∆ 0. We first show that the boolean function is correct for u = (0, 0), v = (1, 0), p = (0, 1), and q = (0, 0.5). The sign of the product of determinants remains unchanged if we continuously move the points and avoid the configurations that make either determinant zero, which are when u, v, p are collinear and when u, v, p, q are cocircular. We can change any configuration where q is inside the circle of u, v, p continuously into the special configuration without going through zero, which implies the correctness of the function for general input points. 79
  • 80. Termination and running time. To prove the edge-flip algorithm terminates, we imagine the triangulation lifted to R3 . We do this by projecting the vertices vertically onto the paraboloid, as before, and connecting them with straight edges and triangles in space. Let uv be an edge shared by triangles uvp and uvq that is flipped to pq by the algorithm. It follows the line segments uv and pq cross and their endpoints form a convex quadrilateral, as shown in Figure 104. After lifting the two line segments, we get q v u p Figure 104: A flip in the plane lifts to a tetrahedron in space in which the lD edge passes below the non-lD edge. u+ v+ passing above p+ q+ . We may thus think of the flip as gluing the tetrahedron u+ v+ p+ q+ underneath the sur- face obtained by lifting the triangulation. The surface is pushed down by each flip and never pushed back up. The removed edge is now above the new surface and can there- fore not be reintroduced by a later flip. It follows that the algorithm performs at most n 2 flips and thus takes at most time O(n2 ) to construct the Delaunay triangulation of S. There are faster algorithms that work in time O(n log n) but we prefer the suboptimal method because it is simpler and it reveals more about Delaunay triangulations than the other algorithms. The lifting of the input points to R3 leads to an interest- ing interpretation of the edge-flip algorithm. Starting with a monotone triangulated surface passing through the lifted points, we glue tetrahedra below the surface until we reach the unique convex surface that passes through the points. The projection of this convex surface is the Delaunay tri- angulation of the points in the plane. This also gives a reinterpretation of the Delaunay Lemma in terms of con- vex and concave edges of the surface. 80
  • 81. 22 Alpha Shapes Many practical applications of geometry have to do with the intuitive but vague concept of the shape of a finite point set. To make this idea concrete, we use the distances be- tween the points to identify subcomplexes of the Delaunay triangulation that represent that shape at different levels of resolution. Union of disks. Let S be a set of n points in R2 . For each r ≥ 0, we write Bu(r) = {x ∈ R2 | kx − uk ≤ r} for the closed disk with center u and radius r. Let U(r) = S u∈S Bu(r) be the union of the n disks. We de- compose this union into convex sets of the form Ru(r) = Bu(r) ∩ Vu. Then (i) Ru(r) is closed and convex for every point u ∈ S and every radius r ≥ 0; (ii) Ru(r) and Rv(r) have disjoint interiors whenever the two points, u and v, are different; (iii) U(r) = S u∈S Ru(r). We illustrate this decomposition in Figure 105. Each re- gion Ru(r) is the intersection of n − 1 closed half-planes and a closed disk. All these sets are closed and convex, which implies (i). The Voronoi regions have disjoint inte- riors, which implies (ii). Finally, take a point x ∈ U(r) and let u be a point in S with x ∈ Vu. Then x ∈ Bu(r) and therefore x ∈ Ru(x). This implies (iii). Figure 105: The Voronoi decomposition of a union of eight disks in the plane and superimposed dual alpha complex. Nerve. Similar to defining the Delaunay triangulation as the dual of the Voronoi diagram, we define the alpha com- plex as the dual of the Voronoi decomposition of the union of disks. This time around, we do this more formally. Let- ting C be a finite collection of sets, the nerve of C is the system of subcollections that have a non-empty common intersection, Nrv C = {X ⊆ C | X 6= ∅}. This is an abstract simplicial complex since T X 6= ∅ and Y ⊆ X implies T Y 6= ∅. For example, if C is the collec- tion of Voronoi regions then Nrv C is an abstract version of the Delaunay triangulation. More specifically, this is true provide the points are in general position and in par- ticular no four points lie on a common circle. We will as- sume this for the remainder of this section. We say the De- launay triangulation is a geometric realization of Nrv C, namely the one obtained by mapping each Voronoi region (a vertex in the abstract simplicial complex) to the gener- ating point. All edges and triangles are just convex hulls of their incident vertices. To go from the Delaunay trian- gulation to the alpha complex, we substitute the regions Ru(r) for the Vu. Specifically, Alpha(r) = Nrv {Ru(r) | u ∈ S}. Clearly, this is isomorphic to a subcomplex of the nerve of Voronoi regions. We can therefore draw Alpha(r) as a subcomplex of the Delaunay triangulation; see Figure 105. We call this geometric realization of Alpha(r) the alpha complex for radius r, denoted as A(r). The alpha shape for the same radius is the underlying space of the alpha complex, |A(r)|. The nerve preserves the way the union is connected. In particular, their Betti numbers are the same, that is, βp(U(r)) = βp(A(r)) for all dimensions p and all radii r. This implies that the union and the alpha shape have the same number of components and the same number of holes. For example, in Figure 105 both have one compo- nent and two holes. We omit the proof of this property. Filtration. We are interested in the sequence of alpha shapes as the radius grows from zero to infinity. Since growing r grows the regions Ru(r), the nerve can only get bigger. In other words, A(r) ⊆ A(s) whenever r ≤ s. There are only finitely many subcomplexes of the Delau- nay triangulation. Hence, we get a finite sequence of alpha complexes. Writing Ai for the i-th alpha complex, we get the following nested sequence, S = A1 ⊂ A2 ⊂ . . . ⊂ Ak = D, 81
  • 82. where D denotes the Delaunay triangulation of S. We call such a sequence of complexes a filtration. We illus- trate this construction in Figure 106. The sequence of al- b d a c e f g h Figure 106: A finite sequence of unions of disks, all decomposed by the same Voronoi diagram. pha complexes begins with a set of n isolated vertices, the points in S. To go from one complex to the next, we either add an edge, we add a triangle, or we add a pair consisting of a triangle with one of its edges. In Figure 106, we be- gin with eight vertices and get the following sequence of alpha complexes. A1 = {a, b, c, d, e, f, g, h}; A2 = A1 ∪ {ah}; A3 = A2 ∪ {bc}; A4 = A3 ∪ {ab, ef }; A5 = A4 ∪ {de}; A6 = A5 ∪ {gh}; A7 = A6 ∪ {cd}; A8 = A7 ∪ {fg}; A9 = A8 ∪ {cg}. Going from A7 to A8, we get for the first time a 1-cycle, which bounds a hole in the embedding. In A9, this hole is cut into two. This is the alpha complex depicted in Figure 105. We continue. A10 = A9 ∪ {cf }; A11 = A10 ∪ {abh, bh}; A12 = A11 ∪ {cde, ce}; A13 = A12 ∪ {cfg}; A14 = A13 ∪ {cef }; A15 = A14 ∪ {bch, ch}; A16 = A15 ∪ {cgh}. At this moment, we have a triangulated disk but not yet the entire Delaunay triangulation since the triangle bcd and the edge bd are still missing. Each step is generic except when we add two equally long edges to A3. Compatible ordering of simplices. We can represent the entire filtration of alpha complexes compactly by sort- ing the simplices in the order they join the growing com- plex. An ordering σ1, σ2, . . . , σm of the Delaunay sim- plices is compatible with the filtration if 1. the simplices in Ai precede the ones not in Ai for each i; 2. the faces of a simplex precede the simplex. For example, the sequence a, b, c, d, e, f, g, h; ah; bc; ab, ef ; de; gh; cd; fg; cg; cf ; bh, abh; ce, cde; cfg; cef ; ch, bch; cgh; bd; bcd is compatible with the filtration in Figure 106. Every alpha complex is a prefix of the compatible sequence but not necessarily the other way round. Condition 2 guarantees that every prefix is a complex, whether an alpha complex or not. We thus get a finer filtration of complexes ∅ = K0 ⊂ K1 ⊂ . . . ⊂ Km = D, where Ki is the set of simplices from σ1 to σi. To con- struct the compatible ordering, we just need to compute for each Delaunay simplex the radius ri = r(σi) such that σi ∈ A(r) iff r ≥ ri. For a vertex, this radius is zero. For a triangle, this is the radius of the circumcircle. For ϕ ψ ϕ ψ Figure 107: Left: the middle edge belongs to two acute triangles. Right: it belongs to an obtuse and an acute triangle. an edge, we have two cases. Let ϕ and ψ be the angles opposite the edge σi inside the two incident triangles. We have ϕ + ψ 180◦ because of the empty circle property. CASE 1. ϕ 90◦ and ψ 90◦ . Then ri = r(σi) is half the length of the edge. 82
  • 83. CASE 2. ϕ ≥ 90◦ . Then ri = rj, where σj is the incident triangle with angle ϕ. Both cases are illustrated in Figure 107. In Case 2, the edge σi enters the growing alpha complex together with the triangle σj. The total number of simplices in the De- launay triangulation is m 6n. The threshold radii can be computed in time O(n). Sorting the simplices into the compatible ordering can therefore be done in time O(n log n). Betti numbers. In two dimensions, Betti numbers can be computed directly, without resorting to boundary matri- ces. The only two possibly non-zero Betti numbers are β0, the number of components, and β1, the number of holes. We compute the Betti numbers of Kj by adding the sim- plices in order. β0 = β1 = 0; for i = 1 to j do switch dim σi: case 0: β0 = β0 + 1; case 1: let u, v be the endpoints of σi; if FIND(u) = FIND(v) then β1 = β1 + 1 else β0 = β0 − 1; UNION(u, v) endif case 2: β1 = β1 − 1 endswitch endfor. All we need is tell apart the two cases when σi is an edge. This is done using a union-find data structure maintaining the components of the alpha complex in amortized time α(n) per simplex. The total running time of the algorithm for computing Betti numbers is therefore O(nα(n)). 83
  • 84. Sixth Homework Assignment Write the solution to each problem on a single page. The deadline for handing in solutions is November 25. Problem 1. (20 points). Let S be a set of n unit disks in the Euclidean plane, each given by its center and radius, which is one. Give an algorithm that decides whether any two of the disks in S intersect. Problem 2. (20 = 10 + 10 points). Let S be a set of n points in the Euclidean plane. The Gabriel graph connects points u, v ∈ S with a straight edge if ku − vk2 ≤ ku − pk2 + kv − pk2 for every point p in S. (a) Show that the Grabriel graph is a subgraph of the edge skeleton of the Delaunay triangulation. (b) Is the Gabriel graph necessarily connected? Justify your answer. Problem 3. (20 = 10 + 10 points). Consider a set of n ≥ 3 closed disks in the Euclidean plane. The disks are allowed to touch but no two of them have an interior point in common. (a) Show that the number of touching pairs of disks is at most 3n − 6. (b) Give a construction that achieves the upper bound in (a) for any n ≥ 3. Problem 4. (20 = 10 + 10 points). Let K be a triangula- tion of a set of n ≥ 3 points in the plane. Let L be a line that avoids all the points. (a) Prove that L intersects at most 2n − 4 of the edges in K. (b) Give a construction for which L achieves the upper bound in (a) for any n ≥ 3. Problem 5. (20 points). Let S be a set of n points in the Euclidean plane, consider its Delaunay triangulation and the corresponding filtration of alpha complexes, S = A1 ⊂ A2 ⊂ . . . ⊂ Ak. Under what conditions is it true that Ai and Ai+1 dif- fer by a single simplex for every 1 ≤ i ≤ m − 1? 84
  • 85. VII NP-COMPLETENESS 23 Easy and Hard Problems 24 NP-Complete Problems 25 Approximation Algorithms Seventh Homework Assignment 85
  • 86. 23 Easy and Hard Problems The theory of NP-completeness is an attempt to draw a line between tractable and intractable problems. The most important question is whether there is indeed a difference between the two, and this question is still unanswered. Typical results are therefore relative statements such as “if problem B has a polynomial-time algorithm then so does problem C” and its equivalent contra-positive “if prob- lem C has no polynomial-time algorithm then neither has problem B”. The second formulation suggests we remem- ber hard problems C and for a new problem B we first see whether we can prove the implication. If we can then we may not want to even try to solve problem B efficiently. A good deal of formalism is necessary for a proper descrip- tion of results of this kind, of which we will introduce only a modest amount. What is a problem? An abstract decision problem is a function I → {0, 1}, where I is the set of problem in- stances and 0 and 1 are interpreted to mean FALSE and TRUE, as usual. To completely formalize the notion, we encode the problem instances in strings of zeros and ones: I → {0, 1}∗ . A concrete decision problem is then a func- tion Q : {0, 1}∗ → {0, 1}. Following the usual conven- tion, we map bit-strings that do not correspond to mean- ingful problem instances to 0. As an example consider the shortest-path problem. A problem instance is a graph and a pair of vertices, u and v, in the graph. A solution is a shortest path from u and v, or the length of such a path. The decision problem ver- sion specifies an integer k and asks whether or not there exists a path from u to v whose length is at most k. The theory of NP-completeness really only deals with deci- sion problems. Although this is a loss of generality, the loss is not dramatic. For example, given an algorithm for the decision version of the shortest-path problem, we can determine the length of the shortest path by repeated de- cisions for different values of k. Decision problems are always easier (or at least not harder) than the correspond- ing optimization problems. So in order to prove that an optimization problem is hard it suffices to prove that the corresponding decision problem is hard. Polynomial time. An algorithm solves a concrete deci- sion problem Q in time T (n) if for every instance x ∈ {0, 1}∗ of length n the algorithm produces Q(x) in time at most T (n). Note that this is the worst-case notion of time-complexity. The problem Q is polynomial-time solv- able if T (n) = O(nk ) for some constant k independent of n. The first important complexity class of problems is P = set of concrete decision problems that are polynomial-time solvable. The problems Q ∈ P are called tractable or easy and the problems Q 6∈ P are called intractable or hard. Algo- rithms that take only polynomial time are called efficient and algorithms that require more than polynomial time are inefficient. In other words, until now in this course we only talked about efficient algorithms and about easy problems. This terminology is adapted because the rather fine grained classification of algorithms by complexity we practiced until now is not very useful in gaining insights into the rather coarse distinction between polynomial and non-polynomial. It is convenient to recast the scenario in a formal lan- guage framework. A language is a set L ⊆ {0, 1}∗ . We can think of it as the set of problem instances, x, that have an affirmative answer, Q(x) = 1. An algorithm A : {0, 1}∗ → {0, 1} accepts x ∈ {0, 1}∗ if A(x) = 1 and it rejects x if A(x) = 0. The language accepted by A is the set of strings x ∈ {0, 1}∗ with A(x) = 1. There is a subtle difference between accepting and deciding a lan- guage L. The latter means that A accepts every x ∈ L and rejects every x 6∈ L. For example, there is an algorithm that accepts every program that halts, but there is no algo- rithm that decides the language of such programs. Within the formal language framework we redefine the class of polynomial-time solvable problems as P = {L ⊆ {0, 1}∗ | L is accepted by a polynomial-time algorithm} = {L ⊆ {0, 1}∗ | L is decided by a polynomial-time algorithm}. Indeed, a language that can be accepted in polynomial time can also be decided in polynomial time: we keep track of the time and if too much goes by without x be- ing accepted, we turn around and reject x. This is a non- constructive argument since we may not know the con- stants in the polynomial. However, we know such con- stants exist which suffices to show that a simulation as sketched exists. Hamiltonian cycles. We use a specific graph problem to introduce the notion of verifying a solution to a problem, as opposed to solving it. Let G = (V, E) be an undi- rected graph. A hamiltonian cycle contains every vertex 86
  • 87. v ∈ V exactly once. The graph G is hamiltonian if it has a hamiltonian cycle. Figure 108 shows a hamiltonian cy- cle of the edge graph of a Platonic solid. How about the edge graphs of the other four Platonic solids? Define L = Figure 108: The edge graph of the dodecahedron and one of its hamiltonian cycles. {G | G is hamiltonian}. We can thus ask whether or not L ∈ P, that is, whether or not there is a polynomial-time algorithm that decides whether or not a graph is hamilto- nian. The answer to this question is currently not known, but there is evidence that the answer might be negative. On the other hand, suppose y is a hamiltonian cycle of G. The language L′ = {(G, y) | y is a hamiltonian cycle of G} is certainly in P because we just need to make sure that y and G have the same number of vertices and every edge of y is also an edge of G. Non-deterministic polynomial time. More generally, it seems easier to verify a given solution than to come up with one. In a nutshell, this is what NP-completeness is about, namely finding out whether this is indeed the case and whether the difference between accepting and verify- ing can be used to separate hard from easy problems. Call y ∈ {0, 1}∗ a certificate. An algorithm A verifies a problem instance x ∈ {0, 1}∗ if there exists a certificate y with A(x, y) = 1. The language verified by A is the set of strings x ∈ {0, 1}∗ verified by A. We now define a new class of problems, NP = {L ⊆ {0, 1}∗ | L is verified by a polynomial-time algorithm}. More formally, L is in NP if for every problem instance x ∈ L there is a certificate y whose length is bounded from above by a polynomial in the length of x such that A(x, y) = 1 and A runs in polynomial time. For exam- ple, deciding whether or not G is hamiltonian is in NP. The name NP is an abbreviation for non-deterministic polynomial time, because a non-deterministic computer can guess a certificate and then verify that certificate. In a parallel emulation, the computer would generate all possi- ble certificates and then verify them in parallel. Generat- ing one certificate is easy, because it only has polynomial length, but generating all of them is hard, because there are exponentially many strings of polynomial length. P = co−NP NP NP NP co−NP P = NP = co−NP NP = co−NP P P Figure 109: Four possible relations between the complexity classes P, NP, and co-NP. Non-deterministic machine are at least as powerful as deterministic machines. It follows that every problem in P is also in NP, P ⊆ NP. Define co-NP = {L | L = {x 6∈ L} ∈ NP}, which is the class of languages whose complement can be verified in non-deterministic polynomial time. It is not known whether or not NP = co-NP. For example, it seems easy to verify that a graph is hamiltonian but it seems hard to verify that a graph is not hamiltonian. We said earlier that if L ∈ P then L ∈ P. Therefore, P ⊆ co-NP. Hence, only the four relationships between the three complexity classes shown in Figure 109 are pos- sible, but at this time we do not know which one is correct. Problem reduction. We now develop the concept of re- ducing one problem to another, which is key in the con- struction of the class of NP-complete problems. The idea is to map or transform an instance of a first problem to an instance of a second problem and to map the solution to the second problem back to a solution to the first problem. For decision problems, the solutions are the same and need no transformation. Language L1 is polynomial-time reducible to language L2, denoted L1 ≤P L2, if there is a polynomial-time com- putable function f : {0, 1}∗ → {0, 1}∗ such that x ∈ L1 iff f(x) ∈ L2, for all x ∈ {0, 1}∗ . Now suppose that 87
  • 88. L1 is polynomial-time reducible to L2 and that L2 has a polynomial-time algorithm A2 that decides L2, x f −→ f(x) A2 −→ {0, 1}. We can compose the two algorithms and obtain a poly- nomial-time algorithm A1 = A2 ◦ f that decides L1. In other words, we gained an efficient algorithm for L1 just by reducing it to L2. REDUCTION LEMMA. If L1 ≤P L2 and L2 ∈ P then L1 ∈ P. In words, if L1 is polynomial-time reducible to L2 and L2 is easy then L1 is also easy. Conversely, if we know that L1 is hard then we can conclude that L2 is also hard. This motivates the following definition. A language L ⊆ {0, 1}∗ is NP-complete if (1) L ∈ NP; (2) L′ ≤P L, for every L′ ∈ NP. Since every L′ ∈ NP is polynomial-time reducible to L, all L′ have to be easy for L to have a chance to be easy. The L′ thus only provide evidence that L might indeed be hard. We say L is NP-hard if it satisfies (2) but not necessarily (1). The problems that satisfy (1) and (2) form the complexity class NPC = {L | L is NP-complete}. All these definitions would not mean much if we could not find any problems in NPC. The first step is the most difficult one. Once we have one problem in NPC we can get others using reductions. Satisfying boolean formulas. Perhaps surprisingly, a first NP-complete problem has been found, namely the problem of satisfiability for logical expressions. A boolean formula, ϕ, consists of variables, x1, x2, . . ., op- erators, ¬, ∧, ∨, =⇒, . . ., and parentheses. A truth assign- ment maps each variable to a boolean value, 0 or 1. The truth assignment satisfies if the formula evaluates to 1. The formula is satisfiable if there exists a satisfying truth as- signment. Define SAT = {ϕ | ϕ is satisfiable}. As an example consider the formula ψ = (x1 =⇒ x2) ⇐⇒ (x2 ∨ ¬x1). If we set x1 = x2 = 1 we get (x1 =⇒ x2) = 1, (x2 ∨ ¬x1) = 1 and therefore ψ = 1. It follows that ψ ∈ SAT. In fact, all truth assignments evaluate to 1, which means that ψ is really a tautology. More generally, a boolean formula, ϕ, is satisfyable iff ¬ϕ is not a tautology. SATISFIABILITY THEOREM. We have SAT ∈ NP and L′ ≤P SAT for every L′ ∈ NP. That SAT is in the class NP is easy to prove: just guess an assignment and verify that it satisfies. However, to prove that every L′ ∈ NP can be reduced to SAT in polynomial time is quite technical and we omit the proof. The main idea is to use the polynomial-time algorithm that verifies L′ and to construct a boolean formula from this algorithm. To formalize this idea, we would need a formal model of a computer, a Touring machine, which is beyond the scope of this course. 88
  • 89. 24 NP-Complete Problems In this section, we discuss a number of NP-complete prob- lems, with the goal to develop a feeling for what hard problems look like. Recognizing hard problems is an im- portant aspect of a reliable judgement for the difficulty of a problem and the most promising approach to a solution. Of course, for NP-complete problems, it seems futile to work toward polynomial-time algorithms and instead we would focus on finding approximations or circumventing the problems altogether. We begin with a result on differ- ent ways to write boolean formulas. Reduction to 3-satisfiability. We call a boolean vari- able or its negation a literal. The conjunctive normal form is a sequence of clauses connected by ∧s, and each clause is a sequence of literals connected by ∨s. A for- mula is in 3-CNF if it is in conjunctive normal form and each clause consists of three literals. It turns out that de- ciding the satisfiability of a boolean formula in 3-CNF is no easier than for a general boolean formula. Define 3-SAT = {ϕ ∈ SAT | ϕ is in 3-CNF}. We prove the above claim by reducing SAT to 3-SAT. SATISFIABILITY LEMMA. SAT ≤P 3-SAT. PROOF. We take a boolean formula ϕ and transform it into 3-CNF in three steps. Step 1. Think of ϕ as an expression and represent it as a binary tree. Each node is an operation that gets the input from its two children and forwards the output to its parent. Introduce a new variable for the output and define a new formula ϕ′ for each node, relating the two input edges with the one output edge. Figure 110 shows the tree representation of the formula ϕ = (x1 =⇒ x2) ⇐⇒ (x2 ∨ ¬x1). The new formula is ϕ′ = (y2 ⇐⇒ (x1 =⇒ x2)) ∧(y3 ⇐⇒ (x2 ∨ ¬x1)) ∧(y1 ⇐⇒ (y2 ⇐⇒ y3)) ∧ y1. It should be clear that there is a satisfying assignment for ϕ iff there is one for ϕ′ . Step 2. Convert each clause into disjunctive normal form. The most mechanical way uses the truth table for each clause, as illustrated in Table 6. Each clause has at most three literals. For example, the negation of y2 ⇐⇒ (x1 =⇒ x2) is equivalent to the disjunc- tion of the conjunctions in the rightmost column. It x 2 x 2 1 1 x y1 2 y 3 x y Figure 110: The tree representation of the formula ϕ. Inciden- tally, ϕ is a tautology, which means it is satisfied by every truth assignment. Equivalently, ¬ϕ is not satisfiable. y2 x1 x2 y2 ⇔ (x1 ⇒ x2) prohibited 0 0 0 0 ¬y2 ∧ ¬x1 ∧ ¬x2 0 0 1 0 ¬y2 ∧ ¬x1 ∧ x2 0 1 0 1 0 1 1 0 ¬y2 ∧ x1 ∧ x2 1 0 0 1 1 0 1 1 1 1 0 0 y2 ∧ x1 ∧ ¬x2 1 1 1 1 Table 6: Conversion of a clause into a disjunction of conjunctions of at most three literals each. follows that y2 ⇐⇒ (x1 =⇒ x2) is equivalent to the negation of that disjunction, which by de Morgan’s law is (y2 ∨x1 ∨x2)∧(y2 ∨x1 ∨¬x2)∧(y2 ∨¬x1 ∨ ¬x2) ∧ (¬y2 ∨ ¬x1 ∨ x2). Step 3. The clauses with fewer than three literals can be expanded by adding new variables. For example a ∨ b is expanded to (a ∨ b ∨ p) ∧ (a ∨ b ∨ ¬p) and (a) is expanded to (a ∨ p ∨ q) ∧ (a ∨ p ∨ ¬q) ∧ (a ∨ ¬p ∨ q) ∧ (a ∨ ¬p ∨ ¬q). Each step takes only polynomial time. At the end, we get an equivalent formula in 3-conjunctive normal form. We note that clauses of length three are necessary to make the satisfiability problem hard. Indeed, there is a polynomial-time algorithm that decides the satisfiability of a formula in 2-CNF. NP-completeness proofs. Using polynomial-time re- ductions, we can show fairly mechanically that problems are NP-complete, if they are. A key property is the tran- sitivity of ≤P , that is, if L′ ≤P L1 and L1 ≤P L2 then L′ ≤P L2, as can be seen by composing the two polynomial-time computable functions to get a third one. REDUCTION LEMMA. Let L1, L2 ⊆ {0, 1}∗ and assume L1 ≤P L2. If L1 is NP-hard and L2 ∈ NP then L2 ∈ NPC. 89
  • 90. A generic NP-completeness proof thus follows the steps outline below. Step 1. Prove that L2 ∈ NP. Step 2. Select a known NP-hard problem, L1, and find a polynomial-time computable function, f, with x ∈ L1 iff f(x) ∈ L2. This is what we did for L2 = 3-SAT and L1 = SAT. Therefore 3-SAT ∈ NPC. Currently, there are thousands of problems known to be NP-complete. This is often con- NPC NP P Figure 111: Possible relation between P, NPC, and NP. sidered evidence that P 6= NP, which can be the case only if P ∩ NPC = ∅, as drawn in Figure 111. Cliques and independent sets. There are many NP- complete problems on graphs. A typical such problem asks for the largest complete subgraph. Define a clique in an undirected graph G = (V, E) as a subgraph (W, F) with F = W 2 . Given G and an integer k, the CLIQUE problem asks whether or not there is a clique of k or more vertices. CLAIM. CLIQUE ∈ NPC. PROOF. Given k vertices in G, we can verify in poly- nomial time whether or not they form a complete graph. Thus CLIQUE ∈ NP. To prove property (2), we show that 3-SAT ≤P CLIQUE. Let ϕ be a boolean formula in 3-CNF consisting of k clauses. We construct a graph as follows: (i) each clause is replaced by three vertices; (ii) two vertices are connected by an edge if they do not belong to the same clause and they are not negations of each other. In a satisfying truth assignment, there is at least one true literal in each clause. The true literals form a clique. Con- versely, a clique of k or more vertices covers all clauses and thus implies a satisfying truth assignment. It is easy to decide in time O(k2 nk+2 ) whether or not a graph of n vertices has a clique of size k. If k is a constant, the running time of this algorithm is polynomial in n. For the CLIQUE problem to be NP-complete it is therefore es- sential that k be a variable that can be arbitrarily large. We use the NP-completeness of finding large cliques to prove the NP-completeness of large sets of pairwise non- adjacent vertices. Let G = (V, E) be an undirected graph. A subset W ⊆ V is independent if none of the vertices in W are adjacent or, equivalently, if E ∩ W 2 = ∅. Given G and an integer k, the INDEPENDENT SET problem asks whether or not there is an independent set of k or more vertices. CLAIM. INDEPENDENT SET ∈ NPC. PROOF. It is easy to verify that there is an independent set of size k: just guess a subset of k vertices and verify that no two are adjacent. Figure 112: The four shaded vertices form an independent set in the graph on the left and a clique in the complement graph on the right. We complete the proof by reducing the CLIQUE to the INDEPENDENT SET problem. As illustrated in Figure 112, W ⊆ V is independent iff W defines a clique in the com- plement graph, G = (V, V 2 − E). To prove CLIQUE ≤P INDEPENDENT SET, we transform an instance H, k of the CLIQUE problem to the instance G = H, k of the INDE- PENDENT SET problem. G has an independent set of size k or larger iff H has a clique of size k or larger. Various NP-complete graph problems. We now de- scribe a few NP-complete problems for graphs without proving that they are indeed NP-complete. Let G = (V, E) be an undirected graph with n vertices and k a pos- itive integer, as before. The following problems defined for G and k are NP-complete. An ℓ-coloring of G is a function χ : V → [ℓ] with χ(u) 6= χ(v) whenever u and v are adjacent. The CHRO- MATIC NUMBER problem asks whether or not G has an ℓ- coloring with ℓ ≤ k. The problem remains NP-complete 90
  • 91. for fixed k ≥ 3. For k = 2, the CHROMATIC NUMBER problem asks whether or not G is bipartite, for which there is a polynomial-time algorithm. The bandwidth of G is the minimum ℓ such that there is a bijection β : V → [n] with |β(u) − β(v)| ≤ ℓ for all adjacent vertices u and v. The BANDWIDTH problem asks whether or not the bandwidth of G is k or less. The problem arises in linear algebra, where we permute rows and columns of a matrix to move all non-zero elements of a square matrix as close to the diagonal as possible. For example, if the graph is a simple path then the bandwidth is 1, as can be seen in Figure 113. We can transform the 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 0 1 1 1 1 0 0 0 0 2 1 3 4 7 8 6 5 Figure 113: Simple path and adjacency matrix with rows and columns ordered along the path. adjacency matrix of G such that all non-zero diagonals are at most the bandwidth of G away from the main diagonal. Assume now that the graph G is complete, E = V 2 , and that each edge, uv, has a positive integer weight, w(uv). The TRAVELING SALESMAN problem asks whether there is a permutation u0, u1, . . . , un−1 of the vertices such that the sum of edges connecting con- tiguous vertices (and the last vertex to the first) is k or less, n−1 X i=0 w(uiui+1) ≤ k, where indices are taken modulo n. The problem remains NP-complete if w : E → {1, 2} (reduction to HAMILTO- NIAN CYCLE problem), and also if the vertices are points in the plane and the weight of an edge is the Euclidean distance between the two endpoints. Set systems. Simple graphs are set systems in which the sets contain only two elements. We now list a few NP- complete problems for more general set systems. Letting V be a finite set, C ⊆ 2V a set system, and k a positive integer, the following problems are NP-complete. The PACKING problem asks whether or not C has k or more mutually disjoint sets. The problem remains NP- complete if no set in C contains more than three elements, and there is a polynomial-time algorithm if every set con- tains two elements. In the latter case, the set system is a graph and a maximum packing is a maximum matching. The COVERING problem asks whether or not C has k or fewer subsets whose union is V . The problem remains NP-complete if no set in C contains more than three ele- ments, and there is a polynomial-time algorithm if every sets contains two elements. In the latter case, the set sys- tem is a graph and the minimum cover can be constructed in polynomial time from a maximum matching. Suppose every element v ∈ V has a positive integer weight, w(v). The PARTITION problem asks whether there is a subset U ⊆ V with X u∈U w(u) = X v∈V −U w(v). The problem remains NP-complete if we require that U and V − U have the same number of elements. 91
  • 92. 25 Approximation Algorithms Many important problems are NP-hard and just ignoring them is not an option. There are indeed many things one can do. For problems of small size, even exponential- time algorithms can be effective and special subclasses of hard problems sometimes have polynomial-time algo- rithms. We consider a third coping strategy appropriate for optimization problems, which is computing almost op- timal solutions in polynomial time. In case the aim is to maximize a positive cost, a ̺(n)-approximation algo- rithm is one that guarantees to find a solution with cost C ≥ C∗ /̺(n), where C∗ is the maximum cost. For mini- mization problems, we would require C ≤ C∗ ̺(n). Note that ̺(n) ≥ 1 and if ̺(n) = 1 then the algorithm produces optimal solutions. Ideally, ̺ is a constant but sometime even this is not achievable in polynomial time. Vertex cover. The first problem we consider is finding the minimum set of vertices in a graph G = (V, E) that covers all edges. Formally, a subset V ′ ⊆ V is a ver- tex cover if every edge has at least one endpoint in V ′ . Observe that V ′ is a vertex cover iff V − V ′ is an inde- pendent set. Finding a minimum vertex cover is therefore equivalent to finding a maximum independent set. Since the latter problem is NP-complete, we conclude that find- ing a minimum vertex cover is also NP-complete. Here is a straightforward algorithm that achieves approximation ratio ̺(n) = 2, for all n = |V |. V ′ = ∅; E′ = E; while E′ 6= ∅ do select an arbitrary edge uv in E′ ; add u and v to V ′ ; remove all edges incident to u or v from E′ endwhile. Clearly, V ′ is a vertex cover. Using adjacency lists with links between the two copies of an edge, the running time is O(n + m), where m is the number of edges. Further- more, we have ̺ = 2 because every cover must pick at least one vertex of each edge uv selected by the algorithm, hence C ≤ 2C∗ . Observe that this result does not imply a constant approximation ratio for the maximum indepen- dent set problem. We have |V −V ′ | = n−C ≥ n−2C∗ , which we have to compare with n − C∗ , the size of the maximum independent set. For C∗ = n 2 , the approxima- tion ratio is unbounded. Let us contemplate the argument we used to relate C and C∗ . The set of edges uv selected by the algorithm is a matching, that is, a subset of the edges so that no two share a vertex. The size of the minimum vertex cover is at least the size of the largest possible matching. The al- gorithm finds a matching and since it picks two vertices per edge, we are guaranteed at most twice as many ver- tices as needed. This pattern of bounding C∗ by the size of another quantity (in this case the size of the largest matching) is common in the analysis of approximation al- gorithms. Incidentally, for bipartite graphs, the size of the largest matching is equal to the size of the smallest vertex cover. Furthermore, there is a polynomial-time algorithm for computing them. Traveling salesman. Second, we consider the traveling salesman problem, which is formulated for a complete graph G = (V, E) with a positive integer cost function c : E → Z+. A tour in this graph is a Hamiltonian cycle and the problem is finding the tour, A, with mini- mum total cost, c(A) = P uv∈A c(uv). Let us first as- sume that the cost function satisfies the triangle inequal- ity, c(uw) ≤ c(uv) + c(vw) for all u, v, w ∈ V . It can be shown that the problem of finding the shortest tour remains NP-complete even if we restrict it to weighted graphs that satisfy this inequality. We formulate an al- gorithm based on the observation that the cost of every tour is at least the cost of the minimum spanning tree, C∗ ≥ c(T ). 1 Construct the minimum spanning tree T of G. 2 Return the preorder sequence of vertices in T . Using Prim’s algorithm for the minimum spanning tree, the running time is O(n2 ). Figure 114 illustrates the algo- rithm. The preorder sequence is only defined if we have Figure 114: The solid minimum spanning tree, the dotted traver- sal using each edge of the tree twice, and the solid tour obtained by taking short-cuts. a root and the neighbors of each vertex are ordered, but 92
  • 93. we may choose both arbitrarily. The cost of the returned tour is at most twice the cost of the minimum spanning tree. To see this, consider traversing each edge of the min- imum spanning tree twice, once in each direction. When- ever a vertex is visited more than once, we take the direct edge connecting the two neighbors of the second copy as a short-cut. By the triangle inequality, this substitution can only decrease the overall cost of the traversal. It follows that C ≤ 2c(T ) ≤ 2C∗ . The triangle inequality is essential in finding a constant approximation. Indeed, without it we can construct in- stances of the problem for which finding a constant ap- proximation is NP-hard. To see this, transform an un- weighted graph G′ = (V ′ , E′ ) to the complete weighted graph G = (V, E) with c(uv) = 1 if uv ∈ E′ , ̺n + 1 otherwise. Any ̺-approximation algorithm must return the Hamilto- nian cycle of G′ , if there is one. Set cover. Third, we consider the problem of covering a set X with sets chosen from a set system F. We as- sume the set is the union of sets in the system, X = S F. More precisely, we are looking for a smallest subsystem F′ ⊆ F with X = S F′ . The cost of this subsystem is the number of sets it contains, |F′ |. See Figure 115 for an illustration of the problem. The vertex cover problem Figure 115: The set X of twelve dots can be covered with four of the five sets in the system. is a special case: X = E and F contains all subsets of edges incident to a common vertex. It is special because each element (edge) belongs to exactly two sets. Since we no longer have a bound on the number of sets containing a single element, it is not surprising that the algorithm for vertex covers does not extend to a constant-approximation algorithm for set covers. Instead, we consider the follow- ing greedy approach that selects, at each step, the set con- taining the maximum number of yet uncovered elements. F′ = ∅; X′ = X; while X′ 6= ∅ do select S ∈ F maximizing |S ∩ X′ |; F′ = F′ ∪ {S}; X′ = X′ − S endwhile. Using a sparse matrix representation of the set system (similar to an adjacency list representation of a graph), we can run the algorithm in time proportional to the total size of the sets in the system, n = P S∈F |S|. We omit the details. Analysis. More interesting than the running time is the analysis of the approximation ratio the greedy algorithm achieves. It is convenient to have short notation for the d- th harmonic number, Hd = Pd i=1 1 i for d ≥ 0. Recall that Hd ≤ 1 + ln d for d ≥ 1. Let the size of the largest set in the system be m = max{|S| | S ∈ F}. CLAIM. The greedy method is an Hm-approximation al- gorithm for the set cover problem. PROOF. For each set S selected by the algorithm, we dis- tribute $1 over the |S ∩ X′ | elements covered for the first time. Let cx be the cost allocated this way to x ∈ X. We have |F′ | = P x∈X cx. If x is covered the first time by the i-th selected set, Si, then cx = 1 |Si − (S1 ∪ . . . ∪ Si−1)| . We have |F′ | ≤ P S∈F∗ P x∈S cx because the optimal cover, F∗ , contains each element x at least once. We will prove shortly that P x∈S cx ≤ H|S| for every set S ∈ F. It follows that |F′ | ≤ X S∈F∗ H|S| ≤ Hm|F∗ |, as claimed. For m = 3, we get ̺ = H3 = 11 6 . This implies that for graphs with vertex-degrees at most 3, the greedy algo- rithm guarantees a vertex cover of size at most 11 6 times the optimum, which is better than the ratio 2 guaranteed by our first algorithm. We still need to prove that the sum of costs cx over the elements of a set S in the system is bounded from above by H|S|. Let ui be the number of elements in S that are 93
  • 94. not covered by the first i selected sets, ui = |S − (S1 ∪ . . . ∪ Si)|, and observe that the numbers do not increase. Let uk−1 be the last non-zero number in the sequence, so |S| = u0 ≥ . . . ≥ uk−1 uk = 0. Since ui−1 − ui is the number of elements in S covered the first time by Si, we have X x∈S cx = k X i=1 ui−1 − ui |Si − (S1 ∪ . . . ∪ Si−1)| . We also have ui−1 ≤ |Si − (S1 ∪ . . . ∪ Si−1)|, for all i ≤ k, because of the greedy choice of Si. If this were not the case, the algorithm would have chosen S instead of Si in the construction of F′ . The problem thus reduces to bounding the sum of ratios ui−1−ui ui−1 . It is not difficult to see that this sum can be at least logarithmic in the size of S. Indeed, if we choose ui about half the size of ui−1, for all i ≥ 1, then we have logarithmically many terms, each roughly 1 2 . We use a sequence of simple arithmetic manipulations to prove that this lower bound is asymptot- ically tight: X x∈S cx ≤ k X i=1 ui−1 − ui ui−1 = k X i=1 ui−1 X j=ui+1 1 ui−1 . We now replace the denominator by j ≤ ui−1 to form a telescoping series of harmonic numbers and get X x∈S cx ≤ k X i=1 ui−1 X j=ui+1 1 j = k X i=1   ui−1 X j=1 1 j − ui X j=1 1 j   = k X i=1 (Hui−1 − Hui ). This is equal to Hu0 − Huk = H|S|, which fills the gap left in the analysis of the greedy algorithm. 94
  • 95. Seventh Homework Assignment The purpose of this assignment is to help you prepare for the final exam. Solutions will neither be graded nor even collected. Problem 1. (20 = 5 + 15 points). Consider the class of satisfiable boolean formulas in conjunctive nor- mal form in which each clause contains two literals, 2-SAT = {ϕ ∈ SAT | ϕ is 2-CNF}. (a) Is 2-SAT ∈ NP? (b) Is there a polynomial-time algorithm for decid- ing whether or not a boolean formula in 2-CNF is satisfiable? If your answer is yes, then de- scribe and analyze your algorithm. If your an- swer is no, then show that 2-SAT ∈ NPC. Problem 2. (20 points). Let A be a finite set and f a func- tion that maps every a ∈ A to a positive integer f(a). The PARTITION problem asks whether or not there is a subset B ⊆ A such that X b∈B f(b) = X a∈A−B f(a). We have learned that the PARTITION problem is NP-complete. Given positive integers j and k, the SUM OF SQUARES problem asks whether or not A can be partitioned into j disjoint subsets, A = B1 ˙ ∪ B2 ˙ ∪ . . . ˙ ∪ Bj, such that j X i=1 X a∈Bi f(a) !2 ≤ k. Prove that the SUM OF SQUARES problem is NP- complete. Problem 3. (20 = 10+10 points). Let G be an undirected graph. A path in G is simple if it contains each ver- tex at most once. Specifying two vertices u, v and a positive integer k, the LONGEST PATH problem asks whether or not there is a simple path connecting u and v whose length is k or longer. (a) Give a polynomial-time algorithm for the LONGEST PATH problem or show that it is NP- hard. (b) Revisit (a) under the assumption that G is di- rected and acyclic. Problem 4. (20 = 10 + 10 points). Let A ⊆ 2V be an abstract simplicial complex over the finite set V and let k be a positive integer. (a) Is it NP-hard to decide whether A has k or more disjoint simplices? (b) Is it NP-hard to decide whether A has k or fewer simplices whose union is V ? Problem 5. (20 points). Let G = (V, E) be an undi- rected, bipartite graph and recall that there is a polynomial-time algorithm for constructing a max- imum matching. We are interested in computing a minimum set of matchings such that every edge of the graph is a member of at least one of the selected matchings. Give a polynomial-time algorithm con- structing an O(log n) approximation for this prob- lem. 95