Chapter 1: Introduction
Mathematical Review
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mathematical Review
 Exponents
 Logarithms
 Recursive Definitions
 Function Growth
 Proofs
Exponents
X0 = 1 by definition
XaXb = X (a+b)
Xa / Xb = X (a-b)
Show that: X-n = 1 / Xn
(Xa )b = Xab
Logarithms
 logaX = Y  aY = X , a > 0, X > 0
E.G: log28 = 3; 23 = 8
 loga1 = 0 because a0 = 1
logX means log2X
lgX means log10X
lnX means logeX,
where ‘e’ is the natural
number
Logarithms
 loga(XY) = logaX + logaY
 loga(X/Y) = logaX – logaY
 loga(Xn) = nlogaX
 loga b = (log2 b)/ (log2a)
 a loga
x = x
Recursive Definitions
 Basic idea: To define objects,
processes and properties in terms of
simpler objects,
simpler processes or
properties of simpler
objects/processes.
Recursive Definitions
 Terminating rule - defining
the object explicitly.
 Recursive rules - defining
the object in terms of a
simpler object.
Examples
 Factorials N!
f(n) = n!
f(0) = 1 i.e. 0! = 1
f(n) = n * f(n-1)
i.e. n! = n * (n-1)!
Examples
Fibonacci numbers
F(0) = 1
F(1) = 1
F(k+1) = F(k) + F(k-1)
1, 1, 2, 3, 5, 8, ….
Function Growth
lim ( n ) = ∞, n → ∞
lim ( na ) = ∞, n → ∞, a > 0
lim ( 1 / n ) = 0, n → ∞
lim ( 1 / (na) ) = 0, n → ∞, a > 0
lim ( log( n )) = ∞, n → ∞
lim ( an ) = ∞, n → ∞, a > 0
Function Growth
 lim (f(x) + g(x)) = lim (f(x)) + lim (g(x))
 lim (f(x) * g(x)) = lim (f(x)) * lim (g(x))
 lim (f(x) / g(x)) = lim (f(x)) / lim (g(x))
 lim (f(x) / g(x)) = lim (f '(x) / g '(x))
Examples
 lim (n/ n2 ) = 0, n → ∞
 lim (n2 / n) = ∞, n → ∞
 lim (n2 / n3 ) = 0, n → ∞
 lim (n3 / n2 ) = ∞, n → ∞
 lim (n / ((n+1)/2) = 2, n → ∞.
Some Derivatives
 (logan)' = (1/n) logae
 (an)' = (an) ln(a)
Proofs
 Direct proof
 Proof by induction
 Proof by counterexample
 Proof by contradiction
 Proof by contraposition
Direct Proof
Based on the definition of the object / property
Example:
Prove that if a number is divisible by 6 then it is
divisible by 2
Proof: Let m divisible by 6.
Therefore, there exists q such that m = 6q
6 = 2 . 3
m = 6q = 2.3.q = 2r, where r = 3q
Therefore m is divisible by 2
Proof by Induction
We use proof by induction when our claim
concerns a sequence of cases, which can be
numbered
Inductive base:
Show that the claim is true for the smallest
case, usually k = 0 or k = 1.
Inductive hypothesis:
Assume that the claim is true for some k
Prove that the claim is true for k+1
Example of Proof by Induction
Prove by induction that
S(N) = Σ 2i = 2 (N+1) - 1, for any integer N ≥ 0
i=0 to N
1. Inductive base
Let n = 0. S(0) = 20 = 1
On the other hand, by the formula S(0) = 2 (0+1) – 1 = 1.
Therefore the formula is true for n = 0
2. Inductive hypothesis
Assume that S(k) = 2 (k+1) – 1
We have to show that S(k+1) = 2(k + 2) -1
By the definition of S(n):
S(k+1) = S(k) + 2(k+1) = 2 (k+1) – 1 + 2(k+1) = 2. 2(k+1) – 1 = 2(k+2) – 1
Proof by Counterexample
Used when we want to prove that a statement is
false.
Types of statements: a claim that refers to all
members of a class.
EXAMPLE: The statement "all odd numbers are
prime" is false.
A counterexample is the number 9: it is odd and
it is not prime.
Proof by Contradiction
Assume that the statement is false, i.e. its
negation is true.
Show that the assumption implies that
some known property is false - this would
be the contradiction
Example: Prove that there is no largest
prime number
Proof by Contraposition
Used when we have to prove a statement of the
form P  Q.
Instead of proving P  Q, we prove its equivalent
~Q  ~P
Example: Prove that if the square of an integer is
odd then the integer is odd
We can prove using direct proof the statement:
If an integer is even then its square is even.
Chapter 2:
Algorithm Analysis
Big-Oh and Other Notations in
Algorithm Analysis
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Big-Oh and Other Notations in
Algorithm Analysis
 Classifying Functions by Their
Asymptotic Growth
 Theta, Little oh, Little omega
 Big Oh, Big Omega
 Rules to manipulate Big-Oh expressions
 Typical Growth Rates
Classifying Functions by
Their Asymptotic Growth
Asymptotic growth : The rate of growth of
a function
Given a particular differentiable function
f(n), all other differentiable functions fall
into three classes:
.growing with the same rate
.growing faster
.growing slower
f(n) and g(n) have
same rate of growth, if
lim( f(n) / g(n) ) = c,
0 < c < ∞, n -> ∞
Notation: f(n) = Θ( g(n) )
pronounced "theta"
Theta
f(n) grows slower than g(n)
(or g(n) grows faster than f(n))
if
lim( f(n) / g(n) ) = 0, n → ∞
Notation: f(n) = o( g(n) )
pronounced "little oh"
Little oh
f(n) grows faster than g(n)
(or g(n) grows slower than f(n))
if
lim( f(n) / g(n) ) = ∞, n -> ∞
Notation: f(n) = ω (g(n))
pronounced "little omega"
Little omega
if g(n) = o( f(n) )
then f(n) = ω( g(n) )
Examples: Compare n and n2
lim( n/n2 ) = 0, n → ∞, n = o(n2)
lim( n2/n ) = ∞, n → ∞, n2 = ω(n)
Little omega and Little oh
Theta: Relation of Equivalence
R: "having the same rate of growth":
relation of equivalence,
gives a partition over the set of all
differentiable functions - classes of
equivalence.
Functions in one and the same class are
equivalent with respect to their growth.
Algorithms with Same Complexity
Two algorithms have same complexity,
if the functions representing the number
of operations have
same rate of growth.
Among all functions with same rate of
growth we choose the simplest one to
represent the complexity.
Examples
Compare n and (n+1)/2
lim( n / ((n+1)/2 )) = 2,
same rate of growth
(n+1)/2 = Θ(n)
- rate of growth of a linear function
Examples
Compare n2 and n2+ 6n
lim( n2 / (n2+ 6n ) )= 1
same rate of growth.
n2+6n = Θ(n2)
rate of growth of a quadratic function
Examples
Compare log n and log n2
lim( log n / log n2 ) = 1/2
same rate of growth.
log n2 = Θ(log n)
logarithmic rate of growth
Θ(n3): n3
5n3+ 4n
105n3+ 4n2 + 6n
Θ(n2): n2
5n2+ 4n + 6
n2 + 5
Θ(log n): log n
log n2
log (n + n3)
Examples
Comparing Functions
 same rate of growth: g(n) = Θ(f(n))
 different rate of growth:
either g(n) = o (f(n))
g(n) grows slower than f(n),
and hence f(n) = ω(g(n))
or g(n) = ω (f(n))
g(n) grows faster than f(n),
and hence f(n) = o(g(n))
f(n) = O(g(n))
if f(n) grows with
same rate or slower than g(n).
f(n) = Θ(g(n)) or
f(n) = o(g(n))
The Big-Oh Notation
n+5 = Θ(n) = O(n) = O(n2)
= O(n3) = O(n5)
the closest estimation: n+5 = Θ(n)
the general practice is to use
the Big-Oh notation:
n+5 = O(n)
Example
The inverse of Big-Oh is Ω
If g(n) = O(f(n)),
then f(n) = Ω (g(n))
f(n) grows faster or with the same
rate as g(n): f(n) = Ω (g(n))
The Big-Omega Notation
Rules to manipulate
Big-Oh expressions
Rule 1:
a. If
T1(N) = O(f(N))
and
T2(N) = O(g(N))
then
T1(N) + T2(N) =
max( O( f (N) ), O( g(N) ) )
Rules to manipulate
Big-Oh expressions
b. If
T1(N) = O( f(N) )
and
T2(N) = O( g(N) )
then
T1(N) * T2(N) = O( f(N)* g(N) )
Rules to manipulate
Big-Oh expressions
Rule 2:
If T(N) is a polynomial of degree k,
then
T(N) = Θ( Nk )
Rule 3:
log k N = O(N) for any constant k.
Examples
n2 + n = O(n2)
we disregard any lower-order term
nlog(n) = O(nlog(n))
n2 +nlog(n) = O(n2)
C constant, we write O(1)
logN logarithmic
log2N log-squared
N linear
NlogN
N2 quadratic
N3 cubic
2N exponential
N! factorial
Typical Growth Rates
Problems
N2 = O(N2) true
2N = O(N2) true
N = O(N2) true
N2 = O(N) false
2N = O(N) true
N = O(N) true
Problems
N2 = Θ (N2) true
2N = Θ (N2) false
N = Θ (N2) false
N2 = Θ (N) false
2N = Θ (N) true
N = Θ (N) true
Chapter 2:
Algorithm Analysis
Application of Big-Oh to program
analysis
Running Time Calculations
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
46
Background
The work done by an
algorithm, i.e. its complexity,
is determined by the
number of the basic operations
necessary to solve the
problem.
47
The Task
Determine how the number of
operations depend on the size
of input :
N - size of input
F(N) - number of operations
48
Basic operations in an algorithm
Problem: Find x in an array
Operation: Comparison of x with an
entry in the array
Size of input: The number of the
elements in the array
49
Basic operations ….
Problem: Multiplying two matrices
with real entries
Operation:
Multiplication of two real numbers
Size of input:
The dimensions of the matrices
50
Basic operations ….
Problem: Sort an array of numbers
Operation: Comparison of two
array entries plus moving elements in
the array
Size of input: The number of
elements in the array
51
Counting the number of
operations
A. for loops O(n)
The running time of a for loop is
at most the running time of the
statements inside the loop times
the number of iterations.
52
for loops
sum = 0;
for( i = 0; i < n; i++ )
sum = sum + i;
The running time is O(n)
53
B. Nested loops
The total running time is the
running time of the inside
statements times
the product of the sizes of all the
loops
Counting the number of
operations
54
Nested loops
sum = 0;
for( i = 0; i < n; i++)
for( j = 0; j < n; j++)
sum++;
The running time is O(n2)
55
C. Consecutive
program fragments
Total running time :
the maximum of the running time
of the individual fragments
Counting the number of
operations
56
Consecutive program fragments
sum = 0; O(n)
for( i = 0; i < n; i++)
sum = sum + i;
sum = 0; O(n2)
for( i = 0; i < n; i++)
for( j = 0; j < 2n; j++) sum++;
The maximum is O(n2)
57
D: If statement
if C
S1;
else
S2;
The running time is the maximum
of the running times of S1 and S2.
Counting the number of
operations
58
EXAMPLES
O(n3):
sum = 0;
for( i = 0 ; i < n; i++)
for( j = 0 ; j < n*n ; j++ )
sum++;
59
EXAMPLES
O(n2):
sum = 0;
for( i = 0; i < n ; i++)
for( j = 0; j < i ; j++)
sum++;
60
EXAMPLES
O(n3logn)
for(j = 0; j < n*n; j++)
compute_val(j);
The complexity of
compute_val(x) is given to
be O(n*logn)
61
Search in an unordered
array of elements
for (i = 0; i < n; i++)
if (a[ i ] == x) return 1;
return -1;
O(n)
62
Search in a table n x m
for (i = 0; i < n; i++)
for (j = 0; j < m; j++)
if (a[ i ][ j ] == x) return 1 ;
return -1; O(n*m)
Chapter 2:
Algorithm Analysis
Application of Big-Oh to program
analysis
Logarithms in Running Time
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
64
Binary search
Euclid’s algorithm
Exponentials
Rules to count operations
Logarithms in Running Time
65
Divide-and-conquer
algorithms
Subsequently reducing the
problem by a factor of two
require O(logN) operations
66
Why logN?
A complete binary tree with
N leaves has logN levels.
Each level in the divide-and-
conquer algorithms corresponds to
an operation
Hence the number of operations is
O(logN)
67
Example: 8 leaves, 3 levels
68
Binary Search
Solution 1:
Scan all elements from left to right,
each time comparing with X.
O(N) operations.
69
Binary Search
Solution 2: O(logN)
Find the middle element Amid in the
list and compare it with X
If they are equal, stop
If X < Amid consider the left part
If X > Amid consider the right part
Do until the list is reduced to one
element
70
Euclid's algorithm
Finding the greatest common
divisor (GCD)
GCD of M and N, M > N,
= GCD of N and M % N
71
GCD and recursion
Recursion:
If M%N = 0 return N
Else return GCD(N, M%N)
The answer is the last nonzero
remainder.
72
M N rem
24 15 9
15 9 6
9 6 3
6 3 0
3 0
73
long gcd ( long m, long n)
{
long rem;
while (n != 0)
{
rem = m % n;
m = n;
n = rem;
}
return m; }
Euclid’s
Algorithm
(non-recursive
implementation)
74
Why O(logN)
M % N <= M / 2
After 1st iteration N appears as first
argument, the remainder is less than N/2
After 2nd iteration the remainder appears
as first argument and will be reduced by a
factor of two
Hence O(logN)
75
Computing XN
XN = X*(X2) N / 2 ,N is odd
XN = (X2) N / 2 ,N is even
76
long pow (long x, int n)
{
if ( n == 0) return 1;
if (is_Even( n ))
return pow(x * x, n/2);
else return
x * pow ( x * x, n/2);
}
77
Why O(LogN)
If N is odd : two multiplications
The operations are at most 2logN:
O(logN)
78
Another recursion for XN
Another recursive definition that
reduces the power just by 1:
XN = X*XN -1
Here the operations are N-1, i.e. O(N)
and the algorithm is less efficient
than the divide-and-conquer
algorithm.
79
How to count operations
 single statements (not function calls)
: constant O(1) = 1.
 sequential fragments: the maximum
of the operations of each fragment
80
 single loop running up to N, with
single statements in its body: O(N)
 single loop running up to N,
with the number of operations in the
body O(f(N)):
O( N * f(N) )
How to count operations
81
 two nested loops each running up to
N, with single statements: O(N2)
 divide-and-conquer algorithms with
input size N: O(logN)
Or O(N*logN) if each step requires additional
processing of N elements
How to count operations
82
Example: What is the probability two
numbers to be relatively prime?
tot = 0; rel = 0;
for ( i = 0; i <= n; i++)
for (j = i+1; j <= n; j++)
{ tot++;
if ( gcd( i, j ) ==1) rel++; }
return (rel/tot);
Running time = ?
Chapter 3:
Abstract Data Types
Lists, Stacks
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
84
An ADT
• a set of objects
• a set of operations
Same set of objects ,
different sets of operations =>
different ADTs
ADTs are implemented using classes,
hiding implementation details:
encapsulation
85
LIST ABSTRACTION
Definition:
A linear configuration of
elements, called nodes.
86
Characteristics
 Insert and delete nodes in any order
 The nodes are connected
 Each node has two components
Information (data)
Link to the next node
 The nodes are accessed through the links
between them
87
Head Predecessor
of X
Node X Success-
or of X
tail
For each node the node that is in
front of it is called predecessor.
The node that is after it is called
successor.
88
Terminology
 Head (front, first node):
 The node without predecessor, the node
that starts the lists.
 Tail (end, last node):
 The node that has no successor, the last
node in the list.
 Current node: The node being processed.
 From the current node we can access the next
node.
 Empty list: No nodes exist
89
Basic operations
 To create/destroy a list
 To expand/shrink the list
 Read/Write operations
 Changing the current node (moving along the
list)
 To report current position in the list
 To report status of the list
90
ADT List Notation
L - list
e - item of the same type as the
information part of an element
(a node) in the list
b - boolean value
91
Operations in ADT Notation
Insert(L,e)
Inserts a node with information e before the
current position
Delete(L)
Deletes the current node in L , the current
position indicates the next node.
RetrieveInfo(L)  e
Returns the information in the current node.
92
Insertion and Deletion
A. Insertion
To insert a node X between the
nodes A and B:
.Create a link from X to B.
.Create a link from A to X,
93
Insertion
X
A B
94
Insertion and Deletion
B. Deletion
To delete a node X between A and B:
• Create a link from A to B,
• Remove node X
95
Deletion
A X B
96
Node Linking
1. Single linked lists :
Each node contains two links - to the previous
and to the next node
2. Double linked lists :
Each node contains a link only to the next node
3. Circular lists:
The tail is linked to the head.
97
List Implementation
Static – using an array
Dynamic – using linear nodes
98
Array Implementation
Two parallel arrays are used:
Index array - the number stored
in the i-th element shows the index
of the "next" node , i.e. node that
follows the i-th node
Data array - used to store the
informational part of the nodes.
99
100
STACKS
Definition:
The last stored element is
the first to be accessed
(LIFO: last in - first out)
101
Basic operations
Push: store a data item at
the top of the stack
Pop: retrieve a data item
from the top of the stack
102
ADT Definition of STACK
Notation:
S stack
e item of same type as the
elements of S
b boolean value
103
Operations
Init_Stack(S)
Procedure to initialize S to an
empty stack
Destroy_Stack(S)
Procedure to delete all elements in S
104
Operations
Stack_Empty(S)  b
Boolean function that returns
TRUE if S is empty.
Stack_Full(S)  b
Boolean function that returns
TRUE if S is full.
105
Operations
Push(S,e)
Procedure to place an item e into S
(if there is room, i.e. S is not full)
Pop(S)  e
Procedure to take the last item
stored in S (this item is called also -
top element) if S is not empty
106
Example
A procedure to replace the elements of a
nonempty stack, assuming they are of type
integers, with their sum
Pre: A nonempty stack with elements of
type integers.
Post: S contains only one element –
the sum of previously stored elements.
107
Algorithm
1. e1  Pop(S)
2. while stack is not empty
repeat
2.1. e2  pop(S)
2.2. push(S, e1+e2)
2.3. e1 pop (S)
3. push(S,e1)
Chapter 3:
Abstract Data Types
Queues
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
109
QUEUES
Definition: A sequence of
elements of the same type.
The first stored element is first
accessible.
The structure is known also under
the name FIFO - first in first out
110
Basic operations
EnQueue : store a data item at
the end of the queue
DeQueue : retrieve a data item
from the beginning of the queue
111
ADT Definition of QUEUE
Notation:
Q queue
e item of same type as the
elements of Q
b boolean value
112
Operations
Init_Queue(Q)
Initialize Q to an empty queue
Queue_Empty(Q)  b
Boolean function that returns TRUE is
Q is empty
Queue_Full(Q)  b
Boolean function that returns TRUE if
Q is full :array-based implementations
113
Operations
EnQueue(Q,e)
Procedure to place an item e into
Q at the end (if Q is not full)
DeQueue(Q)  e
Procedure to take the first item
stored in Q if Q is not empty
114
Problem 1
Append_Queue(Q,P): A procedure to append
a queue P onto the end of a queue Q, leaving P
empty.
Pre: queue P and queue Q, initialized
Post: Q contains all elements originally in Q,
followed by the elements that were in P in same
order. P is empty.
• Design an algorithm to solve the problem
115
Problem 2
Reverse_Queue(Q): A procedure to reverse
the elements in a queue Q
Pre: queue Q, initialized
Post: Q contains all elements re-written in
reverse order
• Design a non-recursive algorithm using a
stack
• Design a recursive algorithm
• Find the complexity of the algorithms
116
Solutions to Problem 2
A. Non-recursive
Init_Stack(S)
While not Queue_Empty(Q)
e DeQueue(Q)
Push(S,e)
While not Stack_Empty(S)
e  Pop(S)
EnQueue(Q,e)
.
Complexity O(N),
N - the number of
elements in Q.
117
Solutions to Problem 2
B. Recursive
Reverse_Queue(Q):
If not Queue_Empty(Q)
e DeQueue(Q)
Reverse_Queue(Q)
EnQueue(Q,e)
return
Complexity
O(N),
N - the
number of
elements
in Q.
118
Problem 3
Append_Reverse_Queue(Q,P): Append a
queue P in reverse order onto the end of a
queue Q, leaving P empty.
Pre: P and Q initialized (possibly empty)
Post: Q contains all elements originally in Q,
followed by the elements that were in P in
reverse order. P is empty
•Design a recursive algorithm
Chapter 4: Trees
General Tree Concepts
Binary Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
120
Trees
Definitions
Representation
Binary trees
Traversals
Expression trees
121
Definitions
tree - a non-empty collection of
vertices & edges
vertex (node) - can have a
name and carry other associated
information
path - list of distinct vertices in
which successive vertices are
connected by edges
122
Definitions
 any two vertices must have one
and only one path between them
else its not a tree
 a tree with N nodes has N-1 edges
123
Definitions
 root - starting point (top) of the
tree
 parent (ancestor) - the vertex
“above” this vertex
 child (descendent) - the vertices
“below” this vertex
124
Definitions
 leaves (terminal nodes) - have
no children
 level - the number of edges
between this node and the root
 ordered tree - where children’s
order is significant
125
Definitions
 Depth of a node - the length of the path
from the root to that node
• root: depth 0
 Height of a node - the length of the longest
path from that node to a leaf
• any leaf: height 0
 Height of a tree: The length of the longest
path from the root to a leaf
126
Balanced Trees
the difference between the
height of the left sub-tree
and the height of the right
sub-tree is not more than 1.
127
Trees - Example
E
R
T
E
L
P
M
E
A
S
A
root
Leaves or
terminal nodes
Child (of
root)
Depth of T: 2
Height of T: 1
Level
0
1
3
2
128
Tree Representation
Class TreeNode
{
Object element;
TreeNode firstChild;
TreeNode nextSibling;
}
129
Example
a
b f
e
c d g
a
b e
c d
f
g
130
Binary Tree
S
A
B
N
O
N
P
D
M
I
S Internal
node
External
node
131
Height of a Complete Binary
Tree
L 0
L 1
L 2
L 3
At each level the number of the nodes is
doubled.
total number of nodes:
1 + 2 + 22 + 23 = 24 - 1 = 15
132
Nodes and Levels in a
Complete Binary Tree
Number of the nodes in a tree with M levels:
1 + 2 + 22 + …. 2M = 2 (M+1) - 1 = 2*2M - 1
Let N be the number of the nodes.
N = 2*2M - 1, 2*2M = N + 1
2M = (N+1)/2
M = log( (N+1)/2 )
N nodes : log( (N+1)/2 ) = O(log(N)) levels
M levels: 2 (M+1) - 1 = O(2M ) nodes
133
Binary Tree Node
Class BinaryNode
{
Object Element; // the data in the node
BinaryNode left; // Left child
BinaryNode right; // Right child
}
134
Binary Tree – Preorder
Traversal
C
L
R
E
T
D
O
N
U
M
P
A
Root
Left
Right
First letter - at the root
Last letter – at the rightmost node
135
Preorder Algorithm
preorderVisit(tree)
{
if (current != null)
{
process (current);
preorderVisit (left_tree);
preorderVisit (right_tree);
}
}
136
Binary Tree – Inorder
Traversal
U
A
E
R
T
N
P
D
M
O
C
L
Left
Root
Right
First letter - at the leftmost node
Last letter – at the rightmost node
137
Inorder Algorithm
inorderVisit(tree)
{
if (current != null)
{
inorderVisit (left_tree);
process (current);
inorderVisit (right_tree);
}
}
138
Binary Tree – Postorder
Traversal
D
L
U
A
N
E
P
R
O
M
C
T
Left
Right
Root
First letter - at the leftmost node
Last letter – at the root
139
Postorder Algorithm
postorderVisit(tree)
{
if (current != null)
{
postorderVisit (left_tree);
postorderVisit (right_tree);
process (current);
}
}
140
Expression Trees
1 2
The stack contains references to tree nodes
(bottom is to the left)
+
1 2
3
*
+
1 2
3
(1+2)*3
Post-fix notation: 1 2 + 3 *
141
Expression Trees
In-order traversal:
(1 + 2) * ( 3)
Post-order traversal:
1 2 + 3 *
*
+
1 2
3
Chapter 4: Trees
General Tree Concepts
Binary Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
143
Trees
Definitions
Representation
Binary trees
Traversals
Expression trees
144
Definitions
tree - a non-empty collection of
vertices & edges
vertex (node) - can have a
name and carry other associated
information
path - list of distinct vertices in
which successive vertices are
connected by edges
145
Definitions
 any two vertices must have one
and only one path between them
else its not a tree
 a tree with N nodes has N-1 edges
146
Definitions
 root - starting point (top) of the
tree
 parent (ancestor) - the vertex
“above” this vertex
 child (descendent) - the vertices
“below” this vertex
147
Definitions
 leaves (terminal nodes) - have
no children
 level - the number of edges
between this node and the root
 ordered tree - where children’s
order is significant
148
Definitions
 Depth of a node - the length of the path
from the root to that node
• root: depth 0
 Height of a node - the length of the longest
path from that node to a leaf
• any leaf: height 0
 Height of a tree: The length of the longest
path from the root to a leaf
149
Balanced Trees
the difference between the
height of the left sub-tree
and the height of the right
sub-tree is not more than 1.
150
Trees - Example
E
R
T
E
L
P
M
E
A
S
A
root
Leaves or
terminal nodes
Child (of
root)
Depth of T: 2
Height of T: 1
Level
0
1
3
2
151
Tree Representation
Class TreeNode
{
Object element;
TreeNode firstChild;
TreeNode nextSibling;
}
152
Example
a
b f
e
c d g
a
b e
c d
f
g
153
Binary Tree
S
A
B
N
O
N
P
D
M
I
S Internal
node
External
node
154
Height of a Complete Binary
Tree
L 0
L 1
L 2
L 3
At each level the number of the nodes is
doubled.
total number of nodes:
1 + 2 + 22 + 23 = 24 - 1 = 15
155
Nodes and Levels in a
Complete Binary Tree
Number of the nodes in a tree with M levels:
1 + 2 + 22 + …. 2M = 2 (M+1) - 1 = 2*2M - 1
Let N be the number of the nodes.
N = 2*2M - 1, 2*2M = N + 1
2M = (N+1)/2
M = log( (N+1)/2 )
N nodes : log( (N+1)/2 ) = O(log(N)) levels
M levels: 2 (M+1) - 1 = O(2M ) nodes
156
Binary Tree Node
Class BinaryNode
{
Object Element; // the data in the node
BinaryNode left; // Left child
BinaryNode right; // Right child
}
157
Binary Tree – Preorder
Traversal
C
L
R
E
T
D
O
N
U
M
P
A
Root
Left
Right
First letter - at the root
Last letter – at the rightmost node
158
Preorder Algorithm
preorderVisit(tree)
{
if (current != null)
{
process (current);
preorderVisit (left_tree);
preorderVisit (right_tree);
}
}
159
Binary Tree – Inorder
Traversal
U
A
E
R
T
N
P
D
M
O
C
L
Left
Root
Right
First letter - at the leftmost node
Last letter – at the rightmost node
160
Inorder Algorithm
inorderVisit(tree)
{
if (current != null)
{
inorderVisit (left_tree);
process (current);
inorderVisit (right_tree);
}
}
161
Binary Tree – Postorder
Traversal
D
L
U
A
N
E
P
R
O
M
C
T
Left
Right
Root
First letter - at the leftmost node
Last letter – at the root
162
Postorder Algorithm
postorderVisit(tree)
{
if (current != null)
{
postorderVisit (left_tree);
postorderVisit (right_tree);
process (current);
}
}
163
Expression Trees
1 2
The stack contains references to tree nodes
(bottom is to the left)
+
1 2
3
*
+
1 2
3
(1+2)*3
Post-fix notation: 1 2 + 3 *
164
Expression Trees
In-order traversal:
(1 + 2) * ( 3)
Post-order traversal:
1 2 + 3 *
*
+
1 2
3
Chapter 4: Trees
AVL Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
166
AVL Trees
 Keep the tree balanced
 Use node rotation
 Balanced condition:
The left and the right subtrees of each
node should differ by at most one level.
The method was proposed by two Russian scientists
Adelson-Velskii and Landis in 1962
167
Single Rotation
5
3
7
6 8
9
The sub-tree containing the inserted
node (rooted at 8) is at the same
side as the ‘heavier’ sub-tree of node
5 (rooted at 7)
Links to be changed
New node
168
After the Rotation
7
5
8
9
6
3
The middle node 6 is switched
to the other subtree
169
Double Rotation
5
6
7
8
5
7 8
9
6
9
8
8
7
7
The new node 6 is in the left subtree of node 8, while node 8 is the
right subtree of 5 -> rotate 7 and 8 to obtain same-side trees.
Animation
New node
170
Splay Trees
Move to the top each accessed
node with rotations, decreasing
the depth of the tree.
171
Multi-Way Search
Nodes contain more than one key
nodes are used only to contain the keys, the actual
records are stored at the terminal nodes at the
bottom of the tree
E I
R
A C G H
S
N
172
Complexity Issues
 AVL trees:
Search,Insertion, and Deletion: O(logN)
Creating the tree – O(N)
 Trees with multiple keys: O(LogmN)
m – branching factor
Chapter 4: Trees
Binary Search Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
174
Binary Search Trees
Definitions
Operations and complexity
Advantages and disadvantages
AVL Trees
Single rotation
Double rotation
Splay Trees
Multi-Way Search
175
Definitions
Each node –
a record with a key and a
value
a left link
a right link
All records with smaller keys –
left subtree
All records with larger keys –
right subtree
176
Example
177
Operations
 Search - compare the values and proceed
either to the left or to the right
 Insertion - unsuccessful search - insert the
new node at the bottom where the search has
stopped
 Deletion - replace the value in the node with
the smallest value in the right subtree
or the largest value in the left subtree.
 Retrieval in sorted order – inorder
traversal
178
Complexity
Logarithmic, depends on the shape of
the tree
In the worst case – O(N) comparisons
179
Advantages of BST
 Simple
 Efficient
 Dynamic
 One of the most fundamental algorithms in CS
 The method of choice in many applications
180
Disadvantages of BST
 The shape of the tree depends on the order of
insertions, and it can be degenerated.
 When inserting or searching for an element, the key
of each visited node
has to be compared with the key of the element to
be inserted/found.
Keys may be long and the run time may increase
much.
Animation
181
Improvements of BST
Keeping the tree balanced:
AVL trees (Adelson - Velskii and Landis)
Balance condition: left and right subtrees of each
node can differ by at most one level.
It can be proved that if this condition is observed
the depth of the tree is O(logN).
Reducing the time for key comparison:
Radix trees - comparing only the leading bits of the
keys (not discussed here)
Chapter 4: Trees
Radix Search Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
183
Radix Search Trees
Radix Searching
Digital Search Trees
Radix Search Trees
Multi-Way Radix Trees
184
Radix Searching
Idea: Examine the search keys
one bit at a time
Advantages:
reasonable worst-case performance
easy way to handle variable length keys
some savings in space by storing part
of
the key within the search structure
competitive with both binary search
trees
185
Radix Searching
 Disadvantages:
 biased data can lead to degenerate
trees with bad performance
 for some methods use of space is
inefficient
 dependent on computer’s
architecture – difficult to do efficient
implementations in some high-level
languages
186
Radix Searching
• Methods
Digital Search Trees
Radix Search Tries
Multiway Radix Searching
187
Digital Search Trees
Similar to binary tree search
Difference:
Branch in the tree by comparing the
key’s bits, not the keys as a whole
188
Example
A 00001
S 10011
E 00101
R 10010
C 00011
H 01000
I 01001
N 01110
G 00111
X 11000
M 01101
P 10000
L 01100
A
P
X
M
E
G
C
L
H
N
I
S
R
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
189
Example
A
P
X
M
E
G
C
L
H
N
I
S
R
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
Z
0 1
inserting Z = 11010
go right twice
go left – external node
attach Z to the left of X
190
Digital Search Trees
Things to remember about digital search
trees:
 Equal keys are anathema – must be kept
in separate data structures, linked to the
nodes.
 Worst case – better than for binary
search trees – the length of the longest
path is equal to the longest match in the
leading bits between any two keys.
191
Digital Search Trees
 Search or insertion requires about log(N)
comparisons on the average and b
comparisons in the worst case in a tree
built from N random b-bit keys.
 No path will ever be longer than the
number of bits in the keys
192
Radix Search Trees
 If the keys are long digital search trees
have low efficiency.
 Radix search trees : do not store keys in
the tree at all, the keys are in the external
nodes of the tree.
 Called tries (try-ee) from “retrieval”
193
Radix Search Trees
Two types of nodes
 Internal: contain only links to other
nodes
 External: contain keys and no links
194
Radix Search Trees
To insert a key –
1. Go along the path described by the leading
bit pattern of the key until an external node is
reached.
2. If the external node is empty, store there the
new key.
If the external node contains a key, replace it
by an internal node linked to the new key and the old
key. If the keys have several bits equal, more internal
nodes are necessary.
NOTE: insertion does not depend on the order of the keys.
195
Radix Search Trees
To search for a key –
1. Branch according to its bits,
2. Don’t compare it to anything, until we
get to an external node.
3. One full key comparison there
completes the search.
196
Example
A 00001
S 10011
E 00101
R 10010
C 00011
A C
E
R S
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
197
Example - insertion
A 00001
S 10011
E 00101
R 10010
C 00011
H 01000
A C
E
R S
H
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
External node - empty
198
Example - insertion
A 00001
S 10011
E 00101
R 10010
C 00011
H 01000
I 01001
A C
E
R S
H
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
I
0
0
0
1
1
1
External node - occupied
199
Radix Search Trees - summary
• Program implementation -
 Necessity to maintain two types of nodes
 Low-level implementation
• Complexity: about logN bit comparisons in average case
and b bit comparisons in the worst case in a tree built
from N random b-bit keys.
Annoying feature: One-way branching for keys with a
large number of common leading bits :
 The number of the nodes may exceed the number of the keys.
 On average – N/ln2 = 1.44N nodes
200
Multi-Way Radix Trees
 The height of the tree is limited by the number of
the bits in the keys
 If we have larger keys – the height increases. One
way to overcome this deficiency is using a multi-
way radix tree searching.
 The branching is not according to 1 bit, but rather
according to several bits (most often 2)
 If m bits are examined at a time – the search is
speeded up by a factor of 2m
 Problem: if m bits at a time, the nodes will have
2m links, may result in considerable amount of
wasted space due to unused links.
201
Multi-Way Radix Trees -
example
Search – take left,
right or middle
links
according to the
first two bits.
Insert – replace
external node by
the key
(E.G. insert T 10100).
X
A C E G N P
H
I L M R
S
00
01
10
11
T
Nodes with 4 links – 00, 01, 10, 11
202
Multi-Way Radix Trees
 Wasted space – due to the large number of
unused links.
 Worse if M - the number of bits considered, gets
higher.
 The running time: logMN – very efficient.
 Hybrid method:
Large M at the top,
Small M at the bottom
Chapter 5: Hashing
Collision Resolution:
Separate Chaining
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
Collision Resolution
 Collision Resolution
 Separate Chaining
 Open Addressing
 Linear Probing
 Quadratic Probing
 Double Hashing
 Rehashing
 Extendible Hashing
Hash Tables – Collision
Problem: many-to-one mapping
a potentially huge set of strings 
a small set of integers
Collision: Having a second key into a
previously used slot.
Collision resolution: deals with keys
that are mapped to same slots.
Separate Chaining
Invented by H. P. Luhn, an IBM engineer, in
January 1953.
Idea: Keys hashing to same slot are kept in
linked lists attached to that slot
Useful for highly dynamic situations,
where the number of the search keys cannot
be predicted in advance.
Example
Key: A S E A R C H I N G E X A M P L E
Hash: 1 8 5 1 7 3 8 9 3 7 5 2 1 2 5 1 5
(M = 11)
Separate chaining:
0 1 2 3 4 5 6 7 8 9 10
= L M N = E = G H I =
A X C P R S =
A = = E = =
A E
= = = end of list
Separate Chaining – Length of
Lists
N - number of keys
M - size of table
N/M - average length of the lists
Separate Chaining – Search
 Unsuccessful searches go to the end of
some list.
 Successful searches are expected to go
half the way down some list.
Separate Chaining – Choosing
Table Size M
relatively small so as not to use up
a large area of contiguous memory
but enough large so that the lists
are short for more efficient
sequential search
Separate Chaining –
other chaining options
 Keep the lists ordered: useful if there are
much more searches than inserts, and if most
of the searches are unsuccessful
 Represent the chains as binary search tree.
Extra effort needed – not efficient
Separate Chaining –
advantages and disadvantages
Advantages
 used when memory space is a concern
 easily implemented
Disadvantages
 unevenly distributed keys – long lists :
search time increases, many empty
spaces in the table.
Chapter 5: Hashing
Hash Tables
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
214
Hashing
What is Hashing?
Direct Access Tables
Hash Tables
215
Hashing - basic idea
A mapping between the search
keys and indices - efficient
searching into an array.
Each element is found with one
operation only.
216
Hashing example
Example:
 1000 students,
 identification number between 0 and 999,
 use an array of 1000 elements.
 SSN of each student a 9-digit number.
 much more elements than the number of
the students, - a great waste of space.
217
The Approach
 Directly referencing records in a table
using arithmetic operations on keys to
map them onto table addresses.
 Hash function: function that transforms
the search key into a table address.
218
Direct-address tables
 The most elementary form of hashing.
 Assumption – direct one-to-one
correspondence between the keys and
numbers 0, 1, …, m-1., m – not very large.
 Array A[m]. Each position (slot) in the array
contains a pointer to a record, or NIL.
219
Direct-Address Tables
 The most elementary form of hashing
 Assumption – direct one-to-one
correspondence between the keys and
numbers 0, 1, …, m-1., m – not very large.
 Array A[m]. Each position (slot) in the array
contains either a reference to a record, or
NULL.
 Cost – the size of the array we need is
determined the largest key. Not very useful if
there are only a few keys
220
Hash Functions
Transform the keys into
numbers within a
predetermined interval to be
used as indices in an array
(table, hash table) to store the
records
221
Hash Functions – Numerical
Keys
Keys – numbers
If M is the size of the array, then
h(key) = key % M.
This will map all the keys into
numbers within the interval
[0 .. (M-1)].
222
Hash Functions – Character
Keys
Keys – strings of characters
Treat the binary representation of a
key as a number, and then apply the
hash function
223
How keys are treated as numbers
If each character is represented with
m bits,
then the string can be treated as
base-m number.
224
Example
A K E Y :
00001 01011 00101 11001 =
1 . 323 + 11 . 322 + 5 . 321 + 25 . 320 =
44271
Each letter is represented by its
position in the alphabet. E.G, K is the
11-th letter, and its representation is
01011 ( 11 in decimal)
225
Long Keys
If the keys are very long, an overflow
may occur.
A solution to this is to apply the
Horner’s method
in computing the hash function.
226
Horner’s Method
anxn + an-1.xn-1 + an-2.xn-2 + … + a1x1 + a0x0 =
x(x(…x(x (an.x +an-1) + an-2 ) + …. ) + a1) + a0
4x5 + 2x4 + 3x3 + x2 + 7x1 + 9x0 =
x( x( x( x ( 4.x +2) + 3) + 1 ) + 7) + 9
The polynomial can be computed by
alternating the multiplication and addition operations
227
Example
V E R Y L O N G K E Y
10110 00101 10010 11001 01100 01111 01110 00111 01011 00101 11001
V E R Y L O N G K E Y
22 5 18 25 12 15 14 7 11 5 25
22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327 + 12 . 326 +
15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 +
25 . 320
228
Example (continued)
V E R Y L O N G K E Y
22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327 + 12 . 326 +
15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320
(((((((((22.32 + 5)32 + 18)32 + 25)32 + 12)32 + 15)32 + 14)32 +
7)32 +11)32 +5)32 + 25
compute the hash function by applying the mod operation at
each step, thus avoiding overflowing.
h0 = (22.32 + 5) % M
h1 = (32.h0 + 18) % M
h2 = (32.h1 +25) % M
. . . .
229
int hash32 (char[] name, int tbl_size)
{
key_length = name.length;
int h = 0;
for (int i=0; i < key_length ; i++)
h = (32 * h + name[i]) % tbl_size;
return h;
}
Code
230
Hash Tables
 Index - integer, generated by a hash
function between 0 and M-1
 Initially - blank slots.
 sentinel value, or a special field in
each slot.
231
Hash Tables
Insert - hash function to
generate an address
Search for a key in the table -
the same hash function is used.
232
Size of the Table
Table size M - different from the
number of records N
Load factor: N/M
M must be prime to ensure even
distribution of keys
Chapter 5: Hashing
Collision Resolution: Open Addressing
Extendible Hashing
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
Collision Resolution
 Collision Resolution
 Separate Chaining
 Open Addressing
 Linear Probing
 Quadratic Probing
 Double Hashing
 Rehashing
 Extendible Hashing
Open Addressing
 Invented by A. P. Ershov and
W. W. Peterson in 1957 independently.
 Idea: Store collisions in the hash table.
 Table size - must be at least twice the
number of the records
Open Addressing
If collision occurs,
next probes are performed following the formula:
hi(x) = ( hash(x) + f(i) ) mod Table_Size
where:
hi(x) is an index in the table to insert x
hash(x) is the hash function
f(i) is the collision resolution function.
i - the current attempt to insert an element
Open Addressing
Problems with delete: a special flag is needed
to distinguish deleted from empty positions.
Necessary for the search function – if we come to
a “deleted” position, the search has to continue
as the deletion might have been done after the
insertion of the sought key
– the sought key might be further in the table.
Linear Probing
f(i) = i
Insert: If collision - probe the next slot .
If unoccupied – store the key there.
If occupied – continue probing next slot.
Search: a) match – successful search
b) empty position – unsuccessful search
c) occupied and no match – continue
probing.
If end of the table - continue from the
beginning
Key: A S E A R C H I N G E X A M P L E
Hash 1 0 5 1 18 3 8 9 14 7 5 5 1 13 16 12 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
S A E
* A R
C G H I N
* E
* * * * * X
* * * A
L M P
* * * * * * E
Example
* - unsuccessful attempts
Linear Probing
Disadvantage: “ Primary clustering”
Large clusters tend to build up.
Probability to fill a slot:
? ?
i filled slots
slot a slot b
slot a: (i+1)/M
slot b: 1/M
Quadratic Probing
Use a quadratic function to compute the
next index in the table to be probed.
The idea here is to skip regions in the table
with possible clusters.
f(i) = i2
Quadratic Probing
In linear probing we check the I-th
position. If it is occupied, we check the
I+1st position, next I+2nd, etc.
In quadric probing, if the I-th
position is occupied we check the I+1st,
next we check I+4th ,
next I + 9th , etc.
Double Hashing
Purpose – to overcome the disadvantage of
clustering.
A second hash function to get a fixed increment
for the “probe” sequence.
hash2(x) = R - (x mod R)
R: prime, smaller than table size.
f(i) = i*hash2(x)
Rehashing
Table size : M > N
For small load factor the
performance is much better,
than for N/M close to one.
Best choice: N/M = 0.5
When N/M > 0.75 - rehashing
Rehashing
Build a second table twice as large as
the original and rehash there all the
keys of the original table.
Expensive operation,
running time O(N)
However, once done, the new hash
table will have good performance.
Extendible Hashing
 external storage
 N records in total to store,
 M records in one disk block
No more than two blocks are examined.
Extendible Hashing
Idea:
• Keys are grouped according to the
first m bits in their code.
• Each group is stored in one
disk block.
• If some block becomes full,
each group is split into two ,
and m+1 bits are considered to
determine the location of a record.
Example
4 disk blocks, each can contain 3 records
4 groups of keys according to the first
two bits
00010 01001 10001 11000
00100 01010 10100 11010
01100
00 01 10 11
directory
Example (cont.)
New key to be inserted: 01011.
Block2 is full, so we start considering 3 bits
00010 01001 01100 10001 11000
---- 01010 --- 11010
00100 01011 10100
directory
000/001 010 011 100/101 110/111
(still on
same block)
Extendible Hashing
Size of the directory : 2D
2D = O(N (1+1/M) / M)
D - the number of bits considered.
N - number of records
M - number of disk blocks
Conclusion 1
Hashing is a search method,
used when
 sorting is not needed
 access time is the primary
concern
Conclusion 2
Time-space trade-off:
No memory limitations – use the
key as a memory address (minimum
amount of time).
No time limitations – use
sequential search (minimum amount of
memory)
Hashing – gives a balance between these
two extremes – a way to use a reasonable
amount of both memory and time.
Conclusion 3
To choose a good hash function is a
“black art”.
The choice depends on the
nature of keys and the
distribution of the numbers
corresponding to the keys.
Conclusion 4
Best course of action:
• separate chaining: if the number of
records is not known in advance
• open addressing: if the number of the
records can be predicted and there is
enough memory available
Chapter 6:
Priority Queues
Priority Queues
Binary Heaps
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
256
Priority Queues
The model
Implementations
Binary heaps
Basic operations
Animation
257
The Model
A priority queue is a queue where:
 Requests are inserted in the order of arrival
 The request with highest priority is processed
first (deleted from the queue)
 The priority is indicated by a number, the
lower the number - the higher the priority.
258
Implementations
Linked list:
- Insert at the beginning - O(1)
- Find the minimum - O(N)
Binary search tree (BST):
- Insert O(logN)
- Find minimum - O(logN)
259
Implementations
Binary heap
Better than BST because it does not support
links
- Insert O(logN)
- Find minimum O(logN)
Deleting the minimal element takes a constant
time, however after that the heap structure has
to be adjusted, and this requires O(logN) time.
260
Binary Heap
Heap-Structure Property:
Complete Binary Tree - Each node has two
children, except for the last two levels.
The nodes at the last level do not have children.
New nodes are inserted at the last level from left
to right.
Heap-Order Property:
Each node has a higher priority than its
children
261
6
10
12
15 17 18 23
20 19 34
Binary Heap
Next node to be inserted - right child of the yellow node
262
Binary heap implementation with
an array
Root - A(1)
Left Child of A(i) - A(2i)
Right child of A(i)- A(2i+1)
Parent of A(I) - A([i/2]).
The smallest element is always at the
root, the access time to the element
with highest priority is constant O(1).
263
Example
6 10 12 15 17 18 23 20 19 34
Consider 17:
position in the array - 5.
parent 10 is at position [5/2] = 2
left child is at position 5*2 = 10 (this is 34)
right child - position 2*5 + 1 = 11 (empty.)
264
Problems
Problem 1:
2 8 10 16 17 18 23 20 21 30
Reconstruct the binary heap
265
Problems
Problem 2: Give the array representation for
3
10
16
13 12 18 23
15 19 30 14
266
Basic Operations
 Insert a node – Percolate Up
 Delete a node – Percolate Down
 Decrease key, Increase key, Remove key
 Build the heap
267
Percolate Up – Insert a Node
A hole is created at the bottom of the tree, in the
next available position.
6
10
12
15 17 18 23
20 19 34
268
Percolate Up
Insert 20
6
10
12
15 17 18 23
21 19 34 20
269
Percolate Up
Insert 16
6
10
12
15 18 23
21 19 34 17
270
Percolate Up
6
10
12
15 18 23
21 19 34 17
16
Complexity of insertion: O(logN)
271
Percolate Down –
Delete a Node
10
16
13 12 18 23
15 19 30 34
272
Percolate Down –
the wrong way
1
0
16
13 12 18 23
15 19 30 34
273
Percolate Down –
the wrong way
16
13 18 23
15 19 34
10
12
30
The empty hole violates the
heap-structure property
274
Percolate Down
Last element - 20. The hole at the root.
10
16
12 13 18 23
15 19 30 20
We try to insert 20 in the hole by percolating the hole down
275
Percolate Down
16
12 13 18 23
15 19 30
20
10
276
Percolate Down
16
13 18 23
15 19 30
20
10
12
277
Percolate Down
16
13 18 23
20 19 30
10
12
15
Complexity of deletion: O(logN)
278
Other Heap Operations
1. DecreaseKey(p,d)
increase the priority of element p in
the heap with a positive value d.
percolate up.
2. IncreaseKey(p,d)
decrease the priority of element p in
the heap with a positive value d.
percolate down.
279
Other Heap Operations
3. Remove(p)
a. Assigning the highest priority to p - percolate p
up to the root.
b. Deleting the element in the root and filling the
hole by percolating down and trying to insert the
last element in the queue.
4. BuildHeap
input N elements
 place them into an empty heap through successive
inserts. The worst case running time is O(NlogN).
280
Build Heap - O(N)
Given an array of elements to be
inserted in the heap,
treat the array as a heap with order
property violated,
and then do operations to fix the
order property.
281
Example:
150 80 40 30 10 70 110 100 20 90 60 50 120 140 130
150
80 40
30 10
70 110
100 20 90 60 50 120 140 130
282
Example (cont)
150
80 40
20 10
50 110
100 30 90 60 70 120 140 130
After processing height 1
283
Example (cont)
15
0
10 40
20 60
50 110
10
0
30 90 80 70 12
0
14
0
13
0
After processing height 2
284
Example (cont)
10
20 40
30 60
50 110
100 150 90 80 70 120 140 130
After processing height 3
285
Theorem
For a perfect binary tree of height
h containing N = 2 h+1 - 1 nodes,
the sum S of the heights of the
nodes is
S = 2 h+1 - 1 - (h + 1) = O(N)
286
Proof
The tree has 1 node at height h, 2 nodes
at height h-1, 4 nodes at height h -2, etc
1 (20) h
2 (21) h - 1
4 (22) h - 2
8 (23) h - 3
…….
2 h 0
S =  2i (h - i), i = 0 to h
287
S = h + 2(h - 1) + 4(h - 2) + 8 (h - 3)
+ …. 2(h-2).2 + 2(h-1).1 (1)
2S = 2h + 4(h - 1) + 8(h - 2) + 16 (h - 3)
+ …. 2(h-1).2 + 2(h).1 (2)
Subtract (1) from (2):
S = h -2h + 2 + 4 + 8 + …. 2 h
= - h - 1 + (2 h+1 - 1)
= (2 h+1 - 1) - (h + 1)
288
Proof (cont.)
Note:
1 + 2 + 4 + 8 + …. 2 h =
(2 h+1 - 1)
Hence S = (2 h+1 - 1) - (h + 1)
Hence the complexity of building a
heap with N nodes is linear O(N)
Chapter 7:
Sorting Algorithms
Insertion Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
290
Sorting Algorithms
Insertion Sort
Shell Sort
Heap Sort
Merge Sort
Quick Sort
291
Assumptions
Elements to sort are placed in
arrays of length N
Can be compared
Sorting can be performed in
main memory
292
Complexity
Simple sorting algorithms : O(N2)
Shellsort: o(N2)
Advanced sorting algorithms:
O(NlogN)
In general: Ω(NlogN)
293
Insertion Sort
PRE: array of N elements (from 0 to N-1)
POST: array sorted
1. An array of one element is sorted
2. Assume that the first p elements are
sorted.
for j = p to N-1
Take the j-th element and find a place
for it among the first j sorted elements
294
Insertion Sort
int j, p; comparable tmp;
for ( p = 1; p < N ; p++)
{
tmp = a[p];
for ( j=p; j>0 && tmp < a[ j-1 ] ; j- - )
a[ j ] = a[ j-1 ];
a[ j ] = tmp;
}
Animation
295
Analysis of the Insertion Sort
Insert the N-th el.: at most N-1 comparisons
N-1 movements
Insert the N-1st el. at most N-2 comparisons
N-2 movements
Insert the 2nd el.: 1 comparison
1 movement
2*(1 + 2 +… N - 1) = 2*N * (N-1) / 2 =
N(N-1) = Θ (N2)
Almost sorted array: O(N)
Average complexity: Θ (N2)
296
A lower bound for
simple sorting algorithms
• An inversion :
an ordered pair (Ai, Aj) such that
i < j but Ai > Aj
Example: 10, 6, 7, 15, 3,1
Inversions are:
(10,6), (10,7), (10,3), (10,1), (6,3), (6,1)
(7,3), (7,1) (15,3), (15,1), (3,1)
297
Swapping
Swapping adjacent elements that are out of
order removes one inversion.
A sorted array has no inversions.
Sorting an array that contains i inversions
requires at least i swaps of adjacent elements
How many inversions are there in an average
unsorted array?
298
Theorems
Theorem 1: The average number of
inversions in an array of N distinct elements is
N (N - 1) / 4
Theorem 2: Any algorithm that sorts by
exchanging adjacent elements requires
Ω (N2) time on average.
For a sorting algorithm to run in less than
quadratic time it must do something other
than swap adjacent elements
Chapter 7:
Sorting Algorithms
Shell Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
300
Shell Sort
Improves on insertion sort
Compares elements far apart,
then less far apart, finally
compares adjacent elements
(as an insertion sort).
301
Idea
arrange the data sequence in a
two-dimensional array
sort the columns of the array
repeat the process each time with
smaller number of columns
302
Example
3 7 9 0 5 1 6 8 4 2 0 6 1 5 7 3 4 9 8 2
it is arranged in an array with 7 columns (left),
then the columns are sorted (right):
3 7 9 0 5 1 6 3 3 2 0 5 1 5
8 4 2 0 6 1 5  7 4 4 0 6 1 6
7 3 4 9 8 2 8 7 9 9 8 2
303
Example (cont)
3 3 2 0 0 1
0 5 1 1 2 2
5 7 4 3 3 4
4 0 6  4 5 6
1 6 8 5 6 8
7 9 9 7 7 9
8 2 8 9
one column in
the last step –
it is only a 6,
an 8 and a 9
that have to
move a little
bit to their
correct position
304
Implementation
• one-dimensional array that is
indexed appropriately.
• an increment sequence
to determine how far apart
elements to be sorted are
305
Increment sequence
Determines how far apart elements to
be sorted are:
h1, h2, ..., ht with h1 = 1
hk-sorted array - all elements
spaced a distance hk apart are sorted
relative to each other.
306
Correctness of the algorithm
Shellsort only works because an array that
is hk-sorted remains hk-sorted when
hk- 1-sorted.
Subsequent sorts do not undo the
work done by previous phases.
The last step (with h = 1) -
Insertion Sort on the whole array
307
Java code for Shell sort
int j, p, gap; comparable tmp;
for (gap = N/2; gap > 0; gap = gap/2)
for ( p = gap; p < N ; p++)
{
tmp = a[p];
for ( j = p ; j >= gap && tmp < a[ j- gap ];
j = j - gap)
a[ j ] = a[ j - gap ];
a[j] = tmp;
}
308
Increment sequences
1. Shell's original sequence:
N/2 , N/4 , ..., 1 (repeatedly divide by 2).
2. Hibbard's increments:
1, 3, 7, ..., 2k - 1 ;
3. Knuth's increments:
1, 4, 13, ..., ( 3k - 1) / 2 ;
4. Sedgewick's increments:
1, 5, 19, 41, 109, ....
9 ·4k - 9 ·2k + 1 or 4k - 3 ·2k + 1.
309
Analysis
Shellsort's worst-case performance using
Hibbard's increments is Θ(n3/2).
The average performance is thought to be
about O(n 5/4)
The exact complexity of this algorithm is still
being debated .
for mid-sized data : nearly as well if not
better than the faster (n log n) sorts.
Chapter 7:
Sorting Algorithms
Merge Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
311
Merge Sort
Basic Idea
Example
Analysis
Animation
312
Idea
Two sorted arrays can be merged
in linear time with N comparisons
only.
Given an array to be sorted,
consider separately
its left half and its right half,
sort them and then merge them.
313
Characteristics
 Recursive algorithm.
 Runs in O(NlogN) worst-case running
time.
Where is the recursion?
 Each half is an array that can be sorted
using the same algorithm - divide into
two, sort separately the left and the right
halves, and then merge them.
314
Example
315
Merge Sort Code
void merge_sort ( int [ ] a, int left, int right)
{
if(left < right) {
int center = (left + right) / 2;
merge_sort (a,left, center);
merge_sort(a,center + 1, right);
merge(a, left, center + 1, right);
}
}
316
Analysis of Merge Sort
Assumption: N is a power of two.
For N = 1 time is constant (denoted by 1)
Otherwise:
time to mergesort N elements =
time to mergesort N/2 elements +
time to merge two arrays each N/2 el.
317
Recurrence Relation
Time to merge two arrays each N/2
elements is linear, i.e. O(N)
Thus we have:
(a) T(1) = 1
(b) T(N) = 2T(N/2) + N
318
Solving the Recurrence
Relation
T(N) = 2T(N/2) + N divide by N:
(1) T(N) / N = T(N/2) / (N/2) + 1
Telescoping: N is a power of two, so we can
write
(2) T(N/2) / (N/2) = T(N/4) / (N/4) +1
(3) T(N/4) / (N/4) = T(N/8) / (N/8) +1
…….
T(2) / 2 = T(1) / 1 + 1
319
Adding the Equations
The sum of the left-hand sides will be equal to
the sum of the right-hand sides:
T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) +
… + T(2)/2 =
T(N/2) / (N/2) + T(N/4) / (N/4) + ….
+ T(2) / 2 + T(1) / 1 + LogN
(LogN is the sum of 1’s in the right-hand sides)
320
Crossing Equal Terms,
Final Formula
After crossing the equal terms, we get
T(N)/N = T(1)/1 + LogN
T(1) is 1, hence we obtain
T(N) = N + NlogN = (NlogN)
Hence the complexity of the Merge Sort
algorithm is (NlogN).
Chapter 7:
Sorting Algorithms
Quick Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
322
Quick Sort
Basic Idea
Code
Analysis
Advantages and Disadvantages
Applications
Comparison with Heap sort
and Merge sort
Animation
323
Basic Idea
 Pick one element in the array, which will
be the pivot.
 Make one pass through the array, called a
partition step, re-arranging the entries so
that:
• entries smaller than the pivot are to the left
of the pivot.
• entries larger than the pivot are to the right
324
Basic Idea
 Recursively apply quicksort to the part of
the array that is to the left of the pivot, and
to the part on its right.
 No merge step, at the end all the elements
are in the proper order
325
Choosing the Pivot
Some fixed element: e.g. the first, the
last, the one in the middle.
Bad choice - may turn to be the
smallest or the largest element, then
one of the partitions will be empty
Randomly chosen (by random generator)
- still a bad choice
326
Choosing the Pivot
The median of the array
(if the array has N numbers, the
median is the [N/2] largest number).
This is difficult to compute - increases
the complexity.
327
Choosing the Pivot
The median-of-three choice:
take the first, the last and the middle
element.
Choose the median of these three
elements.
328
Find the Pivot – Java Code
int median3 ( int [ ]a, int left, int right)
{ int center = (left + right)/2;
if (a [left] > a [ center]) swap (a [ left ], a [center]);
if (a [center] > a [right]) swap (a[center], a[ right ]);
if (a [left] > a [ center]) swap (a [ left ], a [center]);
swap(a [ center ], a [ right-1 ]);
return a[ right-1 ];
}
329
Quick Sort – Java Code
If ( left + 10 < = right)
{ // do quick sort }
else insertionSort (a, left, right);
330
Quick Sort – Java Code
{ int i = left, j = right - 1;
for ( ; ; )
{ while (a [++ i ] < pivot) { }
while ( pivot < a [- - j ] ) { }
if (i < j) swap (a[ i ], a [ j ]);
else break; }
swap (a [ i ], a [ right-1 ]);
quickSort ( a, left, i-1);
quickSort (a, i+1, right); }
331
Implementation Notes
Compare the two versions:
A. while (a[++i] < pivot) { }
while (pivot < a[--j]){ }
if ( i < j ) swap (a[i], a[j]);
else break;
B. while (a[ i ] < pivot) { i++ ; }
while (pivot < a[ j ] ) { j- - ; }
if ( i < j ) swap (a [ i ], a [ j ]);
else break;
332
Implementation Notes
If we have an array of equal
elements, the second code will
never increment i or decrement j,
and will do infinite swaps.
i and j will never cross.
333
Complexity of Quick Sort
Average-case O(NlogN)
Worst Case: O(N2)
This happens when the pivot is the
smallest (or the largest) element.
Then one of the partitions is empty,
and we repeat recursively the
procedure for N-1 elements.
334
Complexity of Quick Sort
Best-case O(NlogN)
The pivot is the median of the array,
the left and the right parts have same
size.
There are logN partitions, and to
obtain each partitions we do N
comparisons (and not more than N/2
swaps). Hence the complexity is
O(NlogN)
335
Analysis
T(N) = T(i) + T(N - i -1) + cN
The time to sort the file is equal to
 the time to sort the left partition
with i elements, plus
 the time to sort the right partition with
N-i-1 elements, plus
 the time to build the partitions.
336
Worst-Case Analysis
The pivot is the smallest (or the largest) element
T(N) = T(N-1) + cN, N > 1
Telescoping:
T(N-1) = T(N-2) + c(N-1)
T(N-2) = T(N-3) + c(N-2)
T(N-3) = T(N-4) + c(N-3)
…………...
T(2) = T(1) + c.2
337
Worst-Case Analysis
T(N) + T(N-1) + T(N-2) + … + T(2) =
= T(N-1) + T(N-2) + … + T(2) + T(1) +
c(N) + c(N-1) + c(N-2) + … + c.2
T(N) = T(1) +
c times (the sum of 2 thru N)
= T(1) + c (N (N+1) / 2 -1) = O(N2)
338
Best-Case Analysis
The pivot is in the middle
T(N) = 2T(N/2) + cN
Divide by N:
T(N) / N = T(N/2) / (N/2) + c
339
Best-Case Analysis
Telescoping:
T(N) / N = T(N/2) / (N/2) + c
T(N/2) / (N/2) = T(N/4) / (N/4) + c
T(N/4) / (N/4) = T(N/8) / (N/8) + c
……
T(2) / 2 = T(1) / (1) + c
340
Best-Case Analysis
Add all equations:
T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4)
+ …. + T(2) / 2 =
= (N/2) / (N/2) + T(N/4) / (N/4) + … +
T(1) / (1) + c.logN
After crossing the equal terms:
T(N)/N = T(1) + c*LogN
T(N) = N + N*c*LogN = O(NlogN)
341
Advantages and Disadvantages
 Advantages:
 One of the fastest algorithms on average
 Does not need additional memory (the
sorting takes place in the array - this is
called in-place processing )
 Disadvantages:
 The worst-case complexity is O(N2)
342
Applications
Commercial applications
 QuickSort generally runs fast
 No additional memory
 The above advantages compensate for
the rare occasions when it runs with O(N2)
343
Applications
Never use in applications which require a guarantee
of response time:
 Life-critical
(medical monitoring,
life support in aircraft, space craft)
 Mission-critical
(industrial monitoring and control in handling
dangerous materials, control for aircraft, defense,
etc)
unless you assume the worst-case
response time
Warning:
344
Comparison with Heap Sort
 O(NlogN) complexity
 quick sort runs faster, (does not
support a heap tree)
 the speed of quick sort is not
guaranteed
345
Comparison with Merge Sort
 Merge sort guarantees O(NlogN) time
 Merge sort requires additional memory
with size N
 Usually Merge sort is not used for main
memory sorting, only for external memory
sorting
Chapter 9: Graphs
Topological Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
347
Topological Sort
Problem
Definitions
Basic Algorithm
Example
Complexity
Improved Algorithm
Complexity
348
Problem
P: Assemble pool table
F: Carpet the floor
W: Panel the walls
C: Install ceiling
How would you order these
activities?
349
Problem Graph
S
F P
W
C
E
S – start F – floor P – pool table
E – end W – walls C - ceiling
Some possible
orderings:
C W F P
W C F P
F P W C
C F P W
Important: P must be after F
350
Topological Sort
RULE:
If there is a path from u to v,
then v appears after u in the ordering.
351
Graphs’ Characteristics
directed
otherwise (u,v) means a path from u to v and
from v to u , cannot be ordered.
acyclic
otherwise u and v are on a cycle :
u would precede v , v would precede u.
352
The ordering may not be unique
V1
V2 V3
V4
legal orderings
V1, V2, V3, V4
V1, V3, V2, V4
353
Some Definitions
Degree of a vertex U:
the number of edges (U,V) - outgoing edges
Indegree of a vertex U:
the number of edges (V,U) - incoming edges
The algorithm for topological sort uses
"indegrees" of vertices.
354
Basic Algorithm
1. Compute the indegrees of all vertices
2. Find a vertex U with indegree 0 and print it
(store it in the ordering)
If there is no such vertex then there is a
cycle and the vertices cannot be ordered. Stop.
3. Remove U and all its edges (U,V) from the graph.
4. Update the indegrees of the remaining vertices.
Repeat steps 2 through 4 while there are vertices
to be processed.
355
Example
V1 V2
V3 V4 V5
Indegrees
V1 0
V2 1
V3 2
V4 2
V5 1
First to be sorted: V1
New indegrees:
V2 0
V3 1
V4 1
V5 2
Possible sorts: V1, V2, V4, V3, V5; V1, V2, V4, V5, V3
356
Complexity
O(|V|2), |V| - the number of vertices.
To find a vertex of indegree 0 we scan all the
vertices -
|V| operations.
We do this for all vertices: |V|2 operations
357
Improved Algorithm
 Find a vertex of degree 0,
 Scan only those vertices whose
indegrees have been updated to 0.
358
Improved Algorithm
1. Compute the indegrees of all vertices
2. Store all vertices with indegree 0 in a queue.
3. Get a vertex U.
4. For all edges (U,V) update the indegree of V, and
put V in the queue if the updated indegree is 0.
Repeat steps 3 and 4 while the queue is not empty.
359
Complexity
Operations needed to compute the indegrees:
Adjacency lists: O(|E|)
Matrix: O(|V|2)
Complexity of the improved algorithm
Adjacency lists: O(|E| + |V|),
Matrix representation: O(|V|2)
Note that if the graph is complete |E| = O(|V|2)
|V| - number of vertices,
|E| - number of edges.
Chapter 9: Graphs
Shortest Paths
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
361
Shortest Paths
Shortest Paths in Unweighted
Graphs
Shortest Paths in Weighted
Graphs - Dijkstra’s Algorithm
Animation
362
Unweighted Directed Graphs
V1 V2
V3 V4
V5
V6 V7
What is the shortest path from V3 to V5?
363
Problem Data
The problem: Given a source vertex s, find the
shortest path to all other vertices.
Data structures needed:
Graph representation:
Adjacency lists / adjacency matrix
Distance table:
distances from source vertex
paths from source vertex
364
Example
Adjacency lists :
V1: V2, V4
V2: V4, V5
V3: V1, V6
V4: V3, V5, V6, V7
V5: V7
V6: -
V7: V6
Let s = V3, stored in a queue
Initialized distance table:
distance parent
V1 -1 -1
V2 -1 -1
V3 0 0
V4 -1 -1
V5 -1 -1
V6 -1 -1
V7 -1 -1
365
Breadth-first search in graphs
Take a vertex and examine all
adjacent vertices.
Do the same with each of the
adjacent vertices .
366
Algorithm
1. Store s in a queue, and initialize
distance = 0 in the Distance Table
2. While there are vertices in the queue:
Read a vertex v from the queue
For all adjacent vertices w :
If distance = -1 (not computed)
Distance = (distance to v) + 1
Parent = v
Append w to the queue
367
Complexity
Matrix representation: O(|V|2)
Adjacency lists - O(|E| + |V|)
We examine all edges (O(|E|)),
and we store in the queue each
vertex only once (O(|V|)).
368
Weighted Directed Graphs
V1 V2
V3 V4
V5
V6 V7
What is the shortest distance from V3 to V7?
369
Comparison
Similar to the algorithm for unweighted graphs
Differences:
- weights are included in the graph representation
- priority queue : the node with the smallest
distance is chosen for processing
- distance is not any more the number of edges,
instead it is the sum of weights
- Distance table is updated if newly computed
distance is smaller.
370
Algorithm
1. Store s in a priority queue with distance = 0
2. While there are vertices in the queue
DeleteMin a vertex v from the queue
For all adjacent vertices w:
Compute new distance
Store in / update Distance table
Insert/update in priority queue
371
Processing Adjacent Nodes
For all adjacent vertices w:
Compute new distance = (distance to v) + (d(v,w))
If distance = -1 (not computed)
store new distance in table
path = v
Insert w in priority queue
If old distance > new distance
Update old_distance = new_distance
Update path = v
Update priority in priority queue
372
Complexity
O(E logV + V logV) = O((E + V) log(V))
Each vertex is stored only once in the queue – O(V)
DeleteMin operation is : O( V logV )
Updating the priority queue –
search and inseart: O(log V)
performed at most for each edge: O(E logV)
Chapter 9: Graphs
Spanning Trees
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
374
Spanning Trees
Spanning Trees in Unweighted
Graphs
Minimal Spanning Trees in Weighted
Graphs - Prim’s Algorithm
Minimal Spanning Trees in Weighted
Graphs - Kruskal’s Algorithm
Animation
375
Definitions
Spanning tree: a tree that contains all vertices
in the graph.
Number of nodes: |V|
Number of edges: |V|-1
376
Spanning trees for unweighted
graphs - data structures
A table (an array) T
size = number of vertices,
Tv= parent of vertex v
Adjacency lists
A queue of vertices to be processed
377
Algorithm - initialization
1. Choose a vertex S and store it in a queue,
set
a counter = 0 (counts the processed nodes),
and Ts = 0 (to indicate the root),
Ti = -1 ,i  s (to indicate vertex not
processed)
378
Algorithm – basic loop
2. While queue not empty and counter < |V| -1 :
Read a vertex V from the queue
For all adjacent vertices U :
If Tu = -1 (not processed)
Tu  V
counter  counter + 1
store U in the queue
379
Algorithm
– results and complexity
Result:
Root: S, such that Ts = 0
Edges in the tree: (Tv , V)
Complexity: O(|E| + |V|) - we process
all edges and all nodes
380
Example
1
2 3
4
5
Table
0 // 1 is the root
1 // edge (1,2)
2 // edge (2,3)
1 // edge (1,4)
2 // edge (2,5)
Edge format: (Tv,V)
381
Minimum Spanning Tree -
Prim's algorithm
Given: Weighted graph.
Find a spanning tree with the minimal sum
of the weights.
Similar to shortest paths in a weighted graphs.
Difference: we record the weight of the
current edge, not the length of the path .
382
Data Structures
A table : rows = number of vertices,
three columns:
T(v,1) = weight of the edge from
parent to vertex V
T(v,2) = parent of vertex V
T(v,3) = True, if vertex V is fixed in the tree,
False otherwise
383
Data Structures (cont)
Adjacency lists
A priority queue of vertices to be processed.
Priority of each vertex is determined by
the weight of edge that links the vertex
to its parent.
The priority may change if we change
the parent, provided the vertex is not yet
fixed in the tree.
384
Prim’s Algorithm
Initialization
For all V, set
first column to –1 (not included in the tree)
second column to 0 (no parent)
third column to False (not fixed in the tree)
Select a vertex S, set T(s,1) = 0 (root: no parent)
Store S in a priority queue with priority 0.
385
While (queue not empty) loop
1. DeleteMin V from the queue, set T(v,3) = True
2. For all adjacent to V vertices W:
If T(w,3)= True do nothing – the vertex is fixed
If T(w,1) = -1 (i.e. vertex not included)
T(w,1)  weight of edge (v,w)
T(w,2)  v (the parent of w)
insert w in the queue with
priority = weight of (v,w)
386
While (queue not empty) loop
(cont)
If T(w,1) > weight of (v,w)
T(w,1)  weight of edge (v,w)
T(w,2)  v (the parent of w)
update priority of w in the queue
remove, and insert with new priority
= weight of (v,w)
387
Results and Complexity
At the end of the algorithm, the tree
would be represented in the table with its
edges
{(v, T(v,2) ) | v = 1, 2, …|V|}
Complexity: O(|E|log(|V|))
388
Kruskal's Algorithm
The algorithm works with :
set of edges,
tree forests,
each vertex belongs to only one
tree in the forest.
389
The Algorithm
1. Build |V| trees of one vertex only - each vertex
is a tree of its own. Store edges in priority queue
2. Choose an edge (u,v) with minimum weight
if u and v belong to one and the same tree,
do nothing
if u and v belong to different trees,
link the trees by the edge (u,v)
3. Perform step 2 until all trees are combined into
one tree only
390
Example
1 2
3 4
5
1
5 3
4
6
2 4
Edges in
Priority
Queue:
(1,2) 1
(3,5) 2
(2,4) 3
(1,3) 4
(4,5) 4
(2,5) 5
(1,5) 6
391
Example (cont)
1 2 3 4 5
Forest of 5 trees
Edges in
Priority Queue:
(1,2) 1
(3,5) 2
(2,4) 3
(1,3) 4
(4,5) 4
(2,5) 5
(1,5) 6
1 2
Process edge (1,2)
Process edge (3,5)
1 2
3
1
1
2
5
3 4 5
4
392
Example (cont)
Edges in
Priority
Queue:
(2,4) 3
(1,3) 4
(4,5) 4
(2,5) 5
(1,5) 6
Process edge (2,4)
1 2
1
4
3
Process edge (1,3)
2
5
1
2
4
3
1
3
4
All trees are
combined in
one, the
algorithm
stops
3
2
5
393
Operations on Disjoint Sets
Comparison: Are the vertices in the same tree
Union: combine two disjoint sets to form one set -
the new tree
Implementation
Union/find operations: the unions are represented
as trees - nlogn complexity.
The set of trees is implemented by an array.
394
Complexity of Kruskal’s
Algorithm
O(|E|log(|V|)
Explanation:
The complexity is determined by the heap operations and the
union/find operations
Union/find operations on disjoint sets represented as trees:
tree search operations, with complexity O(log|V|)
DeleteMin in Binary heaps with N elements is O(logN),
When performed for N elements, it is O(NlogN).
395
Complexity (cont)
Kruskal’s algorithm works with a binary heap that
contains all edges: O(|E|log(|E|))
However, a complete graph has
|E| = |V|*(|V|-1)/2, i.e. |E| = O(|V|2)
Thus for the binary heap we have
O(|E|log(|E|)) = O(|E|log (|V|2) = O(2|E|log (|V|)
= O(|E|log(|V|))
396
Complexity (cont)
Since for each edge we perform DeleteMin
operation and union/find operation, the overall
complexity is:
O(|E| (log(|V|) + log(|V|)) =
O(|E|log(|V|))
Chapter 9: Graphs
Breadth-First and
Depth-First Search
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
398
Breadth-First and Depth-First
Search
BFS Basic Algorithm
BFS Complexity
DFS Algorithm
DFS Implementation
Relation between BFS and DFS
399
BFS – Basic Idea
Given a graph with N vertices and a selected
vertex A:
for (i = 1;
there are unvisited vertices ; i++)
Visit all unvisited vertices at distance i
(i is the length of the shortest path between A
and currently processed vertices)
Queue-based implementation
400
BFS – Algorithm
BFS algorithm
1. Store source vertex S in a queue and mark as
processed
2. while queue is not empty
Read vertex v from the queue
for all neighbors w:
If w is not processed
Mark as processed
Append in the queue
Record the parent of w to be v (necessary only
if we need the shortest path tree)
401
Example
1 4
2 3 5
6
Adjacency lists
1: 2, 3, 4
2: 1, 3, 6
3: 1, 2, 4, 5, 6
4: 1, 3, 5
5: 3, 4
6: 2, 3
Breadth-first traversal: 1, 2, 3, 4, 6, 5
1: starting node
2, 3, 4 : adjacent to 1
(at distance 1 from node 1)
6 : unvisited adjacent to node 2.
5 : unvisited, adjacent to node 3
The order depends on the order of
the nodes in the adjacency lists
402
Shortest Path Tree
Consider the distance table:
Nodes
A B C D E
Distance to A 0 1 1 2 1
Parent 0 A A C A
The table defines the shortest
path tree, rooted at A.
403
BFS – Complexity
Step 1 : read a node from the queue O(V) times.
Step 2 : examine all neighbors, i.e. we examine all edges
of the currently read node.
Not oriented graph: 2*E edges to examine
Hence the complexity of BFS is O(V + 2*E)
404
Depth-First Search
Procedure dfs(s)
mark all vertices in the graph as not reached
invoke scan(s)
Procedure scan(s)
mark and visit s
for each neighbor w of s
if the neighbor is not reached
invoke scan(w)
405
Depth-First Search with Stack
Initialization:
mark all vertices as unvisited,
visit(s)
while the stack is not empty:
pop (v,w)
if w is not visited
add (v,w) to tree T
visit(w)
Procedure visit(v)
mark v as visited
for each edge (v,w)
push (v,w) in the stack
406
Recursive DFS
DepthFirst(Vertex v)
visit(v);
for each neighbor w of v
if (w is not visited)
add edge (v,w) to tree T
DepthFirst(w)
Visit(v)
mark v as visited
407
Example
1 4
2 3 5
6
Adjacency lists
1: 2, 3, 4
2: 6, 3, 1
3: 1, 2, 6, 5, 4
4: 1, 3, 5
5: 3, 4
6: 2, 3
Depth first traversal: 1, 2, 6, 3, 5, 4
the particular order is dependent
on the order of nodes in the
adjacency lists
408
BFS and DFS
bfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
beginning of L
if w not visited
add (v,w) to T
visit(w)
dfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
end of L
if w not visited
add (v,w) to T
visit(w)
Visit ( vertex v)
mark v as visited
for each edge (v,w)
add edge (v,w) to end of L
409
Applications of DFS
 Trees: preorder traversal
 Graphs:
 Connectivity
 Biconnectivity – articulation points
 Euler graphs
Chapter 9: Graphs
Applications of
Depth-First Search
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
411
Graph Connectivity
Connectivity
Biconnectivity
Articulation Points and Bridges
Connectivity in Directed Graphs
412
Connectivity
Definition:
An undirected graph is said to be connected
if for any pair of nodes of the graph, the two
nodes are reachable from one another (i.e. there
is a path between them).
If starting from any vertex we can visit all other
vertices, then the graph is connected
413
Biconnectivity
A graph is biconnected, if there are no
vertices whose removal will disconnect the
graph.
A B
C
D
E
A B
C
D
E
biconnected not biconnected
414
Articulation Points
Definition: A vertex whose removal
makes the graph disconnected is called
an articulation point or cut-vertex
A B
C
D
E
C is an articulation point
We can compute articulation points
using depth-first search and a
special numbering of the vertices in
the order of their visiting.
415
Bridges
Definition:
An edge in a graph is called a bridge, if its
removal disconnects the graph.
Any edge in a graph, that does not lie on a cycle, is a bridge.
Obviously, a bridge has at least one articulation point at its end,
however an articulation point is not necessarily linked in a bridge.
A B
C
D
E
(C,D) and (E,D) are bridges
416
Example 1
A B
C
E
F
D
C is an articulation point, there are no bridges
417
Example 2
A B
C
E
F
D
C is an articulation point, CB is a bridge
418
Example 3
A
B
C
E
F
G
D
B and C are articulation points, BC is a bridge
419
Example 4
A
B C D
E
B and C are articulation points. All edges are bridges
420
Example 5
A
B
C
D E
F
G
Biconnected graph - no articulation points and no bridges
421
Connectivity in Directed
Graphs (I)
Definition: A directed graph is said to be
strongly connected
if for any pair of nodes there is a path
from each one to the other
A B
C
D
E
422
Connectivity in Directed
Graphs (II)
Definition: A directed graph is said to be
unilaterally connected if for any pair of
nodes at least one of the nodes is reachable
from the other
A B
C
D
E
Each strongly connected
graph is also unilaterally
connected.
423
Connectivity in Directed
Graphs (III)
Definition: A directed graph is said to be
weakly connected if the underlying
undirected graph is connected
A B
C
D
E
There is no path
between B and D
Each unilaterally
connected graph is also
weakly connected
Chapter 9: Graphs
Summary
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
425
Summary of Graph Algorithms
Basic Concepts
Topological Sort
Shortest Paths
Spanning Trees
Scheduling Networks
Breadth-First and Depth First Search
Connectivity
426
Basic Concepts
 Basic definitions: vertices and edges
 More definitions: paths, simple paths,
cycles, loops
 Connected and disconnected graphs
 Spanning trees
 Complete graphs
 Weighted graphs and networks
 Graph representation
 Adjacency matrix
 Adjacency lists
427
Topological Sort
RULE: If there is a path from u to v,
then v appears after u in the ordering.
Graphs: directed, acyclic
Degree of a vertex U: the number of outgoing edges
Indegree of a vertex U: the number of incoming edges
The algorithm for topological sort uses "indegrees" of vertices.
Implementation: with a queue
Complexity: Adjacency lists: O(|E| + |V|),
Matrix representation: O(|V|2)
428
Shortest Path in Unweighted
Graphs
Data Structures needed:
Distance Table
Queue
Implementation: Breadth-First search using a queue
Complexity:
Matrix representation: O(|V|2)
Adjacency lists - O(|E| + |V|)
429
Algorithm
1. Store s in a queue, and initialize
distance = 0 in the Distance Table
2. While there are vertices in the queue:
Read a vertex v from the queue
For all adjacent vertices w :
If distance = -1 (not computed)
Distance = (distance to v) + 1
Parent = v
Append w to the queue
430
Shortest Path in Weighted
Graphs – Dijkstra’s Algorithm
1. Store s in a priority queue with distance = 0
2. While there are vertices in the queue
DeleteMin a vertex v from the queue
For all adjacent vertices w:
Compute new distance
Store in / update Distance table
Insert/update in priority queue
431
Complexity
O(E logV + V logV) = O((E + V) log(V))
Each vertex is stored only once in the queue – O(V)
DeleteMin operation is : O( V logV )
Updating the priority queue –
search and inseart: O(log V)
performed at most for each edge: O(E logV)
432
Spanning Trees
Spanning tree: a tree that contains all vertices
in the graph.
Number of nodes: |V|
Number of edges: |V|-1
433
Spanning Trees of Unweighted
Graphs
Breadth-First Search using a queue
Complexity: O(|E| + |V|) - we process all
edges and all nodes
434
Min Spanning Trees of Weighted
Graphs – Prim’s Algorithm
Find a spanning tree with the minimal sum
of the weights.
Similar to shortest paths in a weighted graphs.
Difference: we record the weight of the
current edge, not the length of the path.
Implementation: with a Priority Queue
Complexity: O(|E| log (|V|) )
435
Min Spanning Trees of Weighted
Graphs – Kruskal’s Algorithm
The algorithm works with the set of
edges, stored in Priority queue.
The tree is built incrementally starting
with a forest of nodes only and adding
edges as they come out of the priority
queue
Complexity: O(|E| log (|V|))
436
Scheduling Networks
Directed simple acyclic graph
Nodes: events, Source, Sink
Edges: activities (name of the task, duration)
 Find the earliest occurrence time (EOT) of an
event
 Find the earliest completion time (ECT) of an
activity
 Find the slack of an activity: how much the
activity can be delayed without delaying the
project
 Find the critical activities: must be completed on
time in order not to delay the project
437
EOT and ECT
Earliest occurrence time of an event
EOT(source) = 0
EOT( j ) = max ( EOTs of all activities preceding the event)
Earliest completion time of an activity
ECT( j, k ) = EOT( j ) + duration( j, k)
1. Sort the nodes topologically
2. For each event j : initialize EOT(j) = 0
3. For each event i in topological order
For each event j adjacent from i :
ECT(i,j) = EOT(i) + duration (i,j)
EOT(j) = max(EOT(j) , ECT(i,j))
438
Slack of activities
Compute latest occurrence time LOT(j) , slack(i, j)
1. Initialize LOT(sink) = EOT(sink)
2. For each non-sink event i in the reverse topological order:
LOT(i) = min(LOT( j ) – duration( i, j))
3. For each activity (i,j)
slack( i, j ) = LOT( j ) – ECT( i, j)
Critical activities: slack(j) = 0
Critical path in the project:
all edges are activities with slack = 0
The critical path is found using breadth-first (or depth-first) search on the
edges
Complexity:
O(V + E) with adjacency lists,
O(V2) with adjacency matrix
439
BFS and DFS
bfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
beginning of L
if w not visited
add (v,w) to T
visit(w)
dfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
end of L
if w not visited
add (v,w) to T
visit(w)
Visit ( vertex v)
mark v as visited
for each edge (v,w)
add edge (v,w) to end of L
Complexity: O( V + E)
440
Graph Connectivity
Connectivity
Biconnectivity – no cut vertices
Articulation Points and Bridges
Connectivity in Directed Graphs
Strongly connected
Unilaterally connected
Weakly connected
Chapter 10: Algorithm
Design Techniques
Backtracking
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
Backtracking
How to solve a problem by search
Generate and Test
What is Backtracking
Example
The Idea
Problem solving:
search in a space of possible states.
Initial
state
Target state2
Target state1
Generate and Test Method
• Take a state
• Generate next states by applying relevant rules
(not all rules are applicable for a given state!)
• Test to see if the goal state is reached.
If yes - stop
If no - add the generated states into a
stack/queue (do not add if the state has already
been generated) and repeat the procedure.
To get the solution: Record the paths
Search Methods
Breadth-first vs depth-first search
Exhaustive search (brute force) vs
informative (constraint-based,
heuristic) search
Backtracking
Backtracking
• rejecting the currently reached state
as a dead-end state,
• going back to some previous level in
the state-space tree/graph
• proceeding with the exploration of
other paths toward the target state
Backtracking occurs in depth-first search
Example
Decide on:
State representation
Operators
Constraints
Initial state
Target state
LOAN
+ LOAN
LOAN
LOAN
-------
DEBT
Replace the letters with
distinct digits such that the
addition is correct
Chapter 10: Algorithm
Design Techniques
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
449
Algorithm Design Techniques
Brute Force
Greedy Algorithms
Divide and Conquer
Dynamic Programming
Transform and Conquer
Backtracking
Genetic Algorithms
450
Brute Force
 Based on the problem’s statement and
definitions of the concepts involved.
 Examples:
 Sequential search
 Exhaustive search: TSP, knapsack
 Simple sorts: selection sort, bubble sort
 Computing n!
451
Greedy Algorithms
"take what you can get now" strategy
Work in phases.
In each phase the currently best
decision is made
452
Greedy Algorithms -
Examples
• Dijkstra's algorithm
(shortest path is weighted graphs)
• Prim's algorithm, Kruskal's
algorithm
(minimal spanning tree in weighted graphs)
• Coin exchange problem
• Huffman Trees
453
Divide and Conquer
• Reduce the problem to smaller
problems (by a factor of at least 2)
solved recursively and then
combine the solutions
Examples: Binary Search
Mergesort
Quick sort
Tree traversal
In general, problems that can be defined recursively
454
Decrease and Conquer
• Reduce the problem to smaller
problems solved recursively and
then combine the solutions
Examples of decrease-and-conquer algorithms:
Insertion sort
Topological sorting
Binary Tree traversals: inorder, preorder and postorder
(recursion)
Computing the length of the longest path in a binary tree
(recursion)
Computing Fibonacci numbers (recursion)
Reversing a queue (recursion)
455
Dynamic Programming
Bottom-Up Technique in which the
smallest sub-instances are explicitly
solved first and the results of these used to
construct solutions to progressively larger
sub-instances.
Example:
Fibonacci numbers computed by iteration.
456
Transform and Conquer
The problem is modified to be more amenable to solution. In the
second stage the problem is solved.
• Problem simplification e.g. presorting
Example: finding the two closest numbers in an array of
numbers.
Brute force solution: O(n2)
Transform and conquer with presorting: O(nlogn)
• Change in the representation
Example: AVL trees guarantee O(nlogn) search time
• Problem reduction
Example: least common multiple: lcm(m,n) = (m*n)/ gcd(m,n)
457
Backtracking
Generate-and-Test methods
Based on exhaustive search in
multiple choice problems
Typically used with depth-first state
space search problems.
Example: Puzzles
458
Backtracking –
State Space Search
• initial state
• goal state(s)
• a set of intermediate states
• a set of operators that transform one state into
another. Each operator has preconditions and
postconditions.
• a cost function – evaluates the cost of the
operations (optional)
• a utility function – evaluates how close is a given
state to the goal state (optional)
459
Genetic Algorithms
 Search for good solutions among possible solutions
 The best possible solution may be missed
 A solution is coded by a string , also called
chromosome. The words string and chromosome are
used interchangeably
 A strings fitness is a measure of how good a
solution it codes. Fitness is calculated by a fitness
function
 Selection: The procedure to choose parents
 Crossover is the procedure by which two
chromosomes mate to create a new offspring
chromosome
 Mutation : with a certain probability flip a bit in the
offspring
460
Basic Genetic Algorithm
1. Start: Generate random population of n
chromosomes (suitable solutions for the problem)
2. Fitness: Evaluate the fitness f(x) of each
chromosome x in the population
3. New population: Create a new population by
repeating following steps until the new population
is complete
4. Test: If the end condition is satisfied, stop, and
return the best solution in current population
461
New Population
 Selection: Select two parent chromosomes
from a population according to their
fitness
 Crossover: With a crossover probability
cross over the parents to form a new
offspring (children). If no crossover was
performed, offspring is an exact copy of parents.
 Mutation: With a mutation probability
mutate new offspring at each locus
(position in chromosome).
462
Conclusion
How to choose the approach?
First, by understanding the problem, and second, by
knowing various problems and how they are solved using
different approaches.
Strongly recommended reading to learn more about how to
design algorithms:
The Algorithm Design Manual, Steven S. Skiena
Department of Computer Science, State University of New York
http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK/BOOK.HTM
Algorithm Design Paradigms, Paul E. Dunne
University of Liverpool, Department of Computer Science
http://www.csc.liv.ac.uk/~ped/teachadmin/algor/algor.html
Complexity Issues
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
464
Complexity Issues
 Class P problems
 Class NP problems
 NP-complete problems
 Non-NP problems
 Decidability
465
Class P Problems
 Polynomial run time in the worst-case
 "tractable" problems
 Examples
 Simple sorting algorithms
 Advanced sorting algorithm
 Topological Sort
 Shortest Paths
 Min Spanning Trees
466
Class NP Problems
NP-Problems: Problems for which there is no known
polynomial-time algorithms, "intractable" problems
NP means non-deterministically polynomial. If we
have a candidate for a solution, we can verify it in
polynomial time. The difficult part is to generate all
possible candidates.
Example: (the satisfiability problem) Given a boolean expression of
n variables, find a set of variable assignments that satisfy the
expression
Obviously, P  NP, but P  NP (or P = NP) is an open question
467
NP-Complete Problems
An NP-Complete problem is an NP problem with the
property that any problem in NP can be reduced to it in
polynomial time
In 1971 Stephen Cook (currently working at the
University of Toronto, Ontario) showed that the
satisfiability problem is NP-complete by proving that
every other NP problem could be transformed to it.
http://www.cs.toronto.edu/DCS/People/Faculty/sacook.html
468
NP-Complete Problems
Bin packing problem: How to pack or store objects of various sizes and
shapes with a minimum of wasted space?
Clique problem: A clique is a subset K of vertices in an undirected graph
G such that every pair is joined by an edge in G. The problems are:
Given a graph G, and an integer k, does G have a clique consisting of
k vertices?
Given a graph G, find a clique with as many vertices as possible.
Knapsack problem: You have a knapsack with a given size and several
objects with various sizes and values. The problem is how to fill your
knapsack with these objects so that you have a maximum total value.
Hamiltonian Cycle: a cycle that contains all nodes of the graph. Proving
that a graph has a Hamiltonian cycle is a NP-complete problem.
469
NP-Complete Problems
NP problems and non-NP problems:
For NP problems, given a candidate for a solution, we can verify it in
polynomial time.
For non-NP problems, we cannot verify in polynomial time whether
a candidate for a solution is a solution or not.
Example: the problem to prove that a graph does not have a
Hamiltonian cycle is a non-NP problem.
NP-hard problems: NP-Complete, but not necessarily in NP class
470
Decidability
Yes/no problems (decision problems)
A decision problem is a question that has two
possible answers yes or no. The question is
about some input.
471
Examples of Yes/No
Problems
Given a graph G and a set of vertices K, is K a
clique?
Given a graph G and a set of edges M, is M a
spanning tree?
Given a set of axioms (boolean expressions)
and an expression, is the expression provable
under the axioms?
472
Decidable, semi-decidable, and
undecidable problems
A problem is decidable if there is an algorithm
(P or NP) that says yes if the answer is yes,
and no otherwise.
A problem is semi-decidable if there is an
algorithm that says yes if the answer is yes,
however it may loop infinitely if the answer is
no.
A problem is undecidable if we can prove that
there is no algorithm that will deliver an
answer.
473
Example of semi-decidable
problem
Given a set of axioms, prove that an expression is true.
Problem 1: Let the axioms be:
A v B
A v C
~B
Prove A.
To prove A we add ~A to the axioms.
If A is true then ~A will be false and this will cause a
contradiction - the conjunction of all axioms plus ~A will result in
False
(A v B)  ~A = B
B  (A v C) = (B  A) v (B  C)
B  ~ B = False
474
Example of semi-decidable
problem
Problem 2: Let the axioms be:
A v B
A v C
~B
Prove ~A.
We add A and obtain:
(A v C)  A = A
(A v B)  A = A
A  ~B = A  ~B
(A  ~B)  (A v B) = A  ~ B
…..
This process will never stop, because the expressions we
obtain will always be different from False
475
Example of undecidable
problem
The halting problem
Let LOOP be a program that checks other
programs for infinite loops:
LOOP(P) stops and prints "yes" if P loops infinitely
LOOP(P) enters an infinite loop if P stops
What about LOOP(LOOP)?
http://www.cgl.uwaterloo.ca/~csk/washington/halt.html

More Related Content

PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPT
Asymptotic Notation and Complexity
PPT
lecture 1
PPT
Lecture 4 - Growth of Functions.ppt
PPT
Lecture 4 - Growth of Functions (1).ppt
PPT
Algorithum Analysis
PPTX
Time complexity.pptxghhhhhhhhhhhhhhhjjjjjjjjjjjjjjjjjjjjjjjjjj
PPTX
CMSC 56 | Lecture 8: Growth of Functions
Design an Analysis of Algorithms II-SECS-1021-03
Asymptotic Notation and Complexity
lecture 1
Lecture 4 - Growth of Functions.ppt
Lecture 4 - Growth of Functions (1).ppt
Algorithum Analysis
Time complexity.pptxghhhhhhhhhhhhhhhjjjjjjjjjjjjjjjjjjjjjjjjjj
CMSC 56 | Lecture 8: Growth of Functions

Similar to Complete Book Lectures maths theory helpful for kids.ppt (20)

PDF
02 CS316_algorithms_Asymptotic_Notations(2).pdf
PPT
how to calclute time complexity of algortihm
PPT
How to calculate complexity in Data Structure
PPT
Time complexity.pptr56435 erfgegr t 45t 35
PPT
Time complexity.ppt
PPT
Clase3 Notacion
PPT
AlgorithmAnalysis2.ppt
PPT
recurrence relation is explained in this
PPT
Design and analysis of algorithm ppt ppt
PPT
Lec03 04-time complexity
PPT
Asymptotic analysis
PPTX
Asymptotic Notation
PPT
02 asymp
PDF
Unit-1 DAA_Notes.pdf
PDF
6_12_13Asymptotic analysiskfnhlkjsbfbkjs.pdf
PPT
analysis.ppt
PDF
Lecture3(b).pdf
PPTX
Dr hasany 2467_16649_1_lec-2-zabist
PDF
Lecture 3(a) Asymptotic-analysis.pdf
02 CS316_algorithms_Asymptotic_Notations(2).pdf
how to calclute time complexity of algortihm
How to calculate complexity in Data Structure
Time complexity.pptr56435 erfgegr t 45t 35
Time complexity.ppt
Clase3 Notacion
AlgorithmAnalysis2.ppt
recurrence relation is explained in this
Design and analysis of algorithm ppt ppt
Lec03 04-time complexity
Asymptotic analysis
Asymptotic Notation
02 asymp
Unit-1 DAA_Notes.pdf
6_12_13Asymptotic analysiskfnhlkjsbfbkjs.pdf
analysis.ppt
Lecture3(b).pdf
Dr hasany 2467_16649_1_lec-2-zabist
Lecture 3(a) Asymptotic-analysis.pdf
Ad

More from AishaAnwar16 (7)

PPTX
python topic 2 string concatenations.pptx
PPTX
conditional statement in python day 5.pptx
PPTX
Shapes maths details on it presentation.pptx
PPTX
python slides operators and types of it.pptx
PPTX
python programming codes detailed slide.pptx
PPTX
python programming slides helpfullday two.pptx
DOCX
TSS - CQ Environment (2).docx
python topic 2 string concatenations.pptx
conditional statement in python day 5.pptx
Shapes maths details on it presentation.pptx
python slides operators and types of it.pptx
python programming codes detailed slide.pptx
python programming slides helpfullday two.pptx
TSS - CQ Environment (2).docx
Ad

Recently uploaded (20)

PPTX
2 - Self & Personality 587689213yiuedhwejbmansbeakjrk
PPTX
Astra-Investor- business Presentation (1).pptx
PDF
Charisse Litchman: A Maverick Making Neurological Care More Accessible
PDF
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
PDF
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
PPTX
TRAINNING, DEVELOPMENT AND APPRAISAL.pptx
PDF
Satish NS: Fostering Innovation and Sustainability: Haier India’s Customer-Ce...
PPTX
Project Management_ SMART Projects Class.pptx
PPTX
BUSINESS CYCLE_INFLATION AND UNEMPLOYMENT.pptx
PDF
NEW - FEES STRUCTURES (01-july-2024).pdf
PDF
Kishore Vora - Best CFO in India to watch in 2025.pdf
DOCX
80 DE ÔN VÀO 10 NĂM 2023vhkkkjjhhhhjjjj
DOCX
Center Enamel A Strategic Partner for the Modernization of Georgia's Chemical...
PDF
Tortilla Mexican Grill 发射点犯得上发射点发生发射点犯得上发生
PDF
Susan Semmelmann: Enriching the Lives of others through her Talents and Bless...
PDF
533158074-Saudi-Arabia-Companies-List-Contact.pdf
PPTX
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
PDF
ANALYZING THE OPPORTUNITIES OF DIGITAL MARKETING IN BANGLADESH TO PROVIDE AN ...
PDF
Introduction to Generative Engine Optimization (GEO)
PPTX
chapter 2 entrepreneurship full lecture ppt
2 - Self & Personality 587689213yiuedhwejbmansbeakjrk
Astra-Investor- business Presentation (1).pptx
Charisse Litchman: A Maverick Making Neurological Care More Accessible
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
TRAINNING, DEVELOPMENT AND APPRAISAL.pptx
Satish NS: Fostering Innovation and Sustainability: Haier India’s Customer-Ce...
Project Management_ SMART Projects Class.pptx
BUSINESS CYCLE_INFLATION AND UNEMPLOYMENT.pptx
NEW - FEES STRUCTURES (01-july-2024).pdf
Kishore Vora - Best CFO in India to watch in 2025.pdf
80 DE ÔN VÀO 10 NĂM 2023vhkkkjjhhhhjjjj
Center Enamel A Strategic Partner for the Modernization of Georgia's Chemical...
Tortilla Mexican Grill 发射点犯得上发射点发生发射点犯得上发生
Susan Semmelmann: Enriching the Lives of others through her Talents and Bless...
533158074-Saudi-Arabia-Companies-List-Contact.pdf
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
ANALYZING THE OPPORTUNITIES OF DIGITAL MARKETING IN BANGLADESH TO PROVIDE AN ...
Introduction to Generative Engine Optimization (GEO)
chapter 2 entrepreneurship full lecture ppt

Complete Book Lectures maths theory helpful for kids.ppt

  • 1. Chapter 1: Introduction Mathematical Review Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 2. Mathematical Review  Exponents  Logarithms  Recursive Definitions  Function Growth  Proofs
  • 3. Exponents X0 = 1 by definition XaXb = X (a+b) Xa / Xb = X (a-b) Show that: X-n = 1 / Xn (Xa )b = Xab
  • 4. Logarithms  logaX = Y  aY = X , a > 0, X > 0 E.G: log28 = 3; 23 = 8  loga1 = 0 because a0 = 1 logX means log2X lgX means log10X lnX means logeX, where ‘e’ is the natural number
  • 5. Logarithms  loga(XY) = logaX + logaY  loga(X/Y) = logaX – logaY  loga(Xn) = nlogaX  loga b = (log2 b)/ (log2a)  a loga x = x
  • 6. Recursive Definitions  Basic idea: To define objects, processes and properties in terms of simpler objects, simpler processes or properties of simpler objects/processes.
  • 7. Recursive Definitions  Terminating rule - defining the object explicitly.  Recursive rules - defining the object in terms of a simpler object.
  • 8. Examples  Factorials N! f(n) = n! f(0) = 1 i.e. 0! = 1 f(n) = n * f(n-1) i.e. n! = n * (n-1)!
  • 9. Examples Fibonacci numbers F(0) = 1 F(1) = 1 F(k+1) = F(k) + F(k-1) 1, 1, 2, 3, 5, 8, ….
  • 10. Function Growth lim ( n ) = ∞, n → ∞ lim ( na ) = ∞, n → ∞, a > 0 lim ( 1 / n ) = 0, n → ∞ lim ( 1 / (na) ) = 0, n → ∞, a > 0 lim ( log( n )) = ∞, n → ∞ lim ( an ) = ∞, n → ∞, a > 0
  • 11. Function Growth  lim (f(x) + g(x)) = lim (f(x)) + lim (g(x))  lim (f(x) * g(x)) = lim (f(x)) * lim (g(x))  lim (f(x) / g(x)) = lim (f(x)) / lim (g(x))  lim (f(x) / g(x)) = lim (f '(x) / g '(x))
  • 12. Examples  lim (n/ n2 ) = 0, n → ∞  lim (n2 / n) = ∞, n → ∞  lim (n2 / n3 ) = 0, n → ∞  lim (n3 / n2 ) = ∞, n → ∞  lim (n / ((n+1)/2) = 2, n → ∞.
  • 13. Some Derivatives  (logan)' = (1/n) logae  (an)' = (an) ln(a)
  • 14. Proofs  Direct proof  Proof by induction  Proof by counterexample  Proof by contradiction  Proof by contraposition
  • 15. Direct Proof Based on the definition of the object / property Example: Prove that if a number is divisible by 6 then it is divisible by 2 Proof: Let m divisible by 6. Therefore, there exists q such that m = 6q 6 = 2 . 3 m = 6q = 2.3.q = 2r, where r = 3q Therefore m is divisible by 2
  • 16. Proof by Induction We use proof by induction when our claim concerns a sequence of cases, which can be numbered Inductive base: Show that the claim is true for the smallest case, usually k = 0 or k = 1. Inductive hypothesis: Assume that the claim is true for some k Prove that the claim is true for k+1
  • 17. Example of Proof by Induction Prove by induction that S(N) = Σ 2i = 2 (N+1) - 1, for any integer N ≥ 0 i=0 to N 1. Inductive base Let n = 0. S(0) = 20 = 1 On the other hand, by the formula S(0) = 2 (0+1) – 1 = 1. Therefore the formula is true for n = 0 2. Inductive hypothesis Assume that S(k) = 2 (k+1) – 1 We have to show that S(k+1) = 2(k + 2) -1 By the definition of S(n): S(k+1) = S(k) + 2(k+1) = 2 (k+1) – 1 + 2(k+1) = 2. 2(k+1) – 1 = 2(k+2) – 1
  • 18. Proof by Counterexample Used when we want to prove that a statement is false. Types of statements: a claim that refers to all members of a class. EXAMPLE: The statement "all odd numbers are prime" is false. A counterexample is the number 9: it is odd and it is not prime.
  • 19. Proof by Contradiction Assume that the statement is false, i.e. its negation is true. Show that the assumption implies that some known property is false - this would be the contradiction Example: Prove that there is no largest prime number
  • 20. Proof by Contraposition Used when we have to prove a statement of the form P  Q. Instead of proving P  Q, we prove its equivalent ~Q  ~P Example: Prove that if the square of an integer is odd then the integer is odd We can prove using direct proof the statement: If an integer is even then its square is even.
  • 21. Chapter 2: Algorithm Analysis Big-Oh and Other Notations in Algorithm Analysis Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 22. Big-Oh and Other Notations in Algorithm Analysis  Classifying Functions by Their Asymptotic Growth  Theta, Little oh, Little omega  Big Oh, Big Omega  Rules to manipulate Big-Oh expressions  Typical Growth Rates
  • 23. Classifying Functions by Their Asymptotic Growth Asymptotic growth : The rate of growth of a function Given a particular differentiable function f(n), all other differentiable functions fall into three classes: .growing with the same rate .growing faster .growing slower
  • 24. f(n) and g(n) have same rate of growth, if lim( f(n) / g(n) ) = c, 0 < c < ∞, n -> ∞ Notation: f(n) = Θ( g(n) ) pronounced "theta" Theta
  • 25. f(n) grows slower than g(n) (or g(n) grows faster than f(n)) if lim( f(n) / g(n) ) = 0, n → ∞ Notation: f(n) = o( g(n) ) pronounced "little oh" Little oh
  • 26. f(n) grows faster than g(n) (or g(n) grows slower than f(n)) if lim( f(n) / g(n) ) = ∞, n -> ∞ Notation: f(n) = ω (g(n)) pronounced "little omega" Little omega
  • 27. if g(n) = o( f(n) ) then f(n) = ω( g(n) ) Examples: Compare n and n2 lim( n/n2 ) = 0, n → ∞, n = o(n2) lim( n2/n ) = ∞, n → ∞, n2 = ω(n) Little omega and Little oh
  • 28. Theta: Relation of Equivalence R: "having the same rate of growth": relation of equivalence, gives a partition over the set of all differentiable functions - classes of equivalence. Functions in one and the same class are equivalent with respect to their growth.
  • 29. Algorithms with Same Complexity Two algorithms have same complexity, if the functions representing the number of operations have same rate of growth. Among all functions with same rate of growth we choose the simplest one to represent the complexity.
  • 30. Examples Compare n and (n+1)/2 lim( n / ((n+1)/2 )) = 2, same rate of growth (n+1)/2 = Θ(n) - rate of growth of a linear function
  • 31. Examples Compare n2 and n2+ 6n lim( n2 / (n2+ 6n ) )= 1 same rate of growth. n2+6n = Θ(n2) rate of growth of a quadratic function
  • 32. Examples Compare log n and log n2 lim( log n / log n2 ) = 1/2 same rate of growth. log n2 = Θ(log n) logarithmic rate of growth
  • 33. Θ(n3): n3 5n3+ 4n 105n3+ 4n2 + 6n Θ(n2): n2 5n2+ 4n + 6 n2 + 5 Θ(log n): log n log n2 log (n + n3) Examples
  • 34. Comparing Functions  same rate of growth: g(n) = Θ(f(n))  different rate of growth: either g(n) = o (f(n)) g(n) grows slower than f(n), and hence f(n) = ω(g(n)) or g(n) = ω (f(n)) g(n) grows faster than f(n), and hence f(n) = o(g(n))
  • 35. f(n) = O(g(n)) if f(n) grows with same rate or slower than g(n). f(n) = Θ(g(n)) or f(n) = o(g(n)) The Big-Oh Notation
  • 36. n+5 = Θ(n) = O(n) = O(n2) = O(n3) = O(n5) the closest estimation: n+5 = Θ(n) the general practice is to use the Big-Oh notation: n+5 = O(n) Example
  • 37. The inverse of Big-Oh is Ω If g(n) = O(f(n)), then f(n) = Ω (g(n)) f(n) grows faster or with the same rate as g(n): f(n) = Ω (g(n)) The Big-Omega Notation
  • 38. Rules to manipulate Big-Oh expressions Rule 1: a. If T1(N) = O(f(N)) and T2(N) = O(g(N)) then T1(N) + T2(N) = max( O( f (N) ), O( g(N) ) )
  • 39. Rules to manipulate Big-Oh expressions b. If T1(N) = O( f(N) ) and T2(N) = O( g(N) ) then T1(N) * T2(N) = O( f(N)* g(N) )
  • 40. Rules to manipulate Big-Oh expressions Rule 2: If T(N) is a polynomial of degree k, then T(N) = Θ( Nk ) Rule 3: log k N = O(N) for any constant k.
  • 41. Examples n2 + n = O(n2) we disregard any lower-order term nlog(n) = O(nlog(n)) n2 +nlog(n) = O(n2)
  • 42. C constant, we write O(1) logN logarithmic log2N log-squared N linear NlogN N2 quadratic N3 cubic 2N exponential N! factorial Typical Growth Rates
  • 43. Problems N2 = O(N2) true 2N = O(N2) true N = O(N2) true N2 = O(N) false 2N = O(N) true N = O(N) true
  • 44. Problems N2 = Θ (N2) true 2N = Θ (N2) false N = Θ (N2) false N2 = Θ (N) false 2N = Θ (N) true N = Θ (N) true
  • 45. Chapter 2: Algorithm Analysis Application of Big-Oh to program analysis Running Time Calculations Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 46. 46 Background The work done by an algorithm, i.e. its complexity, is determined by the number of the basic operations necessary to solve the problem.
  • 47. 47 The Task Determine how the number of operations depend on the size of input : N - size of input F(N) - number of operations
  • 48. 48 Basic operations in an algorithm Problem: Find x in an array Operation: Comparison of x with an entry in the array Size of input: The number of the elements in the array
  • 49. 49 Basic operations …. Problem: Multiplying two matrices with real entries Operation: Multiplication of two real numbers Size of input: The dimensions of the matrices
  • 50. 50 Basic operations …. Problem: Sort an array of numbers Operation: Comparison of two array entries plus moving elements in the array Size of input: The number of elements in the array
  • 51. 51 Counting the number of operations A. for loops O(n) The running time of a for loop is at most the running time of the statements inside the loop times the number of iterations.
  • 52. 52 for loops sum = 0; for( i = 0; i < n; i++ ) sum = sum + i; The running time is O(n)
  • 53. 53 B. Nested loops The total running time is the running time of the inside statements times the product of the sizes of all the loops Counting the number of operations
  • 54. 54 Nested loops sum = 0; for( i = 0; i < n; i++) for( j = 0; j < n; j++) sum++; The running time is O(n2)
  • 55. 55 C. Consecutive program fragments Total running time : the maximum of the running time of the individual fragments Counting the number of operations
  • 56. 56 Consecutive program fragments sum = 0; O(n) for( i = 0; i < n; i++) sum = sum + i; sum = 0; O(n2) for( i = 0; i < n; i++) for( j = 0; j < 2n; j++) sum++; The maximum is O(n2)
  • 57. 57 D: If statement if C S1; else S2; The running time is the maximum of the running times of S1 and S2. Counting the number of operations
  • 58. 58 EXAMPLES O(n3): sum = 0; for( i = 0 ; i < n; i++) for( j = 0 ; j < n*n ; j++ ) sum++;
  • 59. 59 EXAMPLES O(n2): sum = 0; for( i = 0; i < n ; i++) for( j = 0; j < i ; j++) sum++;
  • 60. 60 EXAMPLES O(n3logn) for(j = 0; j < n*n; j++) compute_val(j); The complexity of compute_val(x) is given to be O(n*logn)
  • 61. 61 Search in an unordered array of elements for (i = 0; i < n; i++) if (a[ i ] == x) return 1; return -1; O(n)
  • 62. 62 Search in a table n x m for (i = 0; i < n; i++) for (j = 0; j < m; j++) if (a[ i ][ j ] == x) return 1 ; return -1; O(n*m)
  • 63. Chapter 2: Algorithm Analysis Application of Big-Oh to program analysis Logarithms in Running Time Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 64. 64 Binary search Euclid’s algorithm Exponentials Rules to count operations Logarithms in Running Time
  • 65. 65 Divide-and-conquer algorithms Subsequently reducing the problem by a factor of two require O(logN) operations
  • 66. 66 Why logN? A complete binary tree with N leaves has logN levels. Each level in the divide-and- conquer algorithms corresponds to an operation Hence the number of operations is O(logN)
  • 68. 68 Binary Search Solution 1: Scan all elements from left to right, each time comparing with X. O(N) operations.
  • 69. 69 Binary Search Solution 2: O(logN) Find the middle element Amid in the list and compare it with X If they are equal, stop If X < Amid consider the left part If X > Amid consider the right part Do until the list is reduced to one element
  • 70. 70 Euclid's algorithm Finding the greatest common divisor (GCD) GCD of M and N, M > N, = GCD of N and M % N
  • 71. 71 GCD and recursion Recursion: If M%N = 0 return N Else return GCD(N, M%N) The answer is the last nonzero remainder.
  • 72. 72 M N rem 24 15 9 15 9 6 9 6 3 6 3 0 3 0
  • 73. 73 long gcd ( long m, long n) { long rem; while (n != 0) { rem = m % n; m = n; n = rem; } return m; } Euclid’s Algorithm (non-recursive implementation)
  • 74. 74 Why O(logN) M % N <= M / 2 After 1st iteration N appears as first argument, the remainder is less than N/2 After 2nd iteration the remainder appears as first argument and will be reduced by a factor of two Hence O(logN)
  • 75. 75 Computing XN XN = X*(X2) N / 2 ,N is odd XN = (X2) N / 2 ,N is even
  • 76. 76 long pow (long x, int n) { if ( n == 0) return 1; if (is_Even( n )) return pow(x * x, n/2); else return x * pow ( x * x, n/2); }
  • 77. 77 Why O(LogN) If N is odd : two multiplications The operations are at most 2logN: O(logN)
  • 78. 78 Another recursion for XN Another recursive definition that reduces the power just by 1: XN = X*XN -1 Here the operations are N-1, i.e. O(N) and the algorithm is less efficient than the divide-and-conquer algorithm.
  • 79. 79 How to count operations  single statements (not function calls) : constant O(1) = 1.  sequential fragments: the maximum of the operations of each fragment
  • 80. 80  single loop running up to N, with single statements in its body: O(N)  single loop running up to N, with the number of operations in the body O(f(N)): O( N * f(N) ) How to count operations
  • 81. 81  two nested loops each running up to N, with single statements: O(N2)  divide-and-conquer algorithms with input size N: O(logN) Or O(N*logN) if each step requires additional processing of N elements How to count operations
  • 82. 82 Example: What is the probability two numbers to be relatively prime? tot = 0; rel = 0; for ( i = 0; i <= n; i++) for (j = i+1; j <= n; j++) { tot++; if ( gcd( i, j ) ==1) rel++; } return (rel/tot); Running time = ?
  • 83. Chapter 3: Abstract Data Types Lists, Stacks Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 84. 84 An ADT • a set of objects • a set of operations Same set of objects , different sets of operations => different ADTs ADTs are implemented using classes, hiding implementation details: encapsulation
  • 85. 85 LIST ABSTRACTION Definition: A linear configuration of elements, called nodes.
  • 86. 86 Characteristics  Insert and delete nodes in any order  The nodes are connected  Each node has two components Information (data) Link to the next node  The nodes are accessed through the links between them
  • 87. 87 Head Predecessor of X Node X Success- or of X tail For each node the node that is in front of it is called predecessor. The node that is after it is called successor.
  • 88. 88 Terminology  Head (front, first node):  The node without predecessor, the node that starts the lists.  Tail (end, last node):  The node that has no successor, the last node in the list.  Current node: The node being processed.  From the current node we can access the next node.  Empty list: No nodes exist
  • 89. 89 Basic operations  To create/destroy a list  To expand/shrink the list  Read/Write operations  Changing the current node (moving along the list)  To report current position in the list  To report status of the list
  • 90. 90 ADT List Notation L - list e - item of the same type as the information part of an element (a node) in the list b - boolean value
  • 91. 91 Operations in ADT Notation Insert(L,e) Inserts a node with information e before the current position Delete(L) Deletes the current node in L , the current position indicates the next node. RetrieveInfo(L)  e Returns the information in the current node.
  • 92. 92 Insertion and Deletion A. Insertion To insert a node X between the nodes A and B: .Create a link from X to B. .Create a link from A to X,
  • 94. 94 Insertion and Deletion B. Deletion To delete a node X between A and B: • Create a link from A to B, • Remove node X
  • 96. 96 Node Linking 1. Single linked lists : Each node contains two links - to the previous and to the next node 2. Double linked lists : Each node contains a link only to the next node 3. Circular lists: The tail is linked to the head.
  • 97. 97 List Implementation Static – using an array Dynamic – using linear nodes
  • 98. 98 Array Implementation Two parallel arrays are used: Index array - the number stored in the i-th element shows the index of the "next" node , i.e. node that follows the i-th node Data array - used to store the informational part of the nodes.
  • 99. 99
  • 100. 100 STACKS Definition: The last stored element is the first to be accessed (LIFO: last in - first out)
  • 101. 101 Basic operations Push: store a data item at the top of the stack Pop: retrieve a data item from the top of the stack
  • 102. 102 ADT Definition of STACK Notation: S stack e item of same type as the elements of S b boolean value
  • 103. 103 Operations Init_Stack(S) Procedure to initialize S to an empty stack Destroy_Stack(S) Procedure to delete all elements in S
  • 104. 104 Operations Stack_Empty(S)  b Boolean function that returns TRUE if S is empty. Stack_Full(S)  b Boolean function that returns TRUE if S is full.
  • 105. 105 Operations Push(S,e) Procedure to place an item e into S (if there is room, i.e. S is not full) Pop(S)  e Procedure to take the last item stored in S (this item is called also - top element) if S is not empty
  • 106. 106 Example A procedure to replace the elements of a nonempty stack, assuming they are of type integers, with their sum Pre: A nonempty stack with elements of type integers. Post: S contains only one element – the sum of previously stored elements.
  • 107. 107 Algorithm 1. e1  Pop(S) 2. while stack is not empty repeat 2.1. e2  pop(S) 2.2. push(S, e1+e2) 2.3. e1 pop (S) 3. push(S,e1)
  • 108. Chapter 3: Abstract Data Types Queues Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 109. 109 QUEUES Definition: A sequence of elements of the same type. The first stored element is first accessible. The structure is known also under the name FIFO - first in first out
  • 110. 110 Basic operations EnQueue : store a data item at the end of the queue DeQueue : retrieve a data item from the beginning of the queue
  • 111. 111 ADT Definition of QUEUE Notation: Q queue e item of same type as the elements of Q b boolean value
  • 112. 112 Operations Init_Queue(Q) Initialize Q to an empty queue Queue_Empty(Q)  b Boolean function that returns TRUE is Q is empty Queue_Full(Q)  b Boolean function that returns TRUE if Q is full :array-based implementations
  • 113. 113 Operations EnQueue(Q,e) Procedure to place an item e into Q at the end (if Q is not full) DeQueue(Q)  e Procedure to take the first item stored in Q if Q is not empty
  • 114. 114 Problem 1 Append_Queue(Q,P): A procedure to append a queue P onto the end of a queue Q, leaving P empty. Pre: queue P and queue Q, initialized Post: Q contains all elements originally in Q, followed by the elements that were in P in same order. P is empty. • Design an algorithm to solve the problem
  • 115. 115 Problem 2 Reverse_Queue(Q): A procedure to reverse the elements in a queue Q Pre: queue Q, initialized Post: Q contains all elements re-written in reverse order • Design a non-recursive algorithm using a stack • Design a recursive algorithm • Find the complexity of the algorithms
  • 116. 116 Solutions to Problem 2 A. Non-recursive Init_Stack(S) While not Queue_Empty(Q) e DeQueue(Q) Push(S,e) While not Stack_Empty(S) e  Pop(S) EnQueue(Q,e) . Complexity O(N), N - the number of elements in Q.
  • 117. 117 Solutions to Problem 2 B. Recursive Reverse_Queue(Q): If not Queue_Empty(Q) e DeQueue(Q) Reverse_Queue(Q) EnQueue(Q,e) return Complexity O(N), N - the number of elements in Q.
  • 118. 118 Problem 3 Append_Reverse_Queue(Q,P): Append a queue P in reverse order onto the end of a queue Q, leaving P empty. Pre: P and Q initialized (possibly empty) Post: Q contains all elements originally in Q, followed by the elements that were in P in reverse order. P is empty •Design a recursive algorithm
  • 119. Chapter 4: Trees General Tree Concepts Binary Trees Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 121. 121 Definitions tree - a non-empty collection of vertices & edges vertex (node) - can have a name and carry other associated information path - list of distinct vertices in which successive vertices are connected by edges
  • 122. 122 Definitions  any two vertices must have one and only one path between them else its not a tree  a tree with N nodes has N-1 edges
  • 123. 123 Definitions  root - starting point (top) of the tree  parent (ancestor) - the vertex “above” this vertex  child (descendent) - the vertices “below” this vertex
  • 124. 124 Definitions  leaves (terminal nodes) - have no children  level - the number of edges between this node and the root  ordered tree - where children’s order is significant
  • 125. 125 Definitions  Depth of a node - the length of the path from the root to that node • root: depth 0  Height of a node - the length of the longest path from that node to a leaf • any leaf: height 0  Height of a tree: The length of the longest path from the root to a leaf
  • 126. 126 Balanced Trees the difference between the height of the left sub-tree and the height of the right sub-tree is not more than 1.
  • 127. 127 Trees - Example E R T E L P M E A S A root Leaves or terminal nodes Child (of root) Depth of T: 2 Height of T: 1 Level 0 1 3 2
  • 128. 128 Tree Representation Class TreeNode { Object element; TreeNode firstChild; TreeNode nextSibling; }
  • 129. 129 Example a b f e c d g a b e c d f g
  • 131. 131 Height of a Complete Binary Tree L 0 L 1 L 2 L 3 At each level the number of the nodes is doubled. total number of nodes: 1 + 2 + 22 + 23 = 24 - 1 = 15
  • 132. 132 Nodes and Levels in a Complete Binary Tree Number of the nodes in a tree with M levels: 1 + 2 + 22 + …. 2M = 2 (M+1) - 1 = 2*2M - 1 Let N be the number of the nodes. N = 2*2M - 1, 2*2M = N + 1 2M = (N+1)/2 M = log( (N+1)/2 ) N nodes : log( (N+1)/2 ) = O(log(N)) levels M levels: 2 (M+1) - 1 = O(2M ) nodes
  • 133. 133 Binary Tree Node Class BinaryNode { Object Element; // the data in the node BinaryNode left; // Left child BinaryNode right; // Right child }
  • 134. 134 Binary Tree – Preorder Traversal C L R E T D O N U M P A Root Left Right First letter - at the root Last letter – at the rightmost node
  • 135. 135 Preorder Algorithm preorderVisit(tree) { if (current != null) { process (current); preorderVisit (left_tree); preorderVisit (right_tree); } }
  • 136. 136 Binary Tree – Inorder Traversal U A E R T N P D M O C L Left Root Right First letter - at the leftmost node Last letter – at the rightmost node
  • 137. 137 Inorder Algorithm inorderVisit(tree) { if (current != null) { inorderVisit (left_tree); process (current); inorderVisit (right_tree); } }
  • 138. 138 Binary Tree – Postorder Traversal D L U A N E P R O M C T Left Right Root First letter - at the leftmost node Last letter – at the root
  • 139. 139 Postorder Algorithm postorderVisit(tree) { if (current != null) { postorderVisit (left_tree); postorderVisit (right_tree); process (current); } }
  • 140. 140 Expression Trees 1 2 The stack contains references to tree nodes (bottom is to the left) + 1 2 3 * + 1 2 3 (1+2)*3 Post-fix notation: 1 2 + 3 *
  • 141. 141 Expression Trees In-order traversal: (1 + 2) * ( 3) Post-order traversal: 1 2 + 3 * * + 1 2 3
  • 142. Chapter 4: Trees General Tree Concepts Binary Trees Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 144. 144 Definitions tree - a non-empty collection of vertices & edges vertex (node) - can have a name and carry other associated information path - list of distinct vertices in which successive vertices are connected by edges
  • 145. 145 Definitions  any two vertices must have one and only one path between them else its not a tree  a tree with N nodes has N-1 edges
  • 146. 146 Definitions  root - starting point (top) of the tree  parent (ancestor) - the vertex “above” this vertex  child (descendent) - the vertices “below” this vertex
  • 147. 147 Definitions  leaves (terminal nodes) - have no children  level - the number of edges between this node and the root  ordered tree - where children’s order is significant
  • 148. 148 Definitions  Depth of a node - the length of the path from the root to that node • root: depth 0  Height of a node - the length of the longest path from that node to a leaf • any leaf: height 0  Height of a tree: The length of the longest path from the root to a leaf
  • 149. 149 Balanced Trees the difference between the height of the left sub-tree and the height of the right sub-tree is not more than 1.
  • 150. 150 Trees - Example E R T E L P M E A S A root Leaves or terminal nodes Child (of root) Depth of T: 2 Height of T: 1 Level 0 1 3 2
  • 151. 151 Tree Representation Class TreeNode { Object element; TreeNode firstChild; TreeNode nextSibling; }
  • 152. 152 Example a b f e c d g a b e c d f g
  • 154. 154 Height of a Complete Binary Tree L 0 L 1 L 2 L 3 At each level the number of the nodes is doubled. total number of nodes: 1 + 2 + 22 + 23 = 24 - 1 = 15
  • 155. 155 Nodes and Levels in a Complete Binary Tree Number of the nodes in a tree with M levels: 1 + 2 + 22 + …. 2M = 2 (M+1) - 1 = 2*2M - 1 Let N be the number of the nodes. N = 2*2M - 1, 2*2M = N + 1 2M = (N+1)/2 M = log( (N+1)/2 ) N nodes : log( (N+1)/2 ) = O(log(N)) levels M levels: 2 (M+1) - 1 = O(2M ) nodes
  • 156. 156 Binary Tree Node Class BinaryNode { Object Element; // the data in the node BinaryNode left; // Left child BinaryNode right; // Right child }
  • 157. 157 Binary Tree – Preorder Traversal C L R E T D O N U M P A Root Left Right First letter - at the root Last letter – at the rightmost node
  • 158. 158 Preorder Algorithm preorderVisit(tree) { if (current != null) { process (current); preorderVisit (left_tree); preorderVisit (right_tree); } }
  • 159. 159 Binary Tree – Inorder Traversal U A E R T N P D M O C L Left Root Right First letter - at the leftmost node Last letter – at the rightmost node
  • 160. 160 Inorder Algorithm inorderVisit(tree) { if (current != null) { inorderVisit (left_tree); process (current); inorderVisit (right_tree); } }
  • 161. 161 Binary Tree – Postorder Traversal D L U A N E P R O M C T Left Right Root First letter - at the leftmost node Last letter – at the root
  • 162. 162 Postorder Algorithm postorderVisit(tree) { if (current != null) { postorderVisit (left_tree); postorderVisit (right_tree); process (current); } }
  • 163. 163 Expression Trees 1 2 The stack contains references to tree nodes (bottom is to the left) + 1 2 3 * + 1 2 3 (1+2)*3 Post-fix notation: 1 2 + 3 *
  • 164. 164 Expression Trees In-order traversal: (1 + 2) * ( 3) Post-order traversal: 1 2 + 3 * * + 1 2 3
  • 165. Chapter 4: Trees AVL Trees Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 166. 166 AVL Trees  Keep the tree balanced  Use node rotation  Balanced condition: The left and the right subtrees of each node should differ by at most one level. The method was proposed by two Russian scientists Adelson-Velskii and Landis in 1962
  • 167. 167 Single Rotation 5 3 7 6 8 9 The sub-tree containing the inserted node (rooted at 8) is at the same side as the ‘heavier’ sub-tree of node 5 (rooted at 7) Links to be changed New node
  • 168. 168 After the Rotation 7 5 8 9 6 3 The middle node 6 is switched to the other subtree
  • 169. 169 Double Rotation 5 6 7 8 5 7 8 9 6 9 8 8 7 7 The new node 6 is in the left subtree of node 8, while node 8 is the right subtree of 5 -> rotate 7 and 8 to obtain same-side trees. Animation New node
  • 170. 170 Splay Trees Move to the top each accessed node with rotations, decreasing the depth of the tree.
  • 171. 171 Multi-Way Search Nodes contain more than one key nodes are used only to contain the keys, the actual records are stored at the terminal nodes at the bottom of the tree E I R A C G H S N
  • 172. 172 Complexity Issues  AVL trees: Search,Insertion, and Deletion: O(logN) Creating the tree – O(N)  Trees with multiple keys: O(LogmN) m – branching factor
  • 173. Chapter 4: Trees Binary Search Trees Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 174. 174 Binary Search Trees Definitions Operations and complexity Advantages and disadvantages AVL Trees Single rotation Double rotation Splay Trees Multi-Way Search
  • 175. 175 Definitions Each node – a record with a key and a value a left link a right link All records with smaller keys – left subtree All records with larger keys – right subtree
  • 177. 177 Operations  Search - compare the values and proceed either to the left or to the right  Insertion - unsuccessful search - insert the new node at the bottom where the search has stopped  Deletion - replace the value in the node with the smallest value in the right subtree or the largest value in the left subtree.  Retrieval in sorted order – inorder traversal
  • 178. 178 Complexity Logarithmic, depends on the shape of the tree In the worst case – O(N) comparisons
  • 179. 179 Advantages of BST  Simple  Efficient  Dynamic  One of the most fundamental algorithms in CS  The method of choice in many applications
  • 180. 180 Disadvantages of BST  The shape of the tree depends on the order of insertions, and it can be degenerated.  When inserting or searching for an element, the key of each visited node has to be compared with the key of the element to be inserted/found. Keys may be long and the run time may increase much. Animation
  • 181. 181 Improvements of BST Keeping the tree balanced: AVL trees (Adelson - Velskii and Landis) Balance condition: left and right subtrees of each node can differ by at most one level. It can be proved that if this condition is observed the depth of the tree is O(logN). Reducing the time for key comparison: Radix trees - comparing only the leading bits of the keys (not discussed here)
  • 182. Chapter 4: Trees Radix Search Trees Mark Allen Weiss, Florida University Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
  • 183. 183 Radix Search Trees Radix Searching Digital Search Trees Radix Search Trees Multi-Way Radix Trees
  • 184. 184 Radix Searching Idea: Examine the search keys one bit at a time Advantages: reasonable worst-case performance easy way to handle variable length keys some savings in space by storing part of the key within the search structure competitive with both binary search trees
  • 185. 185 Radix Searching  Disadvantages:  biased data can lead to degenerate trees with bad performance  for some methods use of space is inefficient  dependent on computer’s architecture – difficult to do efficient implementations in some high-level languages
  • 186. 186 Radix Searching • Methods Digital Search Trees Radix Search Tries Multiway Radix Searching
  • 187. 187 Digital Search Trees Similar to binary tree search Difference: Branch in the tree by comparing the key’s bits, not the keys as a whole
  • 188. 188 Example A 00001 S 10011 E 00101 R 10010 C 00011 H 01000 I 01001 N 01110 G 00111 X 11000 M 01101 P 10000 L 01100 A P X M E G C L H N I S R 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
  • 189. 189 Example A P X M E G C L H N I S R 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 Z 0 1 inserting Z = 11010 go right twice go left – external node attach Z to the left of X
  • 190. 190 Digital Search Trees Things to remember about digital search trees:  Equal keys are anathema – must be kept in separate data structures, linked to the nodes.  Worst case – better than for binary search trees – the length of the longest path is equal to the longest match in the leading bits between any two keys.
  • 191. 191 Digital Search Trees  Search or insertion requires about log(N) comparisons on the average and b comparisons in the worst case in a tree built from N random b-bit keys.  No path will ever be longer than the number of bits in the keys
  • 192. 192 Radix Search Trees  If the keys are long digital search trees have low efficiency.  Radix search trees : do not store keys in the tree at all, the keys are in the external nodes of the tree.  Called tries (try-ee) from “retrieval”
  • 193. 193 Radix Search Trees Two types of nodes  Internal: contain only links to other nodes  External: contain keys and no links
  • 194. 194 Radix Search Trees To insert a key – 1. Go along the path described by the leading bit pattern of the key until an external node is reached. 2. If the external node is empty, store there the new key. If the external node contains a key, replace it by an internal node linked to the new key and the old key. If the keys have several bits equal, more internal nodes are necessary. NOTE: insertion does not depend on the order of the keys.
  • 195. 195 Radix Search Trees To search for a key – 1. Branch according to its bits, 2. Don’t compare it to anything, until we get to an external node. 3. One full key comparison there completes the search.
  • 196. 196 Example A 00001 S 10011 E 00101 R 10010 C 00011 A C E R S 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0
  • 197. 197 Example - insertion A 00001 S 10011 E 00101 R 10010 C 00011 H 01000 A C E R S H 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 External node - empty
  • 198. 198 Example - insertion A 00001 S 10011 E 00101 R 10010 C 00011 H 01000 I 01001 A C E R S H 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 I 0 0 0 1 1 1 External node - occupied
  • 199. 199 Radix Search Trees - summary • Program implementation -  Necessity to maintain two types of nodes  Low-level implementation • Complexity: about logN bit comparisons in average case and b bit comparisons in the worst case in a tree built from N random b-bit keys. Annoying feature: One-way branching for keys with a large number of common leading bits :  The number of the nodes may exceed the number of the keys.  On average – N/ln2 = 1.44N nodes
  • 200. 200 Multi-Way Radix Trees  The height of the tree is limited by the number of the bits in the keys  If we have larger keys – the height increases. One way to overcome this deficiency is using a multi- way radix tree searching.  The branching is not according to 1 bit, but rather according to several bits (most often 2)  If m bits are examined at a time – the search is speeded up by a factor of 2m  Problem: if m bits at a time, the nodes will have 2m links, may result in considerable amount of wasted space due to unused links.
  • 201. 201 Multi-Way Radix Trees - example Search – take left, right or middle links according to the first two bits. Insert – replace external node by the key (E.G. insert T 10100). X A C E G N P H I L M R S 00 01 10 11 T Nodes with 4 links – 00, 01, 10, 11
  • 202. 202 Multi-Way Radix Trees  Wasted space – due to the large number of unused links.  Worse if M - the number of bits considered, gets higher.  The running time: logMN – very efficient.  Hybrid method: Large M at the top, Small M at the bottom
  • 203. Chapter 5: Hashing Collision Resolution: Separate Chaining Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 204. Collision Resolution  Collision Resolution  Separate Chaining  Open Addressing  Linear Probing  Quadratic Probing  Double Hashing  Rehashing  Extendible Hashing
  • 205. Hash Tables – Collision Problem: many-to-one mapping a potentially huge set of strings  a small set of integers Collision: Having a second key into a previously used slot. Collision resolution: deals with keys that are mapped to same slots.
  • 206. Separate Chaining Invented by H. P. Luhn, an IBM engineer, in January 1953. Idea: Keys hashing to same slot are kept in linked lists attached to that slot Useful for highly dynamic situations, where the number of the search keys cannot be predicted in advance.
  • 207. Example Key: A S E A R C H I N G E X A M P L E Hash: 1 8 5 1 7 3 8 9 3 7 5 2 1 2 5 1 5 (M = 11) Separate chaining: 0 1 2 3 4 5 6 7 8 9 10 = L M N = E = G H I = A X C P R S = A = = E = = A E = = = end of list
  • 208. Separate Chaining – Length of Lists N - number of keys M - size of table N/M - average length of the lists
  • 209. Separate Chaining – Search  Unsuccessful searches go to the end of some list.  Successful searches are expected to go half the way down some list.
  • 210. Separate Chaining – Choosing Table Size M relatively small so as not to use up a large area of contiguous memory but enough large so that the lists are short for more efficient sequential search
  • 211. Separate Chaining – other chaining options  Keep the lists ordered: useful if there are much more searches than inserts, and if most of the searches are unsuccessful  Represent the chains as binary search tree. Extra effort needed – not efficient
  • 212. Separate Chaining – advantages and disadvantages Advantages  used when memory space is a concern  easily implemented Disadvantages  unevenly distributed keys – long lists : search time increases, many empty spaces in the table.
  • 213. Chapter 5: Hashing Hash Tables Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 214. 214 Hashing What is Hashing? Direct Access Tables Hash Tables
  • 215. 215 Hashing - basic idea A mapping between the search keys and indices - efficient searching into an array. Each element is found with one operation only.
  • 216. 216 Hashing example Example:  1000 students,  identification number between 0 and 999,  use an array of 1000 elements.  SSN of each student a 9-digit number.  much more elements than the number of the students, - a great waste of space.
  • 217. 217 The Approach  Directly referencing records in a table using arithmetic operations on keys to map them onto table addresses.  Hash function: function that transforms the search key into a table address.
  • 218. 218 Direct-address tables  The most elementary form of hashing.  Assumption – direct one-to-one correspondence between the keys and numbers 0, 1, …, m-1., m – not very large.  Array A[m]. Each position (slot) in the array contains a pointer to a record, or NIL.
  • 219. 219 Direct-Address Tables  The most elementary form of hashing  Assumption – direct one-to-one correspondence between the keys and numbers 0, 1, …, m-1., m – not very large.  Array A[m]. Each position (slot) in the array contains either a reference to a record, or NULL.  Cost – the size of the array we need is determined the largest key. Not very useful if there are only a few keys
  • 220. 220 Hash Functions Transform the keys into numbers within a predetermined interval to be used as indices in an array (table, hash table) to store the records
  • 221. 221 Hash Functions – Numerical Keys Keys – numbers If M is the size of the array, then h(key) = key % M. This will map all the keys into numbers within the interval [0 .. (M-1)].
  • 222. 222 Hash Functions – Character Keys Keys – strings of characters Treat the binary representation of a key as a number, and then apply the hash function
  • 223. 223 How keys are treated as numbers If each character is represented with m bits, then the string can be treated as base-m number.
  • 224. 224 Example A K E Y : 00001 01011 00101 11001 = 1 . 323 + 11 . 322 + 5 . 321 + 25 . 320 = 44271 Each letter is represented by its position in the alphabet. E.G, K is the 11-th letter, and its representation is 01011 ( 11 in decimal)
  • 225. 225 Long Keys If the keys are very long, an overflow may occur. A solution to this is to apply the Horner’s method in computing the hash function.
  • 226. 226 Horner’s Method anxn + an-1.xn-1 + an-2.xn-2 + … + a1x1 + a0x0 = x(x(…x(x (an.x +an-1) + an-2 ) + …. ) + a1) + a0 4x5 + 2x4 + 3x3 + x2 + 7x1 + 9x0 = x( x( x( x ( 4.x +2) + 3) + 1 ) + 7) + 9 The polynomial can be computed by alternating the multiplication and addition operations
  • 227. 227 Example V E R Y L O N G K E Y 10110 00101 10010 11001 01100 01111 01110 00111 01011 00101 11001 V E R Y L O N G K E Y 22 5 18 25 12 15 14 7 11 5 25 22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327 + 12 . 326 + 15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320
  • 228. 228 Example (continued) V E R Y L O N G K E Y 22 . 3210 + 5 . 329 + 18 . 328 + 25 . 327 + 12 . 326 + 15 . 325 + 14 . 324 + 7 . 323 + 11 . 322 + 5 . 321 + 25 . 320 (((((((((22.32 + 5)32 + 18)32 + 25)32 + 12)32 + 15)32 + 14)32 + 7)32 +11)32 +5)32 + 25 compute the hash function by applying the mod operation at each step, thus avoiding overflowing. h0 = (22.32 + 5) % M h1 = (32.h0 + 18) % M h2 = (32.h1 +25) % M . . . .
  • 229. 229 int hash32 (char[] name, int tbl_size) { key_length = name.length; int h = 0; for (int i=0; i < key_length ; i++) h = (32 * h + name[i]) % tbl_size; return h; } Code
  • 230. 230 Hash Tables  Index - integer, generated by a hash function between 0 and M-1  Initially - blank slots.  sentinel value, or a special field in each slot.
  • 231. 231 Hash Tables Insert - hash function to generate an address Search for a key in the table - the same hash function is used.
  • 232. 232 Size of the Table Table size M - different from the number of records N Load factor: N/M M must be prime to ensure even distribution of keys
  • 233. Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 234. Collision Resolution  Collision Resolution  Separate Chaining  Open Addressing  Linear Probing  Quadratic Probing  Double Hashing  Rehashing  Extendible Hashing
  • 235. Open Addressing  Invented by A. P. Ershov and W. W. Peterson in 1957 independently.  Idea: Store collisions in the hash table.  Table size - must be at least twice the number of the records
  • 236. Open Addressing If collision occurs, next probes are performed following the formula: hi(x) = ( hash(x) + f(i) ) mod Table_Size where: hi(x) is an index in the table to insert x hash(x) is the hash function f(i) is the collision resolution function. i - the current attempt to insert an element
  • 237. Open Addressing Problems with delete: a special flag is needed to distinguish deleted from empty positions. Necessary for the search function – if we come to a “deleted” position, the search has to continue as the deletion might have been done after the insertion of the sought key – the sought key might be further in the table.
  • 238. Linear Probing f(i) = i Insert: If collision - probe the next slot . If unoccupied – store the key there. If occupied – continue probing next slot. Search: a) match – successful search b) empty position – unsuccessful search c) occupied and no match – continue probing. If end of the table - continue from the beginning
  • 239. Key: A S E A R C H I N G E X A M P L E Hash 1 0 5 1 18 3 8 9 14 7 5 5 1 13 16 12 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 S A E * A R C G H I N * E * * * * * X * * * A L M P * * * * * * E Example * - unsuccessful attempts
  • 240. Linear Probing Disadvantage: “ Primary clustering” Large clusters tend to build up. Probability to fill a slot: ? ? i filled slots slot a slot b slot a: (i+1)/M slot b: 1/M
  • 241. Quadratic Probing Use a quadratic function to compute the next index in the table to be probed. The idea here is to skip regions in the table with possible clusters. f(i) = i2
  • 242. Quadratic Probing In linear probing we check the I-th position. If it is occupied, we check the I+1st position, next I+2nd, etc. In quadric probing, if the I-th position is occupied we check the I+1st, next we check I+4th , next I + 9th , etc.
  • 243. Double Hashing Purpose – to overcome the disadvantage of clustering. A second hash function to get a fixed increment for the “probe” sequence. hash2(x) = R - (x mod R) R: prime, smaller than table size. f(i) = i*hash2(x)
  • 244. Rehashing Table size : M > N For small load factor the performance is much better, than for N/M close to one. Best choice: N/M = 0.5 When N/M > 0.75 - rehashing
  • 245. Rehashing Build a second table twice as large as the original and rehash there all the keys of the original table. Expensive operation, running time O(N) However, once done, the new hash table will have good performance.
  • 246. Extendible Hashing  external storage  N records in total to store,  M records in one disk block No more than two blocks are examined.
  • 247. Extendible Hashing Idea: • Keys are grouped according to the first m bits in their code. • Each group is stored in one disk block. • If some block becomes full, each group is split into two , and m+1 bits are considered to determine the location of a record.
  • 248. Example 4 disk blocks, each can contain 3 records 4 groups of keys according to the first two bits 00010 01001 10001 11000 00100 01010 10100 11010 01100 00 01 10 11 directory
  • 249. Example (cont.) New key to be inserted: 01011. Block2 is full, so we start considering 3 bits 00010 01001 01100 10001 11000 ---- 01010 --- 11010 00100 01011 10100 directory 000/001 010 011 100/101 110/111 (still on same block)
  • 250. Extendible Hashing Size of the directory : 2D 2D = O(N (1+1/M) / M) D - the number of bits considered. N - number of records M - number of disk blocks
  • 251. Conclusion 1 Hashing is a search method, used when  sorting is not needed  access time is the primary concern
  • 252. Conclusion 2 Time-space trade-off: No memory limitations – use the key as a memory address (minimum amount of time). No time limitations – use sequential search (minimum amount of memory) Hashing – gives a balance between these two extremes – a way to use a reasonable amount of both memory and time.
  • 253. Conclusion 3 To choose a good hash function is a “black art”. The choice depends on the nature of keys and the distribution of the numbers corresponding to the keys.
  • 254. Conclusion 4 Best course of action: • separate chaining: if the number of records is not known in advance • open addressing: if the number of the records can be predicted and there is enough memory available
  • 255. Chapter 6: Priority Queues Priority Queues Binary Heaps Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 256. 256 Priority Queues The model Implementations Binary heaps Basic operations Animation
  • 257. 257 The Model A priority queue is a queue where:  Requests are inserted in the order of arrival  The request with highest priority is processed first (deleted from the queue)  The priority is indicated by a number, the lower the number - the higher the priority.
  • 258. 258 Implementations Linked list: - Insert at the beginning - O(1) - Find the minimum - O(N) Binary search tree (BST): - Insert O(logN) - Find minimum - O(logN)
  • 259. 259 Implementations Binary heap Better than BST because it does not support links - Insert O(logN) - Find minimum O(logN) Deleting the minimal element takes a constant time, however after that the heap structure has to be adjusted, and this requires O(logN) time.
  • 260. 260 Binary Heap Heap-Structure Property: Complete Binary Tree - Each node has two children, except for the last two levels. The nodes at the last level do not have children. New nodes are inserted at the last level from left to right. Heap-Order Property: Each node has a higher priority than its children
  • 261. 261 6 10 12 15 17 18 23 20 19 34 Binary Heap Next node to be inserted - right child of the yellow node
  • 262. 262 Binary heap implementation with an array Root - A(1) Left Child of A(i) - A(2i) Right child of A(i)- A(2i+1) Parent of A(I) - A([i/2]). The smallest element is always at the root, the access time to the element with highest priority is constant O(1).
  • 263. 263 Example 6 10 12 15 17 18 23 20 19 34 Consider 17: position in the array - 5. parent 10 is at position [5/2] = 2 left child is at position 5*2 = 10 (this is 34) right child - position 2*5 + 1 = 11 (empty.)
  • 264. 264 Problems Problem 1: 2 8 10 16 17 18 23 20 21 30 Reconstruct the binary heap
  • 265. 265 Problems Problem 2: Give the array representation for 3 10 16 13 12 18 23 15 19 30 14
  • 266. 266 Basic Operations  Insert a node – Percolate Up  Delete a node – Percolate Down  Decrease key, Increase key, Remove key  Build the heap
  • 267. 267 Percolate Up – Insert a Node A hole is created at the bottom of the tree, in the next available position. 6 10 12 15 17 18 23 20 19 34
  • 268. 268 Percolate Up Insert 20 6 10 12 15 17 18 23 21 19 34 20
  • 270. 270 Percolate Up 6 10 12 15 18 23 21 19 34 17 16 Complexity of insertion: O(logN)
  • 271. 271 Percolate Down – Delete a Node 10 16 13 12 18 23 15 19 30 34
  • 272. 272 Percolate Down – the wrong way 1 0 16 13 12 18 23 15 19 30 34
  • 273. 273 Percolate Down – the wrong way 16 13 18 23 15 19 34 10 12 30 The empty hole violates the heap-structure property
  • 274. 274 Percolate Down Last element - 20. The hole at the root. 10 16 12 13 18 23 15 19 30 20 We try to insert 20 in the hole by percolating the hole down
  • 275. 275 Percolate Down 16 12 13 18 23 15 19 30 20 10
  • 276. 276 Percolate Down 16 13 18 23 15 19 30 20 10 12
  • 277. 277 Percolate Down 16 13 18 23 20 19 30 10 12 15 Complexity of deletion: O(logN)
  • 278. 278 Other Heap Operations 1. DecreaseKey(p,d) increase the priority of element p in the heap with a positive value d. percolate up. 2. IncreaseKey(p,d) decrease the priority of element p in the heap with a positive value d. percolate down.
  • 279. 279 Other Heap Operations 3. Remove(p) a. Assigning the highest priority to p - percolate p up to the root. b. Deleting the element in the root and filling the hole by percolating down and trying to insert the last element in the queue. 4. BuildHeap input N elements  place them into an empty heap through successive inserts. The worst case running time is O(NlogN).
  • 280. 280 Build Heap - O(N) Given an array of elements to be inserted in the heap, treat the array as a heap with order property violated, and then do operations to fix the order property.
  • 281. 281 Example: 150 80 40 30 10 70 110 100 20 90 60 50 120 140 130 150 80 40 30 10 70 110 100 20 90 60 50 120 140 130
  • 282. 282 Example (cont) 150 80 40 20 10 50 110 100 30 90 60 70 120 140 130 After processing height 1
  • 283. 283 Example (cont) 15 0 10 40 20 60 50 110 10 0 30 90 80 70 12 0 14 0 13 0 After processing height 2
  • 284. 284 Example (cont) 10 20 40 30 60 50 110 100 150 90 80 70 120 140 130 After processing height 3
  • 285. 285 Theorem For a perfect binary tree of height h containing N = 2 h+1 - 1 nodes, the sum S of the heights of the nodes is S = 2 h+1 - 1 - (h + 1) = O(N)
  • 286. 286 Proof The tree has 1 node at height h, 2 nodes at height h-1, 4 nodes at height h -2, etc 1 (20) h 2 (21) h - 1 4 (22) h - 2 8 (23) h - 3 ……. 2 h 0 S =  2i (h - i), i = 0 to h
  • 287. 287 S = h + 2(h - 1) + 4(h - 2) + 8 (h - 3) + …. 2(h-2).2 + 2(h-1).1 (1) 2S = 2h + 4(h - 1) + 8(h - 2) + 16 (h - 3) + …. 2(h-1).2 + 2(h).1 (2) Subtract (1) from (2): S = h -2h + 2 + 4 + 8 + …. 2 h = - h - 1 + (2 h+1 - 1) = (2 h+1 - 1) - (h + 1)
  • 288. 288 Proof (cont.) Note: 1 + 2 + 4 + 8 + …. 2 h = (2 h+1 - 1) Hence S = (2 h+1 - 1) - (h + 1) Hence the complexity of building a heap with N nodes is linear O(N)
  • 289. Chapter 7: Sorting Algorithms Insertion Sort Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 290. 290 Sorting Algorithms Insertion Sort Shell Sort Heap Sort Merge Sort Quick Sort
  • 291. 291 Assumptions Elements to sort are placed in arrays of length N Can be compared Sorting can be performed in main memory
  • 292. 292 Complexity Simple sorting algorithms : O(N2) Shellsort: o(N2) Advanced sorting algorithms: O(NlogN) In general: Ω(NlogN)
  • 293. 293 Insertion Sort PRE: array of N elements (from 0 to N-1) POST: array sorted 1. An array of one element is sorted 2. Assume that the first p elements are sorted. for j = p to N-1 Take the j-th element and find a place for it among the first j sorted elements
  • 294. 294 Insertion Sort int j, p; comparable tmp; for ( p = 1; p < N ; p++) { tmp = a[p]; for ( j=p; j>0 && tmp < a[ j-1 ] ; j- - ) a[ j ] = a[ j-1 ]; a[ j ] = tmp; } Animation
  • 295. 295 Analysis of the Insertion Sort Insert the N-th el.: at most N-1 comparisons N-1 movements Insert the N-1st el. at most N-2 comparisons N-2 movements Insert the 2nd el.: 1 comparison 1 movement 2*(1 + 2 +… N - 1) = 2*N * (N-1) / 2 = N(N-1) = Θ (N2) Almost sorted array: O(N) Average complexity: Θ (N2)
  • 296. 296 A lower bound for simple sorting algorithms • An inversion : an ordered pair (Ai, Aj) such that i < j but Ai > Aj Example: 10, 6, 7, 15, 3,1 Inversions are: (10,6), (10,7), (10,3), (10,1), (6,3), (6,1) (7,3), (7,1) (15,3), (15,1), (3,1)
  • 297. 297 Swapping Swapping adjacent elements that are out of order removes one inversion. A sorted array has no inversions. Sorting an array that contains i inversions requires at least i swaps of adjacent elements How many inversions are there in an average unsorted array?
  • 298. 298 Theorems Theorem 1: The average number of inversions in an array of N distinct elements is N (N - 1) / 4 Theorem 2: Any algorithm that sorts by exchanging adjacent elements requires Ω (N2) time on average. For a sorting algorithm to run in less than quadratic time it must do something other than swap adjacent elements
  • 299. Chapter 7: Sorting Algorithms Shell Sort Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 300. 300 Shell Sort Improves on insertion sort Compares elements far apart, then less far apart, finally compares adjacent elements (as an insertion sort).
  • 301. 301 Idea arrange the data sequence in a two-dimensional array sort the columns of the array repeat the process each time with smaller number of columns
  • 302. 302 Example 3 7 9 0 5 1 6 8 4 2 0 6 1 5 7 3 4 9 8 2 it is arranged in an array with 7 columns (left), then the columns are sorted (right): 3 7 9 0 5 1 6 3 3 2 0 5 1 5 8 4 2 0 6 1 5  7 4 4 0 6 1 6 7 3 4 9 8 2 8 7 9 9 8 2
  • 303. 303 Example (cont) 3 3 2 0 0 1 0 5 1 1 2 2 5 7 4 3 3 4 4 0 6  4 5 6 1 6 8 5 6 8 7 9 9 7 7 9 8 2 8 9 one column in the last step – it is only a 6, an 8 and a 9 that have to move a little bit to their correct position
  • 304. 304 Implementation • one-dimensional array that is indexed appropriately. • an increment sequence to determine how far apart elements to be sorted are
  • 305. 305 Increment sequence Determines how far apart elements to be sorted are: h1, h2, ..., ht with h1 = 1 hk-sorted array - all elements spaced a distance hk apart are sorted relative to each other.
  • 306. 306 Correctness of the algorithm Shellsort only works because an array that is hk-sorted remains hk-sorted when hk- 1-sorted. Subsequent sorts do not undo the work done by previous phases. The last step (with h = 1) - Insertion Sort on the whole array
  • 307. 307 Java code for Shell sort int j, p, gap; comparable tmp; for (gap = N/2; gap > 0; gap = gap/2) for ( p = gap; p < N ; p++) { tmp = a[p]; for ( j = p ; j >= gap && tmp < a[ j- gap ]; j = j - gap) a[ j ] = a[ j - gap ]; a[j] = tmp; }
  • 308. 308 Increment sequences 1. Shell's original sequence: N/2 , N/4 , ..., 1 (repeatedly divide by 2). 2. Hibbard's increments: 1, 3, 7, ..., 2k - 1 ; 3. Knuth's increments: 1, 4, 13, ..., ( 3k - 1) / 2 ; 4. Sedgewick's increments: 1, 5, 19, 41, 109, .... 9 ·4k - 9 ·2k + 1 or 4k - 3 ·2k + 1.
  • 309. 309 Analysis Shellsort's worst-case performance using Hibbard's increments is Θ(n3/2). The average performance is thought to be about O(n 5/4) The exact complexity of this algorithm is still being debated . for mid-sized data : nearly as well if not better than the faster (n log n) sorts.
  • 310. Chapter 7: Sorting Algorithms Merge Sort Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 312. 312 Idea Two sorted arrays can be merged in linear time with N comparisons only. Given an array to be sorted, consider separately its left half and its right half, sort them and then merge them.
  • 313. 313 Characteristics  Recursive algorithm.  Runs in O(NlogN) worst-case running time. Where is the recursion?  Each half is an array that can be sorted using the same algorithm - divide into two, sort separately the left and the right halves, and then merge them.
  • 315. 315 Merge Sort Code void merge_sort ( int [ ] a, int left, int right) { if(left < right) { int center = (left + right) / 2; merge_sort (a,left, center); merge_sort(a,center + 1, right); merge(a, left, center + 1, right); } }
  • 316. 316 Analysis of Merge Sort Assumption: N is a power of two. For N = 1 time is constant (denoted by 1) Otherwise: time to mergesort N elements = time to mergesort N/2 elements + time to merge two arrays each N/2 el.
  • 317. 317 Recurrence Relation Time to merge two arrays each N/2 elements is linear, i.e. O(N) Thus we have: (a) T(1) = 1 (b) T(N) = 2T(N/2) + N
  • 318. 318 Solving the Recurrence Relation T(N) = 2T(N/2) + N divide by N: (1) T(N) / N = T(N/2) / (N/2) + 1 Telescoping: N is a power of two, so we can write (2) T(N/2) / (N/2) = T(N/4) / (N/4) +1 (3) T(N/4) / (N/4) = T(N/8) / (N/8) +1 ……. T(2) / 2 = T(1) / 1 + 1
  • 319. 319 Adding the Equations The sum of the left-hand sides will be equal to the sum of the right-hand sides: T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) + … + T(2)/2 = T(N/2) / (N/2) + T(N/4) / (N/4) + …. + T(2) / 2 + T(1) / 1 + LogN (LogN is the sum of 1’s in the right-hand sides)
  • 320. 320 Crossing Equal Terms, Final Formula After crossing the equal terms, we get T(N)/N = T(1)/1 + LogN T(1) is 1, hence we obtain T(N) = N + NlogN = (NlogN) Hence the complexity of the Merge Sort algorithm is (NlogN).
  • 321. Chapter 7: Sorting Algorithms Quick Sort Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 322. 322 Quick Sort Basic Idea Code Analysis Advantages and Disadvantages Applications Comparison with Heap sort and Merge sort Animation
  • 323. 323 Basic Idea  Pick one element in the array, which will be the pivot.  Make one pass through the array, called a partition step, re-arranging the entries so that: • entries smaller than the pivot are to the left of the pivot. • entries larger than the pivot are to the right
  • 324. 324 Basic Idea  Recursively apply quicksort to the part of the array that is to the left of the pivot, and to the part on its right.  No merge step, at the end all the elements are in the proper order
  • 325. 325 Choosing the Pivot Some fixed element: e.g. the first, the last, the one in the middle. Bad choice - may turn to be the smallest or the largest element, then one of the partitions will be empty Randomly chosen (by random generator) - still a bad choice
  • 326. 326 Choosing the Pivot The median of the array (if the array has N numbers, the median is the [N/2] largest number). This is difficult to compute - increases the complexity.
  • 327. 327 Choosing the Pivot The median-of-three choice: take the first, the last and the middle element. Choose the median of these three elements.
  • 328. 328 Find the Pivot – Java Code int median3 ( int [ ]a, int left, int right) { int center = (left + right)/2; if (a [left] > a [ center]) swap (a [ left ], a [center]); if (a [center] > a [right]) swap (a[center], a[ right ]); if (a [left] > a [ center]) swap (a [ left ], a [center]); swap(a [ center ], a [ right-1 ]); return a[ right-1 ]; }
  • 329. 329 Quick Sort – Java Code If ( left + 10 < = right) { // do quick sort } else insertionSort (a, left, right);
  • 330. 330 Quick Sort – Java Code { int i = left, j = right - 1; for ( ; ; ) { while (a [++ i ] < pivot) { } while ( pivot < a [- - j ] ) { } if (i < j) swap (a[ i ], a [ j ]); else break; } swap (a [ i ], a [ right-1 ]); quickSort ( a, left, i-1); quickSort (a, i+1, right); }
  • 331. 331 Implementation Notes Compare the two versions: A. while (a[++i] < pivot) { } while (pivot < a[--j]){ } if ( i < j ) swap (a[i], a[j]); else break; B. while (a[ i ] < pivot) { i++ ; } while (pivot < a[ j ] ) { j- - ; } if ( i < j ) swap (a [ i ], a [ j ]); else break;
  • 332. 332 Implementation Notes If we have an array of equal elements, the second code will never increment i or decrement j, and will do infinite swaps. i and j will never cross.
  • 333. 333 Complexity of Quick Sort Average-case O(NlogN) Worst Case: O(N2) This happens when the pivot is the smallest (or the largest) element. Then one of the partitions is empty, and we repeat recursively the procedure for N-1 elements.
  • 334. 334 Complexity of Quick Sort Best-case O(NlogN) The pivot is the median of the array, the left and the right parts have same size. There are logN partitions, and to obtain each partitions we do N comparisons (and not more than N/2 swaps). Hence the complexity is O(NlogN)
  • 335. 335 Analysis T(N) = T(i) + T(N - i -1) + cN The time to sort the file is equal to  the time to sort the left partition with i elements, plus  the time to sort the right partition with N-i-1 elements, plus  the time to build the partitions.
  • 336. 336 Worst-Case Analysis The pivot is the smallest (or the largest) element T(N) = T(N-1) + cN, N > 1 Telescoping: T(N-1) = T(N-2) + c(N-1) T(N-2) = T(N-3) + c(N-2) T(N-3) = T(N-4) + c(N-3) …………... T(2) = T(1) + c.2
  • 337. 337 Worst-Case Analysis T(N) + T(N-1) + T(N-2) + … + T(2) = = T(N-1) + T(N-2) + … + T(2) + T(1) + c(N) + c(N-1) + c(N-2) + … + c.2 T(N) = T(1) + c times (the sum of 2 thru N) = T(1) + c (N (N+1) / 2 -1) = O(N2)
  • 338. 338 Best-Case Analysis The pivot is in the middle T(N) = 2T(N/2) + cN Divide by N: T(N) / N = T(N/2) / (N/2) + c
  • 339. 339 Best-Case Analysis Telescoping: T(N) / N = T(N/2) / (N/2) + c T(N/2) / (N/2) = T(N/4) / (N/4) + c T(N/4) / (N/4) = T(N/8) / (N/8) + c …… T(2) / 2 = T(1) / (1) + c
  • 340. 340 Best-Case Analysis Add all equations: T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) + …. + T(2) / 2 = = (N/2) / (N/2) + T(N/4) / (N/4) + … + T(1) / (1) + c.logN After crossing the equal terms: T(N)/N = T(1) + c*LogN T(N) = N + N*c*LogN = O(NlogN)
  • 341. 341 Advantages and Disadvantages  Advantages:  One of the fastest algorithms on average  Does not need additional memory (the sorting takes place in the array - this is called in-place processing )  Disadvantages:  The worst-case complexity is O(N2)
  • 342. 342 Applications Commercial applications  QuickSort generally runs fast  No additional memory  The above advantages compensate for the rare occasions when it runs with O(N2)
  • 343. 343 Applications Never use in applications which require a guarantee of response time:  Life-critical (medical monitoring, life support in aircraft, space craft)  Mission-critical (industrial monitoring and control in handling dangerous materials, control for aircraft, defense, etc) unless you assume the worst-case response time Warning:
  • 344. 344 Comparison with Heap Sort  O(NlogN) complexity  quick sort runs faster, (does not support a heap tree)  the speed of quick sort is not guaranteed
  • 345. 345 Comparison with Merge Sort  Merge sort guarantees O(NlogN) time  Merge sort requires additional memory with size N  Usually Merge sort is not used for main memory sorting, only for external memory sorting
  • 346. Chapter 9: Graphs Topological Sort Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 348. 348 Problem P: Assemble pool table F: Carpet the floor W: Panel the walls C: Install ceiling How would you order these activities?
  • 349. 349 Problem Graph S F P W C E S – start F – floor P – pool table E – end W – walls C - ceiling Some possible orderings: C W F P W C F P F P W C C F P W Important: P must be after F
  • 350. 350 Topological Sort RULE: If there is a path from u to v, then v appears after u in the ordering.
  • 351. 351 Graphs’ Characteristics directed otherwise (u,v) means a path from u to v and from v to u , cannot be ordered. acyclic otherwise u and v are on a cycle : u would precede v , v would precede u.
  • 352. 352 The ordering may not be unique V1 V2 V3 V4 legal orderings V1, V2, V3, V4 V1, V3, V2, V4
  • 353. 353 Some Definitions Degree of a vertex U: the number of edges (U,V) - outgoing edges Indegree of a vertex U: the number of edges (V,U) - incoming edges The algorithm for topological sort uses "indegrees" of vertices.
  • 354. 354 Basic Algorithm 1. Compute the indegrees of all vertices 2. Find a vertex U with indegree 0 and print it (store it in the ordering) If there is no such vertex then there is a cycle and the vertices cannot be ordered. Stop. 3. Remove U and all its edges (U,V) from the graph. 4. Update the indegrees of the remaining vertices. Repeat steps 2 through 4 while there are vertices to be processed.
  • 355. 355 Example V1 V2 V3 V4 V5 Indegrees V1 0 V2 1 V3 2 V4 2 V5 1 First to be sorted: V1 New indegrees: V2 0 V3 1 V4 1 V5 2 Possible sorts: V1, V2, V4, V3, V5; V1, V2, V4, V5, V3
  • 356. 356 Complexity O(|V|2), |V| - the number of vertices. To find a vertex of indegree 0 we scan all the vertices - |V| operations. We do this for all vertices: |V|2 operations
  • 357. 357 Improved Algorithm  Find a vertex of degree 0,  Scan only those vertices whose indegrees have been updated to 0.
  • 358. 358 Improved Algorithm 1. Compute the indegrees of all vertices 2. Store all vertices with indegree 0 in a queue. 3. Get a vertex U. 4. For all edges (U,V) update the indegree of V, and put V in the queue if the updated indegree is 0. Repeat steps 3 and 4 while the queue is not empty.
  • 359. 359 Complexity Operations needed to compute the indegrees: Adjacency lists: O(|E|) Matrix: O(|V|2) Complexity of the improved algorithm Adjacency lists: O(|E| + |V|), Matrix representation: O(|V|2) Note that if the graph is complete |E| = O(|V|2) |V| - number of vertices, |E| - number of edges.
  • 360. Chapter 9: Graphs Shortest Paths Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 361. 361 Shortest Paths Shortest Paths in Unweighted Graphs Shortest Paths in Weighted Graphs - Dijkstra’s Algorithm Animation
  • 362. 362 Unweighted Directed Graphs V1 V2 V3 V4 V5 V6 V7 What is the shortest path from V3 to V5?
  • 363. 363 Problem Data The problem: Given a source vertex s, find the shortest path to all other vertices. Data structures needed: Graph representation: Adjacency lists / adjacency matrix Distance table: distances from source vertex paths from source vertex
  • 364. 364 Example Adjacency lists : V1: V2, V4 V2: V4, V5 V3: V1, V6 V4: V3, V5, V6, V7 V5: V7 V6: - V7: V6 Let s = V3, stored in a queue Initialized distance table: distance parent V1 -1 -1 V2 -1 -1 V3 0 0 V4 -1 -1 V5 -1 -1 V6 -1 -1 V7 -1 -1
  • 365. 365 Breadth-first search in graphs Take a vertex and examine all adjacent vertices. Do the same with each of the adjacent vertices .
  • 366. 366 Algorithm 1. Store s in a queue, and initialize distance = 0 in the Distance Table 2. While there are vertices in the queue: Read a vertex v from the queue For all adjacent vertices w : If distance = -1 (not computed) Distance = (distance to v) + 1 Parent = v Append w to the queue
  • 367. 367 Complexity Matrix representation: O(|V|2) Adjacency lists - O(|E| + |V|) We examine all edges (O(|E|)), and we store in the queue each vertex only once (O(|V|)).
  • 368. 368 Weighted Directed Graphs V1 V2 V3 V4 V5 V6 V7 What is the shortest distance from V3 to V7?
  • 369. 369 Comparison Similar to the algorithm for unweighted graphs Differences: - weights are included in the graph representation - priority queue : the node with the smallest distance is chosen for processing - distance is not any more the number of edges, instead it is the sum of weights - Distance table is updated if newly computed distance is smaller.
  • 370. 370 Algorithm 1. Store s in a priority queue with distance = 0 2. While there are vertices in the queue DeleteMin a vertex v from the queue For all adjacent vertices w: Compute new distance Store in / update Distance table Insert/update in priority queue
  • 371. 371 Processing Adjacent Nodes For all adjacent vertices w: Compute new distance = (distance to v) + (d(v,w)) If distance = -1 (not computed) store new distance in table path = v Insert w in priority queue If old distance > new distance Update old_distance = new_distance Update path = v Update priority in priority queue
  • 372. 372 Complexity O(E logV + V logV) = O((E + V) log(V)) Each vertex is stored only once in the queue – O(V) DeleteMin operation is : O( V logV ) Updating the priority queue – search and inseart: O(log V) performed at most for each edge: O(E logV)
  • 373. Chapter 9: Graphs Spanning Trees Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 374. 374 Spanning Trees Spanning Trees in Unweighted Graphs Minimal Spanning Trees in Weighted Graphs - Prim’s Algorithm Minimal Spanning Trees in Weighted Graphs - Kruskal’s Algorithm Animation
  • 375. 375 Definitions Spanning tree: a tree that contains all vertices in the graph. Number of nodes: |V| Number of edges: |V|-1
  • 376. 376 Spanning trees for unweighted graphs - data structures A table (an array) T size = number of vertices, Tv= parent of vertex v Adjacency lists A queue of vertices to be processed
  • 377. 377 Algorithm - initialization 1. Choose a vertex S and store it in a queue, set a counter = 0 (counts the processed nodes), and Ts = 0 (to indicate the root), Ti = -1 ,i  s (to indicate vertex not processed)
  • 378. 378 Algorithm – basic loop 2. While queue not empty and counter < |V| -1 : Read a vertex V from the queue For all adjacent vertices U : If Tu = -1 (not processed) Tu  V counter  counter + 1 store U in the queue
  • 379. 379 Algorithm – results and complexity Result: Root: S, such that Ts = 0 Edges in the tree: (Tv , V) Complexity: O(|E| + |V|) - we process all edges and all nodes
  • 380. 380 Example 1 2 3 4 5 Table 0 // 1 is the root 1 // edge (1,2) 2 // edge (2,3) 1 // edge (1,4) 2 // edge (2,5) Edge format: (Tv,V)
  • 381. 381 Minimum Spanning Tree - Prim's algorithm Given: Weighted graph. Find a spanning tree with the minimal sum of the weights. Similar to shortest paths in a weighted graphs. Difference: we record the weight of the current edge, not the length of the path .
  • 382. 382 Data Structures A table : rows = number of vertices, three columns: T(v,1) = weight of the edge from parent to vertex V T(v,2) = parent of vertex V T(v,3) = True, if vertex V is fixed in the tree, False otherwise
  • 383. 383 Data Structures (cont) Adjacency lists A priority queue of vertices to be processed. Priority of each vertex is determined by the weight of edge that links the vertex to its parent. The priority may change if we change the parent, provided the vertex is not yet fixed in the tree.
  • 384. 384 Prim’s Algorithm Initialization For all V, set first column to –1 (not included in the tree) second column to 0 (no parent) third column to False (not fixed in the tree) Select a vertex S, set T(s,1) = 0 (root: no parent) Store S in a priority queue with priority 0.
  • 385. 385 While (queue not empty) loop 1. DeleteMin V from the queue, set T(v,3) = True 2. For all adjacent to V vertices W: If T(w,3)= True do nothing – the vertex is fixed If T(w,1) = -1 (i.e. vertex not included) T(w,1)  weight of edge (v,w) T(w,2)  v (the parent of w) insert w in the queue with priority = weight of (v,w)
  • 386. 386 While (queue not empty) loop (cont) If T(w,1) > weight of (v,w) T(w,1)  weight of edge (v,w) T(w,2)  v (the parent of w) update priority of w in the queue remove, and insert with new priority = weight of (v,w)
  • 387. 387 Results and Complexity At the end of the algorithm, the tree would be represented in the table with its edges {(v, T(v,2) ) | v = 1, 2, …|V|} Complexity: O(|E|log(|V|))
  • 388. 388 Kruskal's Algorithm The algorithm works with : set of edges, tree forests, each vertex belongs to only one tree in the forest.
  • 389. 389 The Algorithm 1. Build |V| trees of one vertex only - each vertex is a tree of its own. Store edges in priority queue 2. Choose an edge (u,v) with minimum weight if u and v belong to one and the same tree, do nothing if u and v belong to different trees, link the trees by the edge (u,v) 3. Perform step 2 until all trees are combined into one tree only
  • 390. 390 Example 1 2 3 4 5 1 5 3 4 6 2 4 Edges in Priority Queue: (1,2) 1 (3,5) 2 (2,4) 3 (1,3) 4 (4,5) 4 (2,5) 5 (1,5) 6
  • 391. 391 Example (cont) 1 2 3 4 5 Forest of 5 trees Edges in Priority Queue: (1,2) 1 (3,5) 2 (2,4) 3 (1,3) 4 (4,5) 4 (2,5) 5 (1,5) 6 1 2 Process edge (1,2) Process edge (3,5) 1 2 3 1 1 2 5 3 4 5 4
  • 392. 392 Example (cont) Edges in Priority Queue: (2,4) 3 (1,3) 4 (4,5) 4 (2,5) 5 (1,5) 6 Process edge (2,4) 1 2 1 4 3 Process edge (1,3) 2 5 1 2 4 3 1 3 4 All trees are combined in one, the algorithm stops 3 2 5
  • 393. 393 Operations on Disjoint Sets Comparison: Are the vertices in the same tree Union: combine two disjoint sets to form one set - the new tree Implementation Union/find operations: the unions are represented as trees - nlogn complexity. The set of trees is implemented by an array.
  • 394. 394 Complexity of Kruskal’s Algorithm O(|E|log(|V|) Explanation: The complexity is determined by the heap operations and the union/find operations Union/find operations on disjoint sets represented as trees: tree search operations, with complexity O(log|V|) DeleteMin in Binary heaps with N elements is O(logN), When performed for N elements, it is O(NlogN).
  • 395. 395 Complexity (cont) Kruskal’s algorithm works with a binary heap that contains all edges: O(|E|log(|E|)) However, a complete graph has |E| = |V|*(|V|-1)/2, i.e. |E| = O(|V|2) Thus for the binary heap we have O(|E|log(|E|)) = O(|E|log (|V|2) = O(2|E|log (|V|) = O(|E|log(|V|))
  • 396. 396 Complexity (cont) Since for each edge we perform DeleteMin operation and union/find operation, the overall complexity is: O(|E| (log(|V|) + log(|V|)) = O(|E|log(|V|))
  • 397. Chapter 9: Graphs Breadth-First and Depth-First Search Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 398. 398 Breadth-First and Depth-First Search BFS Basic Algorithm BFS Complexity DFS Algorithm DFS Implementation Relation between BFS and DFS
  • 399. 399 BFS – Basic Idea Given a graph with N vertices and a selected vertex A: for (i = 1; there are unvisited vertices ; i++) Visit all unvisited vertices at distance i (i is the length of the shortest path between A and currently processed vertices) Queue-based implementation
  • 400. 400 BFS – Algorithm BFS algorithm 1. Store source vertex S in a queue and mark as processed 2. while queue is not empty Read vertex v from the queue for all neighbors w: If w is not processed Mark as processed Append in the queue Record the parent of w to be v (necessary only if we need the shortest path tree)
  • 401. 401 Example 1 4 2 3 5 6 Adjacency lists 1: 2, 3, 4 2: 1, 3, 6 3: 1, 2, 4, 5, 6 4: 1, 3, 5 5: 3, 4 6: 2, 3 Breadth-first traversal: 1, 2, 3, 4, 6, 5 1: starting node 2, 3, 4 : adjacent to 1 (at distance 1 from node 1) 6 : unvisited adjacent to node 2. 5 : unvisited, adjacent to node 3 The order depends on the order of the nodes in the adjacency lists
  • 402. 402 Shortest Path Tree Consider the distance table: Nodes A B C D E Distance to A 0 1 1 2 1 Parent 0 A A C A The table defines the shortest path tree, rooted at A.
  • 403. 403 BFS – Complexity Step 1 : read a node from the queue O(V) times. Step 2 : examine all neighbors, i.e. we examine all edges of the currently read node. Not oriented graph: 2*E edges to examine Hence the complexity of BFS is O(V + 2*E)
  • 404. 404 Depth-First Search Procedure dfs(s) mark all vertices in the graph as not reached invoke scan(s) Procedure scan(s) mark and visit s for each neighbor w of s if the neighbor is not reached invoke scan(w)
  • 405. 405 Depth-First Search with Stack Initialization: mark all vertices as unvisited, visit(s) while the stack is not empty: pop (v,w) if w is not visited add (v,w) to tree T visit(w) Procedure visit(v) mark v as visited for each edge (v,w) push (v,w) in the stack
  • 406. 406 Recursive DFS DepthFirst(Vertex v) visit(v); for each neighbor w of v if (w is not visited) add edge (v,w) to tree T DepthFirst(w) Visit(v) mark v as visited
  • 407. 407 Example 1 4 2 3 5 6 Adjacency lists 1: 2, 3, 4 2: 6, 3, 1 3: 1, 2, 6, 5, 4 4: 1, 3, 5 5: 3, 4 6: 2, 3 Depth first traversal: 1, 2, 6, 3, 5, 4 the particular order is dependent on the order of nodes in the adjacency lists
  • 408. 408 BFS and DFS bfs(G) list L = empty tree T = empty choose a starting vertex x visit(x) while(L nonempty) remove edge (v,w) from beginning of L if w not visited add (v,w) to T visit(w) dfs(G) list L = empty tree T = empty choose a starting vertex x visit(x) while(L nonempty) remove edge (v,w) from end of L if w not visited add (v,w) to T visit(w) Visit ( vertex v) mark v as visited for each edge (v,w) add edge (v,w) to end of L
  • 409. 409 Applications of DFS  Trees: preorder traversal  Graphs:  Connectivity  Biconnectivity – articulation points  Euler graphs
  • 410. Chapter 9: Graphs Applications of Depth-First Search Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 412. 412 Connectivity Definition: An undirected graph is said to be connected if for any pair of nodes of the graph, the two nodes are reachable from one another (i.e. there is a path between them). If starting from any vertex we can visit all other vertices, then the graph is connected
  • 413. 413 Biconnectivity A graph is biconnected, if there are no vertices whose removal will disconnect the graph. A B C D E A B C D E biconnected not biconnected
  • 414. 414 Articulation Points Definition: A vertex whose removal makes the graph disconnected is called an articulation point or cut-vertex A B C D E C is an articulation point We can compute articulation points using depth-first search and a special numbering of the vertices in the order of their visiting.
  • 415. 415 Bridges Definition: An edge in a graph is called a bridge, if its removal disconnects the graph. Any edge in a graph, that does not lie on a cycle, is a bridge. Obviously, a bridge has at least one articulation point at its end, however an articulation point is not necessarily linked in a bridge. A B C D E (C,D) and (E,D) are bridges
  • 416. 416 Example 1 A B C E F D C is an articulation point, there are no bridges
  • 417. 417 Example 2 A B C E F D C is an articulation point, CB is a bridge
  • 418. 418 Example 3 A B C E F G D B and C are articulation points, BC is a bridge
  • 419. 419 Example 4 A B C D E B and C are articulation points. All edges are bridges
  • 420. 420 Example 5 A B C D E F G Biconnected graph - no articulation points and no bridges
  • 421. 421 Connectivity in Directed Graphs (I) Definition: A directed graph is said to be strongly connected if for any pair of nodes there is a path from each one to the other A B C D E
  • 422. 422 Connectivity in Directed Graphs (II) Definition: A directed graph is said to be unilaterally connected if for any pair of nodes at least one of the nodes is reachable from the other A B C D E Each strongly connected graph is also unilaterally connected.
  • 423. 423 Connectivity in Directed Graphs (III) Definition: A directed graph is said to be weakly connected if the underlying undirected graph is connected A B C D E There is no path between B and D Each unilaterally connected graph is also weakly connected
  • 424. Chapter 9: Graphs Summary Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 425. 425 Summary of Graph Algorithms Basic Concepts Topological Sort Shortest Paths Spanning Trees Scheduling Networks Breadth-First and Depth First Search Connectivity
  • 426. 426 Basic Concepts  Basic definitions: vertices and edges  More definitions: paths, simple paths, cycles, loops  Connected and disconnected graphs  Spanning trees  Complete graphs  Weighted graphs and networks  Graph representation  Adjacency matrix  Adjacency lists
  • 427. 427 Topological Sort RULE: If there is a path from u to v, then v appears after u in the ordering. Graphs: directed, acyclic Degree of a vertex U: the number of outgoing edges Indegree of a vertex U: the number of incoming edges The algorithm for topological sort uses "indegrees" of vertices. Implementation: with a queue Complexity: Adjacency lists: O(|E| + |V|), Matrix representation: O(|V|2)
  • 428. 428 Shortest Path in Unweighted Graphs Data Structures needed: Distance Table Queue Implementation: Breadth-First search using a queue Complexity: Matrix representation: O(|V|2) Adjacency lists - O(|E| + |V|)
  • 429. 429 Algorithm 1. Store s in a queue, and initialize distance = 0 in the Distance Table 2. While there are vertices in the queue: Read a vertex v from the queue For all adjacent vertices w : If distance = -1 (not computed) Distance = (distance to v) + 1 Parent = v Append w to the queue
  • 430. 430 Shortest Path in Weighted Graphs – Dijkstra’s Algorithm 1. Store s in a priority queue with distance = 0 2. While there are vertices in the queue DeleteMin a vertex v from the queue For all adjacent vertices w: Compute new distance Store in / update Distance table Insert/update in priority queue
  • 431. 431 Complexity O(E logV + V logV) = O((E + V) log(V)) Each vertex is stored only once in the queue – O(V) DeleteMin operation is : O( V logV ) Updating the priority queue – search and inseart: O(log V) performed at most for each edge: O(E logV)
  • 432. 432 Spanning Trees Spanning tree: a tree that contains all vertices in the graph. Number of nodes: |V| Number of edges: |V|-1
  • 433. 433 Spanning Trees of Unweighted Graphs Breadth-First Search using a queue Complexity: O(|E| + |V|) - we process all edges and all nodes
  • 434. 434 Min Spanning Trees of Weighted Graphs – Prim’s Algorithm Find a spanning tree with the minimal sum of the weights. Similar to shortest paths in a weighted graphs. Difference: we record the weight of the current edge, not the length of the path. Implementation: with a Priority Queue Complexity: O(|E| log (|V|) )
  • 435. 435 Min Spanning Trees of Weighted Graphs – Kruskal’s Algorithm The algorithm works with the set of edges, stored in Priority queue. The tree is built incrementally starting with a forest of nodes only and adding edges as they come out of the priority queue Complexity: O(|E| log (|V|))
  • 436. 436 Scheduling Networks Directed simple acyclic graph Nodes: events, Source, Sink Edges: activities (name of the task, duration)  Find the earliest occurrence time (EOT) of an event  Find the earliest completion time (ECT) of an activity  Find the slack of an activity: how much the activity can be delayed without delaying the project  Find the critical activities: must be completed on time in order not to delay the project
  • 437. 437 EOT and ECT Earliest occurrence time of an event EOT(source) = 0 EOT( j ) = max ( EOTs of all activities preceding the event) Earliest completion time of an activity ECT( j, k ) = EOT( j ) + duration( j, k) 1. Sort the nodes topologically 2. For each event j : initialize EOT(j) = 0 3. For each event i in topological order For each event j adjacent from i : ECT(i,j) = EOT(i) + duration (i,j) EOT(j) = max(EOT(j) , ECT(i,j))
  • 438. 438 Slack of activities Compute latest occurrence time LOT(j) , slack(i, j) 1. Initialize LOT(sink) = EOT(sink) 2. For each non-sink event i in the reverse topological order: LOT(i) = min(LOT( j ) – duration( i, j)) 3. For each activity (i,j) slack( i, j ) = LOT( j ) – ECT( i, j) Critical activities: slack(j) = 0 Critical path in the project: all edges are activities with slack = 0 The critical path is found using breadth-first (or depth-first) search on the edges Complexity: O(V + E) with adjacency lists, O(V2) with adjacency matrix
  • 439. 439 BFS and DFS bfs(G) list L = empty tree T = empty choose a starting vertex x visit(x) while(L nonempty) remove edge (v,w) from beginning of L if w not visited add (v,w) to T visit(w) dfs(G) list L = empty tree T = empty choose a starting vertex x visit(x) while(L nonempty) remove edge (v,w) from end of L if w not visited add (v,w) to T visit(w) Visit ( vertex v) mark v as visited for each edge (v,w) add edge (v,w) to end of L Complexity: O( V + E)
  • 440. 440 Graph Connectivity Connectivity Biconnectivity – no cut vertices Articulation Points and Bridges Connectivity in Directed Graphs Strongly connected Unilaterally connected Weakly connected
  • 441. Chapter 10: Algorithm Design Techniques Backtracking Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 442. Backtracking How to solve a problem by search Generate and Test What is Backtracking Example
  • 443. The Idea Problem solving: search in a space of possible states. Initial state Target state2 Target state1
  • 444. Generate and Test Method • Take a state • Generate next states by applying relevant rules (not all rules are applicable for a given state!) • Test to see if the goal state is reached. If yes - stop If no - add the generated states into a stack/queue (do not add if the state has already been generated) and repeat the procedure. To get the solution: Record the paths
  • 445. Search Methods Breadth-first vs depth-first search Exhaustive search (brute force) vs informative (constraint-based, heuristic) search Backtracking
  • 446. Backtracking • rejecting the currently reached state as a dead-end state, • going back to some previous level in the state-space tree/graph • proceeding with the exploration of other paths toward the target state Backtracking occurs in depth-first search
  • 447. Example Decide on: State representation Operators Constraints Initial state Target state LOAN + LOAN LOAN LOAN ------- DEBT Replace the letters with distinct digits such that the addition is correct
  • 448. Chapter 10: Algorithm Design Techniques Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 449. 449 Algorithm Design Techniques Brute Force Greedy Algorithms Divide and Conquer Dynamic Programming Transform and Conquer Backtracking Genetic Algorithms
  • 450. 450 Brute Force  Based on the problem’s statement and definitions of the concepts involved.  Examples:  Sequential search  Exhaustive search: TSP, knapsack  Simple sorts: selection sort, bubble sort  Computing n!
  • 451. 451 Greedy Algorithms "take what you can get now" strategy Work in phases. In each phase the currently best decision is made
  • 452. 452 Greedy Algorithms - Examples • Dijkstra's algorithm (shortest path is weighted graphs) • Prim's algorithm, Kruskal's algorithm (minimal spanning tree in weighted graphs) • Coin exchange problem • Huffman Trees
  • 453. 453 Divide and Conquer • Reduce the problem to smaller problems (by a factor of at least 2) solved recursively and then combine the solutions Examples: Binary Search Mergesort Quick sort Tree traversal In general, problems that can be defined recursively
  • 454. 454 Decrease and Conquer • Reduce the problem to smaller problems solved recursively and then combine the solutions Examples of decrease-and-conquer algorithms: Insertion sort Topological sorting Binary Tree traversals: inorder, preorder and postorder (recursion) Computing the length of the longest path in a binary tree (recursion) Computing Fibonacci numbers (recursion) Reversing a queue (recursion)
  • 455. 455 Dynamic Programming Bottom-Up Technique in which the smallest sub-instances are explicitly solved first and the results of these used to construct solutions to progressively larger sub-instances. Example: Fibonacci numbers computed by iteration.
  • 456. 456 Transform and Conquer The problem is modified to be more amenable to solution. In the second stage the problem is solved. • Problem simplification e.g. presorting Example: finding the two closest numbers in an array of numbers. Brute force solution: O(n2) Transform and conquer with presorting: O(nlogn) • Change in the representation Example: AVL trees guarantee O(nlogn) search time • Problem reduction Example: least common multiple: lcm(m,n) = (m*n)/ gcd(m,n)
  • 457. 457 Backtracking Generate-and-Test methods Based on exhaustive search in multiple choice problems Typically used with depth-first state space search problems. Example: Puzzles
  • 458. 458 Backtracking – State Space Search • initial state • goal state(s) • a set of intermediate states • a set of operators that transform one state into another. Each operator has preconditions and postconditions. • a cost function – evaluates the cost of the operations (optional) • a utility function – evaluates how close is a given state to the goal state (optional)
  • 459. 459 Genetic Algorithms  Search for good solutions among possible solutions  The best possible solution may be missed  A solution is coded by a string , also called chromosome. The words string and chromosome are used interchangeably  A strings fitness is a measure of how good a solution it codes. Fitness is calculated by a fitness function  Selection: The procedure to choose parents  Crossover is the procedure by which two chromosomes mate to create a new offspring chromosome  Mutation : with a certain probability flip a bit in the offspring
  • 460. 460 Basic Genetic Algorithm 1. Start: Generate random population of n chromosomes (suitable solutions for the problem) 2. Fitness: Evaluate the fitness f(x) of each chromosome x in the population 3. New population: Create a new population by repeating following steps until the new population is complete 4. Test: If the end condition is satisfied, stop, and return the best solution in current population
  • 461. 461 New Population  Selection: Select two parent chromosomes from a population according to their fitness  Crossover: With a crossover probability cross over the parents to form a new offspring (children). If no crossover was performed, offspring is an exact copy of parents.  Mutation: With a mutation probability mutate new offspring at each locus (position in chromosome).
  • 462. 462 Conclusion How to choose the approach? First, by understanding the problem, and second, by knowing various problems and how they are solved using different approaches. Strongly recommended reading to learn more about how to design algorithms: The Algorithm Design Manual, Steven S. Skiena Department of Computer Science, State University of New York http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK/BOOK.HTM Algorithm Design Paradigms, Paul E. Dunne University of Liverpool, Department of Computer Science http://www.csc.liv.ac.uk/~ped/teachadmin/algor/algor.html
  • 463. Complexity Issues Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Mark Allen Weiss, Florida University
  • 464. 464 Complexity Issues  Class P problems  Class NP problems  NP-complete problems  Non-NP problems  Decidability
  • 465. 465 Class P Problems  Polynomial run time in the worst-case  "tractable" problems  Examples  Simple sorting algorithms  Advanced sorting algorithm  Topological Sort  Shortest Paths  Min Spanning Trees
  • 466. 466 Class NP Problems NP-Problems: Problems for which there is no known polynomial-time algorithms, "intractable" problems NP means non-deterministically polynomial. If we have a candidate for a solution, we can verify it in polynomial time. The difficult part is to generate all possible candidates. Example: (the satisfiability problem) Given a boolean expression of n variables, find a set of variable assignments that satisfy the expression Obviously, P  NP, but P  NP (or P = NP) is an open question
  • 467. 467 NP-Complete Problems An NP-Complete problem is an NP problem with the property that any problem in NP can be reduced to it in polynomial time In 1971 Stephen Cook (currently working at the University of Toronto, Ontario) showed that the satisfiability problem is NP-complete by proving that every other NP problem could be transformed to it. http://www.cs.toronto.edu/DCS/People/Faculty/sacook.html
  • 468. 468 NP-Complete Problems Bin packing problem: How to pack or store objects of various sizes and shapes with a minimum of wasted space? Clique problem: A clique is a subset K of vertices in an undirected graph G such that every pair is joined by an edge in G. The problems are: Given a graph G, and an integer k, does G have a clique consisting of k vertices? Given a graph G, find a clique with as many vertices as possible. Knapsack problem: You have a knapsack with a given size and several objects with various sizes and values. The problem is how to fill your knapsack with these objects so that you have a maximum total value. Hamiltonian Cycle: a cycle that contains all nodes of the graph. Proving that a graph has a Hamiltonian cycle is a NP-complete problem.
  • 469. 469 NP-Complete Problems NP problems and non-NP problems: For NP problems, given a candidate for a solution, we can verify it in polynomial time. For non-NP problems, we cannot verify in polynomial time whether a candidate for a solution is a solution or not. Example: the problem to prove that a graph does not have a Hamiltonian cycle is a non-NP problem. NP-hard problems: NP-Complete, but not necessarily in NP class
  • 470. 470 Decidability Yes/no problems (decision problems) A decision problem is a question that has two possible answers yes or no. The question is about some input.
  • 471. 471 Examples of Yes/No Problems Given a graph G and a set of vertices K, is K a clique? Given a graph G and a set of edges M, is M a spanning tree? Given a set of axioms (boolean expressions) and an expression, is the expression provable under the axioms?
  • 472. 472 Decidable, semi-decidable, and undecidable problems A problem is decidable if there is an algorithm (P or NP) that says yes if the answer is yes, and no otherwise. A problem is semi-decidable if there is an algorithm that says yes if the answer is yes, however it may loop infinitely if the answer is no. A problem is undecidable if we can prove that there is no algorithm that will deliver an answer.
  • 473. 473 Example of semi-decidable problem Given a set of axioms, prove that an expression is true. Problem 1: Let the axioms be: A v B A v C ~B Prove A. To prove A we add ~A to the axioms. If A is true then ~A will be false and this will cause a contradiction - the conjunction of all axioms plus ~A will result in False (A v B)  ~A = B B  (A v C) = (B  A) v (B  C) B  ~ B = False
  • 474. 474 Example of semi-decidable problem Problem 2: Let the axioms be: A v B A v C ~B Prove ~A. We add A and obtain: (A v C)  A = A (A v B)  A = A A  ~B = A  ~B (A  ~B)  (A v B) = A  ~ B ….. This process will never stop, because the expressions we obtain will always be different from False
  • 475. 475 Example of undecidable problem The halting problem Let LOOP be a program that checks other programs for infinite loops: LOOP(P) stops and prints "yes" if P loops infinitely LOOP(P) enters an infinite loop if P stops What about LOOP(LOOP)? http://www.cgl.uwaterloo.ca/~csk/washington/halt.html