introegthnhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhppt

CS206M
Data Structures and Algorithm

Textbooks/References
 Data Structures and Algorithm Analysis in C /
C++ - Mark Allen Weiss
 Introduction to Algorithms - Cormen,
Leiserson, Rivest, Stein
 Fundamentals of Computer Algorithms –
Sahni, Rajasekharan
 Algorithm Design – Kleinberg and Tardos

Grading Policy
 Quizzes 20%
 Midterm Exam 30%
 Final Exam 50%

What We Cover
 Analysis of Algorithms: worst case time and space
complexity, lower and upper bounds.
 Data Structures: stack, queue, linked list, tree, priority
queue, heap, and hash, …
 Searching algorithms, binary search trees, AVL trees, B-
trees
 Sorting algorithms: insertion sort, merge sort, quick sort,
bucket sort and radix sort;
 Graph: data structure, DFS, BFS, MST, Shortest paths..
 Design techniques: Divide-and-conquer, dynamic
programming, greedy method, …

Data Structures: A systematic way of organizing and
accessing data.
--No single data structure works well for ALL purposes.
Algorithm
Input Output
An algorithm is a step-by-step procedure for
solving a problem in a finite amount of time.

Introduction
 What is Algorithm?
 a clearly specified set of simple instructions to be followed to
solve a problem
 Takes a set of values, as input and
 produces a value, or set of values, as output
 May be specified
 In English
 As a computer program
 As a pseudo-code
 Data structures
 Methods of organizing data
 Program = algorithms + data structures

Algorithm Descriptions
 Nature languages: Hindi, English, etc.
 Pseudo-code: codes very close to computer languages,
e.g., C programming language.
 Programs: C programs, C++ programs, Java programs.
Goal:
 Allow a well-trained programmer to be able to
implement.
 Allow an expert to be able to analyze the running time.

Introduction
 Why need algorithm analysis ?
 writing a working program is not good enough
 The program may be inefficient!
 If the program is run on a large data set, then
the running time becomes an issue

An Example
• A city has n stops. A bus driver wishes to follow the shortest
path from one stop to another. Between two stops, if a road
exists, it may take a different time from other roads. Also,
roads are one-way, i.e., the road from stop1 to 2, is different
from that from stop 2 to 1.
• How to find the shortest path between any two pairs?
•A Naïve approach:
List all the paths between a given pair of view points
Compute the travel time for each.
Choose the shortest one.
• How many paths are there?

# of paths n  (n/e)n
Will be impossible to run your algorithm for n = 30
Need a way to compare two algorithms

Example: Selection Problem
 Given a list of N numbers, determine the k-th
largest, where k  N
 Algorithm 1:
(1) Read N numbers into an array
(2) Sort the array in decreasing order by some
simple algorithm
(3) Return the element in position k

Example: Selection Problem…
 Algorithm 2:
(1) Read the first k elements into an array and
sort them in decreasing order
(2) Each remaining element is read one by one
 If smaller than the kth element, then it is ignored
 Otherwise, it is placed in its correct spot in the
array, bumping one element out of the array.
(3) The element in the kth position is returned
as the answer.

Example: Selection Problem…
 Which algorithm is better when
 N =100 and k = 100?
 N =100 and k = 1?
 What happens when N = 1,000,000 and k =
500,000?
 There exist better algorithms

Algorithm Analysis
 We only analyze correct algorithms
 An algorithm is correct
 If, for every input instance, it halts with the correct output
 Incorrect algorithms
 Might not halt at all on some input instances
 Might halt with other than the desired answer
 Analyzing an algorithm
 Predicting the resources that the algorithm requires
 Resources include

Memory

Communication bandwidth

Computational time (usually most important)

Algorithm Analysis…
 Factors affecting the running time
 computer
 compiler
 algorithm used
 input to the algorithm
 The content of the input affects the running time
 typically, the input size (number of items in the input) is the
main consideration
 E.g. sorting problem  the number of items to be sorted
 E.g. multiply two matrices together  the total number of
elements in the two matrices
 Machine model assumed
 Instructions are executed one after another, with no
concurrent operations  Not parallel computers

Analysis of Algorithms
 Analysis is performed with respect to a
computational model
 We will usually use a generic uniprocessor
random-access machine (RAM)
 All memory equally expensive to access
 No concurrent operations
 All reasonable instructions take unit time
 Except, of course, function calls
 Constant word size

Unless we are explicitly manipulating bits

Asymptotic Performance
 In this course, we care most about asymptotic
performance
 How does the algorithm behave as the
problem size gets very large?
 Running time
 Memory/storage requirements
 Bandwidth/power requirements/logic gates/etc.

Input Size
 Time and space complexity
 This is generally a function of the input size
 e.g., sorting, multiplication
 How we characterize input size depends:

Sorting: number of input items
 Multiplication: total number of bits
 Graph algorithms: number of nodes & edges
 etc. …

Worst- / average- / best-case
 Worst-case running time of an algorithm
 The longest running time for any input of size n
 An upper bound on the running time for any input
 guarantee that the algorithm will never take longer
 Example: Sort a set of numbers in increasing order; and the
data is in decreasing order
 The worst case can occur fairly often
 e.g. in searching a database for a particular piece of information
 Best-case running time
 sort a set of numbers in increasing order; and the data is
already in increasing order
 Average-case running time
 May be difficult to define what “average” means

Worst case / Average case …
 Worst case
 Provides an upper bound on running time
 An absolute guarantee
 Average case
 Provides the expected running time
 Very useful, but treat with care: what is
“average”?
 Random (equally likely) inputs
 Real-life inputs

Running Time
 Number of primitive steps that are executed
 Except for time of executing a function call
most statements roughly require the same
amount of time
 y = m * x + b
 c = 5 / 9 * (t - 32 )
 z = f(x) + g(y)
 We can be more exact if need be

Machine Independent Analysis
We assume that every basic operation takes constant time:
Example Basic Operations:
Addition, Subtraction, Multiplication, Memory Access
Non-basic Operations:
Sorting, Searching
Efficiency of an algorithm is the number of basic
operations it performs
We do not distinguish between the basic operations.

In fact, we will not worry about the exact values, but will
look at ``broad classes’ of values, or the growth rates
Let there be n inputs.
If an algorithm needs n basic operations and another
needs 2n basic operations, we will consider them to be in
the same efficiency category.
However, we distinguish between exp(n), n, log(n)

Running Time
 Most algorithms transform
input objects into output
objects.
 The running time of an
algorithm typically grows
with the input size.
 Average case time is often
difficult to determine.
 We focus on the worst case
running time. 0
20
40
60
80
100
120
Running
Time
1000 2000 3000 4000
Input Size
best case
average case
worst case

Experimental Studies
 Write a program
implementing the algorithm
 Run the program with
inputs of varying size and
composition
 Use a method like
System.currentTimeMillis() to
get an accurate measure
of the actual running time
 Plot the results
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 50 100
Input Size
Time
(
ms)

Limitations of Experiments
 It is necessary to implement the algorithm,
which may be difficult
 Results may not be indicative of the running
time on other inputs not included in the
experiment.
 In order to compare two algorithms, the same
hardware and software environments must be
used

Theoretical Analysis
 Uses a high-level description of the algorithm
instead of an implementation
 Characterizes running time as a function of the
input size, n.
 Takes into account all possible inputs
 Allows us to evaluate the speed of an algorithm
independent of the hardware/software
environment

Pseudocode
 High-level description
of an algorithm
 More structured than
English prose
 Less detailed than a
program
 Preferred notation for
describing algorithms
 Hides program design
issues
Algorithm arrayMax(A, n)
Input array A of n integers
Output maximum element of A
currentMax  A[0]
for i  1 to n  1 do
if A[i]  currentMax then
currentMax  A[i]
return currentMax
Example: find max
element of an array

Primitive Operations
 Basic computations
performed by an algorithm
 Identifiable in pseudocode
 Largely independent from the
programming language
 Exact definition not important
(we will see why later)
 Assumed to take a constant
amount of time in the RAM
model
 Examples:
 Evaluating an
expression
 Assigning a value
to a variable
 Indexing into an
array
 Calling a method
 Returning from a
method

Counting Primitive Operations
 By inspecting the pseudocode, we can determine the
maximum number of primitive operations executed by an
algorithm, as a function of the input size
Algorithm arrayMax(A, n)
currentMax  A[0] 2
for (i =1; i<n; i++) 2n
(i=1 once, i<n n times, i++ (n-1) times)
if A[i]  currentMax then 2(n  1)
currentMax  A[i] 2(n  1)
return currentMax 1
Total 6n 

Estimating Running Time
 Algorithm arrayMax executes 6n  1 primitive
operations in the worst case.
Define:
a = Time taken by the fastest primitive operation
b = Time taken by the slowest primitive operation
 Let T(n) be worst-case time of arrayMax. Then
a (5n  2)  T(n)  b(8n  1)
 Hence, the running time T(n) is bounded by two
linear functions

Growth Rate of Running Time
 Changing the hardware/ software
environment
 Affects T(n) by a constant factor, but
 Does not alter the growth rate of T(n)
 The linear growth rate of the running time
T(n) is an intrinsic property of algorithm
arrayMax

n logn n nlogn n2
n3
2n
4 2 4 8 16 64 16
8 3 8 24 64 512 256
16 4 16 64 256 4,096 65,536
32 5 32 160 1,024 32,768 4,294,967,296
64 6 64 384 4,094 262,144 1.84 * 1019
128 7 128 896 16,384 2,097,152 3.40 * 1038
256 8 256 2,048 65,536 16,777,216 1.15 * 1077
512 9 512 4,608 262,144 134,217,728 1.34 * 10154
1024 10 1,024 10,240 1,048,576 1,073,741,824 1.79 * 10308
The Growth Rate of the Six Popular functions

Computational Complexity
 Compares growth of two functions
 Independent of constant multipliers and
lower-order effects
 Metrics
 “Big O” Notation O()
 “Big Omega” Notation ()
 “Big Theta” Notation ()

Big-Oh Notation
 To simplify the running time estimation,
for a function f(n), we ignore the constants
and lower order terms.
Example: 10n3
+4n2
-4n+5 is O(n3
).

Growth Rate
 The idea is to establish a relative order among functions for large n
  c , n0 > 0 such that 0  f(n)  c g(n) when n  n0
 f(n) grows no faster than g(n) for “large” n

Definition: Given a function g(n), we
denote
(g(n)) to be the set of functions
{ f(n) | there exists positive constants
c and n0 such that
0 ≤ f(n) ≤ c g(n) for all n ≥ n0 }
Big-O notation
Rough Meaning: (g(n)) includes all functions that
are upper bounded by g(n)

• 4n ∈ (5n)
• 4n ∈(n)
• 4n + 3 ∈ (n)
• n ∈
(0.001n2)
[ proof: c = 1, n ¸ 1 ] [
proof: c = 4, n ¸ 1]
[ proof: c = 5, n ¸ 3]
[ proof: c = 1, n ¸ 100 ]
• loge n ∈ (log n) [ proof: c = 1, n ¸ 1 ]
• log n ∈ (loge n) [ proof: c = log e, n ¸ 1 ]
Remark: Usually, we will slightly
abuse the notation, and write f(n) = (g(n)) to
mean f(n) ∈ (g(n))
Big-O notation (example)

Big-Oh Notation
 Given functions f(n) and
g(n), we say that f(n) is
O(g(n)) if there are
positive constants
c and n0 such that
f(n)  cg(n) for n  n0
 Example: 2n  10 is O(n)
 2n  10  cn
 (c  2) n 10
 n 10(c  2)
 Pick c 3 and n0 10
1
10
100
1,000
10,000
1 10 100 1,000
n
3n 2n+ 10 n

Big-Oh: example
 Let f(n) = 2n2
Then
 f(n) = O(n4
)
 f(n) = O(n3
)
 f(n) = O(n2
) (best answer, asymptotically tight)

Big-Oh Example
 Example: the function
n2
is not O(n)
 n2
 cn
 n  c
 The above inequality
cannot be satisfied
since c must be a
constant
 n2
is O(n2
).
1
10
100
1,000
10,000
100,000
1,000,000
1 10 100 1,000
n
n^2 100n
10n n

Big-Oh Examples
7n-2
7n-2 is O(n)
need c > 0 and n0  1 such that 7n-2  c•n for n  n0
this is true for c = 7 and n0 = 1
 3n3
+ 20n2
+ 5
3n3
+ 20n2
+ 5 is O(n3
)
need c > 0 and n0  1 such that 3n3
+ 20n2
+ 5  c•n3
for n  n0
 3 log n + 5
3 log n + 5 is O(log n)
need c > 0 and n0  1 such that 3 log n + 5  c•log n for n  n0

Big Oh: more examples
 n2
/ 2 – 3n = O(n2
)
 1 + 4n = O(n)
 7n2
+ 10n + 3 = O(n2
) = O(n3
)
 log10 n = log2 n / log2 10 = O(log2 n) = O(log n)
 sin n = O(1); 10 = O(1), 1010
= O(1)

 log n + n = O(n)
 n = O(2n
), but 2n
is not O(n)
 210n
is not O(2n
)
)
( 3
2
1
2
N
O
N
N
i
N
i



 
)
( 2
1
N
O
N
N
i
N
i



 

denote
(g(n)) to be the set of functions
{ f(n) | there exists positive constants
c and n0 such that
0 ≤ c g(n) ≤ f(n) for all n ≥ n0
}
Big-Omega notation
Rough Meaning: (g(n)) includes all functions that
are lower bounded by g(n)

Similar to Big-O, we will slightly abuse the
notation, and write f(n) = (g(n)) to mean
f(n) ∈ (g(n))
Relationship between Big-O and Big- : f(n) =
(g(n))  g(n) = (f(n))
Big-O and Big-Omega

• 5n = (4n)
• n = (4n)
• 4n + 3 = (n)
• 0.001n2 = (n)
[ proof: c = 1, n ¸ 1]
[ proof: c = 1/4, n ¸ 1]
[ proof: c = 1, n ¸ 1 ]
[ proof: c = 1, n ¸ 100 ]
• loge n = (log n) [ proof: c = 1/log e, n ¸
1 ]
• log n = (loge n) [ proof: c = 1, n ¸ 1 ]
Big- notation (example)

Big-Omega: examples
 Let f(n) = 2n2
. Then
 f(n) = (n)
 f(n) = (n2
) (best answer)

denote
(g(n)) to be the set of functions
{ f(n) | there exists positive constants c1,
c2, and n0 such that
0 ≤ c1 g(n) ≤ f(n) ≤ c2 g(n) for all n ≥ n0
}
-notation
Meaning: Those functions which can be both upper
bounded and lower bounded by of g(n)

• Similarly, we write f(n) = (g(n)) to
mean f(n) ∈
(g(n))
Relationship between Big-O, Big-, and :
f(n) = (g(n))

f(n) = (g(n)) and f(n) = (g(n))
Big-O, Big-, and 

• 4n = (n)
• 4n + 3 = (n)
• loge n = (log n)
[ c1 = 1, c2 = 4, n ¸ 1] [ c1 =
1, c2 = 5, n ¸ 3 ]
[ c1 = 1/log e, c2 = 1, n ¸ 1]
• Running Time of Insertion Sort = (n2)
– If not specified, running time refers to the
worst-case running time
• Running Time of Merge Sort = (n log n)
 notation (example)

f(N) = (g(N))
 the growth rate of f(n) is the same as the growth rate
of g(n)

Big “Theta” Notation
 f(n) = (g(n))
 iff  c1, c2, n0 > 0 s.t. 0 c
≤ 1g(n) f(n) c
≤ ≤ 2g(n),  n >= n0
f(n)
c1g(n)
n0
c2g(n)
f(n) has the
same long-term
rate of
growth as g(n)

Examples
3n2
+ 17
 (1), (n), (n2
)  lower bounds
 O(n2
), O(n3
), ...  upper bounds
 (n2
)  exact bound

Big-Theta
 f(n) = (g(n)) iff
f(n) = O(g(n)) and f(n) = (g(n))
 The growth rate of f(n) equals the growth rate
of g(n)
 Example: Let f(n)=n2
, g(n)=2n2
 We write f(n) = O(g(n)) and f(n) = (g(n)), thus
f(n) = (g(n)).

denote
(g(n)) to be the set of functions
{ f(n) | for any positive c, there exists
positive constant n0 such that
0 < f(n)  c g(n) for all n
≥ n0 }
Little-o notation
Note the similarities and differences with Big-O

Little-o (equivalent definition)
Definition: Given a function g(n), (g(n))
is the set of functions
{ f(n) | limn ∞
(f(n)/g(n)) = 0 }
Examples:
• 4n = (n2)
• n log n = (n1.000001)
• n log n = (n log2 n)

denote
(g(n)) to be the set of functions
{ f(n) | for any positive c, there exists
positive constant n0 such that
0 < c g(n)  f(n) for all n ≥ n0 }
Little-omega notation
Note the similarities and differences with the Big-
Omega definition

Little-omega (equivalent definition)
Definition: Given a function g(n), (g(n))
is the set of functions
{ f(n) | limn ∞
(g(n)/f(n)) = 0 }
Relationship between Little-o and Little- :
f(n) = (g(n))  g(n) = (f(n))

To remember the notation:
 is like ≤ :
 is like ≥ :
 is like = :
 is like  :
 is like  :
f(n) = (g(n)) means f(n) ≤
cg(n)
f(n) = (g(n)) means f(n) ≥
cg(n)
f(n) = (g(n))  g(n) =
(f(n))
f(n) = (g(n)) means f(n)  cg(n)
f(n) = (g(n)) means f(n)  cg(n)
Note: Not any two functions can be
compared asymptotically (E.g.,
sin x vs cos x )

Some rules
When considering the growth rate of a function using Big-
Oh
 Ignore the lower order terms and the coefficients of the
highest-order term
 No need to specify the base of logarithm
 Changing the base from one constant to another changes the
value of the logarithm by only a constant factor
 If T1(n) = O(f(n) and T2(n) = O(g(n)), then
 T1(n) + T2(n) = max(O(f(n)), O(g(n))),
 T1(n) * T2(n) = O(f(n) * g(n))

Big-Oh and Growth Rate
 The big-Oh notation gives an upper bound on the
growth rate of a function
 The statement “f(n) is O(g(n))” means that the growth
rate of f(n) is no more than the growth rate of g(n)
 We can use the big-Oh notation to rank functions
according to their growth rate

Big-Oh Rules
 If f(n) is a polynomial of degree d, then f(n) is
O(nd
), i.e.,
1. Drop lower-order terms
2. Drop constant factors
 Use the smallest possible class of functions
 Say “2n is O(n)” instead of “2n is O(n2
)”
 Use the simplest expression of the class
 Say “3n  5 is O(n)” instead of “3n  5 is O(3n)”

log n n n log n n2
n3
2n
0 1 0 1 1 2
1 2 2 4 8 4
2 4 8 16 64 16
3 8 24 64 512 256
4 16 64 256 4096 65,536
5 32 160 1,024 32,768 4,294,967,296
1
10
100
1000
10000
100000
n
n2
n log n
n
log n
n3
2n

T(n)
n n n log n n2
n3
n4
n10
2n
10 .01s .03s .1s 1s 10s 10s 1s
20 .02s .09s .4s 8s 160s 2.84h 1ms
30 .03s .15s .9s s 810s 6.83d 1s
40 .04s .21s 1.6s s 2.56ms 121d 18m
50 .05s .28s s s 6.25ms 3.1y 13d
100 .1s .66s 10s 1ms 100ms 3171y 4
1013
y
103
1s 9.96s 1ms 1s 16.67m 3.17
1013
y 32
10283
y
104
s 130s 100ms 16.67m 115.7d 3.17
1023
y
105
s 1.66ms 10s 11.57d 3171y 3.17
1033
y
106
ms 19.92ms 16.67m 31.71y 3.17
107
y 3.17
1043
y
Complexity and Tractability
Complexity and Tractability
Assume the computer does 1 billion ops per sec.

Some rules
 If T(n) is a polynomial of degree k, then
T(n) = (nk
).
 For logarithmic functions,
T(logm n) = (log n).

Properties of the O notation
 Constant factors may be ignored
 k > 0kf isO( f)

 k > 0kf isO( f)
 Higher powers grow faster
 nr
isO( ns
) if 0  r  s

 k > 0kf isO( f)
 nr
isO( ns
) if 0  r  s
 Fastest growing term dominates a sum
 If f is O(g), then f + g is O(g)
eg an4
+ bn3
is O(n4
)

 k > 0kf isO( f)
 nr
isO( ns
) if 0  r  s
 Fastest growing term dominates a sum
 If f is O(g), then f + g is O(g)
eg an4
+ bn3
is O(n4
)
 Polynomial’s growth rate is determined by leading term
 If f is a polynomial of degree d,
then f is O(nd
)

 f is O(g) is transitive
 If f is O(g) and g is O(h) then f is O(h)

 Product of upper bounds is upper bound for the
product
 If f is O(g) and h is O(r) then fh is O(gr)

 Product of upper bounds is upper bound for the
product
 Exponential functions grow faster than powers
 nk
isO( bn
)  b > 1 and k  0
eg n20
is O( 1.05n
)

 Product of upper bounds is upper bound for the product
 nk
isO( bn
)  b > 1 and k  0
eg n20
is O( 1.05n
)
 Logarithms grow more slowly than powers
 logbn isO( nk
)  b > 1 and k  0
eg log2n is O( n0.5
)

 Product of upper bounds is upper bound for the product
 nk
isO( bn
)  b > 1 and k  0
eg n20
is O( 1.05n
)
 Logarithms grow more slowly than powers
 logbn isO( nk
)  b > 1 and k  0
eg log2n is O( n0.5
)
Important!

 All logarithms grow at the same rate
 logbn is O(logdn) b, d > 1

Polynomial and Intractable
Algorithms
 Polynomial Time complexity
 An algorithm is said to be polynomial if it is
O( nd
) for some integer d
 Polynomial algorithms are said to be efficient

They solve problems in reasonable times!
 Intractable algorithms
 Algorithms for which there is no known
polynomial time algorithm
 We will come back to this important class later
in the course

Analysing an Algorithm
 Simple statement sequence
s1; s2; …. ; sk
 O(1) as long as k is constant
 Simple loops
for(i=0;i<n;i++) { s; }
where s is O(1)
 Time complexity is n O(1) or O(n)
 Nested loops
for(i=0;i<n;i++)
for(j=0;j<n;j++) { s; }
 Complexity is n O(n) or O(n2
)
This part is
O(n)

 Loop index doesn’t vary linearly
h = 1;
while ( h <= n ) {
s;
h = 2 * h;
}
 h takes values 1, 2, 4, … until it exceeds n
 There are 1 + log2n iterations
 Complexity O(log n)

 Loop index depends on outer loop index
for(j=0;j<n;j++)
for(k=0;k<j;k++){
s;
}
 Inner loop executed
 1, 2, 3, …., n times
 Complexity O(n2
)
n
 i =
i=1
n(n+1)
2

Complexity Analysis
 Estimate n = size of input
 Isolate each atomic activities to be counted
 Find f(n) = the number of atomic activities done
by an input size of n
 Complexity of an algorithm = complexity of f(n)

Running Time Calculations -
Loops
for (j = 0; j < n; ++j) {
// 3 atomics
}
 Complexity = (3n) = (n)

Loops with Break
for (j = 0; j < n; ++j) {
// 3 atomics
if (condition) break;
}
 Upper bound = O(4n) = O(n)
 Lower bound = (4) = (1)
 Complexity = O(n)

Loops in Sequence
for (j = 0; j < n; ++j) {
// 3 atomics
}
for (j = 0; j < n; ++j) {
// 5 atomics
}
 Complexity = (3n + 5n) = (n)

Nested Loops
for (j = 0; j < n; ++j) {
// 2 atomics
for (k = 0; k < n; ++k) {
// 3 atomics
}
}
 Complexity = ((2 + 3n)n) = (n2
)

Consecutive Statements
for (i = 0; i < n; ++i) {
// 1 atomic
if(condition) break;
}
for (j = 0; j < n; ++j) {
// 1 atomic
for (k = 0; k < n; ++k) {
// 3 atomics
}
}
 Complexity =
O(2n) + O((2+3n)n)
 = O(n) + O(n2
) = ??
 = O(n2
)

If-then-else
if(condition)
i = 0;
else
for ( j = 0; j < n; j+
+)
a[j] = j;
 Complexity = ??
= O(1) + max ( O(1), O(N))
= O(1) + O(N)
= O(N)

Sequential Search
 Given an unsorted vector a[], find if the element X
occurs in a[]
for (i = 0; i < n; i++) {
if (a[i] == X) return true;
}
return false;
 Input size: n = a.size()
 Complexity = O(n)

Recursion
long factorial( int n )
{
if( n <= 1 )
return 1;
else
return n * factorial( n - 1 );
}
long fib( int n )
{
if ( n <= 1)
return 1;
else
return fib( n – 1 ) + fib( n –
2 );
}
This is really a
simple loop
disguised as recursion
Complexity = O(n)
Fibonacci Series:
Terrible way to
Implement recursion
Complexity = O( (3/2)N
)
That’s Exponential !!

Euclid’s Algorithm
 Find the greatest
common divisor (gcd)
between m and n
 Given that m n
≥
 Complexity = O(log(N))
 Exercise:
 Why is it O(log(N)) ?

Examples
Examples
)
(log
/
1
)
)
/
(
(
!
)
(
)
(
)
(
)
(
)
(
)
1
(
c
Asymptomic
)
(
1
0
1
1
3
2
1
2
1
1
n
i
e
n
n
n
r
r
n
i
n
i
n
i
n
n
c
n
f
n
i
n
n
i
n
i
k
k
n
i
n
i
n
i
k
i
i
k
i






















Input:
Output:
A list of n numbers
Arrange the numbers in
increasing order
Remark: Sorting has many applications.
E.g., if the list is already sorted, we can
search a number in the list faster
The Sorting Problem

• Operates in n rounds
• At the kth round,
Insertion Sort
……
kth item
Swap towards left side ;
Stop until seeing an item
with a smaller value.
Question: Why is this algorithm correct?

An Example: Insertion Sort
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
30 10 40 20
1 2 3 4
i =  j =  key = 
A[j] =  A[j+1] = 

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
30 10 40 20
1 2 3 4
i = 2 j = 1 key = 10
A[j] = 30 A[j+1] = 10

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
30 30 40 20
1 2 3 4
i = 2 j = 1 key = 10
A[j] = 30 A[j+1] = 30

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
30 30 40 20
1 2 3 4
i = 2 j = 0 key = 10
A[j] =  A[j+1] = 30

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 2 j = 0 key = 10
A[j] =  A[j+1] = 10

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 3 j = 0 key = 10
A[j] =  A[j+1] = 10

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 3 j = 0 key = 40
A[j] =  A[j+1] = 10

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 3 j = 2 key = 40
A[j] = 30 A[j+1] = 40

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 4 j = 2 key = 40
A[j] = 30 A[j+1] = 40

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 4 j = 2 key = 20
A[j] = 30 A[j+1] = 40

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 20
1 2 3 4
i = 4 j = 3 key = 20
A[j] = 40 A[j+1] = 20

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 40
1 2 3 4
i = 4 j = 3 key = 20
A[j] = 40 A[j+1] = 40

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 40 40
1 2 3 4
i = 4 j = 2 key = 20
A[j] = 30 A[j+1] = 40

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 30 40
1 2 3 4
i = 4 j = 2 key = 20
A[j] = 30 A[j+1] = 30

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 30 30 40
1 2 3 4
i = 4 j = 1 key = 20
A[j] = 10 A[j+1] = 30

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 20 30 40
1 2 3 4
i = 4 j = 1 key = 20
A[j] = 10 A[j+1] = 20

for i = 2 to n {
key = A[i]
j = i - 1;
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
10 20 30 40
1 2 3 4
i = 4 j = 1 key = 20
A[j] = 10 A[j+1] = 20
Done!

Insertion Sort (Running Time)
The following is a pseudo-code for Insertion Sort.
Each line requires constant RAM operations.
tj = # of times key is compared at round j

• Let T(n) denote the running time of
insertion sort, on an input of size n
• By combining terms, we have
T(n) = c1n + (c2+c4+c8)(n-1) + c5 tj +
(c6+c7)  (tj – 1)
• The values of tj are dependent on the
input (not the input size)

• Best Case:
The input list is sorted, so that all tj = 1
Then, T(n) = c1n + (c2+c4+c5+c8)(n-1)
= Kn + c  linear function of n
• Worst Case:
The input list is sorted in decreasing
order, so that all tj = j-1
Then, T(n) = K1n2 + K2n + K3
 quadratic function of n

• In our course (and in most CS research),
we concentrate on worst-case time
• Some reasons for this:
1. Gives an upper bound of running time
2. Worst case occurs fairly often
Remark: Some people also study average-case
running time (they assume input is drawn
randomly)
Worst-Case Running Time

introegthnhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhppt

More Related Content

Similar to introegthnhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhppt (20)

Recently uploaded (20)

introegthnhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhppt

Editor's Notes