SlideShare a Scribd company logo
Analysis of Algorithms
CS 477/677
Sorting – Part B
Instructor: George Bebis
(Chapter 7)
2
Sorting
• Insertion sort
– Design approach:
– Sorts in place:
– Best case:
– Worst case:
• Bubble Sort
– Design approach:
– Sorts in place:
– Running time:
Yes
(n)
(n2)
incremental
Yes
(n2)
incremental
3
Sorting
• Selection sort
– Design approach:
– Sorts in place:
– Running time:
• Merge Sort
– Design approach:
– Sorts in place:
– Running time:
Yes
(n2)
incremental
No
Let’s see!!
divide and conquer
4
Divide-and-Conquer
• Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size
• Conquer the sub-problems
– Solve the sub-problems recursively
– Sub-problem size small enough  solve the problems in
straightforward manner
• Combine the solutions of the sub-problems
– Obtain the solution for the original problem
5
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences
6
Merge Sort
Alg.: MERGE-SORT(A, p, r)
if p < r Check for base case
then q ← (p + r)/2 Divide
MERGE-SORT(A, p, q) Conquer
MERGE-SORT(A, q + 1, r) Conquer
MERGE(A, p, q, r) Combine
• Initial call: MERGE-SORT(A, 1, n)
1 2 3 4 5 6 7 8
6
2
3
1
7
4
2
5
p r
q
7
Example – n Power of 2
1 2 3 4 5 6 7 8
q = 4
6
2
3
1
7
4
2
5
1 2 3 4
7
4
2
5
5 6 7 8
6
2
3
1
1 2
2
5
3 4
7
4
5 6
3
1
7 8
6
2
1
5
2
2
3
4
4
7 1
6
3
7
2
8
6
5
Divide
8
Example – n Power of 2
1
5
2
2
3
4
4
7 1
6
3
7
2
8
6
5
1 2 3 4 5 6 7 8
7
6
5
4
3
2
2
1
1 2 3 4
7
5
4
2
5 6 7 8
6
3
2
1
1 2
5
2
3 4
7
4
5 6
3
1
7 8
6
2
Conquer
and
Merge
9
Example – n Not a Power of 2
6
2
5
3
7
4
1
6
2
7
4
1 2 3 4 5 6 7 8 9 10 11
q = 6
4
1
6
2
7
4
1 2 3 4 5 6
6
2
5
3
7
7 8 9 10 11
q = 9
q = 3
2
7
4
1 2 3
4
1
6
4 5 6
5
3
7
7 8 9
6
2
10 11
7
4
1 2
2
3
1
6
4 5
4
6
3
7
7 8
5
9
2
10
6
11
4
1
7
2
6
4
1
5
7
7
3
8
Divide
10
Example – n Not a Power of 2
7
7
6
6
5
4
4
3
2
2
1
1 2 3 4 5 6 7 8 9 10 11
7
6
4
4
2
1
1 2 3 4 5 6
7
6
5
3
2
7 8 9 10 11
7
4
2
1 2 3
6
4
1
4 5 6
7
5
3
7 8 9
6
2
10 11
2
3
4
6
5
9
2
10
6
11
4
1
7
2
6
4
1
5
7
7
3
8
7
4
1 2
6
1
4 5
7
3
7 8
Conquer
and
Merge
11
Merging
• Input: Array A and indices p, q, r such that
p ≤ q < r
– Subarrays A[p . . q] and A[q + 1 . . r] are sorted
• Output: One single sorted subarray A[p . . r]
1 2 3 4 5 6 7 8
6
3
2
1
7
5
4
2
p r
q
12
Merging
• Idea for merging:
– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile
– Repeat the process until one pile is empty
– Take the remaining input pile and place it face-down
onto the output pile
1 2 3 4 5 6 7 8
6
3
2
1
7
5
4
2
p r
q
A1 A[p, q]
A2 A[q+1, r]
A[p, r]
13
Example: MERGE(A, 9, 12, 16)
p r
q
14
Example: MERGE(A, 9, 12, 16)
15
Example (cont.)
16
Example (cont.)
17
Example (cont.)
Done!
18
Merge - Pseudocode
Alg.: MERGE(A, p, q, r)
1. Compute n1 and n2
2. Copy the first n1 elements into
L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1]
3. L[n1 + 1] ← ; R[n2 + 1] ← 
4. i ← 1; j ← 1
5. for k ← p to r
6. do if L[ i ] ≤ R[ j ]
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ]
10. j ← j + 1
p q
7
5
4
2
6
3
2
1
r
q + 1
L
R


1 2 3 4 5 6 7 8
6
3
2
1
7
5
4
2
p r
q
n1 n2
19
Running Time of Merge
(assume last for loop)
• Initialization (copying into temporary arrays):
– (n1 + n2) = (n)
• Adding the elements to the final array:
- n iterations, each taking constant time  (n)
• Total time for Merge:
– (n)
20
Analyzing Divide-and Conquer Algorithms
• The recurrence is based on the three steps of
the paradigm:
– T(n) – running time on a problem of size n
– Divide the problem into a subproblems, each of size
n/b: takes D(n)
– Conquer (solve) the subproblems aT(n/b)
– Combine the solutions C(n)
(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n) otherwise
21
MERGE-SORT Running Time
• Divide:
– compute q as the average of p and r: D(n) = (1)
• Conquer:
– recursively solve 2 subproblems, each of size n/2
 2T (n/2)
• Combine:
– MERGE on an n-element subarray takes (n) time
 C(n) = (n)
(1) if n =1
T(n) = 2T(n/2) + (n) if n > 1
22
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1
Use Master’s Theorem:
Compare n with f(n) = cn
Case 2: T(n) = Θ(nlgn)
23
Merge Sort - Discussion
• Running time insensitive of the input
• Advantages:
– Guaranteed to run in (nlgn)
• Disadvantage
– Requires extra space N
24
Sorting Challenge 1
Problem: Sort a file of huge records with tiny
keys
Example application: Reorganize your MP-3 files
Which method to use?
A. merge sort, guaranteed to run in time NlgN
B. selection sort
C. bubble sort
D. a custom algorithm for huge records/tiny keys
E. insertion sort
25
Sorting Files with Huge Records and
Small Keys
• Insertion sort or bubble sort?
– NO, too many exchanges
• Selection sort?
– YES, it takes linear time for exchanges
• Merge sort or custom method?
– Probably not: selection sort simpler, does less swaps
26
Sorting Challenge 2
Problem: Sort a huge randomly-ordered file of
small records
Application: Process transaction record for a
phone company
Which sorting method to use?
A. Bubble sort
B. Selection sort
C. Mergesort guaranteed to run in time NlgN
D. Insertion sort
27
Sorting Huge, Randomly - Ordered Files
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, quadratic time for randomly-ordered keys
• Insertion sort?
– NO, quadratic time for randomly-ordered keys
• Mergesort?
– YES, it is designed for this problem
28
Sorting Challenge 3
Problem: sort a file that is already almost in
order
Applications:
– Re-sort a huge database after a few changes
– Doublecheck that someone else sorted a file
Which sorting method to use?
A. Mergesort, guaranteed to run in time NlgN
B. Selection sort
C. Bubble sort
D. A custom algorithm for almost in-order files
E. Insertion sort
29
Sorting Files That are Almost in Order
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, bad for some definitions of “almost in order”
– Ex: B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
• Insertion sort?
– YES, takes linear time for most definitions of “almost
in order”
• Mergesort or custom method?
– Probably not: insertion sort simpler and faster
30
Quicksort
• Sort an array A[p…r]
• Divide
– Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such
that each element of A[p..q] is smaller than or equal to each
element in A[q+1..r]
– Need to find index q to partition the array
≤
A[p…q] A[q+1…r]
31
Quicksort
• Conquer
– Recursively sort A[p..q] and A[q+1..r] using Quicksort
• Combine
– Trivial: the arrays are sorted in place
– No additional work is required to combine them
– The entire array is now sorted
A[p…q] A[q+1…r]
≤
32
QUICKSORT
Alg.: QUICKSORT(A, p, r)
if p < r
then q  PARTITION(A, p, r)
QUICKSORT (A, p, q)
QUICKSORT (A, q+1, r)
Recurrence:
Initially: p=1, r=n
PARTITION())
T(n) = T(q) + T(n – q) + f(n)
33
Partitioning the Array
• Choosing PARTITION()
– There are different ways to do this
– Each has its own advantages/disadvantages
• Hoare partition (see prob. 7-1, page 159)
– Select a pivot element x around which to partition
– Grows two regions
A[p…i]  x
x  A[j…r]
A[p…i]  x x  A[j…r]
i j
34
Example
7
3
1
4
6
2
3
5
i j
7
5
1
4
6
2
3
3
i j
7
5
1
4
6
2
3
3
i j
7
5
6
4
1
2
3
3
i j
7
3
1
4
6
2
3
5
i j
A[p…r]
7
5
6
4
1
2
3
3
i
j
A[p…q] A[q+1…r]
pivot x=5
35
Example
36
Partitioning the Array
Alg. PARTITION (A, p, r)
1. x  A[p]
2. i  p – 1
3. j  r + 1
4. while TRUE
5. do repeat j  j – 1
6. until A[j] ≤ x
7. do repeat i  i + 1
8. until A[i] ≥ x
9. if i < j
10. then exchange A[i]  A[j]
11. else return j
Running time: (n)
n = r – p + 1
7
3
1
4
6
2
3
5
i j
A:
ar
ap
i
j=q
A:
A[p…q] A[q+1…r]
≤
p r
Each element is
visited once!
37
Recurrence
Alg.: QUICKSORT(A, p, r)
if p < r
then q  PARTITION(A, p, r)
QUICKSORT (A, p, q)
QUICKSORT (A, q+1, r)
Recurrence:
Initially: p=1, r=n
T(n) = T(q) + T(n – q) + n
38
Worst Case Partitioning
• Worst-case partitioning
– One region has one element and the other has n – 1 elements
– Maximally unbalanced
• Recurrence: q=1
T(n) = T(1) + T(n – 1) + n,
T(1) = (1)
T(n) = T(n – 1) + n
=
2 2
1
1 ( ) ( ) ( )
n
k
n k n n n

 
       
 
 

n
n - 1
n - 2
n - 3
2
1
1
1
1
1
1
n
n
n
n - 1
n - 2
3
2
(n2)
When does the worst case happen?
39
Best Case Partitioning
• Best-case partitioning
– Partitioning produces two regions of size n/2
• Recurrence: q=n/2
T(n) = 2T(n/2) + (n)
T(n) = (nlgn) (Master theorem)
40
Case Between Worst and Best
• 9-to-1 proportional split
Q(n) = Q(9n/10) + Q(n/10) + n
41
How does partition affect performance?
42
How does partition affect performance?
43
Performance of Quicksort
• Average case
– All permutations of the input numbers are equally likely
– On a random input array, we will have a mix of well balanced
and unbalanced splits
– Good and bad splits are randomly distributed across throughout
the tree
Alternate of a good
and a bad split
Nearly well
balanced split
n
n - 1
1
(n – 1)/2
(n – 1)/2
n
(n – 1)/2
(n – 1)/2 + 1
• Running time of Quicksort when levels alternate
between good and bad splits is O(nlgn)
combined partitioning cost:
2n-1 = (n)
partitioning cost:
n = (n)

More Related Content

PPTX
Merge sort and quick sort
PPTX
quick and merge.pptx
PDF
Analysis and design of algorithms part2
PPTX
Data Structure and algorithms for software
PPT
PPT
02_Gffdvxvvxzxzczcczzczcczczczxvxvxvds2.ppt
PPT
Algorithms and Data structures: Merge Sort
Merge sort and quick sort
quick and merge.pptx
Analysis and design of algorithms part2
Data Structure and algorithms for software
02_Gffdvxvvxzxzczcczzczcczczczxvxvxvds2.ppt
Algorithms and Data structures: Merge Sort

Similar to MergesortQuickSort.ppt (20)

PDF
module2_dIVIDEncONQUER_2022.pdf
PPTX
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
PPTX
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
PDF
Sienna 4 divideandconquer
PPT
free power point ready to download right now
PPT
ee220s02lec9.ppt ghggggggggggggggggggggggg
PPTX
Sortings .pptx
PPTX
ADA_Module 2_MN.pptx Analysis and Design of Algorithms
PPTX
Weak 11-12 Sorting update.pptxbhjiiuuuuu
PPT
Insert Sort & Merge Sort Using C Programming
PPT
Quick Sort
PPTX
2.Problem Solving Techniques and Data Structures.pptx
PPT
Sorting algos > Data Structures & Algorithums
PPTX
sorting-160810203705.pptx
PPTX
Module 2_ Divide and Conquer Approach.pptx
PPTX
Parallel Sorting Algorithms. Quicksort. Merge sort. List Ranking
PPT
Lec 6 Divide and conquer of Data Structures & Algortihms
PPT
03_sorting123456789454545454545444543.ppt
PPT
03_sorting and it's types with example .ppt
module2_dIVIDEncONQUER_2022.pdf
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
UNIT V Searching Sorting Hashing Techniques [Autosaved].pptx
Sienna 4 divideandconquer
free power point ready to download right now
ee220s02lec9.ppt ghggggggggggggggggggggggg
Sortings .pptx
ADA_Module 2_MN.pptx Analysis and Design of Algorithms
Weak 11-12 Sorting update.pptxbhjiiuuuuu
Insert Sort & Merge Sort Using C Programming
Quick Sort
2.Problem Solving Techniques and Data Structures.pptx
Sorting algos > Data Structures & Algorithums
sorting-160810203705.pptx
Module 2_ Divide and Conquer Approach.pptx
Parallel Sorting Algorithms. Quicksort. Merge sort. List Ranking
Lec 6 Divide and conquer of Data Structures & Algortihms
03_sorting123456789454545454545444543.ppt
03_sorting and it's types with example .ppt
Ad

Recently uploaded (20)

PDF
Insiders guide to clinical Medicine.pdf
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Pharma ospi slides which help in ospi learning
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Classroom Observation Tools for Teachers
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
Insiders guide to clinical Medicine.pdf
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Week 4 Term 3 Study Techniques revisited.pptx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Pharma ospi slides which help in ospi learning
2.FourierTransform-ShortQuestionswithAnswers.pdf
Classroom Observation Tools for Teachers
O7-L3 Supply Chain Operations - ICLT Program
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Renaissance Architecture: A Journey from Faith to Humanism
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Microbial disease of the cardiovascular and lymphatic systems
PPH.pptx obstetrics and gynecology in nursing
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Ad

MergesortQuickSort.ppt

  • 1. Analysis of Algorithms CS 477/677 Sorting – Part B Instructor: George Bebis (Chapter 7)
  • 2. 2 Sorting • Insertion sort – Design approach: – Sorts in place: – Best case: – Worst case: • Bubble Sort – Design approach: – Sorts in place: – Running time: Yes (n) (n2) incremental Yes (n2) incremental
  • 3. 3 Sorting • Selection sort – Design approach: – Sorts in place: – Running time: • Merge Sort – Design approach: – Sorts in place: – Running time: Yes (n2) incremental No Let’s see!! divide and conquer
  • 4. 4 Divide-and-Conquer • Divide the problem into a number of sub-problems – Similar sub-problems of smaller size • Conquer the sub-problems – Solve the sub-problems recursively – Sub-problem size small enough  solve the problems in straightforward manner • Combine the solutions of the sub-problems – Obtain the solution for the original problem
  • 5. 5 Merge Sort Approach • To sort an array A[p . . r]: • Divide – Divide the n-element sequence to be sorted into two subsequences of n/2 elements each • Conquer – Sort the subsequences recursively using merge sort – When the size of the sequences is 1 there is nothing more to do • Combine – Merge the two sorted subsequences
  • 6. 6 Merge Sort Alg.: MERGE-SORT(A, p, r) if p < r Check for base case then q ← (p + r)/2 Divide MERGE-SORT(A, p, q) Conquer MERGE-SORT(A, q + 1, r) Conquer MERGE(A, p, q, r) Combine • Initial call: MERGE-SORT(A, 1, n) 1 2 3 4 5 6 7 8 6 2 3 1 7 4 2 5 p r q
  • 7. 7 Example – n Power of 2 1 2 3 4 5 6 7 8 q = 4 6 2 3 1 7 4 2 5 1 2 3 4 7 4 2 5 5 6 7 8 6 2 3 1 1 2 2 5 3 4 7 4 5 6 3 1 7 8 6 2 1 5 2 2 3 4 4 7 1 6 3 7 2 8 6 5 Divide
  • 8. 8 Example – n Power of 2 1 5 2 2 3 4 4 7 1 6 3 7 2 8 6 5 1 2 3 4 5 6 7 8 7 6 5 4 3 2 2 1 1 2 3 4 7 5 4 2 5 6 7 8 6 3 2 1 1 2 5 2 3 4 7 4 5 6 3 1 7 8 6 2 Conquer and Merge
  • 9. 9 Example – n Not a Power of 2 6 2 5 3 7 4 1 6 2 7 4 1 2 3 4 5 6 7 8 9 10 11 q = 6 4 1 6 2 7 4 1 2 3 4 5 6 6 2 5 3 7 7 8 9 10 11 q = 9 q = 3 2 7 4 1 2 3 4 1 6 4 5 6 5 3 7 7 8 9 6 2 10 11 7 4 1 2 2 3 1 6 4 5 4 6 3 7 7 8 5 9 2 10 6 11 4 1 7 2 6 4 1 5 7 7 3 8 Divide
  • 10. 10 Example – n Not a Power of 2 7 7 6 6 5 4 4 3 2 2 1 1 2 3 4 5 6 7 8 9 10 11 7 6 4 4 2 1 1 2 3 4 5 6 7 6 5 3 2 7 8 9 10 11 7 4 2 1 2 3 6 4 1 4 5 6 7 5 3 7 8 9 6 2 10 11 2 3 4 6 5 9 2 10 6 11 4 1 7 2 6 4 1 5 7 7 3 8 7 4 1 2 6 1 4 5 7 3 7 8 Conquer and Merge
  • 11. 11 Merging • Input: Array A and indices p, q, r such that p ≤ q < r – Subarrays A[p . . q] and A[q + 1 . . r] are sorted • Output: One single sorted subarray A[p . . r] 1 2 3 4 5 6 7 8 6 3 2 1 7 5 4 2 p r q
  • 12. 12 Merging • Idea for merging: – Two piles of sorted cards • Choose the smaller of the two top cards • Remove it and place it in the output pile – Repeat the process until one pile is empty – Take the remaining input pile and place it face-down onto the output pile 1 2 3 4 5 6 7 8 6 3 2 1 7 5 4 2 p r q A1 A[p, q] A2 A[q+1, r] A[p, r]
  • 13. 13 Example: MERGE(A, 9, 12, 16) p r q
  • 18. 18 Merge - Pseudocode Alg.: MERGE(A, p, q, r) 1. Compute n1 and n2 2. Copy the first n1 elements into L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 + 1] 3. L[n1 + 1] ← ; R[n2 + 1] ←  4. i ← 1; j ← 1 5. for k ← p to r 6. do if L[ i ] ≤ R[ j ] 7. then A[k] ← L[ i ] 8. i ←i + 1 9. else A[k] ← R[ j ] 10. j ← j + 1 p q 7 5 4 2 6 3 2 1 r q + 1 L R   1 2 3 4 5 6 7 8 6 3 2 1 7 5 4 2 p r q n1 n2
  • 19. 19 Running Time of Merge (assume last for loop) • Initialization (copying into temporary arrays): – (n1 + n2) = (n) • Adding the elements to the final array: - n iterations, each taking constant time  (n) • Total time for Merge: – (n)
  • 20. 20 Analyzing Divide-and Conquer Algorithms • The recurrence is based on the three steps of the paradigm: – T(n) – running time on a problem of size n – Divide the problem into a subproblems, each of size n/b: takes D(n) – Conquer (solve) the subproblems aT(n/b) – Combine the solutions C(n) (1) if n ≤ c T(n) = aT(n/b) + D(n) + C(n) otherwise
  • 21. 21 MERGE-SORT Running Time • Divide: – compute q as the average of p and r: D(n) = (1) • Conquer: – recursively solve 2 subproblems, each of size n/2  2T (n/2) • Combine: – MERGE on an n-element subarray takes (n) time  C(n) = (n) (1) if n =1 T(n) = 2T(n/2) + (n) if n > 1
  • 22. 22 Solve the Recurrence T(n) = c if n = 1 2T(n/2) + cn if n > 1 Use Master’s Theorem: Compare n with f(n) = cn Case 2: T(n) = Θ(nlgn)
  • 23. 23 Merge Sort - Discussion • Running time insensitive of the input • Advantages: – Guaranteed to run in (nlgn) • Disadvantage – Requires extra space N
  • 24. 24 Sorting Challenge 1 Problem: Sort a file of huge records with tiny keys Example application: Reorganize your MP-3 files Which method to use? A. merge sort, guaranteed to run in time NlgN B. selection sort C. bubble sort D. a custom algorithm for huge records/tiny keys E. insertion sort
  • 25. 25 Sorting Files with Huge Records and Small Keys • Insertion sort or bubble sort? – NO, too many exchanges • Selection sort? – YES, it takes linear time for exchanges • Merge sort or custom method? – Probably not: selection sort simpler, does less swaps
  • 26. 26 Sorting Challenge 2 Problem: Sort a huge randomly-ordered file of small records Application: Process transaction record for a phone company Which sorting method to use? A. Bubble sort B. Selection sort C. Mergesort guaranteed to run in time NlgN D. Insertion sort
  • 27. 27 Sorting Huge, Randomly - Ordered Files • Selection sort? – NO, always takes quadratic time • Bubble sort? – NO, quadratic time for randomly-ordered keys • Insertion sort? – NO, quadratic time for randomly-ordered keys • Mergesort? – YES, it is designed for this problem
  • 28. 28 Sorting Challenge 3 Problem: sort a file that is already almost in order Applications: – Re-sort a huge database after a few changes – Doublecheck that someone else sorted a file Which sorting method to use? A. Mergesort, guaranteed to run in time NlgN B. Selection sort C. Bubble sort D. A custom algorithm for almost in-order files E. Insertion sort
  • 29. 29 Sorting Files That are Almost in Order • Selection sort? – NO, always takes quadratic time • Bubble sort? – NO, bad for some definitions of “almost in order” – Ex: B C D E F G H I J K L M N O P Q R S T U V W X Y Z A • Insertion sort? – YES, takes linear time for most definitions of “almost in order” • Mergesort or custom method? – Probably not: insertion sort simpler and faster
  • 30. 30 Quicksort • Sort an array A[p…r] • Divide – Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such that each element of A[p..q] is smaller than or equal to each element in A[q+1..r] – Need to find index q to partition the array ≤ A[p…q] A[q+1…r]
  • 31. 31 Quicksort • Conquer – Recursively sort A[p..q] and A[q+1..r] using Quicksort • Combine – Trivial: the arrays are sorted in place – No additional work is required to combine them – The entire array is now sorted A[p…q] A[q+1…r] ≤
  • 32. 32 QUICKSORT Alg.: QUICKSORT(A, p, r) if p < r then q  PARTITION(A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) Recurrence: Initially: p=1, r=n PARTITION()) T(n) = T(q) + T(n – q) + f(n)
  • 33. 33 Partitioning the Array • Choosing PARTITION() – There are different ways to do this – Each has its own advantages/disadvantages • Hoare partition (see prob. 7-1, page 159) – Select a pivot element x around which to partition – Grows two regions A[p…i]  x x  A[j…r] A[p…i]  x x  A[j…r] i j
  • 34. 34 Example 7 3 1 4 6 2 3 5 i j 7 5 1 4 6 2 3 3 i j 7 5 1 4 6 2 3 3 i j 7 5 6 4 1 2 3 3 i j 7 3 1 4 6 2 3 5 i j A[p…r] 7 5 6 4 1 2 3 3 i j A[p…q] A[q+1…r] pivot x=5
  • 36. 36 Partitioning the Array Alg. PARTITION (A, p, r) 1. x  A[p] 2. i  p – 1 3. j  r + 1 4. while TRUE 5. do repeat j  j – 1 6. until A[j] ≤ x 7. do repeat i  i + 1 8. until A[i] ≥ x 9. if i < j 10. then exchange A[i]  A[j] 11. else return j Running time: (n) n = r – p + 1 7 3 1 4 6 2 3 5 i j A: ar ap i j=q A: A[p…q] A[q+1…r] ≤ p r Each element is visited once!
  • 37. 37 Recurrence Alg.: QUICKSORT(A, p, r) if p < r then q  PARTITION(A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) Recurrence: Initially: p=1, r=n T(n) = T(q) + T(n – q) + n
  • 38. 38 Worst Case Partitioning • Worst-case partitioning – One region has one element and the other has n – 1 elements – Maximally unbalanced • Recurrence: q=1 T(n) = T(1) + T(n – 1) + n, T(1) = (1) T(n) = T(n – 1) + n = 2 2 1 1 ( ) ( ) ( ) n k n k n n n                 n n - 1 n - 2 n - 3 2 1 1 1 1 1 1 n n n n - 1 n - 2 3 2 (n2) When does the worst case happen?
  • 39. 39 Best Case Partitioning • Best-case partitioning – Partitioning produces two regions of size n/2 • Recurrence: q=n/2 T(n) = 2T(n/2) + (n) T(n) = (nlgn) (Master theorem)
  • 40. 40 Case Between Worst and Best • 9-to-1 proportional split Q(n) = Q(9n/10) + Q(n/10) + n
  • 41. 41 How does partition affect performance?
  • 42. 42 How does partition affect performance?
  • 43. 43 Performance of Quicksort • Average case – All permutations of the input numbers are equally likely – On a random input array, we will have a mix of well balanced and unbalanced splits – Good and bad splits are randomly distributed across throughout the tree Alternate of a good and a bad split Nearly well balanced split n n - 1 1 (n – 1)/2 (n – 1)/2 n (n – 1)/2 (n – 1)/2 + 1 • Running time of Quicksort when levels alternate between good and bad splits is O(nlgn) combined partitioning cost: 2n-1 = (n) partitioning cost: n = (n)