SlideShare a Scribd company logo
Lecture 7:
  Heapsort / Priority Queues
         Steven Skiena

Department of Computer Science
 State University of New York
 Stony Brook, NY 11794–4400

http://www.cs.sunysb.edu/∼skiena
Problem of the Day
Take as input a sequence of 2n real numbers. Design an
O(n log n) algorithm that partitions the numbers into n pairs,
with the property that the partition minimizes the maximum
sum of a pair.
For example, say we are given the numbers (1,3,5,9).
The possible partitions are ((1,3),(5,9)), ((1,5),(3,9)), and
((1,9),(3,5)). The pair sums for these partitions are (4,14),
(6,12), and (10,8). Thus the third partition has 10 as
its maximum sum, which is the minimum over the three
partitions.
Solution
Importance of Sorting
Why don’t CS profs ever stop talking about sorting?
1. Computers spend more time sorting than anything else,
   historically 25% on mainframes.
2. Sorting is the best studied problem in computer science,
   with a variety of different algorithms known.
3. Most of the interesting ideas we will encounter in the
   course can be taught in the context of sorting, such as
   divide-and-conquer, randomized algorithms, and lower
   bounds.
You should have seen most of the algorithms - we will
concentrate on the analysis.
Efficiency of Sorting
Sorting is important because that once a set of items is sorted,
many other problems become easy.
Further, using O(n log n) sorting algorithms leads naturally to
sub-quadratic algorithms for these problems.
                    n            n2 /4      n lg n
                   10               25         33
                  100           2,500         664
                1,000        250,000        9,965
               10,000     25,000,000 132,877
              100,000 2,500,000,000 1,660,960
Large-scale data processing would be impossible if sorting
took Ω(n2 ) time.
Application of Sorting: Searching
Binary search lets you test whether an item is in a dictionary
in O(lg n) time.
Search preprocessing is perhaps the single most important
application of sorting.
Application of Sorting: Closest pair
Given n numbers, find the pair which are closest to each other.
Once the numbers are sorted, the closest pair will be next to
each other in sorted order, so an O(n) linear scan completes
the job.
Application of Sorting: Element Uniqueness
Given a set of n items, are they all unique or are there any
duplicates?
Sort them and do a linear scan to check all adjacent pairs.
This is a special case of closest pair above.
Application of Sorting: Mode
Given a set of n items, which element occurs the largest
number of times? More generally, compute the frequency
distribution.
Sort them and do a linear scan to measure the length of all
adjacent runs.
The number of instances of k in a sorted array can be found in
O(log n) time by using binary search to look for the positions
of both k − and k + .
Application of Sorting: Median and Selection
What is the kth largest item in the set?
Once the keys are placed in sorted order in an array, the kth
largest can be found in constant time by simply looking in the
kth position of the array.
There is a linear time algorithm for this problem, but the idea
comes from partial sorting.
Application of Sorting: Convex hulls
Given n points in two dimensions, find the smallest area
polygon which contains them all.




The convex hull is like a rubber band stretched over the
points.
Convex hulls are the most important building block for more
sophisticated geometric algorithms.
Finding Convex Hulls
Once you have the points sorted by x-coordinate, they can be
inserted from left to right into the hull, since the rightmost
point is always on the boundary.




Sorting eliminates the need check whether points are inside
the current hull.
Adding a new point might cause others to be deleted.
Pragmatics of Sorting: Comparison Functions
Alphabetizing is the sorting of text strings.
Libraries have very complete and complicated rules con-
cerning the relative collating sequence of characters and
punctuation.
Is Skiena the same key as skiena?
Is Brown-Williams before or after Brown America before or
after Brown, John?
Explicitly controlling the order of keys is the job of the
comparison function we apply to each pair of elements.
This is how you resolve the question of increasing or
decreasing order.
Pragmatics of Sorting: Equal Elements
Elements with equal key values will all bunch together in any
total order, but sometimes the relative order among these keys
matters.
Sorting algorithms that always leave equal items in the same
relative order as in the original permutation are called stable.
Unfortunately very few fast algorithms are stable, but
Stability can be achieved by adding the initial position as a
secondary key.
Pragmatics of Sorting: Library Functions
Any reasonable programming language has a built-in sort
routine as a library function.
You are almost always better off using the system sort than
writing your own routine.
For example, the standard library for C contains the function
qsort for sorting:
void qsort(void *base, size t nel, size t width,
      int (*compare) (const void *, const void *));
Selection Sort
Selection sort scans throught the entire array, repeatedly
finding the smallest remaining element.
For i = 1 to n
A: Find the smallest of the first n − i + 1 items.
B:    Pull it out of the array and put it first.


Selection sort takes O(n(T (A) + T (B)) time.
The Data Structure Matters
Using arrays or unsorted linked lists as the data structure,
operation A takes O(n) time and operation B takes O(1), for
an O(n2 ) selection sort.
Using balanced search trees or heaps, both of these operations
can be done within O(lg n) time, for an O(n log n) selection
sort, balancing the work and achieving a better tradeoff.
Key question: “Can we use a different data structure?”
Heap Definition
A binary heap is defined to be a binary tree with a key in each
node such that:
1. All leaves are on, at most, two adjacent levels.
2. All leaves on the lowest level occur to the left, and all
   levels except the lowest one are completely filled.
3. The key in root is ≤ all its children, and the left and right
   subtrees are again binary heaps.
Conditions 1 and 2 specify shape of the tree, and condition 3
the labeling of the tree.
Binary Heaps
Heaps maintain a partial order on the set of elements which
is weaker than the sorted order (so it can be efficient to
maintain) yet stronger than random order (so the minimum
element can be quickly identified).
                                                                             1    1492
                                                 1492                        2    1783
                                                                             3    1776

                                 1783                          1776          4    1804
                                                                             5    1865
                                                                             6    1945
                   1804                   1865          1945          1963   7    1963
                                                                             8    1918
            1918          2001     1941                                      9    2001
                                                                             10   1941




A heap-labeled tree of important years (l), with the corre-
sponding implicit heap representation (r)
Array-Based Heaps
The most natural representation of this binary tree would
involve storing each key in a node with pointers to its two
children.
However, we can store a tree as an array of keys, usiing
the position of the keys to implicitly satisfy the role of the
pointers.
The left child of k sits in position 2k and the right child in
2k + 1.
The parent of k is in position n/2 .
Can we Implicitly Represent Any Binary Tree?
The implicit representation is only efficient if the tree is
sparse, meaning that the number of nodes n < 2h .
All missing internal nodes still take up space in our structure.
This is why we insist on heaps as being as balanced/full at
each level as possible.
The array-based representation is also not as flexible to
arbitrary modifications as a pointer-based tree.
Constructing Heaps
Heaps can be constructed incrementally, by inserting new
elements into the left-most open spot in the array.
If the new element is greater than its parent, swap their
positions and recur.
Since all but the last level is always filled, the height h of an
n element heap is bounded because:
                     h
                          2i = 2h+1 − 1 ≥ n
                    i=1

so h = lg n .
Doing n such insertions takes Θ(n log n), since the last n/2
insertions require O(log n) time each.
Heap Insertion

pq insert(priority queue *q, item type x)
{
      if (q->n >= PQ SIZE)
             printf(”Warning: overflow insert”);
      else {
             q->n = (q->n) + 1;
             q->q[ q->n ] = x;
             bubble up(q, q->n);
      }
}
Bubble Up

bubble up(priority queue *q, int p)
{
     if (pq parent(p) == -1) return;

      if (q->q[pq parent(p)] > q->q[p]) {
             pq swap(q,p,pq parent(p));
             bubble up(q, pq parent(p));
      }
}
Bubble Down or Heapify
Robert Floyd found a better way to build a heap, using a
merge procedure called heapify.
Given two heaps and a fresh element, they can be merged into
one by making the new one the root and bubbling down.
Build-heap(A)
      n = |A|
      For i = n/2 to 1 do
            Heapify(A,i)
Bubble Down Implementation

bubble down(priority queue *q, int p)
{
     int c; (* child index *)
     int i; (* counter *)
     int min index; (* index of lightest child *)

      c = pq young child(p);
      min index = p;

      for (i=0; i<=1; i++)
             if ((c+i) <= q->n) {
                    if (q->q[min index] > q->q[c+i]) min index = c+i;
             }

      if (min index ! = p) {
      pq swap(q,p,min index);
      bubble down(q, min index);
      }
}
Exact Analysis of Heapify
In fact, heapify performs better than O(n log n), because most
of the heaps we merge are extremely small.
It follows exactly the same analysis as with dynamic arrays
(Chapter 3).
In a full binary tree on n nodes, there are at most n/2h+1
nodes of height h, so the cost of building a heap is:
            lg n                            lg n
                         h+1
                   n/2         O(h) = O(n          h/2h)
            h=0                             h=0
Since this sum is not quite a geometric series, we can’t apply
the usual identity to get the sum. But it should be clear that
the series converges.
Proof of Convergence (*)
The identify for the sum of a geometric series is
                          ∞ k       1
                             x =
                         k=0      1−x
If we take the derivative of both sides, . . .
                      ∞               1
                         kxk−1 =
                     k=0          (1 − x)2
Multiplying both sides of the equation by x gives:
                       ∞             x
                          kxk =
                      k=0        (1 − x)2
Substituting x = 1/2 gives a sum of 2, so Build-heap uses at
most 2n comparisons and thus linear time.
Is our Analysis Tight?
“Are we doing a careful analysis? Might our algorithm be
faster than it seems?”
Doing at most x operations of at most y time each takes total
time O(xy).
However, if we overestimate too much, our bound may not be
as tight as it should be!
Heapsort
Heapify can be used to construct a heap, using the observation
that an isolated element forms a heap of size 1.
Heapsort(A)
     Build-heap(A)
     for i = n to 1 do
            swap(A[1],A[i])
            n=n−1
            Heapify(A,1)

Exchanging the maximum element with the last element
and calling heapify repeatedly gives an O(n lg n) sorting
algorithm. Why is it not O(n)?
Priority Queues
Priority queues are data structures which provide extra
flexibility over sorting.
This is important because jobs often enter a system at
arbitrary intervals. It is more cost-effective to insert a new
job into a priority queue than to re-sort everything on each
new arrival.
Priority Queue Operations
The basic priority queue supports three primary operations:
 • Insert(Q,x): Given an item x with key k, insert it into the
   priority queue Q.
 • Find-Minimum(Q) or Find-Maximum(Q): Return a
   pointer to the item whose key value is smaller (larger)
   than any other key in the priority queue Q.
 • Delete-Minimum(Q) or Delete-Maximum(Q) – Remove
   the item from the priority queue Q whose key is minimum
   (maximum).
Each of these operations can be easily supported using heaps
or balanced binary trees in O(log n).
Applications of Priority Queues: Dating
What data structure should be used to suggest who to ask out
next for a date?
It needs to support retrieval by desirability, not name.
Desirability changes (up or down), so you can re-insert the
max with the new score after each date.
New people you meet get inserted with your observed
desirability level.
There is never a reason to delete anyone until they arise to the
top.
Applications of Priority Queues: Discrete Event
Simulations
In simulations of airports, parking lots, and jai-alai – priority
queues can be used to maintain who goes next.
The stack and queue orders are just special cases of orderings.
In real life, certain people cut in line.
Applications of Priority Queues: Greedy
Algorithms
In greedy algorithms, we always pick the next thing which
locally maximizes our score. By placing all the things in a
priority queue and pulling them off in order, we can improve
performance over linear search or sorting, particularly if the
weights change.
War Story: sequential strips in triangulations

More Related Content

PDF
Linear sorting
PDF
PPT
lecture 4
PPTX
Galois field
PDF
Lecture 4 f17
PDF
Lecture 11 f17
Linear sorting
lecture 4
Galois field
Lecture 4 f17
Lecture 11 f17

What's hot (20)

PDF
Galois theory
PDF
A note on arithmetic progressions in sets of integers
PPTX
A Short Study of Galois Field
PPTX
Aaex5 group2(中英夾雜)
PDF
Minimum spanning tree
PPTX
Aaex4 group2(中英夾雜)
PPTX
Minimum spanning Tree
PDF
Galois theory andrew hubery
PDF
Elliptic Curves and Elliptic Curve Cryptography
PPT
Binsort
PPT
minimum spanning trees Algorithm
PPTX
Theory of Automata and formal languages Unit 3
PDF
Elliptic Curve Cryptography: Arithmetic behind
PDF
elliptic-curves-modern
PDF
Topological sorting
PPTX
GRAPH APPLICATION - MINIMUM SPANNING TREE (MST)
PPTX
Presentation1
PDF
Horn Concerto – AKSW Colloquium
PPT
Prim Algorithm and kruskal algorithm
PPT
Applications of datastructures
Galois theory
A note on arithmetic progressions in sets of integers
A Short Study of Galois Field
Aaex5 group2(中英夾雜)
Minimum spanning tree
Aaex4 group2(中英夾雜)
Minimum spanning Tree
Galois theory andrew hubery
Elliptic Curves and Elliptic Curve Cryptography
Binsort
minimum spanning trees Algorithm
Theory of Automata and formal languages Unit 3
Elliptic Curve Cryptography: Arithmetic behind
elliptic-curves-modern
Topological sorting
GRAPH APPLICATION - MINIMUM SPANNING TREE (MST)
Presentation1
Horn Concerto – AKSW Colloquium
Prim Algorithm and kruskal algorithm
Applications of datastructures
Ad

Viewers also liked (20)

PPTX
PDF
Agile for Embedded & System Software Development : Presented by Priyank KS
PPT
3.7 heap sort
PDF
circular linked list
PDF
iBeacons: Security and Privacy?
PDF
Interfaces to ubiquitous computing
PDF
Agile London: Industrial Agility, How to respond to the 4th Industrial Revolu...
PPTX
Demystifying dependency Injection: Dagger and Toothpick
PDF
Dependency Injection with Apex
PPTX
Agile Methodology PPT
PDF
HUG Ireland Event Presentation - In-Memory Databases
PDF
Privacy Concerns and Social Robots
PDF
ScrumGuides training: Agile Software Development With Scrum
PDF
Design & Analysis of Algorithms Lecture Notes
PDF
09 Machine Learning - Introduction Support Vector Machines
PDF
Final Year Project-Gesture Based Interaction and Image Processing
PPTX
Going native with less coupling: Dependency Injection in C++
PPTX
In-Memory Database Performance on AWS M4 Instances
PDF
Machine learning support vector machines
PDF
Sap technical deep dive in a column oriented in memory database
Agile for Embedded & System Software Development : Presented by Priyank KS
3.7 heap sort
circular linked list
iBeacons: Security and Privacy?
Interfaces to ubiquitous computing
Agile London: Industrial Agility, How to respond to the 4th Industrial Revolu...
Demystifying dependency Injection: Dagger and Toothpick
Dependency Injection with Apex
Agile Methodology PPT
HUG Ireland Event Presentation - In-Memory Databases
Privacy Concerns and Social Robots
ScrumGuides training: Agile Software Development With Scrum
Design & Analysis of Algorithms Lecture Notes
09 Machine Learning - Introduction Support Vector Machines
Final Year Project-Gesture Based Interaction and Image Processing
Going native with less coupling: Dependency Injection in C++
In-Memory Database Performance on AWS M4 Instances
Machine learning support vector machines
Sap technical deep dive in a column oriented in memory database
Ad

Similar to Skiena algorithm 2007 lecture07 heapsort priority queues (20)

PPT
PPTX
Lecture 5_ Sorting and order statistics.pptx
PPT
Advanced s and s algorithm.ppt
PPT
Chapter 4 ds
PPT
lecture 10
PPT
thisisheapsortpptfilewhichyoucanuseanywhereanytim
PDF
searching.pdf
PPT
lecture 9
PPT
Divide and conquer
PPTX
algorithm assignmenteeeeeee.pptx
PDF
5.1 Priority Queues.pdf 5.1 Priority Queues.pdf5.1 Priority Queues.pdf5.1 Pri...
PDF
Sienna 7 heaps
PPTX
Algorithms Design Homework Help
PDF
Introduction to the logic programing prolog
PPTX
RProgrammingassignmenthPPT_05.07.24.pptx
PPT
Review session2
PDF
03-data-structures.pdf
PPTX
Sorting2
PDF
Skiena algorithm 2007 lecture09 linear sorting
Lecture 5_ Sorting and order statistics.pptx
Advanced s and s algorithm.ppt
Chapter 4 ds
lecture 10
thisisheapsortpptfilewhichyoucanuseanywhereanytim
searching.pdf
lecture 9
Divide and conquer
algorithm assignmenteeeeeee.pptx
5.1 Priority Queues.pdf 5.1 Priority Queues.pdf5.1 Priority Queues.pdf5.1 Pri...
Sienna 7 heaps
Algorithms Design Homework Help
Introduction to the logic programing prolog
RProgrammingassignmenthPPT_05.07.24.pptx
Review session2
03-data-structures.pdf
Sorting2
Skiena algorithm 2007 lecture09 linear sorting

More from zukun (20)

PDF
My lyn tutorial 2009
PDF
ETHZ CV2012: Tutorial openCV
PDF
ETHZ CV2012: Information
PDF
Siwei lyu: natural image statistics
PDF
Lecture9 camera calibration
PDF
Brunelli 2008: template matching techniques in computer vision
PDF
Modern features-part-4-evaluation
PDF
Modern features-part-3-software
PDF
Modern features-part-2-descriptors
PDF
Modern features-part-1-detectors
PDF
Modern features-part-0-intro
PDF
Lecture 02 internet video search
PDF
Lecture 01 internet video search
PDF
Lecture 03 internet video search
PDF
Icml2012 tutorial representation_learning
PPT
Advances in discrete energy minimisation for computer vision
PDF
Gephi tutorial: quick start
PDF
EM algorithm and its application in probabilistic latent semantic analysis
PDF
Object recognition with pictorial structures
PDF
Iccv2011 learning spatiotemporal graphs of human activities
My lyn tutorial 2009
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Information
Siwei lyu: natural image statistics
Lecture9 camera calibration
Brunelli 2008: template matching techniques in computer vision
Modern features-part-4-evaluation
Modern features-part-3-software
Modern features-part-2-descriptors
Modern features-part-1-detectors
Modern features-part-0-intro
Lecture 02 internet video search
Lecture 01 internet video search
Lecture 03 internet video search
Icml2012 tutorial representation_learning
Advances in discrete energy minimisation for computer vision
Gephi tutorial: quick start
EM algorithm and its application in probabilistic latent semantic analysis
Object recognition with pictorial structures
Iccv2011 learning spatiotemporal graphs of human activities

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
A Presentation on Artificial Intelligence
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Machine Learning_overview_presentation.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Cloud computing and distributed systems.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
NewMind AI Weekly Chronicles - August'25-Week II
A Presentation on Artificial Intelligence
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
MYSQL Presentation for SQL database connectivity
Machine Learning_overview_presentation.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Cloud computing and distributed systems.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Chapter 3 Spatial Domain Image Processing.pdf
Building Integrated photovoltaic BIPV_UPV.pdf

Skiena algorithm 2007 lecture07 heapsort priority queues

  • 1. Lecture 7: Heapsort / Priority Queues Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794–4400 http://www.cs.sunysb.edu/∼skiena
  • 2. Problem of the Day Take as input a sequence of 2n real numbers. Design an O(n log n) algorithm that partitions the numbers into n pairs, with the property that the partition minimizes the maximum sum of a pair. For example, say we are given the numbers (1,3,5,9). The possible partitions are ((1,3),(5,9)), ((1,5),(3,9)), and ((1,9),(3,5)). The pair sums for these partitions are (4,14), (6,12), and (10,8). Thus the third partition has 10 as its maximum sum, which is the minimum over the three partitions.
  • 4. Importance of Sorting Why don’t CS profs ever stop talking about sorting? 1. Computers spend more time sorting than anything else, historically 25% on mainframes. 2. Sorting is the best studied problem in computer science, with a variety of different algorithms known. 3. Most of the interesting ideas we will encounter in the course can be taught in the context of sorting, such as divide-and-conquer, randomized algorithms, and lower bounds. You should have seen most of the algorithms - we will concentrate on the analysis.
  • 5. Efficiency of Sorting Sorting is important because that once a set of items is sorted, many other problems become easy. Further, using O(n log n) sorting algorithms leads naturally to sub-quadratic algorithms for these problems. n n2 /4 n lg n 10 25 33 100 2,500 664 1,000 250,000 9,965 10,000 25,000,000 132,877 100,000 2,500,000,000 1,660,960 Large-scale data processing would be impossible if sorting took Ω(n2 ) time.
  • 6. Application of Sorting: Searching Binary search lets you test whether an item is in a dictionary in O(lg n) time. Search preprocessing is perhaps the single most important application of sorting.
  • 7. Application of Sorting: Closest pair Given n numbers, find the pair which are closest to each other. Once the numbers are sorted, the closest pair will be next to each other in sorted order, so an O(n) linear scan completes the job.
  • 8. Application of Sorting: Element Uniqueness Given a set of n items, are they all unique or are there any duplicates? Sort them and do a linear scan to check all adjacent pairs. This is a special case of closest pair above.
  • 9. Application of Sorting: Mode Given a set of n items, which element occurs the largest number of times? More generally, compute the frequency distribution. Sort them and do a linear scan to measure the length of all adjacent runs. The number of instances of k in a sorted array can be found in O(log n) time by using binary search to look for the positions of both k − and k + .
  • 10. Application of Sorting: Median and Selection What is the kth largest item in the set? Once the keys are placed in sorted order in an array, the kth largest can be found in constant time by simply looking in the kth position of the array. There is a linear time algorithm for this problem, but the idea comes from partial sorting.
  • 11. Application of Sorting: Convex hulls Given n points in two dimensions, find the smallest area polygon which contains them all. The convex hull is like a rubber band stretched over the points. Convex hulls are the most important building block for more sophisticated geometric algorithms.
  • 12. Finding Convex Hulls Once you have the points sorted by x-coordinate, they can be inserted from left to right into the hull, since the rightmost point is always on the boundary. Sorting eliminates the need check whether points are inside the current hull. Adding a new point might cause others to be deleted.
  • 13. Pragmatics of Sorting: Comparison Functions Alphabetizing is the sorting of text strings. Libraries have very complete and complicated rules con- cerning the relative collating sequence of characters and punctuation. Is Skiena the same key as skiena? Is Brown-Williams before or after Brown America before or after Brown, John? Explicitly controlling the order of keys is the job of the comparison function we apply to each pair of elements. This is how you resolve the question of increasing or decreasing order.
  • 14. Pragmatics of Sorting: Equal Elements Elements with equal key values will all bunch together in any total order, but sometimes the relative order among these keys matters. Sorting algorithms that always leave equal items in the same relative order as in the original permutation are called stable. Unfortunately very few fast algorithms are stable, but Stability can be achieved by adding the initial position as a secondary key.
  • 15. Pragmatics of Sorting: Library Functions Any reasonable programming language has a built-in sort routine as a library function. You are almost always better off using the system sort than writing your own routine. For example, the standard library for C contains the function qsort for sorting: void qsort(void *base, size t nel, size t width, int (*compare) (const void *, const void *));
  • 16. Selection Sort Selection sort scans throught the entire array, repeatedly finding the smallest remaining element. For i = 1 to n A: Find the smallest of the first n − i + 1 items. B: Pull it out of the array and put it first. Selection sort takes O(n(T (A) + T (B)) time.
  • 17. The Data Structure Matters Using arrays or unsorted linked lists as the data structure, operation A takes O(n) time and operation B takes O(1), for an O(n2 ) selection sort. Using balanced search trees or heaps, both of these operations can be done within O(lg n) time, for an O(n log n) selection sort, balancing the work and achieving a better tradeoff. Key question: “Can we use a different data structure?”
  • 18. Heap Definition A binary heap is defined to be a binary tree with a key in each node such that: 1. All leaves are on, at most, two adjacent levels. 2. All leaves on the lowest level occur to the left, and all levels except the lowest one are completely filled. 3. The key in root is ≤ all its children, and the left and right subtrees are again binary heaps. Conditions 1 and 2 specify shape of the tree, and condition 3 the labeling of the tree.
  • 19. Binary Heaps Heaps maintain a partial order on the set of elements which is weaker than the sorted order (so it can be efficient to maintain) yet stronger than random order (so the minimum element can be quickly identified). 1 1492 1492 2 1783 3 1776 1783 1776 4 1804 5 1865 6 1945 1804 1865 1945 1963 7 1963 8 1918 1918 2001 1941 9 2001 10 1941 A heap-labeled tree of important years (l), with the corre- sponding implicit heap representation (r)
  • 20. Array-Based Heaps The most natural representation of this binary tree would involve storing each key in a node with pointers to its two children. However, we can store a tree as an array of keys, usiing the position of the keys to implicitly satisfy the role of the pointers. The left child of k sits in position 2k and the right child in 2k + 1. The parent of k is in position n/2 .
  • 21. Can we Implicitly Represent Any Binary Tree? The implicit representation is only efficient if the tree is sparse, meaning that the number of nodes n < 2h . All missing internal nodes still take up space in our structure. This is why we insist on heaps as being as balanced/full at each level as possible. The array-based representation is also not as flexible to arbitrary modifications as a pointer-based tree.
  • 22. Constructing Heaps Heaps can be constructed incrementally, by inserting new elements into the left-most open spot in the array. If the new element is greater than its parent, swap their positions and recur. Since all but the last level is always filled, the height h of an n element heap is bounded because: h 2i = 2h+1 − 1 ≥ n i=1 so h = lg n . Doing n such insertions takes Θ(n log n), since the last n/2 insertions require O(log n) time each.
  • 23. Heap Insertion pq insert(priority queue *q, item type x) { if (q->n >= PQ SIZE) printf(”Warning: overflow insert”); else { q->n = (q->n) + 1; q->q[ q->n ] = x; bubble up(q, q->n); } }
  • 24. Bubble Up bubble up(priority queue *q, int p) { if (pq parent(p) == -1) return; if (q->q[pq parent(p)] > q->q[p]) { pq swap(q,p,pq parent(p)); bubble up(q, pq parent(p)); } }
  • 25. Bubble Down or Heapify Robert Floyd found a better way to build a heap, using a merge procedure called heapify. Given two heaps and a fresh element, they can be merged into one by making the new one the root and bubbling down. Build-heap(A) n = |A| For i = n/2 to 1 do Heapify(A,i)
  • 26. Bubble Down Implementation bubble down(priority queue *q, int p) { int c; (* child index *) int i; (* counter *) int min index; (* index of lightest child *) c = pq young child(p); min index = p; for (i=0; i<=1; i++) if ((c+i) <= q->n) { if (q->q[min index] > q->q[c+i]) min index = c+i; } if (min index ! = p) { pq swap(q,p,min index); bubble down(q, min index); } }
  • 27. Exact Analysis of Heapify In fact, heapify performs better than O(n log n), because most of the heaps we merge are extremely small. It follows exactly the same analysis as with dynamic arrays (Chapter 3). In a full binary tree on n nodes, there are at most n/2h+1 nodes of height h, so the cost of building a heap is: lg n lg n h+1 n/2 O(h) = O(n h/2h) h=0 h=0 Since this sum is not quite a geometric series, we can’t apply the usual identity to get the sum. But it should be clear that the series converges.
  • 28. Proof of Convergence (*) The identify for the sum of a geometric series is ∞ k 1 x = k=0 1−x If we take the derivative of both sides, . . . ∞ 1 kxk−1 = k=0 (1 − x)2 Multiplying both sides of the equation by x gives: ∞ x kxk = k=0 (1 − x)2 Substituting x = 1/2 gives a sum of 2, so Build-heap uses at most 2n comparisons and thus linear time.
  • 29. Is our Analysis Tight? “Are we doing a careful analysis? Might our algorithm be faster than it seems?” Doing at most x operations of at most y time each takes total time O(xy). However, if we overestimate too much, our bound may not be as tight as it should be!
  • 30. Heapsort Heapify can be used to construct a heap, using the observation that an isolated element forms a heap of size 1. Heapsort(A) Build-heap(A) for i = n to 1 do swap(A[1],A[i]) n=n−1 Heapify(A,1) Exchanging the maximum element with the last element and calling heapify repeatedly gives an O(n lg n) sorting algorithm. Why is it not O(n)?
  • 31. Priority Queues Priority queues are data structures which provide extra flexibility over sorting. This is important because jobs often enter a system at arbitrary intervals. It is more cost-effective to insert a new job into a priority queue than to re-sort everything on each new arrival.
  • 32. Priority Queue Operations The basic priority queue supports three primary operations: • Insert(Q,x): Given an item x with key k, insert it into the priority queue Q. • Find-Minimum(Q) or Find-Maximum(Q): Return a pointer to the item whose key value is smaller (larger) than any other key in the priority queue Q. • Delete-Minimum(Q) or Delete-Maximum(Q) – Remove the item from the priority queue Q whose key is minimum (maximum). Each of these operations can be easily supported using heaps or balanced binary trees in O(log n).
  • 33. Applications of Priority Queues: Dating What data structure should be used to suggest who to ask out next for a date? It needs to support retrieval by desirability, not name. Desirability changes (up or down), so you can re-insert the max with the new score after each date. New people you meet get inserted with your observed desirability level. There is never a reason to delete anyone until they arise to the top.
  • 34. Applications of Priority Queues: Discrete Event Simulations In simulations of airports, parking lots, and jai-alai – priority queues can be used to maintain who goes next. The stack and queue orders are just special cases of orderings. In real life, certain people cut in line.
  • 35. Applications of Priority Queues: Greedy Algorithms In greedy algorithms, we always pick the next thing which locally maximizes our score. By placing all the things in a priority queue and pulling them off in order, we can improve performance over linear search or sorting, particularly if the weights change. War Story: sequential strips in triangulations