Like merge sort, but unlike insertion sort, heapsort’s running time is O.n lg n/. Like insertion sort, but unlike merge sort, heapsort sorts in place: only a constant number of array elements are stored outside the input array at any time. Heapsort also introduces another algorithm design technique: using a data structure, in this case one we call a “heap,” to manage information. Not only is the heap data structure useful for heapsort, but it also makes an efficient priority queue. The term “heap” was originally coined in the context of heapsort, but it has since come to refer to “garbage-collected storage,” such as the programming languages Java and Lisp provide. Our heap data structure is not garbage-collected storage, and whenever we refer to heaps, we shall mean a data structure rather than an aspect of garbage collection.
The (binary) heap data structure is an array object that we can view as a nearly complete binary tree. Each node of the tree corresponds to an element of the array. An array A that represents a heap is an object with two attributes:
- A:length, which (as usual) gives the number of elements in the array.
- A:heap-size, which represents how many elements in the heap are stored within array A.
That is, although A[1..A.length] may contain numbers, only the elements in A[1..A.heap-size]. The root of the tree is A[1], and given the index i of a node, we can easily compute the indices of its parent, left child, and right child:
A max-heap viewed as (a) a binary tree and (b) an array. The number within the circle at each node in the tree is the value stored at that node. The number above a node is the corresponding index in the array. Above and below the array are lines showing parent-child relationships; parents are always to the left of their children. The tree has height three; the node at index 4 (with value 8) has height one.
PARENT (i)
//Parent of i is at the position floor value of i/2
return floor(i/2)
LEFT(i)
// Left child of i is at the position 2 times i
return 2i
RIGHT(i)
// Right child of i is at the position 2 times i plus 1
return 2i + 1
There are two kinds of binary heaps: max-heaps and min-heaps. In both kinds, the values in the nodes satisfy a heap property, the specifics of which depend on the kind of heap. In a max-heap, value at each root must be greater than its children. A min-heap is organized in the opposite way the min-heap propery is that value at each root must be smaller than its children. Since a heap of n elements is based on a complete binary tree, its height is lg n.
- The MAX -HEAPIFY procedure, which runs in (lg n) time, is the key to maintaining the max-heap property.
- The BUILD -MAX -HEAP procedure, which runs in linear time, produces a max-heap from an unordered input array.
- The HEAPSORT procedure, which runs in (n lg n) time, sorts an array in place.
// Pseudo Code
MAX-HEAPIFY(A,i)
l = LEFT(i)
r = RIGHT(i)
if l <= A.heap-size AND A[l] > A[i]
largest = l
else
largest = i
if r <= A.heap-size AND A[r] > A[largest]
largest = r
if largest != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A,largest)
// C++ Code
void MAX_HEAPIFY(int A[],int n,int i){
int largest = i;
int l = node_left(i);
int r = node_right(i);
if (l < n && A[l] > A[largest]){
largest = l;
}
if (r < n && A[r] > A[largest]){
largest = r;
}
if (largest != i){
swap_el(A,i,largest);
MAX_HEAPIFY(A,n,largest);
}
}
// Pseudo Code
BUILD-MAX-HEAP(A)
A.heap-size = A.length
for i = floor(A.length/2) downto 1
MAX-HEAPIFY(A,i)
// C++ Code
void BUILD_MAX_HEAP(int A[],int n){
for(int i = floor(n/2) - 1; i >= 0;i--){
MAX_HEAPIFY(A,n,i);
}
}
// Pseudo Code
HEAPSORT(A)
BUILD-MAX-HEAP(A)
for i = A.length downto 2
exchange A[1] with A[i]
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A,1)
// C++ Code
void HEAPSORT(int A[],int n){
BUILD_MAX_HEAP(A,n);
for(int i = n - 1; i >= 0;i--){
swap_el(A,i,0);
MAX_HEAPIFY(A,i,0);
}
}