Skip to content

Instantly share code, notes, and snippets.

@ccj5351
Last active September 29, 2024 00:24
Show Gist options
  • Save ccj5351/ec1d2012a91eee858ac0c0b1c2021522 to your computer and use it in GitHub Desktop.
Save ccj5351/ec1d2012a91eee858ac0c0b1c2021522 to your computer and use it in GitHub Desktop.
build-a-heap-O(n)-time

How can building a heap be O(n) time complexity?

Several highly voted answers have given excellent explanations. I will show several figures to show the proof clearly and hope it is easy to understand.

Height of a Heap

The height of heap h = O(log(N)), where N is the number of (internal) nodes.

Note: ◯ for internal nodes, and □ for external nodes or leaf nodes; values are only saved at internal nodes.

depth #node
0, 1                         ----   ◯ (Root) --------
				   / \ 
1, 2                      ________◯   ◯ ________________
...                           ...   ...  ...
			        / \        / \ 
h-2, 2^{h-2}          _______ ◯    ◯      ◯     ◯ ____________
		     ______  / \  / \    / \   / \ ____________
h-1, 1             ________ ◯  □  □  □   □  □  □  □ __________
                           / \
             ___________  □  □  ______________________________

We can prove that h = O(log(N)) as below.

  • Let h be the height of a heap storing n internal nodes.
  • For level i = 0, 1, ..., h-2, there are $2^i$ nodes.
  • For level i=h-1, there are at least 1 node.
  • Thus, we have the following equation

$$ \begin{aligned} n &\geq 1 + \sum_{i=0}^{h-2} 2^{i} = 1 + ( 1 + 2 + 4 + \dots + 2^{h-2}) \\ & = 2^{h-1} + 1 \text{ ( via geometric progression sum)} \\ & \Rightarrow 2^{h-1} \leq n-1 < n \\ & \Rightarrow h-1 < \log(n) \\ & \Rightarrow h < \log(n) + 1 \\ & \Rightarrow h = O(\log{n}) \text{ \qquad \qquad (1)} \end{aligned} $$

Heapify a list of "n" numbers

Let us prove that the total time complexity is O(n) to heapify a list of n numbers.

  • At the bottom level h-1, there is at least one node. No need to adjust it.
  • At the bottom level h-2, there are $2^{h-2}$ nodes, but each node might move down by 1 level.
  • At third level from bottom, there are $2^{h-3}$ nodes, but each node might move down by 2 levels.
  • So on and so forth, at level i, there are $2^{i}$ nodes, but each node might move down by h-1-i levels.
  • Therefore, time complexity T is

$$ \begin{aligned} T &= \sum_{i=0}^{h-2} \underbrace{2^i}{\text{# of nodes}} \cdot \underbrace{(h-1-i)}{\text{# of down-move}} \ &= 1 \cdot (h-1) + 2 \cdot (h-2) + 4 \cdot (h-3) + \dots + 2^{h-3} \cdot 2 + 2^{h-2} \cdot 1 \text{\quad (2)} \end{aligned} $$

Then multiply Eq (2) with 2, to get

$$ 2T = 2 \cdot (h-1) + 4 \cdot (h-2) + 8 \cdot (h-3) + \dots + 2^{h-2} \cdot 2 + 2^{h-1} \cdot 1 \text{\quad (3)} $$

Let Eq (3) - Eq (2) to get:

Left hand side is 2T -T = T = Right hand side. Therefore, we have

$$ \begin{aligned} T &= -1 \cdot (h-1) + \underbrace{[2(h-1) - 2(h-2)]}{\text{diff}=(h-1)-(h-2)=1} + \underbrace{[4(h-2) - 4(h-3)]}{\text{diff}=(h-2)-(h-3)=1} + \ & \dots + \underbrace{ [2^{h-3} \cdot 3 - 2^{h-3} \cdot 2}{\text{diff}=3-2=1}] + \underbrace{[2^{h-2} \cdot 2 - 2^{h-2} \cdot 1]}{\text{diff}=2-1=1} + 2^{h-1} \cdot 1 \ &= -h + 1 + [2\cdot 1 + 4\cdot 1 + \dots + 2^{h-3} \cdot 1 + 2^{h-2} \cdot 1] + 2^{h-1} \cdot 1 \ &= [1 + 2 + 4 + \dots + 2^{h-3} + 2^{h-2} + 2^{h-1} ] - h \ &= \frac{1 - 2^{h-1}\cdot 2}{1-2} - h \ T &= 2^{h} - 1 -h \text{\qquad \qquad (4)} \end{aligned} $$

To further relax the right hand side, we have

$$ \begin{aligned} T &= 2^{h} - 1 -h \\ &< 2^{h} - 1 - 0 \\ &< 2^{h} \quad (\because 2^{h-1} < n \text{ from Eq (1), and } h > 0) \\ & <2n \qquad \text{(5)} \end{aligned} $$

Therefore, Eq (5) proves $T = O(n)$. Proof done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment