Heaps

A heap is a type of tree data structure that satisfies the heap property: every node's key is greater than or equal to the keys of its children.

Here are some example heaps:

Three max-at-top heaps of various shapes.

Note that with the heap property, it necessarily means that the maximum value in the heap is at the root of the tree. This makes it a very natural data structure to use when implementing a priority queue, since the highest priority element is readily available!

The most common way of implementing a heap is by using a complete binary tree.

Complete binary tree

A complete binary tree (note: not a binary search tree) is a binary tree of height h where:

  • levels 0 through h - 1 are fully occupied
  • in level h, there are no "gaps" to the left of any node

In essence, a complete tree is one where the tree is filled with nodes from top to bottom, left to right, and each node is full of nodes except for possibly the last layer. Here are some examples of complete trees:

Three complete trees of various shapes.

And here are some examples of trees that are not complete (where the node that's missing to prevent it from being complete is highlighted):

Three trees of various shapes that are close to being complete trees, but are missing a node that prevents it from being complete.

Representing a complete binary tree

In our previous data structures course, we represented trees a linked data structure: every node in a tree had some data element, as well as pointers to its left child, right child, and parent. This works great in the general case for representing trees.

However, when using complete trees (as we are for heaps), we can actually have a simpler representation that just uses an array. This makes it a lot simpler to manage and manipulate the heap.

To represent a complete tree as an array, the trees's nodes are stored in the order given by a level-order traversal; top to bottom, left to right:

A complete tree showing how it can be interpreted using an array. The root node shows a[0] and its left and right children show a[1] and a[2], respectively. The node with a[1] has a[3] and a[4] as its left and right child, etc.
Side-by-side comparison of a complete tree and its array representation.

Once the tree is represented as an array (or in our case, a list in Python), you can use the following formulas to navigate the tree:

  • The root node is in lst[0]
  • Given the node in lst[i]:
    • its left child is in lst[2*i + 1]
    • its right child is in lst[2*i + 2]
    • its parent is in lst[(i - 1)/2] (using integer division)

For example, consider this complete tree:

Another complete tree where each node is marked with the corresponding array indexes from the array representation.
  • the left child of the node in lst[1] is in lst[2*1 + 1] = lst[3]
  • the left child of the node in lst[2] is in lst[2*2 + 1] = lst[5]
  • the right child of the node in lst[3] is in lst[2*3 + 2] = lst[8]
  • the parent of the node in lst[4] is in lst[(4 - 1)/2] = lst[1]

Summary

To summarize, we are now looking to implement the priority queue ADT using a max-at-top heap. Most often, heaps are complete trees, which are implemented as arrays due to the simplicity of the representation.

Watch the following video to see a summary of heaps and complete trees: