二叉堆(大优先)

1. Introduction

A Binary (Max) Heap is a complete binary tree that maintains the Max Heap property.


Binary Heap is one possible data structure to model an efficient Priority Queue (PQ) Abstract Data Type (ADT). In a PQ, each element has a "priority" and an element with higher priority is served before an element with lower priority (ties are broken with standard First-In First-Out (FIFO) rule as with normal Queue). Try clicking ExtractMax() for a sample animation on extracting the max value of random Binary Heap above.


To focus the discussion scope, this visualization show a Binary Max Heap of integers where duplicates are allowed. See this for an easy conversion to Binary Min Heap.

1-1. 定义

完整二叉树: 每一层都是二叉树 都被填满 ,除了最低/最下面的一层,并且所有的最底层顶点都尽力向左靠拢。


最大二叉堆特性: 每个顶点的父元素 - 除了根元素 - 都比当前元素的值要大。这种定义比下面这种更容易验证:一个顶点的值 - 除了叶顶点 - 必须必它的一个或者两个子元素要大。

1-2. 优先队 ADT(抽象数据类型)

Priority Queue (PQ) Abstract Data Type (ADT) is similar to normal Queue ADT, but with these two major operations:


  1. Enqueue(x): Put a new element (key) x into the PQ (in some order),
  2. y = Dequeue(): Return an existing element y that has the highest priority (key) in the PQ and if ties, return the one that is inserted first, i.e. back to First-In First-Out (FIFO) behavior of a normal Queue

1-3. 例子

想象:你是一个在机场控制塔工作的 空中交通管理员 (ATC)

你已经安排了飞机 X/Y 在接下来的 3/6 分钟分别降落。两辆飞机都有足够飞行15分钟的油并且都在距机场2分钟的位置。据你观察机场跑道现在空旷。

如果你不知道的话,飞机可以听令在机场附近飞 等待航线 直到指定的降落时间。

1-4. NUS 课堂独享

You have to attend the live lecture to figure out what happens next...


There will be two options presented to you and you will have to decide:


  1. Raise AND wave your hand if you choose option A,
  2. Raise your hand but do NOT wave it if you choose option B,

If none of the two options is reasonable for you, simply do nothing.

1-5. 例子 - 继续上页

[This is a hidden slide]

1-6. 优先队例子

除了你刚才看到的几个例子外(NUS课堂独享) 现实生活中还有很多可以用到优先队ADT的地方。
讨论:你能提出一些其他的在生活中能用到优先队的地方吗?

1-7. 潜在答案

[This is a hidden slide]

1-8. 线性的数据结构来做优先队?

We are able to implement this PQ ADT using (circular) array or Linked List but we will have slow (i.e., in O(N)) Enqueue or Dequeue operation.


Discussion: Why?

1-9. 答案 - 第一部分

[This is a hidden slide]

1-10. 答案 - 第二部分

[This is a hidden slide]

2. 动画 + 最大堆特性复习

Now, let's view the visualisation of a (random) Binary (Max) Heap above. You should see a complete binary tree and all vertices except the root satisfy the Max Heap property (A[parent(i)] > A[i] — which is still fine even with the presence of duplicate integers).


You can Toggle the Visualization Mode between the visually more intuitive complete binary tree form or the underlying compact array based implementation of a Binary (Max) Heap.


Quiz: Based on this Binary (Max) Heap property, where will the largest integer be located?

At one of the leaf
Can be anywhere
At the root

2-1. 二叉堆的高度是 O(log N)

要记住的重要知识点:如果我们有一个含有N个元素的二叉堆,因为我们要把它储存成一个完整的二叉树,所以他的高度不会高过 O(log N)
简单分析:一个满的(不止是完整的)含有N个元素的二叉树的高度 h 总是 N = 2(h+1)-1,所以 h = log2(N+1)-1 ~= log2 N。
看上面的例子 N = 7 = 2(2+1)-1 或者 h = log2(7+1)-1 = 2.
对于有关二叉堆的操作 以上知识点很重要。

2-2. 以1为首元素的紧凑数组

A complete binary tree can be stored efficiently as a compact array A as there is no gap between vertices of a complete binary tree/elements of a compact array. To simplify the navigation operations below, we use 1-based array. VisuAlgo displays the index of each vertex as a red label below each vertex. Read those indices in sorted order from 1 to N, then you will see the vertices of the complete binary tree from top to down, left to right. To help you understand this, Toggle the Visualization Mode several times.


Pro tip: Try opening two copies of VisuAlgo on two browser windows. Try to visualize the same Binary Max Heap in two different modes and compare them.


This way, we can implement basic binary tree traversal operations with simple index manipulations (with help of bit shift manipulation):

  1. parent(i) = i>>1, index i divided by 2 (integer division),
  2. left(i) = i<<1, index i multiplied by 2,
  3. right(i) = (i<<1)+1, index i multiplied by 2 and added by 1.

3. Binary (Max) Heap Operations

In this visualization, you can perform several Binary (Max) Heap operations:


  1. Create(A) - O(N log N) version (N calls of Insert(v) below)
  2. Create(A) - O(N) version
  3. Insert(v) in O(log N) — you are allowed to insert duplicates
  4. 3 versions of ExtractMax():
    1. Once, in O(log N)
    2. K times, i.e., PartialSort(), in O(K log N), or
    3. N times, i.e., HeapSort(), in O(N log N)
  5. UpdateKey(i, newv) in O(log N if i is known)
  6. Delete(i) in O(log N if i is known)

There are a few other possible Binary (Max) Heap operations, but currently we do not elaborate them for pedagogical reason in a certain NUS module.

3-1. 那些额外的操作是什么?

[This is a hidden slide]

4. Insert(v)   //插入v

Insert(v): 当我们在往最大二叉堆插入元素 v 的时候 我们只能插入到最后一个元素 N + 1 来维持一个紧凑的数组 = 完整二叉树特性。然而 最大堆的特性仍然可能没被满足。这时我们从插入点往上去修复最大堆特性(如果需要的话)直到整个堆符合最大堆特性。现在点 Insert(v) 几次来插入几个随机的 v 进去现在显示的最大二叉堆。


这个向上走并且修复最大堆特性的操作没有公认的名字。我们叫它 ShiftUp (上移)但是还有人叫它 BubbleUp 或者 IncreaseKey 操作.

4-1. 为什么这样是正确的?

你明白为什么当你插入的新元素违反了最大堆的特性的时候 从插入点(第N+1个元素)往上(最多上到根元素)走,把当前顶点和它的父顶点换位 永远是正确的方法?

4-2. 答案

[This is a hidden slide]

4-3. 时间复杂度分析

The time complexity of this Insert(v) operation is O(log N).


Discussion: Do you understand the derivation?

4-4. 答案

[This is a hidden slide]

5. ExtractMax() - Once

ExtractMax(): 提取并删除最大二叉堆中的最大元素(根)的操作需要一个在堆中的元素来替换根,不然最大堆(一个完整的二叉树,在中文中的'林‘)会变成两个断开的子树(中文中两个'木’)。这同样也是这个元素必须是最后一个元素N的原因:维持紧凑数组 = 完整二叉树性质。

因为我们要把最大二叉堆中的一个叶顶点晋升成根顶点,它很可能是会违背最大堆特性。ExtractMax() 这时从根往下比较当前值和它的两个子元素的值(如果必要的话)来修复最大二叉堆特性。现在在当前显示的最大二叉堆中试试 ExtractMax()

这个向下走并修复最大堆特性的操作没有公认的名字。我们叫它 ShiftDown (下移)但是还有人叫它 BubbleDown 或者 Heapify 操作。.

5-1. 为什么和比较大的子项比较?

在向下运动修复最大堆特性的时候,为什么当一个顶点有两个子元素的时候,我们必须检查(甚至交换)它两个子元素中较大的那个?
为什么我们不能只比较左(或者右,如果存在的话)顶点?

5-2. 答案

[This is a hidden slide]

5-3. 时间复杂度分析

The time complexity of this ExtractMax() operation is O(log N).


Discussion: Do you understand the derivation?

5-4. 答案

[This is a hidden slide]

6. 用二叉堆来实现高效的优先队

Up to here, we have a data structure that can implement the two major operations of Priority Queue (PQ) ADT efficiently:


  1. For Enqueue(x), we can use Insert(x) in O(log N) time, and
  2. For y = Dequeue(), we can use y = ExtractMax() in O(log N) time.

However, we can do a few more operations with Binary Heap.

7. Create(A) - 两种版本

Create(A): 用含有 N 个逗号分隔的数组 A 从一个空的最大二叉堆开始创建一个合法的最大二叉堆


有两种方法完成创建操作一个简单但是慢 -> O(N log N) 另一个用更复杂的技术但是快 -> O(N).


教你一招:在两个浏览器里面同时打开 VisuAlgo。用最差情况 '已排序例子' 去执行不同版本的 Create(A),你会看到可怕的差距。

7-1. Create(A) - O(N log N)

Create(A) - O(N log N): 单纯的一个接一个的插入 (用 Insert(v) 操作的意思) 所有 N 个在输入数组中的元素进入到一个本来是空的最大二叉堆中。

分析: 这个操作显然是 O(N log N) 我们用了 O(log N) Insert(v) 操作 N 次. 让我们来研究 ’排好序的例子‘ 它是这个操作中比较复杂的例子之一(现在试试 Hard Case - O(N log N) 我们显示了这个例子 A=[1,2,3,4,5,6,7] -- 请耐心等待 这个例子会比较耗时). 如果我们用递增的顺序往一个空的最大二叉堆中插入,每一次插入都会触发一次从插入点(新叶元素)到根的上移操作。

7-2. Create(A) - O(N)

Create(A) - O(N): This faster version of Create(A) operation was invented by Robert W. Floyd in 1964. It takes advantage of the fact that a compact array = complete binary tree and all leaves (i.e., half of the vertices — see the next slide) are Binary Max Heap by default. This operation then fixes Binary Max Heap property (if necessary) only from the last internal vertex back to the root.


Analysis: A loose analysis gives another O(N/2 log N) = O(N log N) complexity but it is actually just O(2*N) = O(N) — details in the next few slides. Now try the Hard Case - O(N) on the same input array A=[1,2,3,4,5,6,7] and see that on the same hard case as with the previous slide (but not the one that generates maximum number of swaps), this operation is far superior than the O(N log N) version.

7-3. 很多叶元素(叶顶点)

Simple explanation on why half of Binary (Max) Heap of N (without loss of generality, let's assume that N is even) elements are leaves are as follows:


Suppose that the last leaf is at index N, then the parent of that last leaf is at index i = N/2 (remember this slide). The left child of vertex i+1, if exists (it actually does not exist), will be 2*(i+1) = 2*(N/2+1) = N+2, which exceeds index N (the last leaf) so index i+1 must also be a leaf vertex that has no child. As Binary Heap indexing is consecutive, basically indices [i+1 = N/2+1, i+2 = N/2+2, ..., N], or half of the vertices, are leaves.

7-4. 为什么是 O(N)? - 第一部分

首先,我们回忆含有N个元素的满二叉树的高度是log2 N.
其次,要明白运行shiftDown(i)操作所需要的时间不是最高值O(log N), 而是 O(h),h是在以顶点 i 为跟的从树的高度。
最后,在一个高度为 h 的满二叉树中 有ceil(N/2h+1)个顶点。
在上面这个满二叉树的例子中 N = 7, h = 2。 顶点数量为 ceil(7/20+1) = 4 分别是在h = 0  {44,35,26,17} 。在高度h = 1ceil(7/21+1) = 2 个顶点 分别是: {62,53} 在高度h = 2  ceil(7/22+1) = 1 个顶点 它是 {71}。

7-5. 为什么是 O(N)? - 第二部分

Create(A) 用 O(N) 的版本是这样:


analysis


如果以上公式太复杂的话,现代学生会想到用 WolframAlpha 来解决。

8. HeapSort()   //堆排序

堆排序函数HeapSort() John William Joseph Williams 在1964年发明了堆排序算法和这个二叉堆数据结构. 堆排序的操作 (假定最大二叉堆已经在O(n)的时间复杂度内建立)非常简单。只需要调用提取堆顶函数ExtractMax() n次即可,每次的时间复杂度为O(log N). 现在你可以试一下屏幕上的这个最大二叉堆 HeapSort()


分析:堆排序显然是O(N log N) - 一个最优的通过比较来排序的算法 

Quiz: In worst case scenario, HeapSort() is asymptotically faster than...

Selection Sort
Insertion Sort
Merge Sort
Bubble Sort

8-1. 讨论

Although HeapSort() runs in θ(N log N) time for all (best/average/worst) cases, is it really the best comparison-based sorting algorithm?


Discussion: How about caching performance of HeapSort()?

8-2. 答案

[This is a hidden slide]

8-3. PartialSort()

We can actually just call the O(log N) ExtractMax() operation K times if we are only interested in the top K largest elements in the Binary (Max) Heap. Now try [tobeadded] on the currently displayed Binary (Max) Heap. This operation is called PartialSort().


Simple Analysis: PartialSort() clearly runs in O(K log N) — an output-sensitive algorithm where the time complexity depends on the output size K.

9. 一些额外内容

You have reached the end of the basic stuffs of this Binary (Max) Heap data structure and we encourage you to explore further in the Exploration Mode.


However, we still have a few more interesting Binary (Max) Heap challenges for you that are outlined in this section.


When you have cleared them all, we invite you to study more advanced algorithms that use Priority Queue as (one of) its underlying data structure, like Prim's MST algorithm, Dijkstra's SSSP algorithm, A* search algorithm (not in VisuAlgo yet), a few other greedy-based algorithms, etc.

9-1. 简单的最大堆 -> 最小堆 转换

If we only deal with numbers (including this visualization that is restricted to integers only), it is easy to convert a Binary Max Heap into a Binary Min Heap without changing anything, or vice versa.


We can re-create a Binary Heap with the negation of every integer in the original Binary Heap. If we start with a Binary Max Heap, the resulting Binary Heap is a Binary Min Heap (if we ignore the negative symbols — see the picture above), and vice versa.

9-2. UpdateKey(i, newv)

For some Priority Queue applications (e.g., HeapDecreaseKey in Dijkstra's algorithm), we may have to modify (increase or decrease) the priority of an existing value that is already inserted into a Binary (Max) Heap. If the index i of the value is known, we can do the following simple strategy: Simply update A[i] = newv and then call both shiftUp(i) and shiftDown(i). Only at most one of this Max Heap property restoration operation will be successfully, i.e., shiftUp(i)/shiftDown(i) will be triggered if newv >/< old value of A[parent(i)]/A[larger of the two children of i], respectively.


Thus, UpdateKey(i, newv) can be done in O(log N), provided we know index i.

9-3. Delete(i)

For some Priority Queue applications, we may have to delete an existing value that is already inserted into a Binary (Max) Heap (and this value happens to be not the root). Again, if the index i of the value is known, we can do the following simple strategy: Simply update A[i] = A[1]+1 (a larger number greater than the current root), call shiftUp(i) (technically, UpdateKey(i, A[1]+1)). This will floats index i to be the new root, and from there, we can easily call ExtractMax() once to remove it.


Thus, Delete(i) can be done in O(log N), provided we know index i.


Discussion: Now for UpdateKey(i, newv) and Delete(i), what if we are given oldv instead and thus we have to search for its location in the Binary (Max) Heap? Can we do this faster than O(N)?

9-4. The Answer

[This is a hidden slide]

9-5. 源代码

If you are looking for an implementation of Binary (Max) Heap to actually model a Priority Queue, then there is a good news.


C++ and Java already have built-in Priority Queue implementations that very likely use this data structure. They are C++ STL priority_queue (the default is a Max Priority Queue) and Java PriorityQueue (the default is a Min Priority Queue). However, the built-in implementation may not be suitable to do some PQ extended operations efficiently (details omitted for pedagogical reason in a certain NUS module).


Python heapq exists but its performance is rather slow. OCaml doesn't have built-in Priority Queue but we can use something else that is going to be mentioned in the other modules in VisuAlgo (the reason on why the details are omitted is the same as above).


PS: Heap Sort is likely used in C++ STL algorithm partial_sort.


Nevertheless, here is our implementation of BinaryHeapDemo.cpp.

9-6. 在线测试

For a few more interesting questions about this data structure, please practice on Binary Heap training module (no login is required).


However, for NUS students, you should login using your official class account, officially clear this module, and such achievement will be recorded in your user account.

9-7. 在线评审练习

We also have a few programming problems that somewhat requires the usage of this Binary Heap data structure: UVa 01203 - Argus and Kattis - numbertree.


Try them to consolidate and improve your understanding about this data structure. You are allowed to use C++ STL priority_queue, Python heapq, or Java PriorityQueue if that simplifies your implementation.

9-8. 讨论

[This is a hidden slide]

9-9. 令人震惊的东西

After spending one long lecture on Binary (Max) Heap, here is a jaw-dropping moment...


Binary (Max) Heap data structure is probably not the best data structure to implement (certain operations of) ADT Priority Queue...


Discussion: So what is the alternative data structure?

9-10. The Answer

[This is a hidden slide]