# 并查集 (Union-Find Disjoint Sets, 简称UFDS)

## 2. 可视化

### 2-2. 储存数据 - 第一部分

We can simply record this forest of trees with an array p of size N items where p[i] records the parent of item i and if p[i] = i, then i is the root of this tree and also the representative item of the set that contains item i.

Once again, look at the visualization above and determine the values inside this array p.

Discuss: If i is the root of the tree that contains it, can we set p[i] = -1 instead of p[i] = i? What are the implications?

### 2-3. The Implications

[This is a hidden slide]

## 4. Initialize(N, M)

Initialize(N, M): Create N items and form M disjoint sets with these N items. We randomly pick two disjoint sets and merge them until we have M random disjoint sets. Currently this setup is not random enough, i.e., it cannot create tall trees for example.

The default form is Initialize(N, N), i.e., M = N, all with p[i] = i and rank[i] = 0 (all these rank values are initially not shown). The time complexity of this operation is clearly O(N).

Due to the limitation of screen size, we set 1 ≤ N ≤ 16.

## 5. FindSet（i）

FindSet（i）：从顶点i，递归地在树上往上移动。 也就是说，从顶点i，我们转到顶点p [i]），直到我们找到该树的根，这是该不相交集的的代表项（代表项具有p [i] = i的性质）。

### 5-1. 实践案例

If we execute FindSet(12), we will immediately get vertex 12.
If we execute FindSet(9) we will get vertex 6 after 1 step and no other change.

Now try executing FindSet(0). If this is your first call on this default UFDS example, it will return vertex 3 after 2 steps and then modify the underlying UFDS structure due to path-compression in action (that is, vertex 0 points to vertex 3 directly). Notice that rank value of rank[1] = 1 is now wrong as vertex 1 becomes a new leaf. However, we will not bother to update its value.

Notice that the next time you execute FindSet(0) again, it will be (much) faster as the path has been compressed. For now, we assume that FindSet(i) runs in O(1).

## 6. IsSameSet(i, j)

IsSameSet（i，j）：只需检查是否 FindSet（i） == FindSet（j）。 该函数经常出现在Kruskal的MST算法中。 由于它只调用FindSet操作两次，我们假设它的时间复杂度为O（1）。

## 7. UnionSet(i, j)

UnionSet(i, j): If item i and j come from two disjoint sets initially, we link the representative item of the shorter tree/disjoint set to the representative item of the taller tree/disjoint set (otherwise, we do nothing). This is also done in O(1).

This is union-by-rank heuristic in action and will cause the resulting tree to be relatively short. Only if the two trees are equally tall before union (by comparing their rank values heuristically — note that we are not comparing their actual — the current — heights), then the rank of the resulting tree will increase by one unit.

### 7-2. 实践案例

On the same default example, try UnionSet(9, 12). As the tree that represents disjoint set {6, 9} is currently taller (according to the value of rank[6] = 1), then the shorter tree that represents disjoint set {12} will be slotted under vertex 6, without increasing the height of the combined tree at all.

On the same default example, try UnionSet(0, 11). Notice that the ranks of vertex 3 and vertex 5 are the same rank[3] = rank[5] = 2. Thus, we can either put vertex 3 under vertex 5 (our implementation) or vertex 5 under vertex 3 (both will increase the resulting height of the combined tree by 1). Notice the indirect path-compression heuristic in action.

### 7-3. 小测验

Quiz: Starting with N=8 disjoint sets, how tall (heuristically) can the resulting final tree if we call 7 UnionSet(i, j) operations strategically?

rank:5
rank:3
rank:4
rank:1
rank:2

Quiz: Starting with N=8 disjoint sets, how short (heuristically) can the resulting final tree if we call 7 UnionSet(i, j) operations strategically?

rank:1
rank:2
rank:4
rank:5
rank:3

### 7-4. 答案

[This is a hidden slide]

## 8. 实际时间复杂度

So far, we say that FindSet(i), IsSameSet(i, j), and UnionSet(i, j) runs in O(1). Actually they run in O(α(N)) if the UFDS is implemented with both path-compression and union-by-rank heuristics. The analysis is quite involved and is skipped in this visualization.

This α(N) is called the inverse Ackermann function that grows extremely slowly. For practical usage of this UFDS data structure (assuming N ≤ 1M), we have α(1M) ≈ 1.

## 9. 附加功能

### 9-4. 提示

[This is a hidden slide]

### 9-6. 答案

[This is a hidden slide]