

The Union-Find Disjoint Sets (UFDS) data structure is used to model a collection of disjoint sets, which is able to efficiently (i.e., in nearly constant time) determine which set an item belongs to, test if two items belong to the same set, and union two disjoint sets into one when needed. It can be used to find Connected Components (CCs) in an undirected graph, and can hence be used as part of Kruskal's algorithm for the Minimum Spanning Tree (MST) problem.
Note that this data structure has another alternative name: Disjoint Sets Union (DSU).
Remarks: By default, we show e-Lecture Mode for first time (or non logged-in) visitor.
If you are an NUS student and a repeat visitor, please login.
每棵树代表一个不相交的集合(因此多个不相交集合能形成一个森林),树的根是这个不相交集合的代表项目。
现在停下来看看当前可视化中的树。 总共有多少项(N)? 有多少个不相交的集合? 每个不相交集的成员是什么? 每个不相交集的代表项是什么?
Pro-tip 1: Since you are not logged-in, you may be a first time visitor (or not an NUS student) who are not aware of the following keyboard shortcuts to navigate this e-Lecture mode: [PageDown]/[PageUp] to go to the next/previous slide, respectively, (and if the drop-down box is highlighted, you can also use [→ or ↓/← or ↑] to do the same),and [Esc] to toggle between this e-Lecture mode and exploration mode.
As we fixed the default example for this e-Lecture, your answers should be: N = 13 and there are 4 disjoint sets: {0, 1, 2, 3, 4, 10}, {5, 7, 8, 11}, {6, 9}, {12} with the underlined members be the representative items (of their own disjoint set).
Pro-tip 2: We designed this visualization and this e-Lecture mode to look good on 1366x768 resolution or larger (typical modern laptop resolution in 2021). We recommend using Google Chrome to access VisuAlgo. Go to full screen mode (F11) to enjoy this setup. However, you can use zoom-in (Ctrl +) or zoom-out (Ctrl -) to calibrate this.
我们可以简单地用一个数组 p 记录这个树的森林,数组的大小为 N 个项目,其中 p[i] 记录了项目 i 的父节点,如果 p[i] = i,那么 i 就是这棵树的根,也是包含项目 i 的集合的代表项目。
再次看看上面的可视化,确定这个数组 p 中的值。
讨论:如果 i 是包含它的树的根,我们可以设置 p[i] = -1 而不是 p[i] = i 吗?这有什么影响?
Pro-tip 3: Other than using the typical media UI at the bottom of the page, you can also control the animation playback using keyboard shortcuts (in Exploration Mode): Spacebar to play/pause/replay the animation, ←/→ to step the animation backwards/forwards, respectively, and -/+ to decrease/increase the animation speed, respectively.
The content of this interesting slide (the answer of the usually intriguing discussion point from the earlier slide) is hidden and only available for legitimate CS lecturer worldwide. This mechanism is used in the various flipped classrooms in NUS.
If you are really a CS lecturer (or an IT teacher) (outside of NUS) and are interested to know the answers, please drop an email to stevenhalim at gmail dot com (show your University staff profile/relevant proof to Steven) for Steven to manually activate this CS lecturer-only feature for you.
FAQ: This feature will NOT be given to anyone else who is not a CS lecturer.
On the same fixed example, your answers should be p = [1, 3, 3, 3, 3, 5, 6, 5, 5, 6, 4, 8, 12] of size N = 13 ranging from p[0] to p[12].
You can check that p[3] = 3, p[5] = 5, p[6] = 6, and p[12] = 12, which are consistent with the fact that {3, 5, 6, 12} are the representative items (of their own disjoint set).
由于有很多项的等级为0,我们对可视化进行了如下设置,以减少杂乱:只有当顶点i的等级大于0时,VisuAlgo才会在顶点i下面以红色文字显示rank[i]的值(简写为一个字符r)。
On the same fixed example, verify that {1, 4, 6, 8} have rank 1 and {3, 5} have rank 2, with the rest having rank 0 (not shown).
At this point of time, all rank values are correct, i.e., they really describe the height of the subtree rooted at that vertex. We will soon see that they will not be always correct in the next few slides.
示例,Initialize(N)(初始化),FindSet(i)(查找),IsSameSet(i,j)(在同一集),和UnionSet(i,j)(合并)。
第一个操作(示例)并不重要:具有各种特殊特征的合并集结构实例列表,供您参考。 此e-Lecture模式始终使用“四个不相交集(Four disjoint sets)”示例作为起点。
另请注意,没有一个例子包含 "非常高 "的树。 在我们描述了所使用的两种启发式方法之后,你很快就会明白其中的原因。
Initialize(N, M):创建 N 个项目并用这些 N 个项目形成 M 个不相交的集合。我们随机选择两个不相交的集合并合并它们,直到我们有 M 个随机不相交的集合。由于使用了按秩合并的启发式方法和随机性,初始化过程很不可能创建一个高树。
默认形式是 Initialize(N, N),即 M = N,所有的 p[i] = i 和 rank[i] = 0(所有这些秩值最初都不显示)。这个操作的时间复杂度显然是 O(N)。
由于屏幕大小的限制,我们设置 1 ≤ N ≤ 32。显然 M ≤ N。
FindSet(i):从顶点i,递归地在树上往上移动。 也就是说,从顶点i,我们转到顶点p [i]),直到我们找到该树的根,这是该不相交集的的代表项(代表项具有p [i] = i的性质)。
在这个FindSet(i)操作中,我们在每次调用FindSet(i)之后使用路径压缩,因为现在沿着从顶点i到根的路径的每个顶点都知道根是它们的代表项,并且可以用O(1)时间直接指向它 。
If we execute FindSet(12), we will immediately get vertex 12.
If we execute FindSet(9) we will get vertex 6 after 1 step and no other change.
Now try executing . If this is your first call on this default UFDS example, it will return vertex 3 after 2 steps and then modify the underlying UFDS structure due to path-compression in action (that is, vertex 0 points to vertex 3 directly). Notice that rank value of rank[1] = 1 is now wrong as vertex 1 becomes a new leaf. However, we will not bother to update its value for efficiency.
Notice that the next time you execute again, it will be (much) faster as the path has been compressed. For now, we assume that FindSet(i) runs in O(1).
请注意,FindSet函数在IsSameSet函数内部被调用,因此也间接使用了路径压缩。
If we call IsSameSet(3, 5), we should get false as vertex 3 and vertex 5 are representative items of their respective disjoint sets and they are different.
Now try on the same default example to see indirect path-compression on vertex 0 and vertex 11. We should get false as the two representative items: vertex 3 and vertex 5, are different. Notice that the rank values at vertex {1, 5, 8} are now wrong. But we will not fix them, again — for efficiency.
UnionSet(i, j):如果项目 i 和 j 最初来自两个不相交的集合,我们将较短树/不相交集合的代表项目链接到较高树/不相交集合的代表项目(否则,我们什么也不做)。这也是在 O(1) 中完成的。
这是按秩合并启发式在起作用,将导致结果树相对较短。只有当两棵树在联合之前等高(通过启发式比较他们的秩值 - 注意我们并不是比较他们实际的 - 当前的 - 高度),那么结果树的秩将增加一个单位。
在同样的默认示例中,尝试 。由于代表不相交集合 {6, 9} 的树当前较高(根据 rank[6] = 1 的值),因此代表不相交集合 {12} 的较矮的树将被插入到顶点6下,而不会增加合并树的高度。
在同样的默认示例中,尝试 。注意,顶点3和顶点5的等级是相同的 rank[3] = rank[5] = 2。因此,我们可以将顶点3放在顶点5下(我们的实现),或者将顶点5放在顶点3下(两者都会使合并树的高度增加1)。注意间接的路径压缩启发式在起作用。
Quiz: Starting with N = 8 disjoint sets, how tall (heuristically) can the resulting final tree if we call 7 UnionSet(i, j) operations strategically?
Quiz: Starting with N = 8 disjoint sets, how short (heuristically) can the resulting final tree if we call 7 UnionSet(i, j) operations strategically?
讨论:为什么?
The content of this interesting slide (the answer of the usually intriguing discussion point from the earlier slide) is hidden and only available for legitimate CS lecturer worldwide. This mechanism is used in the various flipped classrooms in NUS.
If you are really a CS lecturer (or an IT teacher) (outside of NUS) and are interested to know the answers, please drop an email to stevenhalim at gmail dot com (show your University staff profile/relevant proof to Steven) for Steven to manually activate this CS lecturer-only feature for you.
FAQ: This feature will NOT be given to anyone else who is not a CS lecturer.
到目前为止,我们说FindSet(i),IsSameSet(i, j)和UnionSet(i, j)的运行时间为O(1)。实际上,如果UFDS同时实现了路径压缩和按秩合并启发式,它们的运行时间为O(α(N))。这个分析相当复杂,在这个可视化中被跳过。
这个α(N)被称为阿克曼函数的逆函数,它的增长速度极慢。对于这个UFDS数据结构的实际使用(假设N ≤ 1M),我们有α(1M) ≈ 1。
然而,我们还有一些更有趣的并查集挑战给你。
Even after clearing the Online Quiz of this UFDS module, do you think that you have really mastered this data structure?
Let us challenge you by asking you to solve three programming problems that somewhat requires the usage of this Union-Find Disjoint Sets data structure: LeetCode - number-of-provinces, UVa 01329 - Corporative Network, and Kattis - control.
Beware that two of the three problems are actual International Collegiate Programming Contest (ICPC) problems, i.e., they are "not trivial".
The content of this interesting slide (the answer of the usually intriguing discussion point from the earlier slide) is hidden and only available for legitimate CS lecturer worldwide. This mechanism is used in the various flipped classrooms in NUS.
If you are really a CS lecturer (or an IT teacher) (outside of NUS) and are interested to know the answers, please drop an email to stevenhalim at gmail dot com (show your University staff profile/relevant proof to Steven) for Steven to manually activate this CS lecturer-only feature for you.
FAQ: This feature will NOT be given to anyone else who is not a CS lecturer.
讨论:那么,如果我们需要这种 "拆分 "或 "分割 "或 "切割 "的操作,该怎么做呢?
The content of this interesting slide (the answer of the usually intriguing discussion point from the earlier slide) is hidden and only available for legitimate CS lecturer worldwide. This mechanism is used in the various flipped classrooms in NUS.
If you are really a CS lecturer (or an IT teacher) (outside of NUS) and are interested to know the answers, please drop an email to stevenhalim at gmail dot com (show your University staff profile/relevant proof to Steven) for Steven to manually activate this CS lecturer-only feature for you.
FAQ: This feature will NOT be given to anyone else who is not a CS lecturer.
You have reached the last slide. Return to 'Exploration Mode' to start exploring!
Note that if you notice any bug in this visualization or if you want to request for a new visualization feature, do not hesitate to drop an email to the project leader: Dr Steven Halim via his email address: stevenhalim at gmail dot com.
例子
Initialise(初始化)
FindSet(查找)
IsSameSet(在同一集)
UnionSet