DFS & BFS

1. Introduction

Given a graph, we can use the O(V+E) DFS (Depth-First Search) or BFS (Breadth-First Search) algorithm to traverse the graph and explore the features/properties of the graph. Each algorithm has its own characteristics, features, and side-effects that we will explore in this visualization.


This visualization is rich with a lot of DFS and BFS variants (all run in O(V+E)) such as:

  1. Topological Sort algorithm (both DFS and BFS/Kahn's algorithm version),
  2. Bipartite Graph Checker algorithm (both DFS and BFS version),
  3. Cut Vertex & Bridge finding algorithm,
  4. Strongly Connected Components (SCC) finding algorithms
    (both Kosaraju's and Tarjan's version), and
  5. 2-SAT Checker algorithm.

2. 可视化

当所选的图遍历算法运行时,将在次处显示动画。


我们使用节点 + 边颜色(颜色方案将很快阐述),偶尔使用节点下的额外的文本(红色字体)来突出显示更改。

所有的图遍历算法都适用于有向图(这是默认设置,其中每个边都有一个箭头指示其反向),但是 Bipartite Graph Check 算法和 Cut Vertex & Bridge 查找算法 需要无向图(通过这种可视化,转换是自动完成的)。

3. 指定一个输入图

对于指定一个输入图,有两种不同的方法:

  1. 绘制图: 您可以绘制任何未加权的有向图作为输入图(绘制双向边 (u, v) ,您可以绘制两个有向边 u → v and v → u )。
  2. 示例图: 您可以从我们选择的示例图列表中进行挑选,以帮助您入门。

4. 概括

If you arrive at this e-Lecture without having first explore/master the concept of Binary Heap and especially Binary Search Tree, we suggest that you explore them first, as traversing a (Binary) Tree structure is much simpler than traversing a general graph.


Quiz: Mini pre-requisite check. What are the Pre-/In-/Post-order traversal of the binary tree shown (root = vertex 0), left and right child are as drawn?

In = 1, 0, 3, 2, 4
Post = 4, 3, 2, 1, 0
In = 4, 2, 3, 0, 1
Pre = 0, 1, 2, 3, 4
Pre = 0, 2, 4, 3, 1
Post = 1, 3, 4, 2, 0

4-1. 二叉树遍历 - 源 = 根

我们通常从(二叉)树的最重要的顶点:节点 开始。
如果给定的树不是“rooted”(参见示例图片),我们可以选择任何一个顶点(例如,示例图片中的顶点0)并将其指定为根。如果我们想象所有边都是相似长度的弦,那么在”实际向上拉指定的根“并让中立向下拉动其余部分之后,我们有一个有根的(向下)树 - 见下一张幻灯片。
PS:从技术上来讲,这种转换是通过运行我们即将探索的 DFS(0) 来实现的。

4-2. 二叉树遍历 - 前序-/中序-/后序-遍历

二叉树中,我们最多只有两个相邻的选择:从当前顶点开始,我们可以先到左边的子树,或者先到右边的子树。我们还可以选择在访问其中一个(或两个)子树之前或之后访问当前顶点。
这产生了个有代表性的:前序(访问当前顶点,访问其左子树,访问其右子树),中序(左,当前,右),和后序(左,右,当前)遍历。
讨论:您是否注意到还有其它三种可能的二叉树的遍历组合?他们是什么?

4-3. 答案

[This is a hidden slide]

4-4. 二叉树遍历 - 无环

一个二叉树中,或者概括来说 一个树结构,不包含大于三个不同的顶点(我们不考虑那些连通两个顶点的双向路径所产生的小圈 我们可以很容易的处理掉它们 - 往前翻三页)

4-5. 一般图的问题

In general graph, we do not have the notion of root vertex. Instead, we need to pick one distinguished vertex to be the starting point of the traversal, i.e. the source vertex s.


We also have 0, 1, ..., k neighbors of a vertex instead of just ≤ 2.


We may (or actually very likely) have cycle(s) in our general graph instead of acyclic tree,
be it the trivial one like u → v → u or the non-trivial one like a → b → c → a.


But fret not, graph traversal is an easy problem with two classic algorithms: DFS and BFS.

5. DFS

最基本的图遍历算法之一是 O(V+E) 深度优先搜索(DFS)。
DFS 采用一个输入参数:源点 s
DFS 是最基本的图的算法之一,因此请花时间了解该算法的关键步骤。

5-1. 比喻

mazeThe closest analogy of the behavior of DFS is to imagine a maze with only one entrance and one exit. You are at the entrance and want to explore the maze to reach the exit. Obviously you cannot split yourself into more than one.


Ask these reflective questions before continuing: What will you do if there are branching options in front of you? How to avoid going in cycle? How to mark your own path? Hint: You need a chalk, stones (or any other marker) and a (long) string.

5-2. 尝试所有选项

顾名思义,DFS从一个已知的源顶点  s 使用递归(隐式堆)来控制访问顺序为走到最深再返回。

如果DFS在顶点 u 并且它有 X 个邻居,它会选择第一个邻居 V1 (通常是序号最小的那个顶点), 使用递归访问所有 V1可以到达的顶点, 最终返回顶点 u. DFS 接下来对其他的邻居做同样的事指导探索完成最后一个邻居 VX 和它所能触及到的顶点.

等下看了DFS的动画 这个冗长的解释会变得清晰起来。

5-3. 避免循环

如果一个图是圈,之前的“尝试所有”的方法可能让DFS陷入循环。所以DFS的基本形式用一个大小为 V 个顶点的数组 status[u] 来确定两种情况 分别为 u 已经被访问过了 或者没有被访问过。只有当 u 还没有被访问过的时候 DFS才可以访问顶点 u.


当DFS没有路可走的时候它会跳出当前的递归 回去 到之前的顶点 (p[u], 看下一页).

5-4. 记住这个路径

DFS uses another array p[u] of size V vertices to remember the parent/predecessor/previous of each vertex u along the DFS traversal path.


The predecessor of the source vertex, i.e., p[s] is set to -1 to say that the source vertex has no predecessor (as the lowest vertex number is vertex 0).


The sequence of vertices from a vertex u that is reachable from the source vertex s back to s forms the DFS spanning tree. We color these tree edges with red color.

5-5. 动手实例

现在,忽略显示的伪代码中额外的 status[u] = explored 以及可视化中的 蓝色灰色 边的存在 (将很快会解释)。

不用多说,让我们在这个 e-Lecture 的默认示例图上执行 DFS(0) (CP3 Figure 4.1)。 Recap DFS Example
到目前为止,DFS 的基本版本已经足够用于大多数的简单案例。

5-6. O(V+E) 时间复杂度

DFS 的时间复杂度是 O(V+E) ,因为:

  1. 每个节点只访问过一次,因为 DFS 将仅递归地探索节点 u 如果 status[u] = unvisited — O(V)
  2. 每次访问完一个节点,都会探索其所有 k 个邻点,因此在访问所有节点之后,我们已检查了所有 E 边 — (O(E) ,因为i每个节点的邻点总数等于 E)。

5-7. 始终是 O(V+E) ?

DFS 的he O(V+E) 时间复杂度只有当我们可以在 O(k) 时间内访问一个顶点的所有 k 个邻点时才可以实现。

Quiz: Which underlying graph data structure support that operation?

Adjacency List
Adjacency Matrix
Edge List

讨论:为什么?

5-8. 答案

[This is a hidden slide]

6. BFS

另一种基本的图遍历算法是 O(V+E) 广度优先搜索 (BFS)。
与 DFS 一样,BFS 也采用一个输入参数:源点 s
DFS 和 BFS 都有自己的优点和缺点。学习两者并对正确的情况采用正确的图遍历算法是非常重要的。

6-1. 比喻

想象一下静止的水,然后你扔石头。石头撞击水面的第一个位置是源点的位置,并且随后在水面上的波纹效应类似于 BFS 遍历模式。

6-2. 尝试全部,避免循环,记住路径

BFS 与之前讨论过的非常相似,但有一些差异。

BFS 从源点 s 开始,但它在更深入之前使用 queue 尽最宽可能地将访问序列排序。


BFS 还是用大小为 V 节点的布尔数组来区分两种不同的状态:已访问节点和未访问节点(我们不会像使用 DFS 那样使用 BFS 来检测反向边)。

在此可视化中,我们还展示从未加权图中的相同源点 s 开始,此图的 BFS 生成树等于其 SSSP spanning tree.

6-3. 动手实例

Without further ado, let's execute BFS(5) on the default example graph for this e-Lecture (CP3 Figure 4.3). Recap BFS Example.


Notice the Breadth-first exploration due to the usage of FIFO data structure: Queue?

6-4. O(V+E) 时间复杂度

BFS的时间复杂度是 O(V+E),因为:

  1. 每一个顶点都被访问一次 因为它们只能进入队列一次— O(V)
  2. 每当一个顶点从队列中出队时,所有它的 k 个邻居都会被探索 所以当所有的顶点都被访问过后,我们一共探索了 E 条路径 — (O(E) 因为每个顶点的邻居总数为 E).

对于DFS来说 O(V+E) 只有在用 邻接表 图数据结构 — 和DFS分析相同

7. 简单的 DFS/BFS 应用

So far, we can use DFS/BFS to solve a few graph traversal problem variants:

  1. Reachability test,
  2. Actually printing the traversal path,
  3. Identifying/Counting/Labeling Connected Components (CCs) of undirected graphs,
  4. Detecting if a graph is cyclic,
  5. Topological Sort (only on DAGs),

For most data structures and algorithms courses, the applications of DFS/BFS are up to these few basic ones only, although DFS/BFS can do much more...

7-1. 可达性测试

If you are asked to test whether a vertex s and a (different) vertex t in a graph are reachable, i.e., connected directly (via a direct edge) or indirectly (via a simple, non cyclic, path), you can call the O(V+E) DFS(s) (or BFS(s)) and check if status[t] = visited.


Example 1: s = 0 and t = 4, run DFS(0) and notice that status[4] = visited.
Example 2: s = 0 and t = 7, run DFS(0) and notice that status[7] = unvisited.

7-2. 打印遍历路径

Remember that we set p[v] = u every time we manage to extend DFS/BFS traversal from vertex u to vertex v — a tree edge in the DFS/BFS spanning tree. Thus, we can use following simple recursive function to print out the path stored in array p. Possible follow-up discussion: Can you write this in iterative form? (trivial)

method backtrack(u)
if (u == -1) stop
backtrack(p[u]);
output vertex u

To print out the path from a source vertex s to a target vertex t in a graph, you can call O(V+E) DFS(s) (or BFS(s)) and then O(V) backtrack(t). Example: s = 0 and t = 4, you can call DFS(0) and then backtrack(4). Elaborate

7-3. 识别一个连接分量(CC)

We can enumerate all vertices that are reachable from a vertex s in an undirected graph (as the example graph shown above) by simply calling O(V+E) DFS(s) (or BFS(s)) and enumerate all vertex v that has status[v] = visited.


Example: s = 0, run DFS(0) and notice that status[{0,1,2,3,4}] = visited so they are all reachable vertices from vertex 0, i.e., they form one Connected Component (CC).

7-4. 计算 CC 的数量/ 标记 CC

We can use the following pseudo-code to count the number of CCs:

CC = 0
for all u in V, set status[u] = unvisited
for all u in V
if (status[u] == unvisited)
++CC // we can use CC counter number as the CC label
DFS(u) // or BFS(u), that will flag its members as visited
output CC // the answer is 3 for the example graph above, i.e.
// CC 0 = {0,1,2,3,4}, CC 1 = {5}, CC 2 = {6,7,8}

You can modify the DFS(u)/BFS(u) code a bit if you want to use it to label each CC with the identifier of that CC.

7-5. 等等,时间复杂性是什么?

Quiz: What is the time complexity of Counting the Number of CCs algorithm?

Calling O(V+E) DFS/BFS V times, so O(V*(V+E)) = O(V^2 + VE)
Trick question, the answer is none of the above, it is O(_____)
It is still O(V+E)

讨论:为什么?

7-6. 答案

[This is a hidden slide]

7-7. 检测圈 - 第一部分

We can actually augment the basic DFS further to give more insights about the underlying graph.


In this visualization, we use blue color to highlight back edge(s) of the DFS spanning tree. The presence of at least one back edge shows that the traversed graph (component) is cyclic while its absence shows that at least the component connected to the source vertex of the traversed graph is acyclic.

7-8. 探测圈 - 第2部分

Back edge can be detected by modifying array status[u] to record three different states:

  1. unvisited: same as earlier, DFS has not reach vertex u before,
  2. explored: DFS has visited vertex u, but at least one neighbor of vertex u has not been visited yet (DFS will go depth-first to that neighbor first),
  3. visited: now stronger definition: all neighbors of vertex u have also been visited and DFS is about to backtrack from vertex u to vertex p[u].

If DFS is now at vertex x and explore edge x → y and encounter status[y] = explored, we can declare x → y is a back edge (a cycle is found as we were previously at vertex y (hence status[y] = explored), go deep to neighbor of y and so on, but we are now at vertex x that is reachable from y but vertex x leads back to vertex y).

7-9. 实践例子(细节)

The edges in the graph that are not tree edge(s) nor back edge(s) are colored grey. They are called forward or cross edge(s) and currently have limited use (not elaborated).


Now try DFS(0) on the example graph above with this new understanding, especially about the 3 possible status of a vertex (unvisited/normal black circle, explored/blue circle, visited/orange circle) and back edge. Edge 2 → 1 will be discovered as a back edge as it is part of cycle 1 → 3 → 2 → 1 (as vertex 2 is `explored' to vertex 1 which is currently `explored') (similarly with Edge 6 → 4 as part of cycle 4 → 5 → 7 → 6 → 4).


Note that if edges 2 → 1 and 6 → 4 are reversed to 1 → 2 and 4 → 6, then the graph is correctly classified as acyclic as edge 3 → 2 and 4 → 6 go from `explored' to `fully visited'. If we only use binary states: `unvisited' vs `visited', we cannot distinguish these two cases.

7-10. 拓扑排序 - 定义

There is another DFS (and also BFS) application that can be treated as 'simple': Performing Topological Sort(ing) of a Directed Acyclic Graph (DAG) — see example above.


Topological sort of a DAG is a linear ordering of the DAG's vertices in which each vertex comes before all vertices to which it has outbound edges.


Every DAG (can be checked with DFS earlier) has at least one but possibly more topological sorts/ordering.


One of the main purpose of (at least one) topological sort of a DAG is for Dynamic Programming (DP) technique. For example, this topological sorting process is used internally in DP solution for SSSP on DAG.

7-11. 拓扑排序

我们可以使用 O(V+E) DFS 或 BFS 来执行有向无环图(DAG)的拓扑排序。

与普通 DFS 相比,DFS 版本只需要额外的一行,基本上是此图的后序遍历。在示例的DAG上尝试 Toposort (DFS)
BFS 版本基于没有传入边的节点的概念,也称为 Kahn 算法.。在示例的DAG上尝试 Toposort (BFS/Kahn's)

8. 更多高级的 DFS/BFS 的应用

As of now, you have seen DFS/BFS and what it can solve (with just minor tweaks). There are a few more advanced applications that require more tweaks and we will let advanced students to explore them on their own:

  1. Bipartite Graph Checker (DFS and BFS variants),
  2. Finding Articulation Points (Cut Vertices) and Bridges of an Undirected Graph (DFS only),
  3. Finding Strongly Connected Components (SCCs) of a Directed Graph (Tarjan's and Kosaraju's algorithms), and
  4. 2-SAT(isfiability) Checker algorithms.

Advertisement: The details are written in Competitive Programming book.

9. 二分图检查

We can use the O(V+E) DFS or BFS (they work similarly) to check if a given graph is a Bipartite Graph by giving alternating color (orange versus blue in this visualization) between neighboring vertices and report 'non bipartite' if we ends up assigning same color to two adjacent vertices or 'bipartite' if it is possible to do such '2-coloring' process. Try DFS_Checker or BFS_Checker on the example Bipartite Graph.


Bipartite Graphs have useful applications in (Bipartite) Graph Matching problem.


Note that Bipartite Graphs are usually only defined for undirected graphs so this visualization will convert directed input graphs into its undirected version automatically before continuing. This action is irreversible and you may have to redraw the directed input graph again for other purposes.

10. 查找切割节点 & 桥

We can modify (but unfortunately, not trivially) the O(V+E) DFS algorithm into an algorithm to find Cut Vertices & Bridges of an Undirected Graph.


A Cut Vertex, or an Articulation Point, is a vertex of an undirected graph which removal disconnects the graph. Similarly, a bridge is an edge of an undirected graph which removal disconnects the graph.


Note that this algorithm for finding Cut Vertices & Bridges only works for undirected graphs so this visualization will convert directed input graphs into its undirected version automatically before continuing. This action is irreversible and you may have to redraw the directed input graph again for other purposes. You can try to Find Cut Vertices & Bridges on the example graph above.

11. 找到强联通分量

We can modify (but unfortunately, not trivially) the O(V+E) DFS algorithm into an algorithm to find Strongly Connected Components (SCCs) of a Directed Graph G.


An SCC of a directed graph G a is defined as a subgraph S of G such that for any two vertices u and v in S, vertex u can reach vertex v directly or via a path, and vertex v can also reach vertex u back directly or via a path.


There are two known algorithms for finding SCCs of a Directed Graph: Kosaraju's and Tarjan's. Both of them are available in this visualization. Try Kosaraju's Algorithm and/or Tarjan's Algorithm on the example directed graph above.

12. 2-SAT Checker 算法

We also have the 2-SAT Checker algorithm. Given a 2-Satisfiability (2-SAT) instance in the form of conjuction of clauses: (clause1) ^ (clause2) ^ ... ^ (clausen) and each clause is in form of disjunction of up to two variables (vara v varb), determine if we can assign True/False values to these variables so that the entire 2-SAT instance is evaluated to be true, i.e. satisfiable.


It turns out that each clause (a v b) can be turned into four vertices a, not a, b, and not b with two edges: (not a → b) and (not b → a). Thus we have a Directed Graph. If there is at least one variable and its negation inside an SCC of such graph, we know that it is impossible to satisfy the 2-SAT instance.


After such directed graph modeling, we can run an SCC finding algorithm (Kosaraju's or Tarjan's algorithm) to determine the satisfiability of the 2-SAT instance.

13. 哪一个更好?

Quiz: Which Graph Traversal Algorithm is Better?

Always DFS
Always BFS
Both are Equally Good
It Depends on the Situation

讨论:为什么?

13-1. 答案

[This is a hidden slide]

14. 额外的

我们仍然可以只用 DFS/BFS 做很多事情......

14-1. 在线测试

There are interesting questions about these two graph traversal algorithms: DFS+BFS and variants of graph traversal problems, please practice on Graph Traversal training module (no login is required, but short and of medium difficulty setting only).


However, for registered users, you should login and then go to the Main Training Page to officially clear this module and such achievement will be recorded in your user account.

14-2. 在线评判练习

We also have a few programming problems that somewhat requires the usage of DFS and/or BFS: Kattis - reachableroads and Kattis - breakingbad.


Try to solve them and then try the many more interesting twists/variants of this simple graph traversal problem and/or algorithm.


You are allowed to use/modify our implementation code for DFS/BFS Algorithms:
dfs_cc.cpp/bfs.cpp
dfs_cc.java/bfs.java
dfs_cc.py/bfs.py
dfs_cc.ml/bfs.ml

14-3. 讨论

[This is a hidden slide]