# DFS & BFS

## 1. Introduction

Given a graph, we can use the O(V+E) DFS (Depth-First Search) or BFS (Breadth-First Search) algorithm to traverse the graph and explore the features/properties of the graph. Each algorithm has its own characteristics, features, and side-effects that we will explore in this visualization.

This visualization is rich with a lot of DFS and BFS variants (all run in O(V+E)) such as:

1. Topological Sort algorithm (both DFS and BFS/Kahn's algorithm version),
2. Bipartite Graph Checker algorithm (both DFS and BFS version),
3. Cut Vertex & Bridge finding algorithm,
4. Strongly Connected Components (SCC) finding algorithms
(both Kosaraju's and Tarjan's version), and
5. 2-SAT Checker algorithm.

## 3. 指定一个输入图

1. 绘制图: 您可以绘制任何未加权的有向图作为输入图（绘制双向边 (u, v) ，您可以绘制两个有向边 u → v and v → u ）。
2. 示例图: 您可以从我们选择的示例图列表中进行挑选，以帮助您入门。

## 4. 概括

If you arrive at this e-Lecture without having first explore/master the concept of Binary Heap and especially Binary Search Tree, we suggest that you explore them first, as traversing a (Binary) Tree structure is much simpler than traversing a general graph.

Quiz: Mini pre-requisite check. What are the Pre-/In-/Post-order traversal of the binary tree shown (root = vertex 0), left and right child are as drawn?

In = 1, 0, 3, 2, 4
Post = 4, 3, 2, 1, 0
In = 4, 2, 3, 0, 1
Pre = 0, 1, 2, 3, 4
Pre = 0, 2, 4, 3, 1
Post = 1, 3, 4, 2, 0

### 4-1. 二叉树遍历 - 源 = 根

PS：从技术上来讲，这种转换是通过运行我们即将探索的 DFS(0) 来实现的。

### 4-3. 答案

[This is a hidden slide]

### 4-5. 一般图的问题

In general graph, we do not have the notion of root vertex. Instead, we need to pick one distinguished vertex to be the starting point of the traversal, i.e. the source vertex s.

We also have 0, 1, ..., k neighbors of a vertex instead of just ≤ 2.

We may (or actually very likely) have cycle(s) in our general graph instead of acyclic tree,
be it the trivial one like u → v → u or the non-trivial one like a → b → c → a.

But fret not, graph traversal is an easy problem with two classic algorithms: DFS and BFS.

## 5. DFS

DFS 采用一个输入参数：源点 s
DFS 是最基本的图的算法之一，因此请花时间了解该算法的关键步骤。

### 5-1. 比喻

The closest analogy of the behavior of DFS is to imagine a maze with only one entrance and one exit. You are at the entrance and want to explore the maze to reach the exit. Obviously you cannot split yourself into more than one.

Ask these reflective questions before continuing: What will you do if there are branching options in front of you? How to avoid going in cycle? How to mark your own path? Hint: You need a chalk, stones (or any other marker) and a (long) string.

### 5-4. 记住这个路径

DFS uses another array p[u] of size V vertices to remember the parent/predecessor/previous of each vertex u along the DFS traversal path.

The predecessor of the source vertex, i.e., p[s] is set to -1 to say that the source vertex has no predecessor (as the lowest vertex number is vertex 0).

The sequence of vertices from a vertex u that is reachable from the source vertex s back to s forms the DFS spanning tree. We color these tree edges with red color.

### 5-6. O(V+E) 时间复杂度

DFS 的时间复杂度是 O(V+E) ，因为：

1. 每个节点只访问过一次，因为 DFS 将仅递归地探索节点 u 如果 status[u] = unvisited — O(V)
2. 每次访问完一个节点，都会探索其所有 k 个邻点，因此在访问所有节点之后，我们已检查了所有 E 边 — （O(E) ，因为i每个节点的邻点总数等于 E）。

### 5-7. 始终是 O(V+E) ？

DFS 的he O(V+E) 时间复杂度只有当我们可以在 O(k) 时间内访问一个顶点的所有 k 个邻点时才可以实现。

Quiz: Which underlying graph data structure support that operation?

Edge List

### 5-8. 答案

[This is a hidden slide]

## 6. BFS

DFS 和 BFS 都有自己的优点和缺点。学习两者并对正确的情况采用正确的图遍历算法是非常重要的。

### 6-2. 尝试全部，避免循环，记住路径

BFS 与之前讨论过的非常相似，但有一些差异。

BFS 从源点 s 开始，但它在更深入之前使用 queue 尽最宽可能地将访问序列排序。

BFS 还是用大小为 V 节点的布尔数组来区分两种不同的状态：已访问节点和未访问节点（我们不会像使用 DFS 那样使用 BFS 来检测反向边）。

### 6-3. 动手实例

Without further ado, let's execute BFS(5) on the default example graph for this e-Lecture (CP3 Figure 4.3). Recap BFS Example.

Notice the Breadth-first exploration due to the usage of FIFO data structure: Queue?

### 6-4. O(V+E) 时间复杂度

BFS的时间复杂度是 O(V+E)，因为:

1. 每一个顶点都被访问一次 因为它们只能进入队列一次— O(V)
2. 每当一个顶点从队列中出队时，所有它的 k 个邻居都会被探索 所以当所有的顶点都被访问过后，我们一共探索了 E 条路径 — (O(E) 因为每个顶点的邻居总数为 E).

## 7. 简单的 DFS/BFS 应用

So far, we can use DFS/BFS to solve a few graph traversal problem variants:

1. Reachability test,
2. Actually printing the traversal path,
3. Identifying/Counting/Labeling Connected Components (CCs) of undirected graphs,
4. Detecting if a graph is cyclic,
5. Topological Sort (only on DAGs),

For most data structures and algorithms courses, the applications of DFS/BFS are up to these few basic ones only, although DFS/BFS can do much more...

### 7-1. 可达性测试

If you are asked to test whether a vertex s and a (different) vertex t in a graph are reachable, i.e., connected directly (via a direct edge) or indirectly (via a simple, non cyclic, path), you can call the O(V+E) DFS(s) (or BFS(s)) and check if status[t] = visited.

Example 1: s = 0 and t = 4, run DFS(0) and notice that status[4] = visited.
Example 2: s = 0 and t = 7, run DFS(0) and notice that status[7] = unvisited.

### 7-2. 打印遍历路径

Remember that we set p[v] = u every time we manage to extend DFS/BFS traversal from vertex u to vertex v — a tree edge in the DFS/BFS spanning tree. Thus, we can use following simple recursive function to print out the path stored in array p. Possible follow-up discussion: Can you write this in iterative form? (trivial)

method backtrack(u)
if (u == -1) stop
backtrack(p[u]);
output vertex u

To print out the path from a source vertex s to a target vertex t in a graph, you can call O(V+E) DFS(s) (or BFS(s)) and then O(V) backtrack(t). Example: s = 0 and t = 4, you can call DFS(0) and then backtrack(4). Elaborate

### 7-3. 识别一个连接分量（CC）

We can enumerate all vertices that are reachable from a vertex s in an undirected graph (as the example graph shown above) by simply calling O(V+E) DFS(s) (or BFS(s)) and enumerate all vertex v that has status[v] = visited.

Example: s = 0, run DFS(0) and notice that status[{0,1,2,3,4}] = visited so they are all reachable vertices from vertex 0, i.e., they form one Connected Component (CC).

### 7-4. 计算 CC 的数量/ 标记 CC

We can use the following pseudo-code to count the number of CCs:

CC = 0
for all u in V, set status[u] = unvisited
for all u in V
if (status[u] == unvisited)
++CC // we can use CC counter number as the CC label
DFS(u) // or BFS(u), that will flag its members as visited
output CC // the answer is 3 for the example graph above, i.e.
// CC 0 = {0,1,2,3,4}, CC 1 = {5}, CC 2 = {6,7,8}

You can modify the DFS(u)/BFS(u) code a bit if you want to use it to label each CC with the identifier of that CC.

### 7-5. 等等，时间复杂性是什么？

Quiz: What is the time complexity of Counting the Number of CCs algorithm?

Calling O(V+E) DFS/BFS V times, so O(V*(V+E)) = O(V^2 + VE)
Trick question, the answer is none of the above, it is O(_____)
It is still O(V+E)

### 7-6. 答案

[This is a hidden slide]

### 7-7. 检测圈 - 第一部分

We can actually augment the basic DFS further to give more insights about the underlying graph.

In this visualization, we use blue color to highlight back edge(s) of the DFS spanning tree. The presence of at least one back edge shows that the traversed graph (component) is cyclic while its absence shows that at least the component connected to the source vertex of the traversed graph is acyclic.

### 7-8. 探测圈 - 第2部分

Back edge can be detected by modifying array status[u] to record three different states:

1. unvisited: same as earlier, DFS has not reach vertex u before,
2. explored: DFS has visited vertex u, but at least one neighbor of vertex u has not been visited yet (DFS will go depth-first to that neighbor first),
3. visited: now stronger definition: all neighbors of vertex u have also been visited and DFS is about to backtrack from vertex u to vertex p[u].

If DFS is now at vertex x and explore edge x → y and encounter status[y] = explored, we can declare x → y is a back edge (a cycle is found as we were previously at vertex y (hence status[y] = explored), go deep to neighbor of y and so on, but we are now at vertex x that is reachable from y but vertex x leads back to vertex y).

### 7-9. 实践例子（细节）

The edges in the graph that are not tree edge(s) nor back edge(s) are colored grey. They are called forward or cross edge(s) and currently have limited use (not elaborated).

Now try DFS(0) on the example graph above with this new understanding, especially about the 3 possible status of a vertex (unvisited/normal black circle, explored/blue circle, visited/orange circle) and back edge. Edge 2 → 1 will be discovered as a back edge as it is part of cycle 1 → 3 → 2 → 1 (as vertex 2 is `explored' to vertex 1 which is currently `explored') (similarly with Edge 6 → 4 as part of cycle 4 → 5 → 7 → 6 → 4).

Note that if edges 2 → 1 and 6 → 4 are reversed to 1 → 2 and 4 → 6, then the graph is correctly classified as acyclic as edge 3 → 2 and 4 → 6 go from `explored' to `fully visited'. If we only use binary states: `unvisited' vs `visited', we cannot distinguish these two cases.

### 7-10. 拓扑排序 - 定义

There is another DFS (and also BFS) application that can be treated as 'simple': Performing Topological Sort(ing) of a Directed Acyclic Graph (DAG) — see example above.

Topological sort of a DAG is a linear ordering of the DAG's vertices in which each vertex comes before all vertices to which it has outbound edges.

Every DAG (can be checked with DFS earlier) has at least one but possibly more topological sorts/ordering.

One of the main purpose of (at least one) topological sort of a DAG is for Dynamic Programming (DP) technique. For example, this topological sorting process is used internally in DP solution for SSSP on DAG.

### 7-11. 拓扑排序

BFS 版本基于没有传入边的节点的概念，也称为 Kahn 算法.。在示例的DAG上尝试 Toposort (BFS/Kahn's)

## 8. 更多高级的 DFS/BFS 的应用

As of now, you have seen DFS/BFS and what it can solve (with just minor tweaks). There are a few more advanced applications that require more tweaks and we will let advanced students to explore them on their own:

1. Bipartite Graph Checker (DFS and BFS variants),
2. Finding Articulation Points (Cut Vertices) and Bridges of an Undirected Graph (DFS only),
3. Finding Strongly Connected Components (SCCs) of a Directed Graph (Tarjan's and Kosaraju's algorithms), and
4. 2-SAT(isfiability) Checker algorithms.

## 9. 二分图检查

We can use the O(V+E) DFS or BFS (they work similarly) to check if a given graph is a Bipartite Graph by giving alternating color (orange versus blue in this visualization) between neighboring vertices and report 'non bipartite' if we ends up assigning same color to two adjacent vertices or 'bipartite' if it is possible to do such '2-coloring' process. Try DFS_Checker or BFS_Checker on the example Bipartite Graph.

Bipartite Graphs have useful applications in (Bipartite) Graph Matching problem.

Note that Bipartite Graphs are usually only defined for undirected graphs so this visualization will convert directed input graphs into its undirected version automatically before continuing. This action is irreversible and you may have to redraw the directed input graph again for other purposes.

## 10. 查找切割节点 & 桥

We can modify (but unfortunately, not trivially) the O(V+E) DFS algorithm into an algorithm to find Cut Vertices & Bridges of an Undirected Graph.

A Cut Vertex, or an Articulation Point, is a vertex of an undirected graph which removal disconnects the graph. Similarly, a bridge is an edge of an undirected graph which removal disconnects the graph.

Note that this algorithm for finding Cut Vertices & Bridges only works for undirected graphs so this visualization will convert directed input graphs into its undirected version automatically before continuing. This action is irreversible and you may have to redraw the directed input graph again for other purposes. You can try to Find Cut Vertices & Bridges on the example graph above.

## 11. 找到强联通分量

We can modify (but unfortunately, not trivially) the O(V+E) DFS algorithm into an algorithm to find Strongly Connected Components (SCCs) of a Directed Graph G.

An SCC of a directed graph G a is defined as a subgraph S of G such that for any two vertices u and v in S, vertex u can reach vertex v directly or via a path, and vertex v can also reach vertex u back directly or via a path.

There are two known algorithms for finding SCCs of a Directed Graph: Kosaraju's and Tarjan's. Both of them are available in this visualization. Try Kosaraju's Algorithm and/or Tarjan's Algorithm on the example directed graph above.

## 12. 2-SAT Checker 算法

We also have the 2-SAT Checker algorithm. Given a 2-Satisfiability (2-SAT) instance in the form of conjuction of clauses: (clause1) ^ (clause2) ^ ... ^ (clausen) and each clause is in form of disjunction of up to two variables (vara v varb), determine if we can assign True/False values to these variables so that the entire 2-SAT instance is evaluated to be true, i.e. satisfiable.

It turns out that each clause (a v b) can be turned into four vertices a, not a, b, and not b with two edges: (not a → b) and (not b → a). Thus we have a Directed Graph. If there is at least one variable and its negation inside an SCC of such graph, we know that it is impossible to satisfy the 2-SAT instance.

After such directed graph modeling, we can run an SCC finding algorithm (Kosaraju's or Tarjan's algorithm) to determine the satisfiability of the 2-SAT instance.

## 13. 哪一个更好？

Quiz: Which Graph Traversal Algorithm is Better?

Always DFS
Always BFS
Both are Equally Good
It Depends on the Situation

### 13-1. 答案

[This is a hidden slide]

## 14. 额外的

### 14-1. 在线测试

There are interesting questions about these two graph traversal algorithms: DFS+BFS and variants of graph traversal problems, please practice on Graph Traversal training module (no login is required, but short and of medium difficulty setting only).

However, for registered users, you should login and then go to the Main Training Page to officially clear this module and such achievement will be recorded in your user account.

### 14-2. 在线评判练习

We also have a few programming problems that somewhat requires the usage of DFS and/or BFS: Kattis - reachableroads and Kattis - breakingbad.

Try to solve them and then try the many more interesting twists/variants of this simple graph traversal problem and/or algorithm.

You are allowed to use/modify our implementation code for DFS/BFS Algorithms:
dfs_cc.cpp/bfs.cpp
dfs_cc.java/bfs.java
dfs_cc.py/bfs.py
dfs_cc.ml/bfs.ml

### 14-3. 讨论

[This is a hidden slide]