Given a graph, we can use the O(**V**+**E**) DFS (Depth-First Search) or BFS (Breadth-First Search) algorithm to traverse the graph and explore the features/properties of the graph. Each algorithm has its own characteristics, features, and side-effects that we will explore in this visualization.

This visualization is rich with a lot of DFS and BFS variants (all run in O(**V**+**E**)) such as:

- Topological Sort algorithm (both DFS and BFS/Kahn's algorithm version),
- Bipartite Graph Checker algorithm (both DFS and BFS version),
- Cut Vertex & Bridge finding algorithm,
- Strongly Connected Components (SCC) finding algorithms

(both Kosaraju's and Tarjan's version), and - 2-SAT Checker algorithm.

When the chosen graph traversal algorithm is running, the animation will be shown here.

We use vertex+edge color (the color scheme will be elaborated soon) and occasionally the extra text under the vertex (in red font) to highlight the changes.

All graph traversal algorithms work on directed graphs (this is the default setting, where each edge has an arrowtip to indicate its direction) but the **Bipartite Graph Check** algorithm and the **Cut Vertex & Bridge** finding algorithm requires the undirected graphs (the conversion is done automatically by this visualization).

There are two different sources for specifying an input graph:

**Draw Graph**: You can draw**any**unweighted directed graph as the input graph (to draw bidirectional edge (u, v), you can draw two directed edges u → v and v → u).**Example Graphs**: You can select from the list of our selected example graphs to get you started.

If you arrive at this e-Lecture **without** having first explore/master the concept of __Binary Heap__ and especially __Binary Search Tree__, we suggest that you explore them first, as traversing a (Binary) Tree structure is much simpler than traversing a general graph.

Quiz: **Mini pre-requisite check. What are the Pre-/In-/Post-order traversal of the binary tree shown (root = vertex 0), left and right child are as drawn?**

Pre = 0, 1, 2, 3, 4

Pre = 0, 2, 4, 3, 1

Post = 1, 3, 4, 2, 0

Post = 4, 3, 2, 1, 0

In = 4, 2, 3, 0, 1

We normally start from the most important vertex of a (binary) tree: The **root** vertex.

If the given tree is not 'rooted' (see the example picture), we can pick any one vertex (for example, vertex 0 in the example picture) and designate it as the root. If we imagine that all edges are strings of similar length, then after "virtually pulling the designated root upwards" and let gravity pulls the rest downwards, we have a rooted directed (downwards) tree — see the next slide.

PS: Technically, this transformation is done by running `DFS(0)`

that we will explore soon.

In a **binary** tree, we only have **up to two** neighboring choices: From the current vertex, we can go to the left subtree first or go to the right subtree first. We also have option to visit the current vertex before or after visiting one of the (or both) subtree(s).

This gives rise to the classics: pre-order (visit current vertex, visit its left subtree, visit its right subtree), in-order (left, current, right), and post-order (left, right, current) traversals.

Discussion: Do you notice that there are three other possible binary tree traversal combinations? What are they?

In a binary tree, or in a tree structure in general, there is no (non-trivial) cycle involving 3 or more distinct vertices to worry about (we do not consider the trivial cycle involving bi-directional edges which can be taken care of easily — see three slides earlier).

In general graph, we do not have the notion of root vertex. Instead, we need to pick one distinguished vertex to be the starting point of the traversal, i.e. the source vertex **s**.

We also have 0, 1, ..., **k** neighbors of a vertex instead of just ≤ 2.

We **may (or actually very likely)** have cycle(s) in our general graph instead of acyclic tree,

be it the trivial one like u → v → u or the non-trivial one like a → b → c → a.

But fret not, graph traversal is an easy problem with two classic algorithms: DFS and BFS.

One of the most basic graph traversal algorithm is the O(**V**+**E**) Depth-First Search (DFS).

DFS takes one input parameter: The source vertex **s**.

DFS is one of the most fundamental graph algorithm, so please spend time to understand the key steps of this algorithm.

The closest analogy of the behavior of DFS is to imagine a maze with only one entrance and one exit. You are at the entrance and want to explore the maze to reach the exit. Obviously you cannot split yourself into more than one.

Ask these reflective questions before continuing: What will you do if there are branching options in front of you? How to avoid going in cycle? How to mark your own path? Hint: You need a chalk, stones (or any other marker) and a (long) string.

As it name implies, DFS starts from a distinguished source vertex **s** and uses recursion (an implicit stack) to order the visitation sequence as deep as possible before backtracking.

If DFS is at a vertex **u** and it has **X** neighbors, it will pick the first neighbor **V _{1}** (usually the vertex with the lowest vertex number), recursively explore all reachable vertices from vertex

This wordy explanation will be clearer with DFS animation later.

If the graph is **cyclic**, the previous 'try-all' strategy may lead DFS to run in cycle.

So *the basic form of DFS* uses an array **status[u]** of size **V** vertices to decide between *binary conditions*: Whether vertex **u** has been visited or unvisited. Only if vertex **u** is still unvisited, then DFS can visit vertex **u**.

When DFS runs out of option, it **backtrack** to previous vertex (**p[u]**, see the next slide) as the recursion unwinds.

DFS uses another array **p[u]** of size **V** vertices to remember the **parent/predecessor/previous** of each vertex **u** along the DFS traversal path.

The predecessor of the source vertex, i.e., **p[s]** is set to -1 to say that the source vertex has no predecessor (as the lowest vertex number is vertex 0).

The sequence of vertices from a vertex **u** that is reachable from the source vertex **s** back to **s** forms the **DFS spanning tree**. We color these **tree edges** with red color.

For now, ignore the extra **status[u] = explored** in the displayed pseudocode and the presence of blue and grey edges in the visualization (to be explained soon).

Without further ado, let's execute

on the default example graph for this e-Lecture (CP3 Figure 4.1).The *basic version* of DFS presented so far is already enough for most simple cases.

The time complexity of DFS is O(**V**+**E**) because:

- Each vertex is only visited once due to the fact that DFS will only recursively explore a vertex
**u**if**status[u] = unvisited**— O(**V**) - Every time a vertex is visited, all its
**k**neighbors are explored and therefore after all vertices are visited, we have examined all**E**edges — (O(**E**) as the total number of neighbors of each vertex equals to**E**).

The O(**V**+**E**) time complexity of DFS only achievable if we can visit all **k** neighboring vertices of a vertex in O(**k**) time.

Quiz: **Which underlying graph data structure support that operation?**

Discussion: Why?

Another basic graph traversal algorithm is the O(**V**+**E**) Breadth-First Search (BFS).

As with DFS, BFS also takes one input parameter: The source vertex **s**.

Both DFS and BFS have their own strengths and weaknesses. It is important to learn both and apply the correct graph traversal algorithm for the correct situation.

Imagine a still body of water and then you throw a stone into it. The first location where the stone hits the water surface is the position of the source vertex and the subsequent **ripple effect** across the water surface is like the BFS traversal pattern.

BFS is very similar with DFS that have been discussed earlier, but with some differences.

BFS starts from a source vertex **s** but it uses a __queue__ to order the visitation sequence *as breadth as possible before going deeper*.

BFS also uses a Boolean array of size **V** vertices to distinguish between two states: visited and unvisited vertices (we will not use BFS to detect back edge(s) as with DFS).

In this visualization, we also show that starting from the same source vertex **s** in an **unweighted graph**, BFS spanning tree of the graph equals to its __SSSP spanning tree__.

Without further ado, let's execute .

on the default example graph for this e-Lecture (CP3 Figure 4.3).Notice the *Breadth-first* exploration due to the usage of FIFO data structure: Queue?

The time complexity of BFS is O(**V**+**E**) because:

- Each vertex is only visited once as it can only enter the queue once — O(
**V**) - Every time a vertex is dequeued from the queue, all its
**k**neighbors are explored and therefore after all vertices are visited, we have examined all**E**edges — (O(**E**) as the total number of neighbors of each vertex equals to**E**).

As with DFS, this O(**V**+**E**) time complexity is only possible if we use __Adjacency List__ graph data structure — same reason as with DFS analysis.

So far, we can use DFS/BFS to solve a few graph traversal problem variants:

- Reachability test,
- Actually printing the traversal path,
- Identifying/Counting/Labeling Connected Components (CCs) of undirected graphs,
- Detecting if a graph is cyclic,
- Topological Sort (only on DAGs),

For most data structures and algorithms courses, the applications of DFS/BFS are up to these few basic ones only, although DFS/BFS can do much more...

If you are asked to test whether a vertex **s** and a (different) vertex **t** in a graph are reachable, i.e., connected directly (via a direct edge) or indirectly (via a simple, non cyclic, path), you can call the O(**V**+**E**) `DFS(s)` (or `BFS(s)`) and check if `status[t] = visited`.

Example 1: **s = 0** and **t = 4**, run and notice that `status[4] = visited`.

Example 2: **s = 0** and **t = 7**, run and notice that `status[7] = unvisited`.

__Remember__ that we set **p[v] = u** every time we manage to extend DFS/BFS traversal from vertex **u** to vertex **v** — a tree edge in the DFS/BFS spanning tree. Thus, we can use following simple recursive function to print out the path stored in array **p**. Possible follow-up discussion: Can you write this in **iterative** form? (trivial)

method backtrack(u)

if (u == -1) stop

backtrack(p[u]);

output vertex u

To print out the path from a source vertex **s** to a target vertex **t** in a graph, you can call O(**V**+**E**) `DFS(s)` (or `BFS(s)`) and then O(**V**) `backtrack(t)`. Example: **s = 0** and **t = 4**, you can call and then `backtrack(4)`.

We can enumerate **all** vertices that are reachable from a vertex **s** in an **undirected graph** (as the example graph shown above) by simply calling O(**V**+**E**) `DFS(s)` (or `BFS(s)`) and enumerate all vertex **v** that has `status[v] = visited`.

Example: **s = 0**, run and notice that `status[{0,1,2,3,4}] = visited` so they are all reachable vertices from vertex 0, i.e., they form one **Connected Component (CC)**.

We can use the following pseudo-code to count the number of CCs:

CC = 0

for all u in V, set status[u] = unvisited

for all u in V

if (status[u] == unvisited)

++CC // we can use CC counter number as the CC label

DFS(u) // or BFS(u), that will flag its members as visited

output CC // the answer is 3 for the example graph above, i.e.

// CC 0 = {0,1,2,3,4}, CC 1 = {5}, CC 2 = {6,7,8}

You can modify the DFS(u)/BFS(u) code a bit if you want to use it to label each CC with the identifier of that CC.

Quiz: **What is the time complexity of Counting the Number of CCs algorithm?**

Discussion: Why?

We can actually *augment* the basic DFS further to give more insights about the underlying graph.

In this visualization, we use blue color to highlight **back** edge(s) of the DFS spanning tree. The presence of at least one back edge shows that the traversed graph (component) is **cyclic** while its absence shows that at least the component connected to the source vertex of the traversed graph is **acyclic**.

Back edge can be detected by modifying array **status[u]** to record **three** different states:

**unvisited**: same as earlier, DFS has not reach vertex**u**before,**explored**: DFS has visited vertex**u**, but at least one neighbor of vertex**u**has not been visited yet (DFS will go depth-first to that neighbor first),**visited**: now stronger definition: all neighbors of vertex**u**have also been visited and DFS is about to backtrack from vertex**u**to vertex**p[u]**.

If DFS is now at vertex **x** and explore edge **x → y** and encounter **status[y] = explored**, we can declare **x → y** is a **back edge** (a cycle is found as we were previously at vertex **y** (hence **status[y] = explored**), go deep to neighbor of **y** and so on, but we are now at vertex **x** that is reachable from **y** but vertex **x** leads back to vertex **y**).

The edges in the graph that are not tree edge(s) nor back edge(s) are colored grey. They are called **forward or cross edge(s)** and currently have limited use (not elaborated).

Now try normal black circle, explored/blue circle, visited/orange circle) and back edge. Edge 2 → 1 will be discovered as a back edge as it is part of cycle 1 → 3 → 2 → 1 (as vertex 2 is `explored' to vertex 1 which is currently `explored') (similarly with Edge 6 → 4 as part of cycle 4 → 5 → 7 → 6 → 4).

on the example graph above with this new understanding, especially about the 3 possible status of a vertex (unvisited/Note that if edges 2 → 1 and 6 → 4 are reversed to 1 → 2 and 4 → 6, then the graph is correctly classified as acyclic as edge 3 → 2 and 4 → 6 go from `explored' to `fully visited'. If we only use binary states: `unvisited' vs `visited', we cannot distinguish these two cases.

There is another DFS (and also BFS) application that can be treated as 'simple': Performing Topological Sort(ing) of a Directed Acyclic Graph (DAG) — see example above.

Topological sort of a DAG is a linear ordering of the DAG's vertices in which each vertex comes before all vertices to which it has outbound edges.

Every DAG (can be checked with __DFS earlier__) has at least one but possibly more topological sorts/ordering.

One of the main purpose of (at least one) topological sort of a DAG is for __Dynamic Programming (DP)__ technique. For example, this topological sorting process is used internally in __DP solution for SSSP on DAG__.

We can use either the O(**V**+**E**) DFS or BFS to perform Topological Sort of a Directed Acyclic Graph (DAG).

The DFS version requires just one additional line compared to the normal DFS and is basically the post-order traversal of the graph. Try

on the example DAG.The BFS version is based on the idea of vertices without incoming edge and is also called as Kahn's algorithm. Try

on the example DAG.As of now, you have seen DFS/BFS and what it can solve (with just minor tweaks). There are a few more advanced applications that require more tweaks and we will let advanced students to explore them on their own:

- Bipartite Graph Checker (DFS and BFS variants),
- Finding Articulation Points (Cut Vertices) and Bridges of an Undirected Graph (DFS only),
- Finding Strongly Connected Components (SCCs) of a Directed Graph (Tarjan's and Kosaraju's algorithms), and
- 2-SAT(isfiability) Checker algorithms.

Advertisement: The details are written in __Competitive Programming book__.

We can use the O(**V**+**E**) DFS or BFS (they work similarly) to check if a given graph is a Bipartite Graph by giving alternating color (orange versus blue in this visualization) between neighboring vertices and report 'non bipartite' if we ends up assigning same color to two adjacent vertices or 'bipartite' if it is possible to do such '2-coloring' process. Try or on the example Bipartite Graph.

Bipartite Graphs have useful applications in __(Bipartite) Graph Matching problem__.

Note that Bipartite Graphs are usually only defined for undirected graphs so this visualization will convert directed input graphs into its undirected version automatically before continuing. This action is irreversible and you may have to redraw the directed input graph again for other purposes.

We can modify (but unfortunately, not trivially) the O(**V**+**E**) DFS algorithm into an algorithm to find Cut Vertices & Bridges of an Undirected Graph.

A Cut Vertex, or an Articulation Point, is a vertex of an undirected graph which removal disconnects the graph. Similarly, a bridge is an edge of an undirected graph which removal disconnects the graph.

Note that this algorithm for finding Cut Vertices & Bridges only works for undirected graphs so this visualization will convert directed input graphs into its undirected version automatically before continuing. This action is irreversible and you may have to redraw the directed input graph again for other purposes. You can try to

on the example graph above.We can modify (but unfortunately, not trivially) the O(**V**+**E**) DFS algorithm into an algorithm to find Strongly Connected Components (SCCs) of a Directed Graph G.

An SCC of a directed graph G a is defined as a subgraph S of G such that for any two vertices u and v in S, vertex u can reach vertex v directly or via a path, and vertex v can also reach vertex u back directly or via a path.

There are two known algorithms for finding SCCs of a Directed Graph: Kosaraju's and Tarjan's. Both of them are available in this visualization. Try

and/or on the example directed graph above.We also have the 2-SAT Checker algorithm. Given a 2-Satisfiability (2-SAT) instance in the form of conjuction of clauses: (clause_{1}) ^ (clause_{2}) ^ ... ^ (clause_{n}) and each clause is in form of disjunction of up to two variables (var_{a} v var_{b}), determine if we can assign True/False values to these variables so that the entire 2-SAT instance is evaluated to be true, i.e. satisfiable.

It turns out that each clause (a v b) can be turned into four vertices a, not a, b, and not b with two edges: (not a → b) and (not b → a). Thus we have a Directed Graph. If there is at least one variable and its negation inside an SCC of such graph, we know that it is impossible to satisfy the 2-SAT instance.

After such directed graph modeling, we can run an SCC finding algorithm (Kosaraju's or Tarjan's algorithm) to determine the satisfiability of the 2-SAT instance.

Quiz: **Which Graph Traversal Algorithm is Better?**

Discussion: Why?