Pencocokan Graf

1. Introduction

A Matching in a graph G = (V, E) is a subset of edges M of a graph G = (V, E) such that no two edges share a common vertex.


Maximum Cardinality Matching (MCM) problem is a Graph Matching problem where we seek a matching M that contains the largest possible number of edges. A desirable but rarely possible result is Perfect Matching where all |V| vertices are matched (assuming |V| is even), i.e., the cardinality of M is |V|/2.


A Bipartite Graph is a graph whose vertices can be partitioned into two disjoint sets U and V such that every edge can only connect a vertex in U to a vertex in V.


Maximum Cardinality Bipartite Matching (MCBM) problem is the MCM problem in a Bipartite Graph, which is a lot easier than MCM problem in a General Graph.

1-1. Motivasi-Aplikasi

Graph Matching problems (and its variants) arise in various applications, e.g.,

  1. Matching job openings (one disjoint set) to job applicants (the other disjoint set)
  2. The weighted version of #2 is called the Assignment problem
  3. Special-case of some NP-hard optimization problems
    (e.g., MVC, MIS, MPC on DAG, etc)
  4. Deterministic 2-opt Approximation Algorithm for MVC
  5. Sub-routine of Christofides's 1.5-approximation algorithm for TSP, etc...

1-2. Current Limitation: No Weighted MCM

In some applications, the weights of edges are not uniform (1 unit) but varies, and we may then want to take MCBM or MCM with minimum (or even maximum) total weight.


This visualization support both unweighted and weighted MCBM, but only works for unweighted MCM.


We do not have immediate plan to add support for weighted MCM and only rely on Dynamic Programming with Bitmask for small graphs solution.

1-3. Mengganti Mode-Mode

To switch between the unweighted MCBM (default, as it is much more popular), weighted MCBM, and unweighted MCM mode, click the respective header.


Here is an example of MCM mode. In MCM mode, one can draw a General, not necessarily Bipartite graphs. However, the graphs are unweighted (all edges have uniform weight 1).


The available algorithms (and example graphs) are different in each mode.

2. Visualisasi

You can view the visualisation here!


For Bipartite Graph visualization, we will mostly layout the vertices of the graph so that the two disjoint sets (U and V) are clearly visible as Left (U) and Right (V) sets. When you draw your input bipartite graph, you can choose to re-layout your bipartite graph into this easier-to-visualize form. However, you do not have to visualize Bipartite Graph in this form, e.g., you can click Grid Graph to load an example grid graph and notice that vertices {0,1,2,3} can form set U and vertices {4,5,6,7,8} can form set V. There is no odd-length cycle in this grid graph.


For General Graph, we do not (and usually cannot) re-layout the vertices into this Left Set and Right Set form.


Initially, edges have grey color. Matched edges will have black color. Free/Matched edges along an augmenting path will have Orange/Light Blue colors, respectively.

3. Graf Masukan

There are four different sources for specifying an input graph:

  1. Edit Graph: You can draw any undirected unweighted graph as the input graph.
    However, due to the way we visualize our MCBM algorithms, we need to impose one additional graph drawing constraint that does not exist in the actual MCBM problems. That constraint is that vertices on the left set are numbered from [0, n), and vertices on the right set are numbered from [n, n+m). You do not have to visually draw them in left-right sets form, as shown in this Grid Graph example.
  2. Input Graph: This is a new (not fully tested) feature.
  3. Modeling: Several graph problems can be reduced into an MCBM problem. In this visualization, we have the modeling examples for the famous Rook Attack problem and standard MCBM problem (also valid in MCM mode).
  4. Example Graphs: You can select from the list of our example graphs to get you started. The list of examples is slightly different in the two MCBM vs MCM modes.

3-1. Model Penyerangan Benteng

Slide ini merupakan sebuah stub dan akan diluaskan dengan penjelasan masalah ini dan bagaimana cara mengintepretasikan graf bipartit yang dibuat.

3-2. Model Graf Bipartit Tak Berbobot

Anda dapat membuat sembarang graf bipartit (kecil) dengan n/m simpul di kiri/kanan, dan aturkan densitas sisi-sisinya, dengan 100% merupakan graf bipartit komplet Kn,m dan 0% merupakan graf bipartit tanpa sisi.

4. Algoritma MCBM

There are several Max Cardinality Bipartite Matching (MCBM) algorithms in this visualization, plus one more in Max Flow visualization:

  1. By reducing MCBM problem into a Max-Flow problem in polynomial time,
    we can actually use any Max Flow algorithm to solve MCBM.
  2. O(V×E) Augmenting Path Algorithm (without greedy pre-processing),
  3. O(√(V)×E) Dinic's or Hopcroft-Karp Algorithm,
  4. O(k×E) Augmenting Path Algorithm (with randomized greedy pre-processing),

PS1: Although possible, we will likely not use O(V3) Edmonds' Matching Algorithm if the input is guaranteed to be a Bipartite Graph (as it is much slower).

PS2: Although possible, we will also likely not use O(V3) Kuhn-Munkres Algorithm if the input is guaranteed to be an unweighted Bipartite Graph (again, as it is much slower).

4-1. MCBM ≤p Max-Flow

The MCBM problem can be modeled (or reduced into) as a Max Flow problem in polynomial time.


Go to Max Flow visualization page and see the flow graph modeling of MCBM problem (select Modeling → Bipartite Matching → all 1). Basically, create a super source vertex s that connects to all vertices in the left set and also create a super sink vertex t where all vertices in the right set connect to t. Keep all edges in the flow graph directed from source to sink and with unit weight 1.


If we use one of the earliest Max Flow algorithm, i.e., a simple Ford-Fulkerson algorithm, the time complexity will be tighter than O(M×E) as all edge weights in the flow graph are unit weight so the max flow value M ≤ V, i.e., so O(V×E) overall.


If we use one of the fastest Max Flow algorithm, i.e., Dinic's algorithm on this flow graph, we can find Max Flow = MCBM in O(√(V)×E) time — the analysis is omitted for now. This allows us to solve MCBM problem with V ∈ [1000..1500] in a typical 1s allowed runtime in many programming competitions.


Discussion: Must the edges in the flow graph be directed or can they be undirected? Explain.

4-2. The Answer

[This is a hidden slide]

4-3. Apakah Kita Berhenti Di Sini?

Sebenarnya, kita bisa berhenti di sini, yaitu, ketika diberikan masalah MCBM (atau yang terkait), kita bisa langsung menguranginya menjadi masalah Max Flow dan menggunakan algoritma Max Flow (yang tercepat).


Namun, ada algoritma Pencocokan Graf yang jauh lebih sederhana yang akan kita lihat dalam beberapa slide berikutnya. Algoritma ini didasarkan pada sebuah teorema penting dan dapat diimplementasikan sebagai variasi mudah dari algoritma DFS standar.

4-4. Teorema Jalur Augmentasi (Berge)

Jalur Augmentasi adalah jalur yang dimulai dari simpul bebas (tidak dipasangkan) u dalam graf G (perhatikan bahwa G tidak harus graf bipartit, meskipun jalur augmentasi, jika ada, jauh lebih mudah ditemukan dalam graf bipartit), bergantian melalui sisi yang tidak dipasangkan (atau bebas/'f'), dipasangkan (atau 'm'), ..., sisi tidak dipasangkan ('f') dalam G, hingga berakhir di simpul bebas lain v. Pola dari jalur augmentasi apa pun adalah fmf...fmf dan memiliki panjang ganjil.


Jika kita membalik status sisi sepanjang jalur augmentasi tersebut, yaitu dari fmf...fmf menjadi mfm...mfm, kita akan menambah jumlah sisi dalam set pencocokan M sebanyak 1 unit dan menghilangkan jalur augmentasi ini.


Pada tahun 1957, Claude Berge mengusulkan theorem berikut:
Pencocokan M dalam graf G adalah maksimum jika dan hanya jika tidak ada lagi jalur augmentasi dalam G.


Diskusi: Dalam kelas, buktikan kebenaran teorema Berge!
Dalam praktiknya, kita bisa menggunakannya apa adanya.

4-5. Bukti Teorema Berge

Pembuktian mengklaim jika dan hanya jika, sehingga ada dua bagian:
arah maju dan arah mundur.

Pembuktian arah maju lebih mudah:
M∈G adalah maksimum → tidak ada jalur augmentasi dalam G terhadap M.

Pembuktian arah mundur sedikit lebih sulit:
M∈G adalah maksimum ← tidak ada jalur augmentasi dalam G terhadap M.

4-6. M∈G adalah maks → tidak ada AP dalam G terhadap M

Pembuktian dengan kontradiksi:
Andaikan M∈G adalah pencocokan maksimum tetapi G masih memiliki jalur augmentasi terkait dengan pencocokan M.

Sekarang, jalur augmentasi ini: fmf...fmf (yang memiliki panjang ganjil) dapat dibalik menjadi pencocokan lain M' yang menghapus sisi yang sebelumnya dicocokkan (yang 'm') dan mengambil sisi lain yang bebas (yang 'f') sepanjang jalur augmentasi. Dengan demikian, |M'| = |M|+1.

Ini bertentangan dengan pernyataan bahwa M adalah pencocokan maksimum.

Jadi, jika M∈G adalah maksimum → tidak ada lagi jalur augmentasi terkait dengan pencocokan M di G.

4-7. M∈G adalah maks ← tidak ada AP dalam G terhadap M

Bagian ini biasanya sulit dipahami dalam satu kali baca. Harap baca dengan cermat.

Kami menggunakan pembuktian dengan kontradiksi lagi:
Misalkan tidak ada jalur augmentasi di G terkait dengan M tetapi MG tidak maksimum,
yaitu, ada M' yang lebih besar dari M.

Pertama, kita ambil selisih simetris dari M' dan M untuk menghasilkan graf baru G' yang memiliki simpul yang sama dengan G, tetapi hanya memiliki sisi yang terlibat dalam M' atau M (tetapi tidak keduanya).

Mari kita amati graf baru G' ini. Perhatikan bahwa G' hanya akan terdiri dari simpul-simpul dengan degree 0 (simpul terisolasi, kita abaikan), degree 1 (titik akhir dari jalur augmentasi), atau degree 2 (di tengah jalur augmentasi, simpul yang menghubungkan sisi di M dan sisi lain di M'). Graf dengan degree tidak lebih dari 2 hanya dapat terdiri dari jalur atau siklus.


Pada jalur dan siklus, kita memiliki dua sub-kemungkinan: panjang ganjil atau genap.

Jika G' mengandung jalur dengan panjang genap (seperti yang ditunjukkan saat ini), itu tidak membantu pembuktian ini (karena ini menyiratkan |M| = |M'|, yaitu M' tidak lebih besar dari M).

4-8. Bukti, dilanjutkan (1)

Kita dapat memiliki siklus dengan panjang genap (seperti yang ditunjukkan saat ini di latar belakang) tetapi ini tidak membantu dengan pembuktian ini (karena ini menyiratkan |M| = |M'|, yaitu, M' tidak lebih besar dari M).

Kita tidak akan memiliki siklus dengan panjang ganjil karena sisi-sisi dalam G' hanya berasal dari M dan M' (gambar sebuah segitiga yang merupakan siklus dengan panjang ganjil terkecil dan yakinkan diri Anda bahwa setelah menetapkan satu sisi ke M dan sisi lainnya ke M', kita tidak dapat menetapkan sisi ketiga dari segitiga ke M atau M' — situasi yang sama berlaku untuk siklus panjang ganjil yang lebih panjang lainnya).

4-9. Bukti, dilanjutkan (2)

Lastly, we can have odd-length path where the path starts and ends with edges from the 'larger' M' and edges in M are slightly inside, that fmf...fmf pattern. Now what is this? This is an augmenting path w.r.t. M. We earlier claimed that is no augmenting path in G w.r.t M, so again we arrive at a contradiction.


Overall conclusion: Berge's theorem is not only the core mechanism behind the Augmenting Path algorithm, but it also lays the groundwork for algorithms that will be discussed later on, like Kuhn-Munkres (Hungarian) and Edmonds' Matching.

4-10. O(V×E) Augmenting Path Algorithm

Recall: Berge's theorem states:
A matching M in graph G is maximum iff there is no more augmenting path in G.


The Augmenting Path Algorithm (on Bipartite Graph) is a simple O(V*(V+E)) = O(V2 + V×E) = O(V×E) implementation (a modification of DFS) of that theorem: Find and then eliminate augmenting paths in Bipartite Graph G.


Click Augmenting Path Algorithm Demo to visualize this algorithm on a special test case called X̄ (X-bar).


Basically, this Augmenting Path Algorithm scans through all vertices on the left set (that were initially free vertices) one by one. Suppose L on the left set is a free vertex, this algorithm will recursively (via modification of DFS) go to a vertex R on the right set:

  1. If R is another free vertex, we have found one augmenting path (e.g., Augmenting Path 0-2 initially), and
  2. If R is already matched (this information is stored at match[R]), we immediately return to the left set and recurse (e.g, path 1-2-immediately return to 0-then 0-3, to find the second Augmenting Path 1-2-0-3)

4-11. Contoh Kode C++ - Bagian 1

vi match, vis;           // global variables

int Aug(int L) { // similar with DFS algorithm
if (vis[L]) return 0; // L visited, return 0
vis[L] = 1;
for (auto& R : AL[L])
if ((match[R] == -1) || Aug(match[R])) { // the key part
match[R] = L; // flip status
return 1; // found 1 matching
}
return 0; // Augmenting Path is not found
}

4-12. Contoh Kode C++ - Bagian 2

// pada int main(), buat graf bipartitnya
// gunakan sisi-sisi terarah dari himpunan kiri (dengan ukuran VLeft) ke himpunan kanan
int MCBM = 0;
match.assign(V, -1);
for (int L = 0; L < VLeft; ++L) { // coba semua simpul kiri
vis.assign(VLeft, 0);
MCBM += Aug(L); // temukan jalur augmentasi mulai dari L
}
printf("Found %d matchings\\n", MCBM);

Anda bisa melihat implementasi penuh di situs pendamping buku Competitive Programming: mcbm.cpp | py | java | ml.

4-13. Kasus Tes yang Ekstrim

If we are given a Complete Bipartite Graph KN/2,N/2, i.e.,
V = N/2+N/2 = N and E = N/2×N/2 = N2/4 ≈ N2, then
the Augmenting Path Algorithm discussed earlier (that process neighbouring vertices in increasing vertex number) will run in O(V×E) = O(N×N2) = O(N3).


This is only OK for V ∈ [400..500] in a typical 1s allowed runtime in many programming competitions.


Try executing the standard Augmenting Path Algorithm on this Extreme Test Case, which is an almost complete K5,5 Bipartite Graph.


It feels bad, especially on the latter iterations...
So, should we avoid using this simple Augmenting Path algorithm?

4-14. O(√(V)×E) Hopcroft-Karp Algorithm

The key idea of Hopcroft-Karp (HK) Algorithm (invented in 1973) is identical to Dinic's Max Flow Algorithm, i.e., prioritize shortest augmenting paths (in terms of number of edges used) first. That's it, augmenting paths with 1 edge are processed first before longer augmenting paths with 3 edges, 5 edges, 7 edges, etc (the length always increase by 2 due to the nature of augmenting path in a Bipartite Graph).


Hopcroft-Karp Algorithm has time complexity of O(√(V)×E) — analysis omitted for now. This allows us to solve MCBM problem with V ∈ [1000..1500] in a typical 1s allowed runtime in many programming competitions — the similar range as with running Dinic's algorithm on Bipartite Matching flow graph.


Try HK Algorithm on the same Extreme Test Case earlier. You will notice that HK Algorithm can find the MCBM in a much faster time than the previous standard O(V×E) Augmenting Path Algorithm.


Since Hopcroft-Karp algorithm is essentially also Dinic's algorithm, we treat both as 'approximately equal'.

4-15. O(k×E) Augmenting Path Algorithm Plus

However, we can actually make the easy-to-code Augmenting Path Algorithm discussed earlier to avoid its worst case O(V×E) behavior by doing O(V+E) randomized (to avoid adversary test case) greedy pre-processing (not just about randomizing the list of neighbors of each vertex) before running the actual algorithm.


This O(V+E) additional pre-processing step is simple: For every vertex on the left set, match it with a randomly chosen unmatched neighbouring vertex on the right set. This way, we eliminate many trivial (one-edge) Augmenting Paths that consist of a free vertex u, an unmatched edge (u, v), and a free vertex v.


Try Augmenting Path Algorithm Plus on the same Extreme Test Case earlier. Notice that the pre-processing step already eliminates many trivial 1-edge augmenting paths, making the actual Augmenting Path Algorithm only need to do little amount of additional work.

4-16. Kasus Uji Sulit Lainnya

Quite often, on randomly generated Bipartite Graph, the randomized greedy pre-processing step has cleared most of the matchings.


However, we can construct test case like: Example Graphs, Corner Case, Rand Greedy AP Killer to make randomization as ineffective as possible. For every group of 4 vertices, there are 2 matchings. Random greedy processing has 50% chance of making mistake per group (but since each group has only short Augmenting Paths, the fixes are not 'long'). Try this Test Case with Multiple Components case to see for yourself.


The worst case time complexity is no longer O(V×E) but now O(k×E) where k is a small integer, much smaller than V, k can be as small as 0 and is at most V/2 (any maximal matching, as with this case, has size of at least half of the maximum matching). In our empirical experiments, we estimate k to be "about √(V)" too on randomly generated bipartite graphs (not the special case that is currently shown). This version of Augmenting Path Algorithm Plus also allows us to solve MCBM problem with V ∈ [1000..1500] in a typical 1s allowed runtime in many programming competitions.

4-17. Jadi, Max Flow atau AP?

So, when presented with an MCBM problem, which route should we take?

  1. Reduce the MCBM problem into Max-Flow and use Dinic's algorithm (essentially Hopcroft-Karp algorithm) and gets O(√(V)×E) theoretical performance guarantee but with a much longer implementation?
  2. Use Augmenting Path algorithm with Randomized Greedy Processing with O(k×E) performance with good empirical results and a much shorter implementation?

Discussion: Discuss these two routes!

4-18. Tanggapan Kami (1)

[This is a hidden slide]

4-19. Tanggapan Kami (2)

[This is a hidden slide]

4-20. Tanggapan Kami (3)

[This is a hidden slide]

4-21. Hall's Matching Theorem (1)

[This is a hidden slide]

4-22. Hall's Matching Theorem (2)

[This is a hidden slide]

4-23. Steven's Matching Theorems (1)

[This is a hidden slide]

4-24. Steven's Matching Theorems (2)

[This is a hidden slide]

4-25. Steven's Matching Theorems (3)

[This is a hidden slide]

4-26. Steven's Matching Theorems (4)

[This is a hidden slide]

5. Algoritma MCBM Berbobot

NEW FOR 2025. We have just added Min-Cost-Max-Flow (mcmf) in maxflow visualization and Hungarian/Kuhn-Munkres visualization in this VisuAlgo page.


However, these features are still experimental and maybe different from the way these algorithms were written back in July 2020 for CP4.


Do report to Prof Halim if you encounter technical issue(s).

5-1. Current Limitation

[This is a hidden slide]

6. Algoritma-Algoritma MCM

Ketika pencocokan graf diterapkan pada graf umum (masalah MCM), menemukan Jalur Augmentasi menjadi jauh lebih sulit. Faktanya, sebelum Jack Edmonds menerbitkan paper terkenalnya yang berjudul "Paths, Trees, and Flowers" pada tahun 1965, masalah MCM ini dianggap sebagai masalah optimisasi (NP-)hard.


Ada dua algoritma Pencocokan Kardinalitas Maksimum (MCM) dalam visualisasi ini:

  1. O(V^3) Algoritma Pencocokan Edmonds (tanpa preprocessing greedy),
  2. O(V^3) Algoritma Pencocokan Edmonds (dengan preprocessing greedy),

6-1. Algoritma Pencocokan Edmonds O(V^3)

Dalam Graf Umum (seperti graf yang ditunjukkan di latar belakang yang memiliki |MCM| = 4), kita mungkin memiliki siklus panjang ganjil. Augmenting Path tidak terdefinisi dengan baik dalam graf semacam itu, sehingga kita tidak dapat dengan mudah mengimplementasikan teorema Claude Berge seperti yang kita lakukan dengan Graf Bipartit.


Jack Edmonds menyebut jalur yang dimulai dari simpul bebas u, bergantian antara sisi bebas, dipasangkan, ..., sisi bebas, dan kembali ke simpul bebas yang sama u sebagai Blossom. Situasi ini hanya mungkin terjadi jika kita memiliki siklus panjang ganjil, yaitu, dalam Graf tak-Bipartit. Sebagai contoh, anggap sisi 1-2 telah dipasangkan dalam graf yang ditunjukkan di latar belakang, maka jalur 3-1=2-3 adalah blossom.


Edmonds kemudian mengusulkan Algoritma Blossom shrinking/contraction and expansion untuk menyelesaikan masalah ini. Untuk detail tentang cara kerja algoritma ini, baca CP4 Bagian 9.28 karena visualisasi algoritma pencocokan Edmonds saat ini di VisuAlgo masih 'agak terlalu sulit untuk dipahami' oleh pemula, coba Edmonds' Matching. Dalam kelas langsung di NUS, langkah-langkah ini akan dijelaskan secara verbal.


Algoritma ini dapat diimplementasikan dalam O(V^3).

6-2. Algoritma Pencocokan Edmonds Plus O(V^3)

Algoritma Pencocokan Edmonds Plus O(V^3)

Sama seperti Algoritma Jalur Augmentasi Plus untuk masalah MCBM, kita juga dapat melakukan langkah preprocessing acak untuk menghilangkan sebanyak mungkin 'pencocokan sepele' sebelumnya. Ini mengurangi jumlah pekerjaan Algoritma Pencocokan Edmonds, sehingga menghasilkan kompleksitas waktu yang lebih cepat — analisis akan datang.

7. Kata Penutup

We have not added the visualization(s) for weighted variant of MCM problem. They are for future work.


The Hungarian (Kuhn-Munkres) algorithm visualization for weighted MCBM is very new and requires users testing, thus do report if you encounter technical issue(s).

7-1. Tantangan Pemrograman

Untuk memperkuat pemahaman Anda tentang masalah Graf Matching ini, variasinya, dan berbagai solusi yang mungkin, silakan coba menyelesaikan sebanyak mungkin dari masalah kompetisi pemrograman yang tercantum di bawah ini:

  1. MCBM Standar (tetapi memperlukan algoritma cepat): Kattis - flippingcards
  2. Greedy Bipartite Matching: Kattis - froshweek2
    (anda tidak perlu menggunakan sebuha algoritma MCBM spesifik untuk soal ini,
    faktanya, akan terlalu lambat jika Anda menggunakan algoritma apa pun yang dibahas di sini.)
  3. Kasus spesial dari sebuah masalah optimisasi NP-hard: Kattis - bilateral
  4. MCBM berbobot yang lumayan jelas: Kattis - engaging

7-2. Implementasi

Untuk menyelesaikan soal-soal lomba programming tersebut, anda dapat menggunakan dan/atau memodifikasi implementasi kami untuk Algoritma Jalur Augmentasi (dengan Preprocessing Greedy Acak): mcbm.cpp | py | java | ml