Kruskal's Algorithm: Analysis

Exercise Based on your understanding of the Kruskal's algorithm, how can we efficiently implement the step which involves finding the next min-weight edge in $G$?

Solution

Keep a sorted array of edges, keep a pointer to the next position (edge).
Keep edges in a (min-heap) priority queue.

With an optimal sorting algorithm (to sort edges of the input graph by increasing weight), both approaches are $O(M \lg M)$ runtime.

We would spend $O(M \lg M)$ to sort the edges and then get the next edge in $O(1)$ time. Whereas, we can build the PriorityQueue in $O(M)$ time and remove the next "best" edge in $O(\lg M)$. We would have to do the "remove" $O(M)$ times because some edges may have to be disregarded (they cause cycle).

Exercise Once the next min-weight edge $(v, w)$ is found, how can we efficiently check if adding it to the MST would create a cycle?

Solution

We cannot check for cycle by simply checking if the endpoints are already in $T$ (why?). We can run BFS/DFS on $T$, start at $v$ and check if $w$ is reachable.

Exercise Based on your answers to the previous questions, analyze the asymptotic complexity of Kruskal's algorithm.

Solution

Operation	Frequency	Cost per operation
build PQ	$1$	$O(M)$
extract min	$M$	$O(\lg M)$
run BFS/DFS	$M$	$O(N+M)$

From the table, it can be seen that Kruskal's algorithm is quadratic. It turns out we can use another data structure called Union-Find for checking/preventing cycle much efficiently. We will explore Union-Find in the next chapter!

CS226 Notes

Kruskal's Algorithm: Analysis