Kruskal's Algorithm: Analysis
Exercise Based on your understanding of the Kruskal's algorithm, how can we efficiently implement the step which involves finding the next min-weight edge in $G$?
Solution
- Keep a sorted array of edges, keep a pointer to the next position (edge).
- Keep edges in a (min-heap) priority queue.
With an optimal sorting algorithm (to sort edges of the input graph by increasing weight), both approaches are $O(M \lg M)$ runtime.
We would spend $O(M \lg M)$ to sort the edges and then get the next edge in $O(1)$ time. Whereas, we can build the PriorityQueue in $O(M)$ time and remove the next "best" edge in $O(\lg M)$. We would have to do the "remove" $O(M)$ times because some edges may have to be disregarded (they cause cycle).
Exercise Once the next min-weight edge $(v, w)$ is found, how can we efficiently check if adding it to the MST would create a cycle?
Solution
We cannot check for cycle by simply checking if the endpoints are already in $T$ (why?). We can run BFS/DFS on $T$, start at $v$ and check if $w$ is reachable.
Exercise Based on your answers to the previous questions, analyze the asymptotic complexity of Kruskal's algorithm.
Solution
Operation | Frequency | Cost per operation |
---|---|---|
build PQ | $1$ | $O(M)$ |
extract min | $M$ | $O(\lg M)$ |
run BFS/DFS | $M$ | $O(N+M)$ |
From the table, it can be seen that Kruskal's algorithm is quadratic. It turns out we can use another data structure called Union-Find for checking/preventing cycle much efficiently. We will explore Union-Find in the next chapter!