Weighted quick union with path compression
The last implementation is called weighted quick union with path compression which is the result of studying for looking for an algorithm that has guaranteed constant time. The idea of the algorithm is to have every node to link directly to the root of its tree without paying the cost of changing a large number of links as we did for the quick find algorithm.
The approach is to do it just for the nodes that we do examine. To do so we just add another loop to the find function that sets the ids entry corresponding to each node encountered along the way to link directly to the root, resulting on almost flattening the trees completely.
impl UnionFind for WeightedQuickUnionWithPathCompression {
/// Merges two components if the two elements are in different components.
fn union(&mut self, p: usize, q: usize) {
let root_p = self.find(p);
let root_q = self.find(q);
if root_p != root_q {
if self.szs[root_p] < self.szs[root_q] {
self.ids[root_p] = root_q;
self.szs[root_q] += self.szs[root_p];
} else {
self.ids[root_q] = root_p;
self.szs[root_p] += self.szs[root_q];
}
self.count -= 1;
}
}
/// Returns the component identifier for a given element.
fn find(&mut self, mut i: usize) -> usize {
let mut root = i;
while root != self.ids[root] {
root = self.ids[root];
}
while i != root {
let newi = self.ids[i];
self.ids[i] = root;
i = newi;
}
root
}
/// Returns the number of components.
fn components(&self) -> usize {
self.count
}
}
Weighted quick union with path compression is optimal but not quite constant time per operation, no algorithm can guarantee to perform each union find operation in amortized constant time. A detailed explanation of the process can be seen in the unit test but here we just show the section where the path compression takes place:
// Union 6 and 1, their roots are 6 and 2 and their root sizes are the same so the
// root of q(1) which is 2 is connected to the root of p(6) which is 6 and its size is
// incremented after summing up the size of the root of q(1) which is 3. The total components
// are 2.
// 0 1 2 3 4 5 6 7 8 9
expected_find_pos = [6, 2, 6, 4, 4, 6, 6, 2, 4, 4];
// ^ ^ ^
// sizes: [1, 1, 3, 1, 4, 1, 6, 1, 1, 1]
// ^
alg.union(6, 1);
assert_eq!(alg.components(), 2);
for (i, expected_find) in expected_find_pos.into_iter().enumerate() {
assert_eq!(alg.ids[i], expected_find, "i={}", i);
}
// Union 1 and 0, their roots are the same 6, but notice how the path compression logic of find
// when evaluating p(1) changed its root to the root of the tree it belongs, which is 6. As the
// roots are the same nothing else is needed.
// 0 1 2 3 4 5 6 7 8 9
expected_find_pos = [6, 6, 6, 4, 4, 6, 6, 2, 4, 4];
// ^ ^
// sizes: [1, 1, 3, 1, 4, 1, 6, 1, 1, 1]
//
alg.union(1, 0);
assert_eq!(alg.components(), 2);
for (i, expected_find) in expected_find_pos.into_iter().enumerate() {
assert_eq!(alg.ids[i], expected_find, "i={}", i);
}
// Union 6 and 7, like in the last iteration their roots are the same 6 so nothing else is
// needed but the find logic updates the root of q to 6.
// 0 1 2 3 4 5 6 7 8 9
expected_find_pos = [6, 6, 6, 4, 4, 6, 6, 6, 4, 4];
// ^ ^
// sizes: [1, 1, 3, 1, 4, 1, 6, 1, 1, 1]
alg.union(6, 7);
assert_eq!(alg.components(), 2);
for (i, expected_find) in expected_find_pos.into_iter().enumerate() {
assert_eq!(alg.ids[i], expected_find, "i={}", i);
}
The full source code can be found here.