CLRS21 用于不相交集合的数据结构

2017-05-26 | 阅读：次

21.1 不相交集合的操作

一个不相交集合数据结构维护了一个不相交动态集的集合 {S1, S2, …, Sk}, 我们用一个代表来标识每个集合，它是这个集合的某个成员。
动态集合支持：MAKE-SET(x), UNION(x, y), FIND-SET(x)
用不相交集合数据结构确定无向图的连通分量

CONNECTED-COMPONENTS(G)
    for each vextex v in G.V
        MAKE-SET(v)
    for each edge(u, v) in G.E
        if FIND-SET(u) != FIND-SET(v)
            UNION(u, v)

SAME-COMPONENT(u, v)
    if FIND-SET(u) == FIND-SET(v)
        return TRUE
    else return FALSE

练习21.1-1 Suppose that CONNECTED-COMPONENTS is run on the undirected graph G = (V, E), where V = {a, b, c, d, e, f, g, h, i, j, k} and the edges of E are processed in the following order: (d, i), (f, k), (g, i), (b, g), (a, h), (i, j), (d, k), (b, j), (d, f), (g, j), (a, e), (i, d). List the vertices in each connected component after each iteration of lines 3-5.

这和图21-1(b)是一样的，照做就好。

(d,i) => {a} {b} {c} {d i} {e} {f} {g} {h} {j} {k}
(f,k) => {a} {b} {c} {d i} {e} {f k} {g} {h} {j}
(g,i) => {a} {b} {c} {d g i} {e} {f k} {h} {j}
(b,g) => {a} {b d g i} {c} {e} {f k} {h} {j}
(a,h) => {a h} {b d g i} {c} {e} {f k} {j}
(i,j) => {a h} {b d g i j} {c} {e} {f k}
(d,k) => {a h} {b d g i j f k} {c} {e}
(b,j) => {a h} {b d g i j f k} {c} {e}
(d,f) => {a h} {b d g i j f k} {c} {e}
(g,j) => {a h} {b d g i j f k} {c} {e}
(a,e) => {a e h} {b d g i j f k} {c}

练习21.1-2 Show that after all edges are processed by CONNECTED-COMPONENTS, two vertices are in the same connected component if and only if they are in the same set.

If two vertices are not in the same set, meaning that no edge connecting the set of 
one vertice with the other. So two vertices are in the same connected component if 
and only if they are in the same set.

练习21.1-3 During the execution of CONNECTED-COMPONENTS on an undirected graph G = (V, E) with k connected components, how many times is FIND-SET called? How many times is UNION called? Express your answers in terms of

, and k.

FIND-SET调用 2|E| 次
UNION调用 |V| - k 次

21.2 不相交集合的链表表示

每个集合用一个自己的链表来表示。每个集合的对象包含head属性和tail属性。链表中每个对象包含一个集合成员，一个指向下一个对象的指针和一个指回到集合对象的指针。
这种实现MAKE-SET和FIND-SET都是O(1)时间的，UNION花费时间较多，因为要更新指回集合对象的指针。
最坏情况下，操作的摊还时间为O(n)
假设表中包含了表的长度，可以总把短的接到长的上，叫做加权合并启发式策略。一个具有m个MAKE-SET, UNION, FIND-SET的操作序列需要耗费O(m+nlgn)时间。(可以证明UNION每个对象最多被更新lgn次)
练习21.2-1 Write pseudocode for MAKE-SET, FIND-SET, and UNION using the linked-list representation and the weighted-union heuristic. Assume that each object x has an attribute rep[x] pointing to the representative of the set containing x and that each set S has attributes head[S], tail[S], and sizeS.

It's staright forward.

练习21.2-2 Show the data structure that results and the answers returned by the FIND-SET operations in the following program. Use the linked-list representation with the weighted-union heuristic.

for i <- 1 to 16
    do MAKE-SET(x_i)
for i <- 1 to 15 by 2
    do UNION(x_i,x_i+1)
for i <- 1 to 13 by 4
    do UNION(x_i, x_i+2)
UNION(x1,x5)
UNION(x11,x13)
UNION(x1,x_10)
FIND-SET(x2)
FIND-SET(x9)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1;2 3;4 5;6 7;8 9;10 11;12 13;14 15;16
1;2;3;4 5;6;7;8 9;10;11;12 13;14;15;16
1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16
现在都在一个集合里，两个查找都返回1

练习21.2-3 Adapt the aggregate proof of Theorem 21.1 to obtain amortized time bounds of O(1) for MAKE-SET and FIND-SET and O(lg n) for UNION using the linked-list representation and the weighted-union heuristic.

前两个操作都是O(1)
因为n个操作可以在O(nlgn)时间完成，所以摊还O(lgn)

练习21.2-4 Give a tight asymptotic bound on the running time of the sequence of operations in Figure 21.3 assuming the linked-list representation and the weighted-union heuristic.

O(n) + O(n) = O(n)
每次合并只改变一个对象。

练习21.2-5 在每个集合对象只用一个指针，而不是两个，请说明是否有道理，并说明如何进行三个操作。

可以只保留tail, 然后令tail作为集合的代表，代码类似两个指针的。

练习21.2-6 Suggest a simple change to the UNION procedure for the linked-list representation that removes the need to keep the tail pointer to the last object in each list. Whether or not the weighted-union heuristic is used, your change should not change the asymptotic running time of the UNION procedure. (Hint: Rather than appending one list to another, splice them together.)

反正这个链表是可以循环的，怎么表示都可以。

21.3 不相交集合森林

在一个disjoint-set forest中，每棵树表示一个集合，每个成员仅指向父结点。根包含集合的代表，并且是自己的父结点。
通过引入两种启发策略(按秩合并和路径压缩), 我么能得到一个渐进最优的不相交集合数据结构。
按秩合并: 使具有较少结点的树指向具有较多结点的树的根。对每个结点，维护一个秩，表示该结点高度的一个上界。
路径压缩: 使查找路径的每个结点直接指向根。路径压缩并不改变任何结点的秩。

MAKE-SET(x)
    x.p = x
    x.rank = 0

UNION(x, y)
    LINK(FIND-SET(x), FIND-SET(y))

LINK(x, y)
    if x.rank > y.rank
        y.p = x
    else x.p = y
        if x.rank == y.rank
            y.rank += 1

FIND-SET(x)
    if x != x.p
        x.p = FIND-SET(x.p)
    return x.p

FIND-SET是一种两趟方法(two-pass method): 当它递归时，第一趟沿查找路径向上直到找到根。当递归回溯时，第二趟沿搜索树向下更新每个结点，使其指向根。
一起使用两种启发式策略能带来运行时间的更大改善。最坏情况运行时间为$O(\alpha(n)m)$, $\alpha(n) \le 4$, 严格来说它是超线性的。

Sasuke

Jing

CLRS21 用于不相交集合的数据结构

21.1 不相交集合的操作

21.2 不相交集合的链表表示

21.3 不相交集合森林