CLRS22 基本的图算法

2017-05-26 | 阅读：次

介绍

本章介绍图的表示和图的搜索
图的搜索技巧是整个图算法领域的核心

22.1 图的表示

对于图G = (V, E)，有两种标准表示方法: 一种是邻接链表，另一种是邻接矩阵。分别适合稀疏图和稠密图。当图规模较小时，我们更倾向于使用邻接矩阵表示法。
对于有向图，所有邻接链表的长度和为E. 无向图为2E.
邻接链表表示法的鲁棒性很高，可以对其进行简单修改来支持许多其它的图变种。
无向图只需存储对角线及以上部分的邻接矩阵。
练习22.1-1 Given an adjacency-list representation of a directed graph, how long does it take to compute the out-degree of every vertex? How long does it take to compute the in-degrees?

O(E + V)
O(E + V)

练习22.1-2 Give an adjacency-list representation for a complete binary tree on 7 vertices. Give an equivalent adjacency-matrix representation. Assume that vertices are numbered from 1 to 7 as in a binary heap.

~   1   2   3   4   5   6   7
1   0   1   1   0   0   0   0
2   1   0   0   1   1   0   0
3   1   0   0   0   0   1   1
4   0   1   0   0   0   0   0
5   0   1   0   0   0   0   0
6   0   0   1   0   0   0   0
7   0   0   1   0   0   0   0

练习22.1-3 The transpose of a directed graph G = (V, E) is the graph GT = (V, ET), where ET = {(v, u) in V × V : (u, v) in E}. Thus, GT is G with all its edges reversed. Describe efficient algorithms for computing GT from G, for both the adjacency-list and adjacency-matrix representations of G. Analyze the running times of your algorithms.

需要遍历所有链表，O(V + E)
转置矩阵即可，O(V * V)

练习22.1-4 Given an adjacency-list representation of a multigraph G = (V, E), describe an O(V + E)-time algorithm to compute the adjacency-list representation of the “equivalent” undirected graph G′ = (V, E′), where E′ consists of the edges in E with all multiple edges between two vertices replaced by a single edge and with all self-loops removed.

1. We iterate through all the vertices in the multigraph.
2. For each of the vertices we iterate through their multigraph adjacency list.
3. While iterating through the multigraph adjacency list of a vertex u, we add the neighbor to the new adjacency list of u 
     and u to the new adjacency list of the neighbor, if the neighbor is not vertex u itself and if the neighbor is not already 
     present in the new adjacency list of u.
4. The new adjacency list array is a representation of the required undirected graph.

练习22.1-5 The square of a directed graph G=(V,E) is the graph G2 =(V,E2) such that (u,w) in E2 if and only if for some v in V, both(u,v) in E and(v,w) in E.That is,G2 contains an edge between u and w whenever G contains a path with exactly two edges between u and w. Describe efficient algorithms for computing G2 from G for both the adjacency-list and adjacency-matrix representations of G. Analyze the running times of your algorithms.

邻接矩阵：O(V^3)
for i = 1 to V
    for j = 1 to V
        G2[i][j] = g[i][j]
        for k = 1 to V
            if g[i][k] == 1 and g[k][j] == 1
                G2[i][j] = 1

邻接链表: O(V^3)
将本来链表中的结点能到的结点也加入链表中

练习22.1-6 When an adjacency-matrix representation is used, most graph algorithms require time Ω(V2), but there are some exceptions. Show that determining whether a directed graph G contains a universal sink-a vertex with in-degree

- 1 and out-degree 0-can be determined in time O(V), given an adjacency matrix for G.

If vertex i is a universal sink according to the definition, the i-th row of the adjacency-matrix 
will be all “0”, and the i-th column will be all “1” except the aii entry, and clearly there is 
only one such vertex. We then describe an algorithm to find out if a universal sink really exist.

Starts from a11. If current entry aij = 0 then j = j + 1 (take one step right); if aij = 1 then 
i = i +1 (take one step down). In this way, it will stop at an entry akn of the last row or ank 
of the last column (n = |V|, 1 ≦ k ≦|V|). Check if vertex k satisfies the definition of universal 
sink (check for kth row to contain V zeros and kth column to contain k-1 1s, because the column 
in adajacency matrix defines in degree of a vertex), if yes then we found it, if no then there is 
no universal sink. Since we always make a step right or down, and checking if a vertex is a 
universal sink can be done in O(V), the total running time is O(V).

If there is no universal sink, this algorithm won’t return any vertex. If there is a universal 
sink u, the path starts from a11 will definitely meet u-th column or u-th row at some entry. 
Once it’s on track, it can’t get out of the track and will finally stop at the right entry.

练习22.1-7 The incidence matrix of a directed graph G = (V, E) is a

matrix B = (bij) such that. Describe what the entries of the matrix product B BT represent, where BT is the transpose of B.

indegree + outdegree                        if u = v
−(number of edges connecting u and v)       if u != v

练习22.1-8 Suppose that instead of a linked list, each array entry Adj[u] is a hash table containing the vertices v for which (u, v) in E. If all edge lookups are equally likely, what is the expected time to determine whether an edge is in the graph? What disadvantages does this scheme have? Suggest an alternate data structure for each edge list that solves these problems. Does your alternative have disadvantages compared to the hash table?

期望时间为O(1)
会使用更多空间
可以是一个二叉搜索树

22.2 广度优先搜索

广度优先搜索能生成一棵广度优先搜索树，在广度优先搜索树中从结点s到结点v的简单路径对应的就是图G中从结点s到结点v的最短路径。
算法在发现所有距离源节点s为k的结点后，才会发现距离为K+1的结点。

BFS(G, s)
    for each vetex u in G.V - {s}
        u.color = WHITE
        u.d = INFINITE
        u.parent = NIL
    s.color = GREY
    s.d = 0
    s.parent = NIL
    Q = queue()
    ENQUEUE(Q, s)
    while Q
        u = DEQUEUE(Q)
        for each v in G.Adj[i]
            if v.color = WHILTE
                v.color = GREY
                v.d = u.d + 1
                v.parent = u
                ENQUEUE(Q, v)
        u.color = BLACK

广度优先搜索的结果可能依赖于对每个结点的邻接结点的访问顺序。广度优先树可能会不一样，但计算出来的距离都是一样的。
运行时间为 O(V + E)
广度优先树：VFS过程所建造出来的parent属性使得前驱子图成为一棵广度优先树

PRINT-PATH(G, s.v)
    if v == s
        print s
    elif v.pai == NIL
        print "no path from" s "to" v "exists"
    else PRINT-PATH(G, s, v.pai)
        print v

练习22.2-1 Show the d and π values that result from running breadth-first search on the directed graph of Figure 22.2(a), using vertex 3 as the source.

结点          d            pai
1             INFINITE     NIL
2             3            4
3             0            NIL             
4             2            5
5             1            3
6             1            3

练习22.2-2 Show the d and π values that result from running breadth-first search on the undirected graph of Figure 22.3, using vertex u as the source.

和上题一样。

练习22.2-3 证明用单个位存放结点颜色即可。

不将结点置为黑色，算法性能不受影响。

练习22.2-4 What is the running time of BFS if its input graph is represented by an adjacency matrix and the algorithm is modified to handle this form of input?

O(V + V^2)

练习22.2-5 Argue that in a breadth-first search, the value d[u] assigned to a vertex u is independent of the order in which the vertices in each adjacency list are given. Using Figure 22.3 as an example, show that the breadth-first tree computed by BFS can depend on the ordering within adjacency lists.

因为d是最短路径，所以不受影响。
但是如果邻接链表出现的顺序不同可能有的结点pai值会变。

练习22.2-6 Give an example of a directed graph G = (V, E), a source vertex s in V , and a set of tree edges Eπ in E such that for each vertex v in V,the unique path in the graph(V,Eπ) from s to v is a shortest path in G, yet the set of edges Eπ cannot be produced by running BFS on G, no matter how the vertices are ordered in each adjacency list.

是因为BFS只能找出一条最短路径，但实际可以有很多条吗？

练习22.2-7 There are two types of professional wrestlers: “good guys” and “bad guys.” Between any pair of professional wrestlers, there may or may not be a rivalry. Suppose we have n professional wrestlers and we have a list of r pairs of wrestlers for which there are rivalries. Give an O(n + r)-time algorithm that determines whether it is possible to designate some of the wrestlers as good guys and the remainder as bad guys such that each rivalry is between a good guy and a bad guy. If is it possible to perform such a designation, your algorithm should produce it.

运行BFS, 用两种颜色染色，每次遇到没染色的结点，都把它染成和当前结点不一样的颜色。如果染色了，
而且和当前结点是一样的颜色，则不存在划分。否则，染好的图就是划分。

练习22.2-8 The diameter is the largest of all shortest-path distances in the tree. Give an efficient algorithm to compute the diameter of a tree, and analyze the running time of your algorithm.

对所有结点运行一遍BFS, 找到最大的d值。O(V * (V + E))

https://github.com/gzc/CLRS/blob/master/C22-Elementary-Graph-Algorithms/22.2.md
这里给出了很多解法。

练习22.2-9UNSOLVED Let G = (V, E) be a connected, undirected graph. Give an O(V + E)-time algorithm to compute a path in G that traverses each edge in E exactly once in each direction. Describe how you can find your way out of a maze if you are given a large supply of pennies.

22.3 深度优先搜索

广度优先搜索通常用来寻找从特定源节点出发的最短路径距离(及其相关的前驱子图)，而深度优先搜索常常作为另一个算法的一个子程序。
深度优先搜索的前驱子图可能形成多棵树(由多棵深度优先树构成的深度优先森林)。

DFS(G)
    for each vertex u in G.V
        u.color = WHITE
        u.pai = NIL
    time = 0
    for each vertex u in G.V
        if u.color == WHITE
            DFS-VISIT(G, u)

DFS-VISIT(G, u)
    time += 1
    u.d = time
    u.color = GREY
    for each v in G.Adj[u]
        if v.color == WHITE
            v.pai = u
            DFS-VISIT(G, v)
    u.color = BLACK
    time += 1
    u.f = time

运行时间为Theta(V + E)

深度优先搜索的结果可能依赖于算法DFS中第5行对结点进行检查的顺序和DFS-VISIT的第四行对一个结点邻接结点访问的顺序。通常这在实际中并不会导致问题。
深度优先森林中，结点v是结点u的后代当且仅当发现结点u的时间u.d，存在一条从结点u到结点v的全部由白色结点构成的路径。
边分为4种：树边、后向边、前向边和横向边。无环图没有横向边。
当第一次探索到(u, v)时，v为白色：树边， v为灰色: 后向边， v为黑色：前向边或横向边。在u.d<v.d时为前向边，在u.d>v.d时为横向边。
无向图没有前向边和横向边。
练习22.3-1 设计表格说明已知两结点颜色情况下每条边可能的类型。

Directed

(i,j)   White   Gray    Black
White   TBFC    BC      C
Gray    TF      TFB     TFC
Black           B       TFBC

Undirected

(i,j)   White   Gray    Black
White   TB      TB  
Gray    TB      TB      TB
Black           TB      TB

练习22.3-2 Show how depth-first search works on the graph of Figure 22.6. Assume that the for loop of lines 5-7 of the DFS procedure considers the vertices in alphabetical order, and assume that each adjacency list is ordered alphabetically. Show the discovery and finishing times for each vertex, and show the classification of each edge.

Follow the algorithm.

Tree edges: (q, s), (s, v), (v, w), (q, t), (t, x), (x, z), (t, y), (r, u)

Back edges: (w, s), (z, x), (y, q)

Forward edges: (q, w)

Cross edges: (r, y), (u, y)

练习22.3-3 Show the parenthesis structure of the depth-first search shown in Figure 22.4.

1   2   3   4   5   6   7   8   9   10  11  12
(u  (v  (y  (x  x)  y)  v)  u)  (w  (z  z)  w)

练习22.3-4 证明使用单个位存储颜色即可。

最后并不需要将结点置为黑色。

练习22.3-5 Show that edge (u, v) is

a tree edge or forward edge if and only if d[u] < d[v] < f[v] < f[u],
a back edge if and only if d[v] < d[u] < f[u] < f[v], and
a cross edge if and only if d[v] < f[v] < d[u] < f[u].

tree edge or forward edge说明v在u开始后开始，在u完成前完成。u是v的祖先结点。
back edge说明v在u开始前开始，结束后结束。v是u的祖先结点。
cross edge说明在探索(u, v)时v已经完成。

练习22.3-6 Show that in an undirected graph, classifying an edge (u, v) as a tree edge or a back edge according to whether (u, v) or (v, u) is encountered first during the depth-first search is equivalent to classifying it according to the priority of types in the classification scheme.

先(u, v): (u, v) is tree edge.
先(v, u): (v, u) is back edge.

练习22.3-7 Rewrite the procedure DFS, using a stack to eliminate recursion.

DFS(G)
    let S be a new stack
    for each vetex u in G.V
        u.color = WHITE
        u.pai = NIL
    time = 0
    for each vetex u in G.v
        S.push_back(u)
        while S is not empty
            u = S.top()
            if u.color == GRAY
                u.color = BLACK
                time += 1
                u.f = time
                S.pop()
                continue
            if u.color == WHITE
                u.color = GRAY
                time += 1
                u.d = time
            for each v in G.Adj[u]
                if v.color == WHITE
                    v.pai = u
                    S.push_back(v)

练习22.3-8 Give a counterexample to the conjecture that if there is a path from u to v in a directed graph G, and if d[u] < d[v] in a depth-first search of G, then v is a descendant of u in the depth-first forest produced.

cross edge 也符合。

练习22.3-9 Give a counterexample to the conjecture that if there is a path from u to v in a directed graph G, then any depth-first search must result in d[v] < f[u].

    s   u   v
s   0   1   0
u   1   0   0
v   1   0   0

练习22.3-10 Modify the pseudocode for depth-first search so that it prints out every edge in the directed graph G, together with its type. Show what modifications, if any, must be made if G is undirected.

根据22.3-5和22.3-6的结论，在伪代码中打印即可。

练习22.3-11 Explain how a vertex u of a directed graph can end up in a depth-first tree containing only u, even though u has both incoming and outgoing edges in G。

If the DFS first searches the vertices in the other ends of the outgoing edges, 
then searches u, and then searches the vertices in the other ends of the 
incoming edges.

练习22.3-12 Show that a depth-first search of an undirected graph G can be used to identify the connected components of G, and that the depth-first forest contains as many trees as G has connected components. More precisely, show how to modify depth-first search so that each vertex v is assigned an integer label cc[v] between 1 and k, where k is the number of connected components of G, such that cc[u] = cc[v] if and only if u and v are in the same connected component.

 Since the connectivity implies reaching a vertex from any other vertex in the 
 graph, using this idea we can follow the process in below pseudo code to number 
 the vertices to the connected componenet they belong: 

 numberingConnectedComponenets(G): 
    connected_component = 1 
    for every u vertex in G 
        markVertex(G.u, connected_component) 
        connected_component += 1 
      
markVertex(G.u , connected_component) 
    similar to DFS for every vertex v in G.adj[u]   
        v.number = connected_component 
        markVertex(G.v, connected_component)

练习22.3-13 A directed graph G = (V, E) is singly connected if implies that there is at most one simple path from u to v for all vertices u, v in V. Give an efficient algorithm to determine whether or not a directed graph is singly connected.

For each vertex u ∈ V , perform a DFS on the given graph G. Check if there 
are any foward edges or cross edges (in the same component) in any one of the
searches. If no such edges exist, then the graph is singly connected, else not.

Time complexity: O(|V |(|V | + |E|)).

The graph is singly connected even with back edges existed. Since back edges 
implies there is a path u v and v u, which is consistent with the definition 
of single connectedness.

22.4 拓扑排序

本节介绍如何使用深度优先搜索来对有向无环图进行拓扑排序。
拓扑排序是一种线性次序，次序满足: 如果G包含边(u, v)，则拓扑排序中u在v的前面。

TOPOLOGICAL-SORT(G)
    call DFS(G) to compute finishing times v.f for each vertex v
    as each vertex is finished, insert it onto the front of a linked list
    return the linked list of vertices

运行时间: Theta(V + E)

一个有向图是无环的当且仅当对其进行深度搜索不产生后向边。
练习22.4-1 Show the ordering of vertices produced by TOPOLOGICAL-SORT when it is run on the dag of Figure 22.8, under the assumption of Exercise 22.3-2.

p n o s m r y v x w z u q t

练习22.4-2 Give a linear-time algorithm that takes as input a directed acyclic graph G = (V, E) and two vertices s and t, and returns the number of paths from s to t in G. For example, in the directed acyclic graph of Figure 22.8, there are exactly four paths from vertex p to vertex v: pov, por yv, posr yv, and psr yv. (Your algorithm only needs to count the paths, not list them.)

以s为源结点运行广度优先搜索，看会遇到t多少次。

练习22.4-3 Give an algorithm that determines whether or not a given undirected graph G = (V, E) contains a cycle. Your algorithm should run in O(V) time, independent of

An undirected graph is acyclic (i.e., a forest) iff a DFS yields no back edges. Since back edges are those 
edges (u, v) connecting a vertex u to an ancestor v in a depth-first tree, so no back edges means there are 
only tree edges, so there is no cycle.

So we can simply run DFS. If find a back edge, there is a cycle. The complexity is O(V ) instead of O(E + V ). 
Since if there is a back edge, it must be found before seeing |V | distinct edges. This is because in a acyclic
 (undirected ) forest, |E| ≤ |V | - 1

练习22.4-4 Prove or disprove: If a directed graph G contains cycles, then TOPOLOGICAL-SORT (G) produces a vertex ordering that minimizes the number of “bad” edges that are inconsistent with the ordering produced.

该结论不成立。选择不同的起始点, 拓扑排序(按书上的深度优先遍历实现)会给出不同的序列, 不能保证坏边的数目最少.

练习22.4-5 Another way to perform topological sorting on a directed acyclic graph G = (V, E) is to repeatedly find a vertex of in-degree 0, output it, and remove it and all of its outgoing edges from the graph. Explain how to implement this idea so that it runs in time O(V + E). What happens to this algorithm if G has cycles?

首先在O(V+E)时间里统计出每个点的入度和出度，以后删除边时维护这些信息。
每次输出入度为0的点，并删除出边。要进行O(V)次输出，O(E)次删除，总运行时间为
O(V+E)

如果图有回路，可能没有入度为0的点

22.5 强连通分量

本节将阐述如何使用深度优先搜索将有向图分解为强连通分量。
图G的转置为将G中的边进行反向。给定图G的邻接链表，创建转置的时间为O(V+E).图G与其转置的强连通分量完全相同。
线性时间算法计算有向图的强连通分量:

STRONGLY-CONNECTED-COMPONENTS(G)
    call DFS(G) = compute finishing times u.f for each vertex u
    compute GT
    call DFS(GT), but in the main loop of DFS, consider the vertices in order of decreasing u.f
    output the vertices of each tree in the depth-first forest formed in line 3 as a separate strongly
            connected component

通过收缩所有相邻结点都在同一个强连通分量中的边，剩下的图就是分量图。分量图是一个有向无环图。
定理22.16 算法 STRONGLY-CONNECTED-COMPONENTS 能够正确计算出有向图G的强连通分量。
练习22.5-1 How can the number of strongly connected components of a graph change if a new edge is added?

如果增加的是一个强连通分量里面的边，那么不会影响
如果是不同强连通分量之间的边，可能不会减少也可能会减少。

练习22.5-2 Show how the procedure STRONGLY-CONNECTED-COMPONENTS works on the graph of Figure 22.6. Specifically, show the finishing times computed in line 1 and the forest produced in line 3. Assume that the loop of lines 5-7 of DFS considers vertices in alphabetical order and that the adjacency lists are in alphabetical order.

按算法做，最后的结果是 {r} -> {u} -> {q, y, t} -> {x, z} -> {s, w, v}

练习22.5-3 Professor Deaver claims that the algorithm for strongly connected components can be simplified by using the original (instead of the transpose) graph in the second depth-first search and scanning the vertices in order of increasing finishing times. Is the professor correct?

不能。
Consider vertex1 to vertex0 vertex0 to vertex2 and vertex0 to vertex1 For this algorithm, the first DFS 
will give a list 1 2 0 for the second DFS. All vertices will be incorrectly reported to be in the same 
SCC. For the CLRS algorithm, the first DFS will give a list 0 2 1 for the second DFS. After reversing 
edges, the correct SCCs {0, 1} and {2} will be reported.

练习22.5-4 Prove that for any directed graph G, we have ((GT)SCC)T = GSCC. That is, the transpose of the component graph of GT is the same as the component graph of G.

转置不改变连通关系。

练习22.5-5 Give an O(V + E)-time algorithm to compute the component graph of a directed graph G = (V, E). Make sure that there is at most one edge between two vertices in the component graph your algorithm produces.

先用STRONGLY-CONNECTED-COMPONENTS计算出强联通分量
将每个强连通分量视为一个结点，如果两个分量之间有边且之前没有加过，就加上这个边，

练习22.5-6 Given a directed graph G = (V, E), explain how to create another graph G′ = (V, E′) such that (a) G′ has the same strongly connected components as G, (b) G′ has the same component graph as G, and (c) E′ is as small as possible. Describe a fast algorithm to compute G′.

只加上必要的边，即
强连通分量内部如果有n个结点，那么用n条边构成环路
两个不同强连通分量之间至多用一个边相连

练习22.5-7 A directed graph G = (V, E) is semiconnected if, for all pairs of vertices u, v ∈ V, we have u ~> v or v ~> u. Give an efficient algorithm to determine whether or not G is semiconnected. Prove that your algorithm is correct, and analyze its running time.

先使用本节算法计算出强连通分量
然后看每个强连通分量是否有指向下一个强连通分量的边

思考题

22.1 以广度优先搜索来对图的边进行分类

因为广度优先搜索的性质。

22.2 衔接点、桥和双连通分量

to be solved

22.3 欧拉回路

只有in-degree = out-degree，才能进出各一次。

22.4 可到达性

to be solved

Sasuke

Jing