CLRS16 贪心算法

2017-05-23 | 阅读：次

介绍

贪心算法每一步都做出局部最优的选择，寄希望于这样的选择能得到全局最优解。贪心算法并不保证得到最优解，但对很多问题确实可以求得最优解。

16.1 活动选择问题

虽然可以用动态规划方法求解活动选择问题，但并不需要这样做。相反，可以反复选择最早结束的活动，保留兼容活动，重复过程，直到不再有剩余。
贪心算法通常做出一个选择，然后求解剩下的那个子问题。
递归贪心算法：添加一个虚拟活动a0,其结束时间为f0=0方便初始化。

RECURSIVE-ACTIVITY-SELECTOR(s, f, k, n)
    m = k + 1
    while m <= n and s[m] < f[k]            // find the first activity in Sk to finish
        m += 1
    if m <= n
        return {am} U RECURSIVE-ACTIVITY-SELECTOR(s, f, m, n)
    else return Fai


RECURSIVE-ACTIVITY-SELECTOR(s, f, 0, n)

迭代贪心算法(将一个尾递归过程改为迭代形式通常是很直接的，某些语言的编译器可以自动完成这一工作)

GREEDY-ACTIVITY-SELECTOR(s, f)
    n = s.length
    A = {a1}
    k = 1
    for m = 2 to n
        if s[m] >= f[k]
            A = A U {am}
            k = m
    return A

练习16.1-1 Give a dynamic-programming algorithm for the activity-selection problem, based on the recurrence (16.3). Have your algorithm compute the sizes c[i, j] as defined above and also produce the maximum-size subset A of activities. Assume that the inputs have been sorted as in equation (16.1). Compare the running time of your solution to the running time of GREEDY-ACTIVITY-SELECTOR.

DP-ACTIVITY-SELECTOR(s, f)
    let dp[0..n][0..n] be a new table, all elements initiatized to -1
    return DP-ACTIVITY-SELECTOR-AUX(s, f, 0, n)

DP-ACTIVITY-SELECTOR-AUX(s, f, i, j)
    if dp[i][j] >= 0
        return dp[i][j]
    elif i >= j
        dp[i][j] = 0
        return 0
    else
        for t = i to j
            p <- the first activity in i..j that ends before t starts or i
            q <- the first activity in i..j that starts before t ends or j
            if DP-ACTIVITY-SELECTOR-AUX(s, f, i, p) + 1 + DP-ACTIVITY-SELECTOR-AUX(s, f, q, j) > dp[i][j]
                dp[i][j] = DP-ACTIVITY-SELECTOR-AUX(s, f, i, p) + 1 + DP-ACTIVITY-SELECTOR-AUX(s, f, q, j)
        return dp[i][j]

n^2个子问题，每个字问题最多n种选择，所以时间Theta(n^3)
与之相比贪心算法只有Theta(n)

练习16.1-2 Suppose that instead of always selecting the first activity to finish, we instead select the last activity to start that is compatible with all previously selected activities. Describe how this approach is a greedy algorithm, and prove that it yields an optimal solution.

GREEDY-ACTIVITY-SELECTOR(s, f)
    n = s.length
    A = {an}
    k = 1
    for m = n-1 to 1
        if f[m] <= s[k]
            A = A U {am}
            k = m
    return A

代码并没有什么实质性的区别。

练习16.1-3 说明选择持续时间最短、与其他活动重叠最少或最早开始者均不能得到最优解。

用反例来说明。

练习16.1-4 Suppose that we have a set of activities to schedule among a large number of lecture halls. We wish to schedule all the activities using as few lecture halls as possible. Give an efficient greedy algorithm to determine which activity should use which lecture hall.

Find the smallest number of lectures halls to schedule a set of activities S in.To do this efficiently move throught the 
activities according to starting and finishing times. Maintain two lists of lecture halls: Halls that are busy at time t 
and halls that are free at time t. When t is the starting time for some activity schedule this activity to a free lecture 
hall and move the hall to the busy list. Similarly, move the hall to the free list when the activity stops. Initially 
start with zero halls. If there are no halls in the free list create a new hall.

练习16.1-5 每个活动有一个值，求值之和最大的兼容活动子集

用16.1-1设计的动态规划算法做就好啦~

贪心算法原理

第一个关键要素是贪心选择性质：可以通过贪心选择来构造全局最优解。
最优子结构性质：论证将子问题的最优解与贪心选择组合在一起就能生成原问题的最优解。
贪心算法可以解决分数背包问题，不能解决0-1背包问题。
练习16.2-1 Prove that the fractional knapsack problem has the greedy-choice property.

If every time we pick out the commodity with the greatest value per pound, we will finally 
obtain the optimal value of all commodities under limit W.

练习16.2-2 Give a dynamic-programming solution to the 0–1 knapsack problem that runs in O(n W) time, where n is number of items and W is the maximum weight of items that the thief can put in his knapsack.

# V, W是两个数组，表示每件商品的价值和重量，limit是背包的最大承重

def knapsack_problem(V, M, limit):
    n = len(V)
    dp = [[-1] * (limit+1)] * (n+1)
    return knapsack_aux(V, M, limit, dp)

def knapsack_aux(V, M, limit, dp):
    if limit < 0: return -2**64
    if len(V) == 0 or limit == 0:
        dp[len(V)][limit] = 0
        return 0
    dp[len(V)][limit] = max(knapsack_aux(V[1:], M[1:], limit, dp),
                    V[0] + knapsack_aux(V[1:], M[1:], limit-M[0], dp))
    return dp[len(V)][limit]

练习16.2-3 Suppose that in a 0–1 knapsack problem, the order of the items when sorted by increasing weight is the same as their order when sorted by decreasing value. Give an efficient algorithm to find an optimal solution to this variant of the knapsack problem, and argue that your algorithm is correct.

排序，然后先装最轻的，正确性显而易见。

练习16.2-4 Professor Midas drives an automobile from Newark to Reno along Interstate 80. His car’s gas tank, when full, holds enough gas to travel n miles, and his map gives the distances between gas stations on his route. The professor wishes to make as few gas stops as possible along the way. Give an efficient method by which Professor Midas can determine at which gas stations he should stop, and prove that your strategy yields an optimal solution.

先到能到达的最远补水站，然后补水
再到能到达的最远补水站，然后补水
依此类推，直到到达终点。

练习16.2-5 Describe an efficient algorithm that, given a set {x1, x2, …,xn} of points on the real line, determines the smallest set of unit-length closed intervals that contains all of the given points. Argue that your algorithm is correct.

先对所有点进行排序
将[x1, x1+1]并入集合S
for i = 2 to n
    if xi 不在集合中的任何一个间隔上
        S U [xi, xi+1]
S即为最优解

练习16.2-6 Show how to solve the fractional knapsack problem in O(n) time.

TO BE SOLVED

练习16.2-7 Suppose you are given two sets A and B, each containing n positive integers. You can choose to reorder each set however you like. After reordering, let ai be the ith element of set A, and let bi be the ith element of set B. You then receive a payoff of . Give an algorithm that will maximize your payoff. Prove that your algorithm maximizes the payoff, and state its running time.

将A,B都排序为单调递减。
待证明。

Huffman编码

Q用最小二叉堆实现，运行时间是O(nlgn). 若用van Emde Boas树，则运行时间是O(nlglgn).

HUFFMAN(C)
    n = len(C)
    Q = C                 // Q为一最小优先队列，以属性freq为关键字。
    for i = 1 to n - 1
        allocate a new node z
        z.left = x = EXCTRACT-MIN(Q)
        z.right = y = EXTRACT-MIN(Q)
        z.freq = x.freq + y.freq
        INSERT(Q, z)
    return EXTRACT-MIN(Q)

定理16.4：过程HUFFMAN会生成一个最优前缀码。
练习16.3-1 为什么若x.freq=b.freq,则有a,b,x,y的freq都想等

因为x.freq是所有的中最小的。
且a.freq <= b.freq

练习16.3-2 Prove that a binary tree that is not full cannot correspond to an optimal prefix code.

因为最优前缀编码必须考虑到每一种前缀情况。

练习16.3-3 What is an optimal Huffman code for the following set of frequencies, based on the first 8 Fibonacci numbers? a:1 b:1 c:2 d:3 e:5 f:8 g:13 h:21. Can you generalize your answer to find the optimal code when the frequencies are the first n Fibonacci numbers?

a: 1111111
b: 1111110
c: 111110
d: 11110
e: 1110
f: 110
g: 10
h: 0

往前面加1就行了

练习16.3-4 Prove that the total cost of a tree for a code can also be computed as the sum, over all internal nodes, of the combined frequencies of the two children of the node.

这样算下来和正常算法实质上是一样的。

练习16.3-5 证明: 如果我们将字母表中的字符按频率单调递减排序，那么存在一个最优编码，其码字长度是单调递增的。

因为如果不是单调递增，移动两个结点位置可以得到一个更优的编码树。

练习16.3-6 Suppose we have an optimal prefix code on a set C = {0, 1, …, n - 1} of characters and we wish to transmit this code using as few bits as possible. Show how to represent any optimal prefix code on C using only 2n - 1 + n ceil(lg n) bits. (Hint: Use 2n - 1 bits to specify the structure of the tree, as discovered by a walk of the tree.)

内部结点用1表示，叶节点用0表示。，需要2n-1位。
表示字母序列需要nlgn位。

练习16.3-7 Generalize Huffman’s algorithm to ternary codewords (i.e., codewords using the symbols 0, 1, and 2), and prove that it yields optimal ternary codes.

每次都用3个频率最小的子树合成一棵子树~

练习16.3-8 Suppose a data file contains a sequence of 8-bit characters such that all 256 characters are about as common: the maximum character frequency is less than twice the minimum character frequency. Prove that Huffman coding in this case is no more efficient than using an ordinary 8-bit fixed-length code.

此时生成的Huffman树是一颗满二叉树,跟固定长度编码一致.

练习16.3-9 Show that no compression scheme can expect to compress a file of randomly chosen 8-bit characters by even a single bit. (Hint: Compare the number of files with the number of possible encoded files.)

随机生成的n位文件，所有可能情况也需要n位编码文件表示。

Sasuke

Jing

CLRS16 贪心算法

介绍

16.1 活动选择问题

贪心算法原理

Huffman编码

思考题(看完图算法回来做)