介绍
经常需要通过存储额外信息的方法来扩张一种标准的数据结构,然后对这种数据结构,编写新的操作来支持所需要的应用。对数据结构的扩张并不总是简单直接的。
14.1 动态顺序统计
-
本节介绍如何修改红黑树,使得可以在O(lgn)时间内确定任何顺序统计量。还将看到如何在O(lgn)时间内计算一个元素的秩。
-
红黑树每个结点饱含x.size = x.left.size + x.right.size + 1
OS-SELECT(x, i)
r = x.left.size + 1
if i == r
return x
elif i < r
return OS-SELECT(x.left, i)
else return OS-SELECT(x.right, i-r)
OS-RANK(T, x)
r = x.left.size + 1
y = x
while y != T.root
if y == y.p.right
r += y.p.left.size +
y = y.p
return r
-
除非能用红黑树上经过修改的基本操作对size属性加以有效的维护,否则,我们的工作将变得没有意义。
-
练习14.1-1 Show how OS-SELECT(T, 10) operates on the red-black tree T of Figure 14.1.
It's straight forward.
- 练习14.1-2 Show how OS-RANK(T, x) operates on the red-black tree T of Figure 14.1 and the node x with key[x] = 35.
It's also straight forward.
- 练习14.1-3 Write a nonrecursive version of OS-SELECT.
ITERATIVE-OS-SELECT(x, i)
while True:
r = x.left.size + 1
if i == r
return x
elif i < r
x = x.left
else
x = x.right
i -= r
- 练习14.1-4 Write a recursive procedure OS-KEY-RANK(T, k) that takes as input an order-statistic tree T and a key k and returns the rank of k in the dynamic set represented by T . Assume that the keys of T are distinct.
OS-KEY-RANK(T, k)
r = T.left.size + 1
if k == T.key
return r
elif k < T.key
return OS-KEY-RANK(T.left, k)
else
return r + OS-KEY-RANK(T.right, k)
- 练习14.1-5 Given an element x in an n-node order-statistic tree and a natural number i, how can the ith successor of x in the linear order of the tree be determined in O(lg n) time?
这节的两个函数,rank和select一起用就行了。
- 练习14.1-6 Observe that whenever the size field of a node is referenced in either OS-SELECT or OS- RANK, it is used only to compute the rank of the node in the subtree rooted at that node. Accordingly, suppose we store in each node its rank in the subtree of which it is the root. Show how this information can be maintained during insertion and deletion. (Remember that these two operations can cause rotations.)
插入和删除都要遍历所有结点,加一或者减一。
- 练习14.1-7 Show how to use an order-statistic tree to count the number of inversions (see Problem 2-4) in an array of size n in time O(n lg n).
红黑树的建树时间是O(nlgn),所以一边插入一边计算插入结点的rank,加起来就行了。
- 练习14.1-8 Consider n chords on a circle, each defined by its endpoints. Describe an O(n lg n)-time algorithm for determining the number of pairs of chords that intersect inside the circle. (For example, if the n chords are all diameters that meet at the center, then the correct answer is .) Assume that no two chords share an endpoint.
先按角度对2n个端点排序
按照从小到大的顺序遍历2n个端点
如果端点是某条弦的起点
将x插入顺序统计树中
如果端点是某条弦的终点
统计目前树中有多少弦的起点角度比x的起点角度大,就是相交弦数量
从顺序统计树中删除x
14.2 如何扩张数据结构
-
定理14.1 红黑树的扩张:设f是n个结点的红黑树T扩张的属性,且假设对任一结点x,f的值仅依赖于结点x,x.left和x.right的信息,还可能包括x.left.f和x.right.f。那么,我们可以在插入和删除操作期间对T的所有结点的f值进行维护,并且不影响这两个操作的O(lgn)渐进时间性能。证明:对结点x的f属性的变化只会影响到祖先。
-
练习14.2-1 Show how the dynamic-set queries MINIMUM, MAXIMUM, SUCCESSOR, and PREDECESSOR can each be supported in O(1) worst-case time on an augmented order- statistic tree. The asymptotic performance of other operations on order-statistic trees should not be affected. (Hint: Add pointers to nodes.)
MINIMUM: 用一个指针维持当前最小结点,插入时看是否需要更新,删除时如果删除的是最小元素,那么MINIMUM更新为原来的SUCCESSOR.
MAXIMUM: 与上类似。
SUCCESSOR: 给每个节点增加一个successor指针。
PREDECESSOR: 与上类似。
- 练习14.2-2a Can the black-heights of nodes in a red-black tree be maintained as fields in the nodes of the tree without affecting the asymptotic performance of any of the red-black tree operations? Show how, or argue why not.
可以,定理14.1
- 练习14.2-2b Can the depths of nodes in a red-black tree be efficiently maintained as fields in the nodes of the tree? Show how, or argue why not.
不能。深度取决于parent结点。可能会在更新操作中影响到超过O(lgn)的结点。
- 练习14.2-3 Let X be an associative binary operator, and let a be a field maintained in each node of a red- black tree. Suppose that we want to include in each node x an additional field f such that f[x] = a[x1] X a[x2] X ··· X a[xm], where x1, x2,…, xm is the inorder listing of nodes in the subtree rooted at x. Show that the f fields can be properly updated in O(1) time after a rotation. Modify your argument slightly to show that the size fields in order-statistic trees can be maintained in O(1) time per rotation.
每个结点的f属性可以通过left-child和right-child的f计算出来。
令运算为加法,每个结点的a field都等于1,f就是每个结点的size.
- 练习14.2-4 We wish to augment red-black trees with an operation RB-ENUMERATE(x, a, b) that outputs all the keys k such that a ≤ k ≤ b in a red-black tree rooted at x. Describe how RB- ENUMERATE can be implemented in Θ(m +lg n) time, where m is the number of keys that are output and n is the number of internal nodes in the tree. (Hint: There is no need to add new fields to the red-black tree.)
根据练习12.2-8,查找m个下继的时间为O(m+lgn)
14.3 区间树
INTERVAL-SEARCH(T, i)
x = T.root
while x != T.nil and i does not overlap x.int
if x.left != T.nil and x.left.max >= i.low
x = x.left
else x = x.right
return x
- 练习14.3-1 Write pseudocode for LEFT-ROTATE that operates on nodes in an interval tree and updates the max fields in O(1) time.
y.max = x.max
x.max = max(x.high, x.left.max, x.right.max)
- 练习14.3-2 Rewrite the code for INTERVAL-SEARCH so that it works properly when all intervals are assumed to be open.
INTERVAL-SEARCH(T, i)
x = T.root
while x != T.nil and i does not overlap x.int
if x.left != T.nil and x.left.max > i.low
x = x.left
else x = x.right
return x
- 练习14.3-3 Describe an efficient algorithm that, given an interval i, returns an interval overlapping i that has the minimum low endpoint, or nil[T] if no such interval exists.
MIN-INTERVAL-SEARCH(T, i)
x = T.root
res = INT_MAX
while x != T.nil
if i overlap x.int
res = min(res, x.low)
if x.left != T.nil and x.left.max >= i.low
x = x.left
else x = x.right
return x
- 练习14.3-4 Given an interval tree T and an interval i, describe how all intervals in T that overlap i can be listed in O(min(n, k lg n)) time, where k is the number of intervals in the output list. (Optional: Find a solution that does not modify the tree.)
INTERVAL-OVERLAP-LIST(T, x, i)
if i overlap x.int
print x
if x.left != T.nil and x.left.max >= i.low
INTERVAL-OVERLAP-LIST(T, x.left, i)
if x.right != T.nil and x.int.low <= i.high and x.right.max >= i.low
INTERVAL-OVERLAP-LIST(T, x.right, i)
return x
- 练习14.3-5 Suggest modifications to the interval-tree procedures to support the new operation INTERVAL-SEARCH-EXACTLY(T, i), which returns a pointer to a node x in interval tree T such that low[int[x]] = low[i] and high[int[x]] = high[i], or nil[T] if T contains no such node. All operations, including INTERVAL-SEARCH-EXACTLY, should run in O(lg n) time on an n-node tree.
INTERVAL-SEARCH-EXACTLY(T, i)
x = SEARCH(T, i.low)
if x.high == i.high
return x
return T.nil
- 练习14.3-6 Show how to maintain a dynamic set Q of numbers that supports the operation MIN-GAP, which gives the magnitude of the difference of the two closest numbers in Q. For example, if Q = {1, 5, 9, 15, 18, 22}, then MIN-GAP(Q) returns 18 - 15 = 3, since 15 and 18 are the two closest numbers in Q. Make the operations INSERT, DELETE, SEARCH, and MIN-GAP as efficient as possible, and analyze their running times.
给节点新增三个属性:
mingap : 以当前节点为根,其子树的最小gap.叶子结点为∞
maxval : 以当前节点为根,其子树的最大关键字
minval : 以当前节点为根,其子树的最小关键字
x.minval = min(x.key, x.left.minval)
x.maxval = max(x.key, x.right.maxval)
x.mingap = min(x.left.mingap, x.right.mingap, x.key-x.left.maxval, x.right.minval-x.key)
都可以在O(lgn)时间更新这些信息。
- 练习14.3-7 VLSI databases commonly represent an integrated circuit as a list of rectangles. Assume that each rectangle is rectilinearly oriented (sides parallel to the x- and y-axis), so that a representation of a rectangle consists of its minimum and maximum x- and y-coordinates. Give an O(n lg n)-time algorithm to decide whether or not a set of rectangles so represented contains two rectangles that overlap. Your algorithm need not report all intersecting pairs, but it must report that an overlap exists if one rectangle entirely covers another, even if the boundary lines do not intersect. (Hint: Move a “sweep” line across the set of rectangles.)
先根据x坐标排序所有2n个点,然后将一扫描线扫过整个矩形,我们让扫描线和y轴平行,从最左开始扫描.如果碰到某矩形平行y轴的左边那条边条边,就把这条边插入到区间树中,如果碰到某矩形平行y轴的右边那条边,就把这条边从区间树中删除,如果碰到在插入边时,和前面插入的边有重叠,那么就是矩形重叠了.
O(nlgn)
思考题
- 14-1 最大重叠点
a.
端点不会比其它地方少。
b.
- 14-2 Josephus排列
a.
用一个循环链表
b.