Interval, Segment, Range, and Priority Search Trees:Range Trees.
Range Trees
Consider a set S of points in k-dimensional space 3ik . A range tree for this set S of points is a data structure that supports general range queries of the form where each range denotes an interval in 3i. The cartesian product of these k ranges is referred to as a kD range. In 2-dimensional space, a 2D range is simply an axes-parallel rectangle in 3i2. The range search problem is to find all the points in S that satisfy any range query. In 1-dimension, the range search problem can be easily solved in logarithmic time using a sorted array or a balanced binary search tree. The 1D range is simply an interval [xe, xr ]. We first do a binary search using xe as searched key to find the first node v whose key is no less than xe . Once v is located, the rest is simply to retrieve the nodes, one at a time, until the node u whose key is greater than xr . We shall in this section describe an augmented binary search tree which is easily generalized to higher dimensions.
Construction of Range Trees
A range tree is primarily a binary search tree augmented with an auxiliary data structure. The root node v, denoted Range Tree root(S), of a kD-range tree[5, 18, 22, 24] for a set S of points in k-dimensional space 3ik , i.e., S = {pi = (x1, x2,..., xk ),i = 1, 2,... , n}, where pi.xj = xj ∈3i is the jth-coordinate value of point pi, for j = 1, 2,... , k, is associated with the entire set S. The key stored in v.key is to partition S into two approximately equal subsets Se and Sr , such that all the points in Se and in Sr lie to the left and to the right, respectively of the hyperplane Hk : xk = v.key. That is, we will store the median of the kth coordinate values of all the points in S in v.key of the root node v, i.e., v.key = pj .xk for some point pj such that Se contains points pe, pe.xk ≤ v.key, and Sr contains points pr , pr .xk > v.key. Each node v in the kD-range tree, as before, has two tree pointers, v.left and v.right , to the roots of its left and right subtrees respectively. The node pointed to by v. left will be associated with the set Se and the node pointed to by v.right will be associated with the set Sr . The auxiliary pointer v.aux will point to an augmented data structure, in our case a (k − 1)D-range tree.
A 1D-range tree is a sorted array of all the points pi ∈ S such that the entries are drawn from the set {x1|i = 1, 2,... , n} sorted in nondecreasing order. This 1D-range tree supports the 1D range search in logarithmic time.
The following is a pseudo code for a kD-range tree for a set S of n points in k-dimensional space. See Fig. 18.3(a) and (b) for an illustration. Fig. 18.4(c) is a schematic representation of a kD-range tree.
As this is a recursive algorithm with two parameters, k and |S|, that determine the recursion depth, it is not immediately obvious how much time and how much space are needed to construct a kD-range tree for a set of n points in k-dimensional space.
Let T (n, k) denote the time taken and S(n, k) denote the space required to build a kD range tree of a set of n points in 3ik . The following are recurrence relations for T (n, k) and S(n, k) respectively.
scheme, and solutions to the recurrence relation, please refer to Bentley[2] and Monier[20] respectively.
We conclude that
THEOREM 18.4 The kD-range tree for a set of n points in k-dimensional space can be constructed in O(n logk−1 n + n log n) time and O(n logk−1 n) space for k ≥ 1.
Fig. 18.3(b) illustrates an example of a range tree for a set of points in 2-dimensions shown in Fig. 18.3(a). This list of integers under each node represents the indexes of points in ascending x-coordinates.
Fig. 18.4 illustrates a general schematic representation of a kD- range tree, which is a layered structure[5, 22].
We now discuss how we make use of a range tree to solve the range search problem. We shall use 2D-range tree as an example for illustration purposes. It is rather obvious
to generalize it to higher dimensions. Recall we have a set S of n points in the plane 3i and 2D range query Q = [xe, xr ] × [ye, yr ]. Let us assume that a 2D-range tree rooted at 2D Range Tree root(S) is available. Recall also that associated with each node v in the range tree there is a standard range for the set Sv of points represented in the subtree rooted at node v, in this case [v.B, v.E] where v.B = min{pi.y} and v.E = max{pi.y} for all pi ∈ Sv . v.key will split the standard range into two standard subranges [v.B, v.key] and [v.key, v.E] each associated with the root nodes v.left and v.right of the left and right subtrees of v respectively.
The following pseudo code reports in F the set of points in S that lie in the range Q = [xe, xr ] × [ye, yr ]. It is invoked by 2D Range Search(v, xe, xr , ye, yr ,F ), where v is the root, 2D Range Tree root(S), and F , initially empty will return all the points in S that lie in the range Q = [xe , xr ] × [ye, yr ].
procedure 1D Range Search(v, xe, xr ,F ) is very straightforward. v is a pointer to a sorted array SA. We first do a binary search in SA looking for the first element no less than xe and then start to report in F those elements no greater than xr . It is obvious that procedure 2D Range Search finds all the points in Q in O(log2 n) time. Note that there are O(log n) nodes for which we need to invoke 1D Range Search in their auxiliary sorted arrays. These nodes v are in the canonical covering‡ of the y-range [ye, yr ], since its associated standard range [v.B, v.E] is totally contained in [ye, yr ], and the 2D-range search problem is now reduced to the 1D-range search problem.
This is not difficult to see that the 2D-range search problem can be answered in time O(log2 n) plus time for output, as there are O(log n) nodes in the canonical covering of a given y-range and for each node in the canonical covering we spend O(log n) time for dealing with the 1D-range search problem.
However, with a modification to the auxiliary data structure, one can achieve an optimal query time of O(log n), instead of O(log2 n)[6, 7, 24]. This is based on the observation that in each of the 1D-range search subproblem associated with each node in the canonical covering, we perform the same query, reporting points whose x-coordinates lie in the x-range [xe, xr ]. More specifically we are searching for the smallest element no less than xe .
The modification is performed on the sorted array associated with each of the node in the 2D Range Tree(S).
Consider the root node v. As it is associated with the entire set of points, v.aux points to the sorted array containing the x-coordinates of all the points in S. Let this sorted array be denoted SA(v) and the entries, SA(v)i,i = 1, 2,... , |S|, are sorted in nondecreasing order of the x-coordinate values. In addition to the x-coordinate value, each entry also contains the index of the corresponding point. That is, SA(v)i .key and SA(v)i .index contain the x-coordinate of pj respectively, where SA(v)i .index = j and SA(v)i .key = pj .x.
We shall augment each entry SA(v)i with two pointers, SA(v)i.left and SA(v)i .right .
They are defined as follows. Let ve and vr denote the roots of the left and right subtrees of v, i.e., v.left = ve and v.right = vr . SA(v)i.left points to the entry SA(ve)j such that entry SA(ve )j .key is the smallest among all key values SA(ve)j .key ≥ SA(v)i .key. Similarly, SA(v)i.right points to the entry SA(vr )k such that entry SA(vr )k .key is the smallest among all key values SA(vr )k .key ≥ SA(v)i.key.
These two augmented pointers, SA(v)i .left and SA(v)i .right , possess the following property: If SA(v)i.key is the smallest key such that SA(v)i.key ≥ xe , then SA(ve)j .key is also the smallest key such that SA(ve)j .key ≥ xe. Similarly SA(vr )k .key is the smallest key such that SA(vr )k .key ≥ xe .
Thus if we have performed a binary search in the auxiliary sorted array SA(v) associated with node v locating the entry SA(v)i whose key SA(v)i.key is the smallest key such that SA(v)i.key ≥ xe , then following the left (respectively right) pointer SA(v)i.left (respectively SA(v)i.right ) to SA(ve)j (respectively SA(vr )k ), the entry SA(ve)j .key (respectively SA(vr )k .key) is also the smallest key such that SA(ve )j .key ≥ xe (respectively SA(vr )k .key ≥ xe). Thus there is no need to perform an additional binary search in the auxiliary sorted array SA(v.left ) (respectively SA(v.right )).
With this additional modification, we obtain an augmented 2D-range tree and the following theorem.
THEOREM 18.5 The 2D-range search problem for a set of n points in the 2-dimensional space can be solved in time O(log n) plus time for output, using an augmented 2D-range tree that requires O(n log n) space.
The following procedure is generalized from procedure 2D Range Search(v, xe, xr , ye, yr ,F ) discussed in Section 18.4.2 taken into account the augmented auxiliary data structure. It is invoked by kD Range Search(k, v, Q, F ), where v is the root kD Range Tree root(S) of the range tree, Q is the k-range represented by a two dimensional array, such that Qi.f = xi and Qi.r = xi , and F , initially empty, will contain all the points that lie in Q.
The following recurrence relation for the query time Q(n, k) of the kD-range search problem, can be easily obtained:
THEOREM 18.6 The kD-range search problem for a set of n points in the k-dimensional space can be solved in time O(logk−1 n) plus time for output, using an augmented kD-range tree that requires O(n logk−1 n) space for k ≥ 1.
Comments
Post a Comment