Approximate Geometric Query Structures:BBD Trees
BBD Trees
Arya et al. [1] introduced the first BSP tree structure to guarantee both balanced aspect ratio and O(log n) depth. In addition, the aspect ratio achieved allowed them to prove a packing constraint. From this, one can verify that the BBD tree has a packing function of ρ(n) = O(1) where the constant factor depends on the dimension d. In the following section, we describe the basic construction of the BBD tree using terminology from [3].
Every region Ru associated with a node u in a BBD tree is either an outer rectangular box or the set theoretic difference between an outer rectangular box and an inner rectangular box. The size of a box is the length of its longest side and the size of Ru is the size of the outer box. In order to guarantee balanced aspect ratio for these cells, Arya et al. [1] introduced a stickiness restriction on the inner box. Briefly described, an inner box is sticky if the distance between the inner box and every face on the outer box is either 0 or not less than the size of the inner box. Although not necessary to the structure, we shall assume that the aspect ratio of the outer box is no more than two.
The construction of the BBD tree is done by a sequence of alternating splitting and shrinking operations. In the (midpoint) split operation, a region is bisected by a hyperplane cut orthogonal to one of the longest sides. This is essentially the standard type of cut used in a quadtree or octree. Its simplicity, speed of computation, and effectiveness are major reasons for preferring these operations.
The shrink operation partitions a region by a box lying inside the region, creating an inner region. The shrink operation is actually part of a sequence of up to three operations called a centroid shrink. The centroid shrink attempts to partition the region into a small
number of subregions Ri such that |Ri|≤ 2|Ru|/3.
When Ru is simply an outer box, with no inner box, a centroid operation is performed with one shrink operation. The inner partitioning box is found by conceptually applying midpoint split operations recursively on the subregion with the larger number of points. The process stops when the subregion contains no more than 2|Ru|/3 points. The outer box of this subregion is the inner partitioning box for the shrink operation. The other merely conceptual midpoint splits are simply ignored. Choosing this inner box guarantees that both subregions produced by the split have no more than 2|Ru|/3 points. This can be seen by observing that the inner box has no more than 2|Ru|/3 points and also must contain at least |Ru|/3 points. The technique as stated is not theoretically ideal because the number of midpoint split operations computed cannot be bounded. Each midpoint split
FIGURE 26.7: Examples of (a) multiple iterations of the midpoint split rule, (b) centroid shrinking, with dashed lines representing the conceptual midpoint splits and the highlighted inner box being the actual partition cut, (c) centroid shrinking with an inner box. In the final example, the original inner box is solid, the final midpoint split is shown with slightly extended dotted lines, and the new inner box partition cut is shown shaded in gray.
may not partition even a single point. Arya et al. [1, 3] describe a simple solution to this problem, which repeatedly computes the smallest bounding midpoint box using a technique due to Clarkson [8].
When Ru has an inner box associated with it, we cannot simply find another inner box as this would violate the restriction on having only one inner box. Let bi represent the original inner box. The solution is to proceed as in the previous centroid shrink operation, repeatedly applying midpoint split operations on the subregion with the larger number of points. However, we now stop in one of two situations; when the size of the larger subregion either has no more than 2|Ru|/3 points or no longer contains bi. In the former case, let b be the outer box of this subregion. In the latter case, or in the event both cases happen, let b represent the outer box of the subregion prior to this final split. We perform a shrink operation using b as the partitioning box. Since b clearly contains bi, the subregion associated with the original outer box continues to have one inner box, albeit a slightly larger one than its parent. The subregion R1, whose outer box is b, also has one inner box, bi. If |R1| > 2|Ru|/3, we perform a midpoint split on this subregion. Let R2 be the larger subregion formed by this last split, which we know does not contain bi. Since R2 does not contain an inner box, if R2 contains more than 2|Ru|/3 points, we simply perform the basic shrink operation thus dividing R2 into two smaller subregions as well. Clearly, all the subregions produced by this centroid shrink have no more than 2|Ru|/3 points. Figure 26.7 shows the three main operations, splitting, shrinking, and the three-step shrinking process. In addition to this simple version of the BBD tree, there are more flexible variations on this approach.
The reader should refer to [1, 3] for details on an efficient O(dn log n) construction algorithm and for discussions on some of the BBD variations. To highlight a few options, at any stage in the construction, rather than alternate between shrinking and splitting operations, it is preferable to perform split operations whenever possible, so long as the point set is divided evenly after every few levels, and to use the more costly shrinking operations only when necessary. Another approach is to use a more flexible split operation, a fair split, which attempts to partition the points in the region more evenly. In this case, more care has to be taken to avoid producing skinny regions and to avoid violating the stickiness property; however, as was shown experimentally, the flexibility provides for better experimental performance.
The following theorem summarizes the basic result [1, 3]:
THEOREM 26.3 Given a set S of n data points in IRd, in O(dn log n) time it is possible to construct a BBD tree such that
1. the tree has O(n) nodes and depth O(log n),
2. the regions have outer boxes with balanced aspect ratio and inner boxes that are sticky to the outer box,
3. the sizes of the regions are halved after every 2d levels in the tree.
The above conditions imply that the BBD tree is an O(1)-quasi-BAR tree.
The size reduction constraint above helps guarantee a slightly better performance for ge- ometric queries than given for general quasi-BAR trees. In particular, Arya and Mount [3] show that the size reduction allows range queries on BBD trees to be solved in O(2d log n + d(3√ E)d) time, or O(2d log n + d3( 3√E)d−1) for convex queries plus output size. Duncan [11] later extended the separation of the n and E dependencies to nearest and farthest neighbor queries showing that the running time for both is O(log n + E1−d log(1/E)) for fixed dimension d.
Comments
Post a Comment