Multidimensional Spatial Data Structures:Line Data and Boundaries of Regions

Line Data and Boundaries of Regions

Section 16.4 was devoted to variations on hierarchical decompositions of regions into blocks, an approach to region representation that is based on a description of the region’s interior. In this section, we focus on representations that enable the specification of the boundaries of regions, as well as curvilinear data and collections of line segments. The representations are usually based on a series of approximations which provide successively closer fits to the data, often with the aid of bounding rectangles. When the boundaries or line segments have a constant slope (i.e., linear and termed line segments in the rest of this discussion), then an exact representation is possible.

There are several ways of approximating a curvilinear line segment. The first is by digitizing it and then marking the unit-sized cells (i.e., pixels) through which it passes. The second is to approximate it by a set of straight line segments termed a polyline. Assuming a boundary consisting of straight lines (or polylines after the first stage of approximation), the simplest representation of the boundary of a region is the polygon. It consists of vectors which are usually specified in the form of lists of pairs of x and y coordinate values corresponding to their start and end points. The vectors are usually ordered according to their connectivity. One of the most common representations is the chain code [26] which is an approximation of a polygon’s boundary by use of a sequence of unit vectors in the four (and sometimes eight) principal directions.

Chain codes, and other polygon representations, break down for data in three dimensions and higher. This is primarily due to the difficulty in ordering their boundaries by connectivity. The problem is that in two dimensions connectivity is determined by ordering the boundary elements ei,j of boundary bi of object o so that the end vertex of the vector vj corresponding to ei,j is the start vertex of the vector vj+1 corresponding to ei,j+1. Unfortunately, such an implicit ordering does not exist in higher dimensions as the relationship between the boundary elements associated with a particular object are more complex.

Instead, we must make use of data structures which capture the topology of the object in terms of its faces, edges, and vertices. The winged-edge data structure is one such representation which serves as the basis of the boundary model (also known as BRep [5]). For more details about these data structures, see Chapter 17.

Polygon representations are very local. In particular, if we are at one position on the boundary, we don’t know anything about the rest of the boundary without traversing it element-by-element. Thus, using such representations, given a random point in space, it is very difficult to find the nearest line to it as the lines are not sorted. This is in contrast to hierarchical representations which are global in nature. They are primarily based on rectangular approximations to the data as well as on a regular decomposition in two dimensions. In the rest of this section, we discuss a number of such representations.

In Section 16.3 we already examined two hierarchical representations (i.e., the R-tree and the R+-tree) that propagate object approximations in the form of bounding rectangles. In this case, the sides of the bounding rectangles had to be parallel to the coordinate axes of the space from which the objects are drawn. In contrast, the strip tree [4] is a hierarchical representation of a single curve that successively approximates segments of it with bounding rectangles that does not require that the sides be parallel to the coordinate axes. The only requirement is that the curve be continuous; it need not be differentiable.

The strip tree data structure consists of a binary tree whose root represents the bounding rectangle of the entire curve. The rectangle associated with the root corresponds to a rectangular strip, that encloses the curve, whose sides are parallel to the line joining the endpoints of the curve. The curve is then partitioned in two at one of the locations where it touches the bounding rectangle (these are not tangent points as the curve only needs to be

image

FIGURE 16.12: (a) MX quadtree and (b) edge quadtree for the collection of line segments of Figure 16.5.

continuous; it need not be differentiable). Each subcurve is then surrounded by a bounding rectangle and the partitioning process is applied recursively. This process stops when the width of each strip is less than a predetermined value.

In order to be able to cope with more complex curves such as those that arise in the case of object boundaries, the notion of a strip tree must be extended. In particular, closed curves and curves that extend past their endpoints require some special treatment. The general idea is that these curves are enclosed by rectangles which are split into two rectangular strips, and from now on the strip tree is used as before.

The strip tree is similar to the point quad tree in the sense that the points at which the curve is decomposed depend on the data. In contrast, a representation based on the region quad tree has fixed decomposition points. Similarly, strip tree methods approximate curvilinear data with rectangles of arbitrary orientation, while methods based on the region quad tree achieve analogous results by use of a collection of disjoint squares having sides of length power of two. In the following we discuss a number of adaptations of the region quad tree for representing curvilinear data.

The simplest adaptation of the region quad tree is the MX quad tree [39, 40]. It is built by digitizing the line segments and labeling each unit-sized cell (i.e., pixel) through which it passes as of type boundary. The remaining pixels are marked WHITE and are merged, if possible, into larger and larger quad tree blocks. Figure 16.12a is the MX quad tree for the collection of line segment objects in Figure 16.5.

A drawback of the MX quad tree is that it associates a thickness with a line. Also, it is difficult to detect the presence of a vertex whenever five or more line segments meet.

The edge quad tree [68, 74] is a refinement of the MX quad tree based on the observation that the number of squares in the decomposition can be reduced by terminating the subdivision whenever the square contains a single curve that can be approximated by a single straight line. For example, Figure 16.12b is the edge quad tree for the collection of line segment objects in Figure 16.5. Applying this process leads to quad trees in which long edges are represented by large blocks or a sequence of large blocks. However, small blocks are required in the vicinity of corners or intersecting edges. Of course, many blocks will contain no edge information at all.

The PM quad tree family [54, 66] (see also edge-EXCELL [71]) represents an attempt

image

FIGURE 16.13: (a) PM1 quad tree and (b) PMR quad tree for the collection of line segments of Figure 16.5.

to overcome some of the problems associated with the edge quad tree in the representation of collections of polygons (termed polygonal maps). In particular, the edge quad tree is an approximation because vertices are represented by pixels. There are a number of variants of the PM quad tree. These variants are either vertex-based or edge-based. They are all built by applying the principle of repeatedly breaking up the collection of vertices and edges (forming the polygonal map) until obtaining a subset that is sufficiently simple so that it can be organized by some other data structure.

The PM1 quad tree [66] is an example of a vertex-based PM quad tree. Its decomposition rule stipulates that partitioning occurs as long as a block contains more than one line segment unless the line segments are all incident at the same vertex which is also in the same block (e.g., Figure 16.13a). Given a polygonal map whose vertices are drawn from a grid (say 2m × 2m), and where edges are not permitted to intersect at points other than the grid points (i.e., vertices), it can be shown that the maximum depth of any leaf node in the PM1 quad tree is bounded from above by 4m + 1 [64]. This enables a determination of the maximum amount of storage that will be necessary for each node.

A similar representation has been devised for three-dimensional images (e.g., [3] and the references cited in [63]). The decomposition criteria are such that no node contains more than one face, edge, or vertex unless the faces all meet at the same vertex or are adjacent to the same edge. This representation is quite useful since its space requirements for polyhedral objects are significantly smaller than those of a region octree.

The PMR quad tree [54] is an edge-based variant of the PM quad tree. It makes use of a probabilistic splitting rule. A node is permitted to contain a variable number of line segments. A line segment is stored in a PMR quad tree by inserting it into the nodes corresponding to all the blocks that it intersects. During this process, the occupancy of each node that is intersected by the line segment is checked to see if the insertion causes it to exceed a predetermined splitting threshold. If the splitting threshold is exceeded, then the node’s block is split once, and only once, into four equal quadrants.

For example, Figure 16.13b is the PMR quad tree for the collection of line segment objects in Figure 16.5 with a splitting threshold value of 2. The line segments are inserted in alphabetic order (i.e., a–i). It should be clear that the shape of the PMR quad tree depends on the order in which the line segments are inserted. Note the difference from the PM1 quadtree in Figure 16.13a – that is, the NE block of the SW quadrant is decomposed in the PM1 quad tree while the SE block of the SW quadrant is not decomposed in the PM1 quad tree.

On the other hand, a line segment is deleted from a PMR quad tree by removing it from the nodes corresponding to all the blocks that it intersects. During this process, the occupancy of the node and its siblings is checked to see if the deletion causes the total number of line segments in them to be less than the predetermined splitting threshold. If the splitting threshold exceeds the occupancy of the node and its siblings, then they are merged and the merging process is reapplied to the resulting node and its siblings. Notice the asymmetry between the splitting and merging rules.

The PMR quad tree is very good for answering queries such as finding the nearest line to a given point [34–37] (see [38] for an empirical comparison with hierarchical object representations such as the R-tree and R+-tree). It is preferred over the PM1 quadtree (as well as the MX and edge quadtrees) as it results in far fewer subdivisions. In particular, in the PMR quadtree there is no need to subdivide in order to separate line segments that are very “close” or whose vertices are very “close,” which is the case for the PM1 quad tree.

This is important since four blocks are created at each subdivision step. Thus when many subdivision steps that occur in the PM1 quad tree result in creating many empty blocks, the storage requirements of the PM1 quad tree will be considerably higher than those of the

PMR quad tree. Generally, as the splitting threshold is increased, the storage requirements of the PMR quad tree decrease while the time necessary to perform operations on it will increase.

Using a random image model and geometric probability, it has been shown [48], theoretically and empirically using both random and real map data, that for sufficiently high values of the splitting threshold (i.e., 4), the number of nodes in a PMR quad tree is asymptotically proportional to the number of line segments and is independent of the maximum depth of the tree. In contrast, using the same model, the number of nodes in the PM1 quad tree is a product of the number of lines and the maximal depth of the tree (i.e., n for a 2n × 2n image). The same experiments and analysis for the MX quad tree confirmed the results predicted by the Quad tree Complexity Theorem (see Section 16.4) which is that the number of nodes is proportional to the total length of the line segments.

Observe that although a bucket in the PMR quad tree can contain more line segments than the splitting threshold, this is not a problem. In fact, it can be shown [63] that the maximum number of line segments in a bucket is bounded by the sum of the splitting threshold and the depth of the block (i.e., the number of times the original space has been decomposed to yield this block).

Comments

Popular posts from this blog

Binary Space Partitioning Trees:BSP Tree as a Hierarchy of Regions.

Data Structure Visualization:Introduction and Value of Data Structure Rendering

0/1 Knapsack Problem Memory function.