Managing Spatiotemporal Data:Objects
Indexing Structures for Continuously Moving Objects
Continuously moving objects pose new challenges to indexing technology for large databases. Sources for such data include GPS systems, wireless networks, air-traffic controls etc. In the previous sections we have outlined some indexing schemas that could efficiently index spatio-temporal data types but some other schemas have been developed specifically for answering predictive queries in a database of continuously moving objects. In this section we discuss an assortment of such indexing schemas.
Databases of continuously moving objects have two kinds of indexing issues: Storing the historical movements in time of objects and predicting the movement of objects based in previous positional and directional information. Such predictions can be made more reliably for a future time Tf , not far from the current timestamp Tf . As Tf increases, the predictions become less and less reliable since the change of trajectory by a moving object results in inaccurate prediction. Traditional indexing schemas such as R*-trees are successful in storing multidimensional data points but are not directly useful for storing moving objects.
An example[21] of such kind of system is shown in Figure 22.14.
For simplicity a two dimensional space is illustrated, but practical systems can have larger dimensions. First part of figure shows multiple objects moving in different directions. If traditional R*-trees is used for indexing these data, the minimum bounding boxes for the leaf level of the tree are demonstrated in Figure 22.14b. But these objects might be following different trajectories (as shown by arrows in Figure 22.14a and at subsequent time stamps, the leaf level MBBs might change in size and position, as demonstrated in Figure 22.14c and Figure 22.14d.
Since traditional indexing methods are not designed for such kind of applications, some novel indexing schemas are described in [21-24] to handle such data types and queries imposed on them.
Saltenis et al in [21] proposed TPR-tree, an acronym for Time-Parameterized R*-tree, based on the underlying principles of R-tree. TPR-tree indexes the current and future anticipated positions of moving objects in one, two and three dimensions. The basic algorithms of R*- trees are employed for TPR-tree with a modification that the leaf and non-leaf minimum bounding rectangles are now augmented with velocity vectors for these rectangles. The velocity vector for an edge of the rectangle is chosen so that the object remains inside the moving rectangle. TPR-tree can typically handle the following three types of queries,
• Timeslice Query: A query Q specified by hyper-rectangle R located at time point t.
• Window Query: A query Q specified by hyper-rectangle R covering an interval from [Ta,Tb].
• Moving Query: A query Q specified by hyper-rectangles Ra and Rb at different times Ta and Tb, forming a trapezoid.
Figure 22.15 shows objects o1, o2, o3 and o4 moving in time.
The trajectories of these objects are shifting, as shown in figure. The three types of queries as described above are illustrated in this figure. Q0 and Q1 are timeslice queries, Q2 and Q3 are window queries and Q4 is a moving query.
The structure of TPR-tree is very similar to R*-tree with leaves consisting of position and pointer of the moving object. The nodes of the tree consist of pointers to subtree and bounding rectangles for the entries in subtree. TPR-trees store the moving objects as linear function of time with time-parameterized bounding rectangles. The index does not consist of points and rectangles for time stamp older than current time. TPR-tree differs from the R*-trees in how its insertion algorithms group points into nodes. While in R*-trees the heuristics of the minimized area, overlap, and margin of bounding rectangles are used to assign points to the nodes of the tree, in case of TPR-trees these heuristics are replaced
by their respective integrals, which are representative their temporal component. Given an objective function F (t), the following integral is expected to be minimized [21].
where tc is the current time and H is the time horizon.
The objective function can be area or perimeters of the bounding rectangles, or could represent the overlap between these rectangles. Figure 22.16 represents a bounding interval and a query in the TPR-tree. The area of the shaded region in Figure 22.16 represents the time integral of the length of the bounding interval.
Saltenis et al. in [21] compared the performance of TPR-trees with load-time bounding rectangles, TPR-tree with update-time bounding rectangles and R-tree with a set of experiments with varying workloads. The results demonstrated that TPR-tree outperforms other approaches by considerable improvement. It was also demonstrated that tree does not degrade severely in performance with increasing time and it can be tuned to take advantage of a specific update rate.
Saltenis and Jensen in [22] proposed REXP -tree a balanced, multi-way tree with a structure of R*-tree. REXP -tree is an improvement over TPR-tree, assuming that the some objects used in indexing expires after a certain period. These trees can handle realistic scenario where certain objects are no longer required, that is when they expire. By removing the expired entries and re-computing bounding rectangles, the index organizes itself to handle subsequent queries efficiently. This tree structure finds its application where the objects not reporting their position for a certain period, possibly implying that they are no more interested in the service.
The index structure of REXP -tree differs from TPR-tree in insertion and deletion algorithms for disposing the expired nodes. REXP -tree uses a ‘lazy strategy’ for deleting the expired entries. Another possible strategy is scheduled deletion of entries in TPR-trees. During the search, insertion and deletion operations, only the live entries are searched and expired entries are physically removed when the content of the node are modified and is written to the disk. Whenever an entry in internal node is deleted, the entire subtree is reallocated. The performance results demonstrated in [22] show that choosing the right bounding rectangles and corresponding algorithms for grouping entries is not straightforward and depends on the characteristics of the workloads.
STAR-tree
Procopiuc, Agarwal and Har-Peled in [23] propose a Spatiotemporal Self-Adjusting R-tree or STAR-tree. STAR-tree indexing schema is similar to TPR trees with few differences. Specifically, STAR-tree groups points according to their current locations and may result in points moving with different velocities being included in the same rectangle. Scheduled events are used to regroups points to control the growth of such bounding rectangles. It improves the structure of TPR-tree by self-adjusting the index, whenever index performance degrades. Intervention of user is not needed for adjustment of the index and the query time is kept low even without continuously updating the index by positions of the objects. STAR- tree doesn’t need periodic rebuilding of indexing and estimation of time horizon. It provides tradeoffs between storage and query performance and between time spent in updating the index and in answering queries. STAR-tree can handle not only the timeslice and range queries as those handled by TPR-trees, but also nearest neighbor queries for continuously moving objects.
TPR*-tree proposed by Tao, Papadias and Sun in [24] is an optimized spatio-temporal indexing method for predictive queries. TPR-tree, described in the previous section, does not propose an analytical model for cost estimation and query optimization and quantification of its performance. TPR*-tree assumes a probabilistic model that accurately estimates the number of disk accesses in answering a window query in a spatio-temporal index. The authors in [24] investigate the optimal performance of any data-partition index using the proposed model.
The TPR*-tree improves the performance of TPR-tree by employing a new set of insertion and deletion algorithms that minimize the average number of node accesses for answering a window query, whose MBB uniformly distributes in the data space. The static point interval query with the following constraints has been is optimized [24] using the TPR*-tree:
• MBB has a length |QR| = 0 on each axis.
• Velocity bounding rectangle is {0,0,0,0}.
• Query interval QI = [0, H], where H is the horizon parameter.
It is demonstrated that the above choice of parameters leads to nearly-optimal performance independently of the query parameters. The experiments have also shown that TPR*-trees significantly outperforms the conventional TPR-tree under all conditions.
Comments
Post a Comment