Managing Spatiotemporal Data:3D R-tree
3D R-tree
In the previous section a space driven, multi-version linear Quad tree based spatiotemporal data structure was presented. In this section we discuss a data driven data structure that partitions the set of objects for efficient spatiotemporal representation.
Theidoridis, Vazirgiannis and Sellis in [9] have proposed a data structure termed 3D R- tree for indexing spatiotemporal information for large multimedia applications. The data structure is a derivation of R-trees (Chapter 21) whose variants have been demonstrated to be an efficient indexing schema for high-dimensional spatial objects.
In [9] Theidoridis et al. have adopted a set of operators defined in [10] to represent possible topological-directional spatial relationships between two 2-dimensional objects. An anthology of 169 relationships Ri j (i ∈ [1..13],j ∈ [1..13]) can represent a complete set of spatial operators.
Figure 22.4 represents these relations.
An interested reader can find a complete illustration of these topographical relations relationships in [10]. To represent the temporal relationships, a set of temporal operators defined in [11] are employed. Any spatiotemporal relationships among objects can be found using these operators. For Example, object Q to appear 7 seconds after the object P , 14cm to the right and 2cm down the right bottom vertex of object P can be represented as the following composition tuple:
where r13 13is the corresponding spatial relationship, (−7− >) is the temporal relationship between the objects, v3 and v4 are the named vertices of the objects while (14, 2) are their spatial distances on the two axes.
Theidoridis et al. have employed the following typical spatiotemporal relationships to illustrate their indexing schema [9]. These relationships can be defined as spatiotemporal operators.
• overlap during(a,b): returns the catalog of objects a that spatially overlap object b during its execution.
• overlap before(a,b): returns the catalogue of objects a that spatially overlap object b and their execution terminates before the start of execution of b.
• above during(a,b): returns the catalogue of objects a that spatially lie above object b during the course of its execution.
• above before(a,b): returns the catalogue of objects a that spatially lie above object b and their execution terminates before the start of execution of b.
Spatial and Temporal features of objects are typically identified by a six dimensions (each spatiotemporal object can be perceived as a point in a 6-dimensional space):
In a naıve approach, these object descriptors coupled by the object id (unique for the object that they represent) can be stored sequentially in a database. An illustration of such an organization is demonstrated in Figure 22.5.
Such a sequential schema has obvious demerits. Answering of spatial temporal queries, such as one described above, would require a full scan of the data organization, at least once. As indicated before, most spatiotemporal databases suffer from the curse of dimensionality
and sequential organization of data exhibits this curse through depreciated performance, rather than reducing it.
In another schema, two indices can be maintained to store spatial and temporal components separately. Specifically, they can be organized as follows.
1. Spatial index: An index to store the size and coordinates of the objects in two dimensions
2. Temporal index: A one-dimensional index for storing the duration and start/stop time for objects in one dimension.
R-trees and their variants have been demonstrated to be efficient for storing n-dimensional data. Generally speaking, they could be used to store the spatial space components in a 2D R-tree and temporal components in a 1D R-tree. This schema is better than sequential search, since a tree structure provides a hierarchical organization of data leading to a logarithmic time performance. More details on R-trees and its variants can be found in Chapter 21.
Although this schema is better than sequential searching, it still suffers from a limitation [9]. Consider the query overlap during, which would require that both the indices (spatial 2D R-tree and temporal 1D R-tree) are searched individually (in the first phase) and then the intersection of the recovered answer sets from each of the indices is reported as the index’s response to the query. Access to both the indices individually and then post- intersection can cumulatively be a computationally expensive procedure, especially when each of these indices are dense. Spatial-joins [9,12] have been proposed to handle queries on two indexes, provided these indexing schemas adopt the same spatial data structure. It is not straightforward to extend these to handle joins in two varieties of spatial data structures. Additionally, there might be possible inherent relationships between the spatial descriptors and the temporal descriptors resident in two different indexes which can be learned and exploited for enhanced performance. Arranging and searching these descriptors separately may not be able to exploit these relationships. Hence, a unified framework is needed to present spatiotemporal components preferably in a same indexing schema.
Before we proceed further, let us briefly discuss the similarity search procedure in R-trees. In R-trees Minimum bounding boxes (MBB); or minimum bounding rectangles in two- dimensions) are used to assign geometric descriptors to objects for similarity applications, especially in data mining [14]. The idea behind usage of MBB in R-trees is the following. If two objects are disjoint, then their MBBs should be disjoint and if two objects overlap, then their MBB should definitely overlap. Typically, a spatial query on a MBB based index involves the following steps.
1. Searching the index: This step is used to select the answer and some possible false alarms from a given data set, ignoring those records that cannot possibly satisfy the query criterion.
2. Dismissals of false alarms: The input to this step is the resultant of the index searching step. In this step the false alarms are eliminated to identify correct answers, which are reported as the query response.
Designing an indexing schema for a multimedia application requires design of a spatiotemporal indexing structure to support spatiotemporal queries. Consider the following scenario.
Example 22.1
An application starts with a video clip A located at point (1,7) relative to the application origin Θ. After 2 minutes an image B appears inside A with 1 unit above its lower horizontal edge and 3 units after its left vertical edge. B disappears after 2 minutes of its presence, while A continues. After 2 minutes of B’s disappearance, an text window C appears 3 units below the lower edge of A and 4 units to the right of left edge of it. A then disappears after 1 minute of C’s appearance. The moment A disappears, a small image D appears 2 units to the right of right edge of C and 3 units above the top edge of C. C disappears after 2 minutes of D’s appearance. As soon as C disappears, a text box E appears 2 units below the lower edge of D and left aligned with it. E lasts for 4 minutes after which it disappears. D disappears 1 minute after E’s disappearance. The application ends with D’s disappearance.
The spatial layout of the above scenario is presented in Figure 22.6a and the temporal layout is presented in Figure 22.6b.
FIGURE 22.6: (a) Spatial layout of the multimedia database (b) Temporal layout of the multimedia database.
A typical query that can be posed on such a database of objects described above is “Which objects overlap the object A during its presentation?”. In [9], the authors have proposed a unified schema to handle queries on a spatiotemporal database, such as the one stated above. The schema amalgamates the spatial and temporal components in a single data structure such as R-trees to exhibit advantages and performance gains over the other schemas described above. This schema amalgamates need of spatial joins on two spatial data structures besides aggregating both attributes under a unified framework. The idea for representation of spatio-temporal attributes is as follows. If an object which initially lies at point (xa , ya)during time [ta, tb)and at (xb ,yb ) during [tb, tc), it can be modeled by two lines [(xa , ya, ta), (xa , ya, tb)) and [(xb, yb, tb), (xb, yb, tc)). These lines can be presented in a hierarchical R-tree index in three dimensions. Figure 22.7 presents the unified spatial- temporal schema for the spatial and temporal layouts of the example presented above.
Answering Spatio-Temporal Queries Using the Unified Schema
Answering queries on the above presented unified schema is similar to handling similarity queries using R-trees. Consider the following queries [9]:
Query-1: “Find all the objects on the screen at time T2” (Spatial layout query). This query can be answered by considering a rectangle Q1 (Figure 22.7) intersecting the time-axis at exactly one point, T2.
Query-2: “Find all the objects and their corresponding temporal duration between the time interval (T0,T1)” (temporal layout query). This query can be answered by considering a box Q2(Figure 22.7) intersecting the time-axis between the intervals (T0, T1).
After we have obtained the objects enclosed by the rectangle Q1 and box Q2, these objects are filtered (in main memory) to obtain the answer set.
Query-3: “Find all the objects and their corresponding spatial layout at time T =3 minutes”. This query can be answered by looking at the screenshot of spatial layout of objects that were present at the given instant of time. The resultant would be the list of objects and their corresponding spatial descriptors. The response to this query is given in Figure 22.8a.
FIGURE 22.7: A unified R-tree based schema for storing spatial and temporal components.
FIGURE 22.8: (a) Response to Query-3 (b) Response to Query-4.
Query-4: Find the temporal layout of all the objects between the time interval (T1 = 2, T2 = 9). This query can be answered by drawing a rectangle on the unified index with dimensions (Xmax − 0) × (Ymax − 0) × (T2 − T1). The response of the query is shown in
Figure 22.8b.
Performance Analysis of 3D R-trees
Theodoridis et al. in [9] analyzed the performance of the proposed R-trees using the expected retrieval cost metric that they presented in [15]. An interested reader is referred to [15,9] for details on this metric. Based on the analytical model of this metric, it is asserted that one can estimate the retrieval cost of an overlap query based on the information attainable from the query window and data set only. In the performance analysis it was demonstrated that since the expected retrieval cost metric expresses the expected performance of R-trees on overlapping queries, the retrieval of spatio-temporal operators using R-trees is cost-equivalent to the cost of retrieval of an overlap query using an appropriate query window Q. Rigorous analysis on 10,000 objects asserted the following conclusions in [9]:
1. For the operators with high selectivity (overlap, during, overlap during), the proposed 3D R-trees outperformed sequential search at a level of one to two orders of magnitude.
2. For operators with low selectivity (above, before, above before), the proposed 3D R-trees outperformed sequential search by factors ranging between 0.25-0.50 fraction of the sequential cost.
Handling Queries with Open Transaction-Times
In the previous section although 3D R-trees were demonstrated to be very efficient compared to sequential search, it suffers from a limitation. The transaction times presented as creation and termination time of an object are expected to be known a priori before they can be stored and queries from the index. However, in most practical circumstances the duration of existence of an object is not known. In other words, when the object is created, all we know is that it would remain valid until changed. The concept of until changes is a well- discussed issue (see [16,17]). Data structures like R-trees and also its modified form of 3D R-tree are typically not capable of handling such queries having open transaction times. In the next and the following section, we discuss two spatiotemporal data structures capable of handling queries with open transaction times.
Comments
Post a Comment