Cache-Oblivious Data Structures:Dynamic B-Trees
The van Emde Boas layout of a binary tree provides a static cache-oblivious version of B-trees. The first dynamic solution was given Bender et al. [11], and later several simplified structures were developed [10, 12, 18, 25]. In this section, we describe two of these structures [10, 18].
In this section we describe the dynamic cache-oblivious search tree structure of Brodal et al. [18]. A similar proposal was given independently by Bender et al. [12].
The basic idea in the structure is to embed a dynamic binary tree of height log N + O(1) into a static complete binary tree, that is, in a tree with 2h − 1 nodes and height h, which in turn is embedded into an array using the van Emde Boas layout. Refer to Figure 34.8.
To maintain the dynamic tree we use techniques for maintaining small height in a binary tree developed by Andersson and Lai [3]; in a different setting, similar techniques has also been given by Itai et al. [21]. These techniques give an algorithm for maintaining height log N + O(1) using amortized O(log2 N ) time per update. If the height bound is violated after performing an update in a leaf l, this algorithm performs rebalancing by rebuilding the subtree rooted at a specific node v on the search path from the root to l. The subtree is rebuilt to perfect balance in time linear in the size of the subtree. In a binary tree of perfect balance the element in any node v is the median of all the elements stored in the subtree Tv rooted in v. This implies that only the lowest level in Tv is not completely filled and the empty positions appearing at this level are evenly distributed across the level. Hence, the net effect of the rebuilding is to redistribute the empty positions in Tv . Note that this can lower the cost of future insertions in Tv , and consequently it may in the long run be better to rebuild a subtree larger than strictly necessary for reestablishment of the height bound. The criterion for choosing how large a subtree to rebuild, i.e. for choosing the node v, is the crucial part of the algorithms by Andersson and Lai [3] and Itai et al. [21]. Below we give the details of how they can be used in the cache-oblivious setting.
Structure
As mentioned, the data structure consists of a dynamic binary tree T embedded into a static complete binary tree T ! of height H, which in turn is embedded into an array using
the van Emde Boas layout.
In order to present the update and query algorithms, we define the density ρ(u) of a node u as |Tu|/|T ! |, where |Tu| and |T ! | are the number of nodes in the trees rooted in u in T and T !, respectively. In Figure 34.8, the node containing the element 4 has balance 4/7. We also define two density thresholds τi and γi for the nodes on each level i = 1, 2,... ,H (where the root is at level 1). The upper density thresholds τi are evenly space values between 3/4 and 1, and the lower density thresholds γi are evenly spaced values between 1/4 and 1/8. More precisely, τi = 3/4 + (i − 1)/(4(H − 1)) and γi = 1/4 − (i − 1)/(8(H − 1)).
Updates
To insert a new element into the structure we first locate the position in T of the new node w. If the insertion of w violates the height bound H, we rebalance T as follows: First we find the lowest ancestor v of w satisfying γi ≤ ρ(v) ≤ τi, where i is the level of v. If no ancestor v satisfies the requirement, we rebuild the entire structure, that is, T , T ! and the layout of T !: For k the integer such that 2k ≤ N < 2k+1 we choose the new height H of the tree T ! as k + 1 if N ≤ 5/4 · 2k ; otherwise we choose H = k + 2. On the other hand, if the ancestor v exists we rebuild Tv : We first create a sorted list of all elements in Tv by an in-order traversal of Tv . The l|Tv |/2lth element becomes the element stored at v, the smallest l(|Tv |− 1)/2J elements are recursively distributed in the left subtree of v, and the largest l(|Tv |− 1)/2l elements are recursively distributed in the right subtree of v.
We can delete an element from the structure in a similar way: We first locate the node w in T containing the element e to be deleted. If w is not a leaf and has a right subtree, we then locate the node w! containing the immediate successor of e (the node reached by following left children in the right subtree of w), swap the elements in w and w!, and let w = w!. We repeat this until w is a leaf. If on the other hand w is not a leaf but only has a left subtree, we instead repeatedly swap w with the node containing the predecessor of e. Finally, we delete the leaf w from T , and rebalance the tree by rebuilding the subtree rooted at the lowest ancestor v of w satisfying satisfying γi ≤ ρ(v) ≤ τi, where i is the level of v; if no such node exists we rebuild the entire structure completely.
Similar to the proof of Andersson and Lai [3] and Itai et al. [21] that updates are performed in O(log2 N ) time, Brodal et al. [18] showed that using the above algorithms, updates can be performed in amortized O(logB N + (log2 N )/B) memory transfers.
In Section 34.2, we discussed how a range query can be answered in O(logB N + K ) memory transfers on a complete tree T ! laid out using the van Emde Boas layout. Since it can be shown that the above update algorithm maintains a lower density threshold of 1/8 for all nodes, we can also perform range queries in T efficiently: To answer a range query [x1, x2] we traverse the two paths to x1 and x2 in T , as well as O(log N ) subtrees rooted in children of nodes on these paths. Traversing one subtree Tv in T incurs at most the number of memory transfers needed to traverse the corresponding (full) subtree T ! in T !. By the lower density threshold of 1/8 we know that the size of Tv is at most a factor of eight larger than the size of Tv Thus a range query is answered in O(logB N + K ) memory transfers.
THEOREM 34.4 There exists a linear size cache-oblivious data structure for storing N elements, such that updates can be performed in amortized O(logB N +(log2 N )/B) memory transfers and range queries in O(logB N + K ) memory transfers.
Using the method for moving between nodes in a van Emde Boas layout using arithmetic on the node indices rather than pointers, the above data structure can be implemented as a single size O(N ) array of data elements. The amortized complexity of updates can also be lowered to O(logB N ) by changing leaves into pointers to buckets containing Θ(log N ) elements each. With this modification a search can still be performed in O(logB N ) memory transfers. However, then range queries cannot be answered efficiently, since the O( K ) buckets can reside in arbitrary positions in memory.
The second dynamic cache-oblivious search tree we consider is based on the so-called exponential layout of Bender et al. [10]. For simplicity, we here describe the structure slightly differently than in [10].
Structure
Updates
To perform an insertion in T we first search for the leaf l where we want to perform the insertion; inserting the new element below l will increase the number of elements stored below each of the log log N + O(1) components on the path to the root, and may thus result in several components needing rebalancing (an X-component with 2X elements stored below it). We perform the insertion and rebalance the tree in a simple way as follows: We find the topmost X-component Cj on the path to the root with 2X elements below it. Then we divide these elements into two groups of X elements and store them separately in the exponential layout (effectively we split Cj with 2X elements below it into two X-components with X elements each). This can easily be done in O(X) memory transfers. Finally, we update a leaf and insert a new leaf in the X 2-component above Cj (corresponding to the two new X-components); we can easily do so in O(X) memory transfers by rebuilding it. Thus overall we have performed the insertion and rebalancing in O(X) memory transfers. The rebuilding guarantees that after rebuilding an X-component, X inserts have to be performed below it before it needs rebalancing again. Therefore we can charge the O(X) cost to the X insertions that occurred below Cj since it was last rebuilt, and argue that each insertion is charged O(1) memory accesses on each of the log log N + O(1) levels. In fact, using the same argument as above for the searching cost, we can argue that we only need to charge an insertion O(1) transfers on the last log log B − O(1) levels of T , since rebalancing on any of these levels can always be performed in O(1) memory transfers. Thus overall we perform an insertion in O(logB N ) memory transfers amortized.
Deletions can easily be handled in O(logB N ) memory transfers using global rebuilding: To delete the element in a leaf l of T we simply mark l as deleted. If l’s sibling is also marked as deleted, we mark their parent deleted too; we continue this process along one path to the root of T . This way we can still perform searches in O(logB N ) memory transfers, as long as we have only deleted a fraction of the elements in the tree. After N deletes we therefore rebuild the entire structure in O(N logB N ) memory accesses, or O(logB N ) accesses per delete operation.
Bender et al. [10] showed how to modify the update algorithms to perform updates “lazily” and obtain worst case O(logB N ) bounds.
Reducing space usage
To reduce the space of the layout of a tree T to linear we simply make room for 2 log N elements in each leaf, and maintain that a leaf contains between log N and 2 log N elements.
This does not increase the O(logB N ) search and update costs since the O(log N ) elements in a leaf can be scanned in O((log N )/B) = O(logB N ) memory accesses. However, it reduces the number of elements stored in the exponential layout to O(N/ log N ).
THEOREM 34.5 The exponential layout of a search tree T on N elements uses lin- ear space and supports updates in O(logB N ) memory accesses and searches in O(logB N ) memory accesses.
Note that the analogue of Theorem 34.1 does not hold for the exponential layout, i.e.
it does not support efficient range queries. The reason is partly that the √ –components below an X-component are not located in (sorted) order in memory because components are rebalanced by splitting, and partly because of the leaves containing Θ(log N ) elements.
However, Bender et al [10] showed how the exponential layout can be used to obtain a number of other important results: The structure as described above can easily be extended such that if two subsequent searched are separated by d elements, then the second search can be performed in O(log∗ d + logB d) memory transfers. It can also be extended such that R queries (batched searching) can be answered simultaneously in O(R logB N +SortM,B (R)) memory transfers. The exponential layout can also be used to develop a persistent B-tree, where updates can be performed in the current version of the structure and queries can be performed in the current as well as all previous versions, with both operations incurring O(logB N ) memory transfers. It can also be used as a basic building block in a linear space planar point location structure that answers queries in O(logB N ) memory transfers.
Comments
Post a Comment