B Trees:Further Discussions

Further Discussions

In this section we discuss various issues of the B-tree and the B+-tree.

Eﬃciency Analysis

Theorem: In the B-tree or the B+-tree, the I/O cost of insertion, deletion and exact-match query is O(logB n). In the B+-tree, the I/O cost of a range search is O(logB n + t/B). Here B is the minimum page capacity in number of records, n is the total number of objects in the tree, and t is the number of objects in the range query result.

The correctness of the theorem can be seen from the discussion of the algorithms. Basically, for both the B-tree and the B+-tree, all the insertion, deletion and exact-match query algorithms examine a single path from root to leaf. At each node, the algorithm might ex- amine up to two other nodes. However, asymptotically the complexity of these algorithms are equal to the height of the tree. Since there are n objects, and the minimum fan-out of the tree is B, the height of the tree is O(logB n). So the complexity of the algorithms is O(logB n) as well.

For the range query in the B+-tree, logB n nodes are examined to ﬁnd the leaf node that contains the low value of the query range. By following the sibling pointers in the leaf nodes, the other leaf nodes that contain objects in the query range are also found. Among all the leaf nodes examined, except for the ﬁrst and the last, every node contains at least B objects in the query result. Thus if there are t objects in the query result, the range query complexity is O(logB n + t/B).

Why is the B+-tree Widely Accepted?

One can safely claim that the B+-tree has been included in at least 99%, if not all, of the database management systems (DBMS). No other index structure has received so much attention. Why is that?

Let’s do some calculation. First, we point out that a practical number of minimum occupancy of a B+-tree is B = 100. Thus the fan-out of the tree is between 100 and 200. Analysis has shown that in a real-world B+-tree, the average page capacity is about 66.7% full. Or, a page typically contains 200*66.7%=133 entries. Here is the relationship between the height of the tree and the number of objects that can hold in a typical B+-tree:

1. height=0: B+-tree holds 133 objects on average. There is a single node, which is 66.7% full.

2. height=1: B+-tree holds 1332 = 17, 689 objects. There are 133 leaf nodes, each holds 133 objects.

3. height=2: B+-tree holds 1333 = 2, 352, 637 objects.

4. height=3: B+-tree holds 1334 = 312, 900, 721 (over 0.3 billion) objects.

The ﬁrst two levels of the B+-tree contains 1+133=134 disk pages. This is very small. If a disk page is 4KB large, 134 disk pages occupy 134*4KB=536KB disk space. It’s quite reasonable to assume that the ﬁrst two levels of the B+-tree always stays in memory.

The calculations lead to this discovery: in a large database with 0.3 billion objects, to ﬁnd one object we only need to access two disk pages! This is unbelievably good.

Bulk-Loading a B+-tree

In some cases, we are given a large set of records and we are asked to build a B+-tree index. Of course, we can start with an empty B+-tree and insert one record at a time using the Insert algorithm. However, this approach is not eﬃcient, as the I/O cost is O(n · logB n).

Many systems have implemented the bulk-loading utility. The idea is as follows. First, sort the objects. Use the objects to ﬁll in leaf nodes in sequential order. For instance, if a leaf node holds up to 2B objects, the 2B smallest objects are stored in page 1, the next 2B

objects are stored in page 2, etc. Next, build the index nodes at one level up. Assume an index node holds up to 2B child pointers. Create the ﬁrst index node as the parent of the ﬁrst 2B leaf nodes. Create the second index node as the parent of the next 2B leaf nodes, etc. Then, build the index nodes at two levels above the leaf level, and so on. The process stops when there is only one node at a level. This node is the tree root.

If the objects are sorted already, the bulk-loading algorithm has an I/O cost of O(n/B).

Otherwise, the bulk-loading algorithm has asymptotically the same I/O cost as external sort, which is O(n/B · logB n). Notice that even if the bulk-loading algorithm performs a sorting ﬁrst, it is still B times faster than inserting objects one at a time into the structure.

Aggregation Query in a B+-tree

The B+-tree can also be used to answer the aggregation query: “given a key range R, ﬁnd the aggregate value of objects whose keys are in R”. The standard SQL supports the following aggregation operators: COUNT, SUM, AVG, MIN, MAX. For instance, the COUNT operator returns the number of objects in the query range. Here AVG can be computed as SUM/AVG. Thus we focus on the other four aggregate operators.

Since the B+-tree eﬃciently supports the range query, it makes sense to utilize it to answer the aggregation query as well. Let’s ﬁrst look at some concepts.

Associated with each aggregate operator, there exists a in it value and an aggregate function. The in it value is the aggregate for an empty set of objects. For instance, the in it value for the COUNT operator is 0. The aggregate function computes the aggregate value. There are two versions. One version takes two aggregate values of object set S1 and S2, and computes the aggregate value of set S1 ∪ S2. Another version takes one aggregate value of set S1 and an object o and computes the aggregate value of S1 ∪ {o}. For instance, if we know

The B+-tree can support the aggregation query in the following way. We keep a temporary aggregate value, which is initially set to be in it value. A range search is performed on the B+-tree. For each object found, its value is aggregated with the temporary aggregate value on-the-ﬂy. When all objects whose keys are in the query range are processed, this temporary aggregate value is returned.

However, this approach is not eﬃcient, as the I/O cost is O(logB n + t/B), which is linear to the number of objects divided by B. If the query range is large, the algorithm needs to access too many disk pages. It is ideal to ﬁnd some approach whose query performance is independent to the size of the objects in the query range.

A better way is to store the local aggregate values in the tree. In more detail, along with each child pointer, we store the aggregate value of all objects in the corresponding sub-tree. By doing so, if the query range fully contains the key range of a sub-tree, we take the associated local aggregate value and avoid browsing the sub-tree. We call such a B+-tree with extra aggregate information the aggregation B+-tree. The algorithm to perform a aggregation query using the aggregation B+-tree is shown below.

Algorithm Aggregation(x, R)

Input: a node x of an aggregation B+-tree, the query key range R.

Action: Among objects in the sub-tree rooted by x, compute the aggregate value of objects

whose keys belong to R.

The algorithm starts with examining the root node. Here the child pointers are divided into three groups. (1) There are at most two child pointers whose key ranges intersect the query range R. (2) The child pointers between them have key ranges fully contained in R.

(2) The child pointers outside of them have key ranges non-intersecting with R.

For child pointers in group (2), the local aggregate stored at the child pointer (represented by x.child[i].aggr) is aggregated to the temporary aggregate value and the examination of the sub-tree is avoided. This is shown in step 3(a)i of the algorithm. For child pointers in group (3), no object in the sub-trees will contribute to the query and the examination of the sub-trees are also avoided.

For each of the two child pointers whose key ranges intersect R, a recursive call to Aggregation is performed. This is shown in step 3(a)ii of the algorithm. If we go one level down, in each of the two child nodes, there can be at most one child pointer whose key range intersects R. Take the left child node as an example. If there is a child pointer whose key range intersect R, all child pointers to the left of it will be outside of R and all child pointers to the right of it will be fully contained in R. Thus the algorithm examines two paths from root to leaf.

Theorem: The I/O cost of the Aggregation query algorithm is O(log B n).

The above theorem shows that the aggregation query performance is independent of the number of objects in the query range.

Search This Blog

algorithms