Functional Data Structures:Binary Search Trees,Path Copying

Binary Search Trees: Path Copying

Stacks are unusual in that there is never a need to update an existing node. However, for most data structures, there is such a need. For example, consider inserting an element into a binary search tree (see Chapter 3). At the very least, we need to update the new node’s parent to point to the new node. But how can we do this if nodes are immutable? The solution is a technique called path copying. To update an existing node, we copy the node and make the necessary changes in the copy. However, we then have to update the existing node’s parent in a similar fashion to point to the copy. In this way, changes propagate all the way from the site of the update to the root, and we end up copying that entire path. That may seem like a lot of copying, but notice that all nodes not on that path are shared between the old and new versions.

To see how path copying works in practice, consider a simple implementation of integer sets as unbalanced binary search trees. Figure 40.5 shows an implementation in Haskell and Figure 40.6 shows the same implementation in Java.

The key to understanding path copying lies in the insert operation. Consider the case where the element being inserted is larger than the element at the current node. In the Java implementation, this case executes the code return new Tree(t.left,t.element,insert(x,t.right));

First, insert calls itself recursively on the right subtree, returning a pointer to the new right subtree. It then allocates a new tree node, copying the left and element fields from the old node, and installing the new pointer in the right field. Finally, it returns a pointer to the new node. This process continues until it terminates at the root. Figure 40.7 illustrates a sample insertion. Notice how the parts of the tree not on the path from the root to the site of the update are shared between the old and new trees.

This functional implementation of binary search trees has exactly the same time complexity as an ordinary non-persistent implementation. The running time of insert is still proportional to the length of the search path. Of course, the functional implementation al- locates more space, but even that issue is not clear cut. If the old tree is no longer needed, then the just-copied nodes can immediately be garbage collected, leaving a net space in- crease of one node—exactly the same space required by a non-persistent implementation. On the other hand, if the old tree is still needed, then the just-copied nodes cannot be garbage collected, but in that case we are actively taking advantage of functionality not supported by ordinary binary search trees.

image

Of course, the binary search trees described above suffer from the same limitations as ordinary unbalanced binary search trees, namely a linear time complexity in the worst case. Whether the implementation is functional or not as no effect in this regard. However, we can easily apply the ideas of path copying to most kinds of balanced binary search trees (see Chapter 10), such as AVL trees [17, 29], red-black trees [25], 2-3 trees [30], and weight-balanced trees [2]. Such a functional implementation retains the logarithmic time complexity of the underlying design, but makes it persistent.

Path copying is sufficient for implementing many tree-based data structures besides binary search trees, including binomial queues [7, 15] (Chapter 7), leftist heaps [18, 24] (Chapter 5), Patricia tries [26] (Chapter 28), and many others.

Comments

Popular posts from this blog

Data Structure Visualization:Introduction and Value of Data Structure Rendering

Collision Detection:Penetration Depth Computation

Concurrent Data Structures:Linked Lists