Splay Trees:Introduction

Introduction

In this chapter we discuss following topics:

1. Introduction to splay trees and their applications

2. Splay Trees–description, analysis, algorithms and optimality of splay trees.

3. Linking and Cutting Trees

4. Case Study: Application to Network Flows

5. Variants of Splay Trees.

There are various data structures like AVL-trees, red-black trees, 2-3-trees (Chapter 10) which support operations like insert, delete (including deleting the minimum item), search (or membership) in O(log n) time (for each operation). Splay trees, introduced by Sleator and Tarjan [13, 15] support all these operations in O(log n) amortized time, which roughly means that starting from an empty tree, a sequence of m of these operations will take O(m log n) time (deterministic), an individual operation may take either more time or less time (see Theorem 12.1). We discuss some applications in the rest of this section.

Assume that we are searching for an item in a “large” sorted ﬁle, and if the item is in the kth position, then we can search the item in O(log k) time by exponential and binary search. Similarly, ﬁnger search trees (Chapter 11) can be used to search any item at distance f from a ﬁnger in O(log f ) time. Splay trees can search (again in amortized sense) an item from any ﬁnger (which need not even be speciﬁed) in O(log f ) time, where f is the distance from the ﬁnger (see Section 12.4.2). Since the ﬁnger is not required to be speciﬁed, the time taken will be minimum over all possible ﬁngers (time, again in amortized sense).

If we know the frequency or probability of access of each item, then we can construct an optimum binary search tree (Chapter 14) for these items; total time for all access will be the smallest for optimal binary search trees. If we do not know the probability (or access frequency), and if we use splay trees, even then the total time taken for all accesses will still be the same as that for a binary search tree, up to a multiplicative constant (see Section 12.4.1).

In addition, splay trees can be used almost as a “black box” in linking and cutting trees (see Section 12.5). Here we need the ability to add (or subtract) a number to key values of all ancestors of a node x.

Moreover, in practice, the re-balancing operations (rotations) are very much simpler than those in height balanced trees. Hence, in practice, we can also use splay trees as an alternative to height balanced trees (like AVL-trees, red-black trees, 2-3-trees), if we are interested only in the total time. However, some experimental studies [3] suggest, that for random data, splay trees outperform balanced binary trees only for highly skewed data; and for applications like “vocabulary accumulation” of English text [16], even standard binary search trees, which do not have good worst case performance, outperform both balanced binary trees (AVL trees) and splay trees. In any case, the constant factor and the algorithms are not simpler than those for the usual heap, hence it will not be practical to use splay trees for sorting (say as in heap sort), even though the resulting algorithm will take O(n log n) time for sorting, unless the data has some degree of pre-sortedness, in which case splay sort is a practical alternative [10]. Splay trees however, can not be used in real time applications. Splay trees can also be used for data compression. As splay trees are binary search trees, they can be used directly [4] with guaranteed worst case performance. They are also used in data compression with some modiﬁcations [9]. Routines for data compression can be shown to run in time proportional to the entropy of input sequence [7] for usual splay trees and their variants.

Splay Trees

Let us assume that for each node x, we store a real number key(x).

In any binary search tree left subtree of any node x contains items having “key” values less than the value of key(x) and right subtree of the node x contains items with “key” values larger than the value of key(x).

In splay trees, we ﬁrst search the query item, say x as in the usual binary search trees— compare the query item with the value in the root, if smaller then recursively search in the left subtree else if larger then, recursively search in the right subtree, and if it is equal then we are done. Then, informally speaking, we look at every disjoint pair of consecutive ancestors of x, say y =parent(x) and z =parent(y), and perform certain pair of rotations. As a result of these rotations, x comes in place of z.

In case x has an odd number of proper ancestors, then the ancestor of x (which is child of the root), will also have to be dealt separately, in terminal case— we rotate the edge between x and the root. This step is called zig step (see Figure 12.1).

If x and y are both left or are both right children of their respective parents, then we ﬁrst rotate the edge between y and its parent z and then the edge between x and its parent y. This step is called zig-zig step (see Figure 12.2).

If x is a left (respectively right) child and y is a right (respectively left) child, then we

ﬁrst rotate the edge between x and y and then between x and z, this step is called zig-zag step (see Figure 12.3).

These rotations (together) not only make x the new root, but also, roughly speaking halve the depth (length of path to root) of all ancestors of x in the tree. If the node x is at depth “d”, splay(x) will take O(d) time, i.e., time proportional to access the item in node x.

Formally, splay(x) is a sequence of rotations which are performed (as follows) until x becomes a root:

• If parent(x) is root, then we carry out usual rotation, see Figure 12.1.

• If x and parent(x) are both left (or are both right) children of their parents, then we ﬁrst rotate at y =parent(x) (i.e., the edge between y and its parent) and then rotate at x, see Figure 12.2.

• If x is left (or respectively right) child but parent(x) is right (respectively left) child of its parent, then ﬁrst rotate at x and then again rotate at x, see Figure 12.3.

Search This Blog

algorithms