Finger Search Trees:Applications.

Applications

Finger search trees have, e.g., been used in algorithms within computational geometry [3, 8, 20, 24, 28, 41] and string algorithms [10, 11]. In the rest of this chapter we give examples of the eﬃciency that can be obtained by applying ﬁnger search trees. These examples typically allow one to save a factor of O(log n) in the running time of algorithms compared to using standard balanced search trees supporting O(log n) time searches.

Optimal Merging and Set Operations

Consider the problem of merging two sorted sequences X and Y of length respectively n and m, where n ≤ m, into one sorted sequence of length n + m. The canonical solution is to repeatedly insert each x ∈ X in Y . This requires that Y is searchable and that there can be inserted new elements, i.e. a suitable representation of Y is a balanced search tree. This immediately implies an O(n log m) time bound for merging. In the following we discuss how ﬁnger search trees allow this bound to be improved to O(n log m/n ).

Hwang and Lin [27] presented an algorithm for merging two sorted sequence using op- timal O(n log m ) comparisons, but did not discuss how to represent the sets. Brown and Tarjan [12] described how to achieve the same bound for merging two AVL trees [1]. Brown and Tarjan subsequently introduced level linked (2,3)-trees and described how to achieve the same merging bound for level linked (2,3)-trees [13].

Optimal merging of two sets also follows as an application of ﬁnger search trees [26]. Assume that the two sequences are represented as ﬁnger search trees, and that we repeatedly insert the n elements from the shorter sequence into the larger sequence using a ﬁnger that moves monotonically from left to right. If the ith insertion advances the ﬁnger di positions, we have that the total work of performing the n ﬁnger searches and insertions is di ≤ m. By convexity of the logarithm the total work becomes bounded by O(n log m ).

Since sets can be represented as sorted sequences, the above merging algorithm gives immediately raise to optimal, i.e. O (log (n+m)) = O(n log m ) time, algorithms for set union, intersection, and diﬀerence operations [26]. For a survey of data structures for set representations see Chapter 33.

Arbitrary Merging Order

A classical O(n log n) time sorting algorithm is binary merge sort. The algorithm can be viewed as the merging process described by a balanced binary tree: Each leaf corresponds to an input element and each internal node corresponds to the merging of the two sorted sequences containing respectively the elements in the left and right subtree of the node.

If the tree is balanced then each element participates in O(log n) merging steps, i.e. the O(n log n) sorting time follows.

Many divide-and-conquer algorithms proceed as binary merge sort, in the sense that the work performed by the algorithm can be characterized by a treewise merging process. For some of these algorithms the tree determining the merges is unfortunately ﬁxed by the input instance, and the running time using linear merges becomes O(n · h), where h is the height of the tree. In the following we discuss how ﬁnger search trees allow us to achieve O(n log n) for unbalanced merging orders to.

Consider an arbitrary binary tree T with n leaves, where each leaf stores an element. We allow T to be arbitrarily unbalanced and that elements are allowed to appear at the leaves in any arbitrary order. Associate to each node v of T the set Sv of elements stored at the leaves of the subtree rooted at v. If we for each node v of T compute Sv by merging the two sets of the children of v using ﬁnger search trees, cf. Section 11.5.1, then the total time to compute all the sets Sv is O(n log n).

The proof of the total O(n log n) bound is by structural induction where we show that in a tree of size n, the total merging cost is O(log(n!)) = O(n log n). Recall that two sets of size n1 and n2 can be merged in O ( g (n1 +n2 )\ time. By induction we get that the total merging in a subtree with a root with two children of size respectively n1 and n2 becomes:

The above approach of arbitrary merging order was applied in [10, 11] to achieve O(n log n) time algorithms for ﬁnding repeats with gaps and quasiperiodicities in strings. In both these algorithms T is determined by the suﬃx-tree of the input string, and the Sv sets denote the set of occurrences (positions) of the substring corresponding to the path label of v.

List Splitting

Hoﬀmann et al. [25] considered how ﬁnger search trees can be used for solving the following list splitting problem, that e.g. also is applied in [8, 28]. Assume we initially have a sorted list of n elements that is repeatedly split into two sequences until we end up with n sequences each containing one element. If the splitting of a list of length k into two lists of length k1 and k2 is performed by performing a simultaneous ﬁnger search from each end of the list, followed by a split, the searching and splitting can be performed in O(log min(k1, k2)) time.

Here we assume that the splitting order is unknown in advance.

By assigning a list of k elements a potential of k − log k ≥ 0, the splitting into two lists of size k1 and k2 releases the following amount of potential:

since max(k1, k2) ≥ k/2. The released potential allows each list splitting to be performed in amortized O(1) time. The initial list is charged n − log n potential. We conclude that starting with a list of n elements, followed by a sequence of at most n − 1 splits requires total O(n) time.

Adaptive Merging and Sorting

The area of adaptive sorting addresses the problem of developing sorting algorithms which perform o(n log n) comparisons for inputs with a limited amount of disorder for various deﬁnitions of measures of disorder, e.g. the measure Inv counts the number of pairwise insertions in the input. For a survey of adaptive sorting algorithms see [18].

An adaptive sorting algorithm that is optimal with respect to the disorder measure Inv has running time O(n log Inv ). A simple adaptive sorting algorithm optimal with respect to Inv is the insertion sort algorithm, where we insert the elements of the input sequence from left to right into a ﬁnger search tree. Insertions always start at a ﬁnger on the last element inserted. Details on applying ﬁnger search trees in insertion sort can be found in [13, 31, 32].

Another adaptive sorting algorithm based on applying ﬁnger search trees is obtained by replacing the linear merging in binary merge sort by an adaptive merging algorithm [14, 33– 35]. The classical binary merge sort algorithm alway performs Ω(n log n) comparisons, since in each merging step where two lists each of size k is merged the number of comparisons performed is between k and 2k − 1.

The idea of the adaptive merging algorithm is to identify consecutive blocks from the input sequences which are also consecutive in the output sequence, as illustrated in Figure 11.5. This is done by repeatedly performing a ﬁnger search for the smallest element of the two input sequences in the other sequence and deleting the identiﬁed block in the other sequence by a split operation. If the blocks in the output sequence are denoted Z1,..., Zk, it follows from the time bounds of ﬁnger search trees that the total time for this adaptive merging operation becomes O(),k log |Zi|). From this merging bound it can be argued that merge sort with adaptive merging is adaptive with respect to the disorder measure Inv (and several other disorder measures). See [14, 33, 34] for further details.

Search This Blog

algorithms