Trees with Minimum Weighted Path Length:Height Limited Huffman Trees

Height Limited Huffman Trees

In this section only, for technical reason, we assume that the length of the path is the number of its vertices. For a sequence S of weights the total cost is changed by adding the sum of weights in S.

Assume we have the same problem as in the case of Huffman trees with additional restric- tion that the height of the tree is limited by a given parameter L. A beautiful algorithm for this problem has been given by Larmore and Hirschberg, see [16].

Reduction to the Coin Collector Problem

The main component of the algorithm is the reduction to the following problem in which the crucial property play powers of two. We call a real number dyadic iff it has a finite binary representation.

Coin Collector problem:

Input: A set I of m items and dyadic number X, each element of I has a width and a weight, where each width is a (possibly negative) power of two, and each weight is a positive real number.

Output: CoinColl(I, X) - the minimum total weight of a subset S I whose widths sum to X.

The following trivial lemma plays important role in the reduction of height limited tree problem to the Coin Collector problem.

LEMMA 14.5 Assume T is a full binary tree with n leaves, then

image

Assume we have Huffman coding problem with n items with the sequence of weights weights W = w1, w2,..., wn. We define an instance of the Coin Collector problem as follows:

image

image

FIGURE 14.3: A Huffman tree T for the items 1, 2, 3, 4 of height limited by 4 and the corresponding solution to the Coin Collector problem. Each node of the tree can be treated as a package consisting of leaves in the corresponding subtree. Assume weight(i) = i. Then in the corresponding Coin Collector problem we have weight(i, h) = i, width(i, h) = 2h.

The intuition behind this strange construction is to view nodes as packages consisting of of original items (elements in the leaves). The internal node v which is a package consisting of (leaves in its subtree) items i1, i2,..., ik can be treated as a set of coins (i1, h), (i2, h),... (ik, h), where h is the level of v, and weight(ij , h) = weight(ij ). The total weight of the set of coins is the cost of the Huffman tree.

Example Figure 14.3 shows optimal Huffman tree for S = (1, 2, 3, 4) with height limited by 4, and the optimal solution to the corresponding Coin Collector problem. The sum of widths of the coins is 4+1, and the total weight is minimal. It is the same as the cost of the Huffman tree on the left, assuming that leaves are also contributing their weights (we scale cost by the sum of weights of S).

Hirschberg and Larmore, see [16], have shown the following fact.

LEMMA 14.6 The solution CoinColl(IW , XW ) to the Coin Collector Problem is the cost of the optimal L-height restricted Huffman tree for the sequence W of weights.

The Algorithm for the Coin Collector Problem

The height limited Huffman trees problem is thus reduced to the Coin Collector Problem. The crucial point in the solution of the latter problem is the fact that weights are powers of two.

Denote MinWidth (X) to be the smallest power of two in binary representation of number X. For example MinWidth (12) = 4 since 12 = 8 + 4. Denote by MinItem(I) the item with the smallest width in I.

LEMMA 14.7 If the items are sorted with respect to the weight then the Coin Collector problem can be solved in linear time (with respect to the total number |I| of coins given in the input).

Proof The recursive algorithm for the Coin Collector problem is presented below as a recursive function CoinColl(I, X). There are several cases depending on the relation between the smallest width of the item and minimum power of two which constitutes the binary representation of X. In the course of the algorithm the set of weights shrinks as well as the size of X. The linear time implementation of the algorithm is given in [21].

image

The last two lemmas imply the following fact.

THEOREM 14.5 The problem of the Huffman tree for n items with height limited by L can be solved in O(n · L) time.

Using complicated approach of the Least Weight Concave Subsequence the complexity has been reduced to nL log n + n log n in [1]. Another small improvement is by Schieber

[49]. An efficient approximation algorithm is given in [40–42]. The dynamic algorithm for Huffman trees is given in [50].

Comments

Popular posts from this blog

Data Structure Visualization:Introduction and Value of Data Structure Rendering

Binary Space Partitioning Trees:BSP Tree as a Hierarchy of Regions.

Drawing Trees:HV-Layout