Randomized Graph Data-Structures for Approximate Shortest Paths:A Randomized Data-Structure for Decremental APASP.

A Randomized Data-Structure for Decremental APASP

There are a number of applications that require efficient solutions of the APASP problem for a dynamic graph. In these applications, an initial graph is given, followed by an on-line sequence of queries interspersed with updates that can be insertion or deletion of edges. We have to carry out the updates and answer the queries on-line in an efficient manner. The goal of a dynamic graph algorithm is to update the solution efficiently after the dynamic changes, rather than having to re-compute it from scratch each time.

The approximate distance oracles described in the previous section can be used for answering approximate distance query in a static graph. However, there does not seem to be any efficient way to dynamize these oracles in order to answer distance queries in a graph under deletion of edges. In this section we shall describe a hierarchical data structure for efficiently maintaining APASP in an undirected unweighted graph under deletion of edges. In addition to maintaining approximate shortest paths for all-pairs of vertices, this scheme has been used for efficiently maintaining approximate shortest paths for pair of vertices separated by distance in an interval [a, b] for any 1 a < b n. However, to avoid giving too much detail in this chapter, we would outline an efficient algorithm for the following problem only.

APASP-d : Given an undirected unweighted graph G = (V, E) that is undergoing deletion of edges, and a distance parameter d n, maintain approximate shortest paths for all-pairs of vertices separated by distance at-most d.

Main Idea

For an undirected unweighted graph G = (V, E), a breadth-first-search (BFS) tree rooted at a vertex u V stores distance information with respect to the vertex u. So in order to maintain shortest paths for all-pairs of vertices separated by distance d, it suffices to maintain a BFS tree of depth d rooted at each vertex under deletion of edges. This is the approach taken by the previously existing algorithms.

The main idea underlying the hierarchical data-structure that would provide efficient update time for maintaining APASP can be summarized as follows : Instead of maintaining exact distance information separately from each vertex, keep small BFS trees around each vertex for maintaining distance information within locality of each vertex, and some what larger BFS trees around fewer vertices for maintaining global distance information.

We now provide the underlying intuition of the above idea and a brief outline of the new techniques used.

Let Bd denote the BFS tree of depth d rooted at vertex u V . There exists a simple algorithm for maintaining a BFS tree u under deletion of edges that takes a total of µ(Bd ) · d time, where µ(t) is the number of edges in the graph induced by tree t. Thus the total update time for maintaining shortest path for all-pairs separated by distance at-most d is of the order of uV µ(Bu) · d. Potentially µ(Bu) can be as large as θ(m), and so the total update time over any sequence of edge deletions will be O(mnd). Dividing this total update cost uniformly over the entire sequence of edge deletions, we can see that it takes O(nd) amortized update time per edge deletion, and O(1) time for reporting exact distance between any pair of vertices separated by distance at-most d.

In order to achieve o(nd) bound on the update time for the problem APASP-d, we closely look at the expression of total update time  uV µ(Bd) · d. There are n terms in this expression each of potential size θ(m). A decrease in either the total number of terms or the size of each term would give an improvement in the total update time. Thus the following simple ideas come to mind.

• Is it possible to solve the problem APASP-d by keeping very few depth-d BFS  trees ?

Is there some other alternative t for depth bounded BFS tree Bd that has o(m) bound on µ(t) ?

While it appears difficult for any of the above ideas to succeed individually, they can be combined in the following way: Build and maintain BFS trees of depth 2d on vertices of a set S V of size o(n), called the set of special vertices, and for each remaining vertex u V \S, maintain a BFS tree (denoted by BS ) rooted at u and containing all the vertices that lie closer to u than the nearest special vertex, say N (u, S).

Along the above lines, we present a 2-level data-structure (and its generalization to k- levels) for the problem APASP-d.

It can be seen that unlike the tree Bd , the new BFS tree BS might not contain all the vertices lying within distance d from u. In order to ensure that our scheme leads to a solution of problem APASP-d, we use the following observation similar to that of 3-approximate distance oracle in the previous section. If v is a vertex lying within distance d from u but not present in BS , an approximate distance from u to v can be extracted from the tree rooted at the nearest special vertex N (u, S). This is because (by triangle inequality) the distance from N (u, S) to v is at most twice the distance from u to v.

For our hierarchical scheme to lead to improved update time, it is crucial that we establish sub-linear upper bounds on µ(BS ). We show that if the set S is formed by picking each vertex independently with suitable probability, then µ(BS ) = O˜(m/|S|) with probability arbitrarily close to 1.

image

Hierarchical Distance Maintaining Data-Structure

Based on the idea of “keeping many small trees, and a few large trees”, we define a k-level hierarchical data-structure for efficiently maintaining approximate distance information as follows. (See Figure 38.5)

image

Let v be a vertex within distance d from u. If v is present in Bd,S1 , we can report exact distance between them. Otherwise, (as will soon become clear) we can extract the approximate distance between u and v from the collection of the BFS trees rooted at the vertices u, p(u), ··· , pk1(u) (see Figure 38.5). The following Lemma is the basis for estimating the distance between two vertices using the hierarchy Hk .

LEMMA 38.6 Given a hierarchy Hk , if j < k 1 is such that v is not present in any of

image

It follows from Lemma 38.6 and Lemma 38.7 that if l is the smallest integer such that v is present in the BFS tree rooted at pl(u) in the hierarchy Hk , then we can report δ(pl(u), u) + δ(pl (u), v) as an approximate distance between u and v. Along these lines, we shall present an improved decremental algorithms for APASP-d.

Bounding the Size of Bd,S under Edge-Deletions

We shall now present a scheme based on random sampling to find a set S V of vertices that will establish a sub-linear bound on the number of vertices (ν(BS )) as well as the number of edges (µ(BS )) induced by BS under deletion of edges. Since Bd,S BS , so these upper bounds also hold for Build the set S of vertices by picking each vertex from V independently with probability n . The expected size of S is O(n ). Consider an ordering of vertices V according to their levels in the BFS tree BS (see Figure 38.7). The set of vertices lying at higher levels than the nearest sampled vertex in this ordering is what constitutes the BFS tree BS . Along similar lines as that of Lemma 38.1, it follows that the expected size of this set (and hence ν(BS )) is n . Moreover, it can be shown that ν(BS ) is no more than 4n ln n with probability > 1 1 . Now as the edges are being deleted, the levels of the vertices in the tree BS may fall, and so the ordering of the vertices may change. There will be a total of m such orderings during the entire course of edge deletions. Since the vertices are picked randomly and independently, therefore, the upper bound of 4n ln n holds for ν(BS ) with probability (1 1 ) for any of these orderings. So we can conclude that ν(BS ), the number of vertices never exceeds ( 4n ln n ) during the entire course of edge deletions with probability To bound the number of edges induced by BS , consider the following scheme. Pick every edge independently with probability n . The set S consists of the end points of the sampled edges. The expected size of S is O(nc). Consider an ordering of the edges according to their level in BS (level of an edge is defined as the minimum of the levels of its end points).

Along the lines of arguments given above (for bounding the the number of vertices of BS ), it can be shown that µ(BS ), the number of edges induced by BS remains 4m ln n during the entire course of edge deletions.

Note that in the sampling scheme to bound the number of vertices of tree BS , a vertex  v is picked with probability n . Whereas in the sampling scheme for bounding the number of edges in the sub-graph induced by BS , a vertex v is picked with probability degree(v)·n .

It can thus be seen that both the bounds can be achieved simultaneously by the following random sampling scheme :

image

image

LEMMA 38.8 [Even, Shiloach [7]] Given a graph under deletion of edges, a BFS tree u,u V can be maintained in O(d) amortized time per edge deletion.

For maintaining a Bd,S tree under edge deletions, we shall use the same algorithm of [7] with the modification that whenever the depth of Bd,S has to be increased (due to recent

edge deletion), we grow the tree to its new level min {d, δ(u, N (u, S)) 1}. We analyze the total update time required for maintaining Bd,S as follows.

There are two computational tasks : one extending the level of the tree, and another that of maintaining the levels of the vertices in the tree Bd,S under edge deletions. For the first task, the time required is bounded by the edges of the new level introduced which is O(µ(Bd,S )). For the second task, we give a variant of the proof of Even and Shiloach

[7] (for details, please refer [7]).

The running time is dominated by the processing of the edges in this process. In-between two consecutive processing of an edge, level of one of the end-points of the edge falls down by at least one unit. The processing cost of an edge can thus be charged to the level from which it has fallen. Clearly the maximum number of edges passing a level i is bounded by µ(Bd,S ). The number of levels in the tree Bd,S is min{d, ν(Bd,S )}. Thus the total cost for maintaining the BFS tree Bd,S over the entire sequence of edge deletions is O(µ(Bd,S ) · min {d, ν(Bd,S )}).

LEMMA 38.9 Given an undirected unweighted graph G = (V, E) under edge deletions, a distance parameter d, and a set S V ; a BFS tree Bd,S can be maintained in

image

image

To report distance from u to v, we start form the level 0. We first inquire if v lies in Bd,S1 . If v does not lie in the tree, we move to the first level and inquire if v lies in B2d,S2 . It follows from the Corollary 38.2 that if δ(u, v) d, then proceeding in this way, we eventually find a vertex pl (u),l k 1 in the hierarchy Hk such that v is present in the BFS tree rooted at pl (u). (See Figure 38.5). We then report the sum of distances from pl(u) to both u and v.

image

image

Based on the data-structure of [7], the previous best algorithm for maintaining all-pairs exact shortest paths of length d requires O(nd) amortized update time. We have been able to achieve o(nd) update time at the expense of introducing approximation as shown in Table 38.1 on the following page.

Comments

Popular posts from this blog

0/1 Knapsack Problem Memory function.

Binary Space Partitioning Trees:BSP Tree as a Hierarchy of Regions.

Drawing Trees:HV-Layout