Multi-Dimensional Packet Classiﬁcation:Introduction and Performance Metrics for Classiﬁcation Algorithms.

By احمد جاد الله فرحات - May 18, 2015

Introduction

Chapter 48 discussed algorithms for 1-d packet classiﬁcation.

In this chapter we consider multi-dimensional classiﬁcation in detail. First we discuss the motivation and then the algorithms for multi-dimensional classiﬁcation. As we will see, packet classiﬁcation on multiple ﬁelds is in general a diﬃcult problem. Hence, researchers have proposed a variety of algorithms which, broadly speaking, can be categorized as “basic search algorithms,” geometric algorithms, heuristic algorithms, or hardware-speciﬁc search algorithms. In this chapter, we will describe algorithms that are representative of each category, and discuss which type of algorithm might be suitable for diﬀerent applications.

Until recently, Internet routers provided only “best-eﬀort” service, servicing packets in a ﬁrst-come-ﬁrst-served manner. Routers are now called upon to provide diﬀerent qualities of service to diﬀerent applications which means routers need new mechanisms such as ad- mission control, resource reservation, per-ﬂow queueing, and fair scheduling. All of these mechanisms require the router to distinguish packets belonging to diﬀerent ﬂows.

Flows are speciﬁed by rules applied to incoming packets. We call a collection of rules a classiﬁer. Each rule speciﬁes a ﬂow that a packet may belong to based on some criteria applied to the packet header, as shown in Figure 49.1. To illustrate the variety of classiﬁers, consider some examples of how packet classiﬁcation can be used by an ISP to provide diﬀerent services.

Figure 49.2 shows ISP1 connected to three diﬀerent sites: enterprise networks E1 and E2 and a Network Access Point∗ (NAP), which is in turn connected to

Table 49.2 shows the ﬂows that an incoming packet must be classiﬁed into by the router at interface X. Note that the ﬂows speciﬁed may or may not be mutually exclusive. For example, the ﬁrst and second ﬂows in Table 49.2 overlap. This is common in practice, and when no explicit priorities are speciﬁed, we follow the convention that rules closer to the top of the list take priority (referred to as the “First matching rule in table” tie-breaker rule in Chapter 48).

49.1.1 Problem Statement

Each rule of a classiﬁer has d components. R[i] is the ith component of rule R, and is a regular expression on the ith ﬁeld of the packet header. A packet P is said to match rule R, if ∀i, the ith ﬁeld of the header of P satisﬁes the regular expression R[i]. In practice, a rule component is not a general regular expression but is often limited by syntax to a simple address/mask or operator/number(s) speciﬁcation. In an address/mask speciﬁcation, a 0 (respectively 1) at bit position x in the mask denotes that the corresponding bit in the address is a don’t care (respectively signiﬁcant) bit. Examples of operator/number(s) speciﬁcations are eq 1232 and range 34-9339. Note that a preﬁx can be speciﬁed as an address/mask pair where the mask is contiguous — i.e., all bits with value 1 appear to the left of bits with value 0 in the mask. It can also be speciﬁed as a range of width equal to 2t where t = 32 − pref ixlength. Most commonly occurring speciﬁcations can be represented by ranges. An example real-life classiﬁer in four dimensions is shown in Table 49.3. By convention, the ﬁrst rule R1 is of highest priority and rule R7 is of lowest priority. Some example classiﬁcation results are shown in Table 49.4 Longest preﬁx matching for routing lookups is a special-case of one-dimensional packet classiﬁcation. All packets destined to the set of addresses described by a common preﬁx may be considered to be part of the same ﬂow. The address of the next hop where the

packet should be forwarded to is the associated action. The length of the preﬁx deﬁnes the priority of the rule.

Performance Metrics for Classiﬁcation Algorithms

1. Search speed — Faster links require faster classiﬁcation. For example, links running at 10Gbps can bring 31.25 million packets per second (assuming minimum sized 40 byte TCP/IP packets).

2. Low storage requirements — Small storage requirements enable the use of fast memory technologies like SRAM (Static Random Access Memory). SRAM can be used as an on-chip cache by a software algorithm and as on-chip SRAM for a hardware algorithm.

3. Ability to handle large real-life classiﬁers.

4. Fast updates — As the classiﬁer changes, the data structure needs to be updated.

We can categorize data structures into those which can add or delete entries incrementally, and those which need to be reconstructed from scratch each time the classiﬁer changes. When the data structure is reconstructed from scratch, we call it “pre-processing”. The update rate diﬀers among diﬀerent applications: a very low update rate may be suﬃcient in ﬁrewalls where entries are added manually or infrequently, whereas a router with per-ﬂow queues may require very frequent updates.

5. Scalability in the number of header ﬁelds used for classiﬁcation.

6. Flexibility in speciﬁcation — A classiﬁcation algorithm should support general rules, including preﬁxes, operators (range, less than, greater than, equal to, etc.) and wildcards. In some applications, non-contiguous masks may be required.

Search This Blog

algorithms