Multi-Dimensional Packet Classification:Introduction and Performance Metrics for Classification Algorithms.

Introduction

Chapter 48 discussed algorithms for 1-d packet classification.

In this chapter we consider multi-dimensional classification in detail. First we discuss the motivation and then the algorithms for multi-dimensional classification. As we will see, packet classification on multiple fields is in general a difficult problem. Hence, researchers have proposed a variety of algorithms which, broadly speaking, can be categorized as “basic search algorithms,” geometric algorithms, heuristic algorithms, or hardware-specific search algorithms. In this chapter, we will describe algorithms that are representative of each category, and discuss which type of algorithm might be suitable for different applications.

Until recently, Internet routers provided only “best-effort” service, servicing packets in a first-come-first-served manner. Routers are now called upon to provide different qualities of service to different applications which means routers need new mechanisms such as ad- mission control, resource reservation, per-flow queueing, and fair scheduling. All of these mechanisms require the router to distinguish packets belonging to different flows.

Flows are specified by rules applied to incoming packets. We call a collection of rules a classifier. Each rule specifies a flow that a packet may belong to based on some criteria applied to the packet header, as shown in Figure 49.1. To illustrate the variety of classifiers, consider some examples of how packet classification can be used by an ISP to provide different services.

Figure 49.2 shows ISP1 connected to three different sites: enterprise networks E1 and E2 and a Network Access Point(NAP), which is in turn connected to

image

Table 49.2 shows the flows that an incoming packet must be classified into by the router at interface X. Note that the flows specified may or may not be mutually exclusive. For example, the first and second flows in Table 49.2 overlap. This is common in practice, and when no explicit priorities are specified, we follow the convention that rules closer to the top of the list take priority (referred to as the “First matching rule in table” tie-breaker rule in Chapter 48).

image

49.1.1 Problem Statement

Each rule of a classifier has d components. R[i] is the ith component of rule R, and is a regular expression on the ith field of the packet header. A packet P is said to match rule R, if i, the ith field of the header of P satisfies the regular expression R[i]. In practice, a rule component is not a general regular expression but is often limited by syntax to a simple address/mask or operator/number(s) specification. In an address/mask specification, a 0 (respectively 1) at bit position x in the mask denotes that the corresponding bit in the address is a don’t care (respectively significant) bit. Examples of operator/number(s) specifications are eq 1232 and range 34-9339. Note that a prefix can be specified as an address/mask pair where the mask is contiguous — i.e., all bits with value 1 appear to the left of bits with value 0 in the mask. It can also be specified as a range of width equal to 2t where t = 32 pref ixlength. Most commonly occurring specifications can be represented by ranges. An example real-life classifier in four dimensions is shown in Table 49.3. By convention, the first rule R1 is of highest priority and rule R7 is of lowest priority. Some example classification results are shown in Table 49.4  Longest prefix matching for routing lookups is a special-case of one-dimensional packet classification. All packets destined to the set of addresses described by a common prefix may be considered to be part of the same flow. The address of the next hop where the

image

packet should be forwarded to is the associated action. The length of the prefix defines the priority of the rule.

Performance Metrics for Classification Algorithms

1. Search speed — Faster links require faster classification. For example, links running at 10Gbps can bring 31.25 million packets per second (assuming minimum sized 40 byte TCP/IP packets).

2. Low storage requirements — Small storage requirements enable the use of fast memory technologies like SRAM (Static Random Access Memory). SRAM can be used as an on-chip cache by a software algorithm and as on-chip SRAM for a hardware algorithm.

3. Ability to handle large real-life classifiers.

4. Fast updates — As the classifier changes, the data structure needs to be updated.

We can categorize data structures into those which can add or delete entries incrementally, and those which need to be reconstructed from scratch each time the classifier changes. When the data structure is reconstructed from scratch, we call it “pre-processing”. The update rate differs among different applications: a very low update rate may be sufficient in firewalls where entries are added manually or infrequently, whereas a router with per-flow queues may require very frequent updates.

5. Scalability in the number of header fields used for classification.

6. Flexibility in specification — A classification algorithm should support general rules, including prefixes, operators (range, less than, greater than, equal to, etc.) and wildcards. In some applications, non-contiguous masks may be required.

Comments

Popular posts from this blog

Data Structure Visualization:Introduction and Value of Data Structure Rendering

Collision Detection:Penetration Depth Computation

Concurrent Data Structures:Linked Lists