MpFPC—A Parallelization Method for Fast Packet Classification

,

Packet classification is the core technology to Internet devices and network services implementation, it searches for the operation or task to be performed by the packet in a set of rules, according to a series of information carried by the specified packet (such as source address, destination address, source port, destination port and protocol, etc.) [10]. With the development of a series of cutting-edge network technologies, such as network function virtualization (NFV) and software-defined network (SDN), the Internet carries more and more application services, the scale of network traffic, backbone network routing table and firewall access control rules have increased explosively, which also puts forward higher requirements for the packet classification ability of network equipment [10]. With the continuous improvement of network communication cable manufacturing technology and the increasing link bandwidth, the optical fiber The associate editor coordinating the review of this manuscript and approving it for publication was Victor S. Sheng.
transmission rate of the current universal interface has reached or even exceeded 400Gbps [3]. In terms of 64 bytes per packet, in order to achieve wire-speed requirement, it is necessary to complete a packet classification within 1.28ns. However, existing software packet classification algorithms generally only have the ability to classify a particular dimension, especially the single dimensional classification and two dimensional classification algorithms. Some algorithms can be extended for the high dimensional classification, but it cannot meet the requirements in the space complexity and time complexity simultaneously.
Considering these requirements, we designed a novel parallelization method for fast packet classification (MpFPC) based on distributed computing. The proposed method has advantages of high efficiency without additional space requirement, and it is suitable for a large amount of data, especially for real-time packet classification which has the requirement of high speed. Our work contributes a new parallelization method for fast packet classification to improve the classification speed. The main innovations of this method are as follows: Firstly, the decision tree is grouped into several sub-decision trees for parallel packet classification, and the improved grouping process can avoid the rule replication problem of the traditional decision-tree-based method; Secondly, during classification, each data packet will be compared according to the corresponding interval value of the root node of the decision sub-tree. After finding the matching interval, it will be distributed to the corresponding decision subtree for classification. However, the packet that cannot match any decision subtree can directly be given the classification decision of discard, which also improves the packet classification efficiency to a certain extent.
The rest of this paper is organized as follows: the related work is presented in Section 2; then we give the problem description, and present the algorithms of classification decision tree constructing, rule grouping and distributed packet classifying in Section 3. In Section 4, we give the classification results of MpFPC method and Uscuts method, followed by the comparison and analysis of experimental results. Finally, the conclusion is drawn in Section 5.

II. RELATED WORK
Packet classification is a hard problem with high complexity. From a geometric point of view, packet classification can be treated as a point location problem, which has been proved that the best bounds for locating a point are either O(logn) time with O(n k ) space, or O((logn) k−1 ) time with O(n) space for n non-overlapping hyper-rectangles in k-dimensional space [4]. Therefore, the worst case mathematical complexity of algorithmic packet classification is extremely high, which makes it impractical to achieve a wire-speed requirement within the capabilities of current memory technology. However, packet classification rules in real-life applications have some inherent characteristics that can be exploited to reduce the complexity.

A. HARDWARE-BASED METHODS
In industry, large routers and high-end classifiers use hardware devices, such as ternary content addressable memory (TCAM) [5], field programmable gate array (FPGA) [6] and specialized network processor chips, to achieve highperformance packet classification. The hardware-based classification method can carry out high-speed classification, but it is expensive, consumes a lot of power, has low flexibility and scalability [5]. Therefore, in academia, researchers are more willing to seek solutions based on software for packet classification [7].
According to the existing research, packet classification algorithms can be roughly divided into algorithms based on dimension decomposition and algorithms based on space division [8]. The algorithm based on space division usually partitions the whole rule space into several subspaces, then divides the rule set into several groups and puts them into each subspace. This type of method can be further divided into two main subcategories: tuple space-based method [9]- [11] and decision tree-based method [12], [13].

B. DIMENSION-DECOMPOSITION-BASED METHODS
The algorithm based on dimension decomposition decomposes each rule into multiple dimensions in a certain number of bytes or bits. Each dimension is searched separately, and then combined to obtain the final search result, representative classical algorithms include BV(Bit vector), ABV(Aggregated bit vector), RFC(Recursive flow classification) [14]- [16], etc. These methods are fast, but with the increase of the size of the rule set, the space consumption will increase exponentially in the worst case, and the space consumption is high.

C. TUPLE-SPACE-SEARCH-BASED METHODS
The algorithm based on tuple space constructs a hash table for each different prefix length, and the rule components with the same prefix length are stored in the same hash table. When classifying packets, all hash tables are accessed sequentially until the longest matching prefix is found. Representative algorithms include multi-dimensional packet classification algorithms TSS(Tuple Space Search) [17], PartitionSort [18], and TupleMerge [12], etc.

D. DECISION-TREE-BASED METHODS
The algorithm based on decision tree recursively decomposes the multidimensional space where the rules are located in a certain way, and establishes a decision tree. The root node of the decision tree represents the whole multidimensional space, and the non-root node represents the subspace. The end condition of recursive decomposition is that the number of rules associated with nodes is less than or equal to the preset threshold . When receiving a packet, the method first access the root node of the decision tree, and then decide how to access the next node according to the space division. Usually, the packet classification algorithm based on decision tree will eventually access the leaf node. When accessing a leaf node, the algorithm sequentially accesses at most rules associated with the leaf node to determine the rule that the packet can match. Representative classical algorithms include Hicuts [19], HyperCuts [20], etc.
Generally speaking, the methods based on tuple space have the problems of uneven subspace distribution and rule replication, which affect the classification performance of the methods to a certain extent. Although traversing the decision tree can achieve logarithmic time complexity, it need to perform sequence matching within rules after finding the leaf node of the decision tree. Obviously, the execution time efficiency of sequence matching is linear with the number of rules, which reduces the classification efficiency of the method to a great extent. In addition, when dividing the multidimensional space where the rule is located, a rule may need to be copied to multiple rule groups, resulting in a large amount of storage space for the method. Efficuts, SmartSplit and other methods [21]- [23] have some improvements in the depth of constructing decision tree compared with classical methods, but they do not fundamentally solved the problem of sequential matching after rule replication and matching to leaf nodes. From the perspective of solving these two problems, a decision tree packet classification method Uscuts based on cell space division is proposed. The classification time complexity of this method is O(klogn) [24]. However, for the classification of massive data packets, its classification speed may still become a bottleneck.
At present, compared with other classification methods based on software, the method based on decision tree has the most advantages in classification speed. Especially in the data center network environment, there may be massive data packets to be classified instantaneously. At this time, the classification speed is the most important. However, by analyzing the current packet classification algorithm, there is still a long distance from the wire-speed requirements, which directly affects and limits the application scalability of network devices in the new generation Internet. Considering that the industry mostly uses parallelization methods to process massive data, if both the data packets and rules can be grouped, the packet classification problem may be handled in a parallel manner. Based on this idea, this paper proposes a new parallelization method for fast packet classification based on distributed computing.

III. THE PROPOSED APPROACH A. PROBLEM DESCRIPTION
According to the analysis of previous studies, the biggest obstacle to the direct grouping of rules is that rules are often entangled, and there are overlaps and conflicts between rules [24]. Therefore, if the rules are divided into several groups and the data packets are distributed to these groups for classification, the classification results will generally be inconsistent with that of the original rules. As shown in Figure 1, the classification rule set contains nine rules, which are divided into three rule groups Rg1, Rg2 and Rg3. Suppose that packet e(5, 3) passes through policy R and matches nine rules from top to bottom, it will first match r 2 , and its classification result is discard; Accordingly, Rg1 and Rg2 can match packet e too, but their classification decisions are different, namely discard and accept. However, there are no rules in Rg3 can match the data packet e. It can be seen that if there are conflicts in classification rules, the direct division of rules will lead to ''inconsistency'' errors [25] in classification decisions. Moreover, copying data packets to each rule group for classification will increase the memory consumption due to replication on the one hand, and the ''incomplete'' error [25] of the rule will occur because the grouping rules cannot match the data packets on the other hand.
Therefore, the first problem to be solved is how to deal with the original classification rules so that the rules are independent of each other, but the rule semantics remain unchanged. The second problem is how to group these rules and divide them equally to each parallel computing node, and the semantics of the whole rules divided to each node must be the same as that of the original rules. In addition, generally speaking, if each node has the same computing power, the more average the number of rules deployed to each node, the higher the computing efficiency. Finally, the method also needs to consider how to properly group the massive data packets to be classified, and send the grouped data packets to each computing node for classification based on decision tree. Firstly, we briefly introduce the solutions to the above problems, and then explain the specific steps of the method in detail with examples. To solve the problem of how to make the rules independent of each other while the semantics of the rules unchanged, we put forward a solution based on multidimensional matrix mapping (FDM) in our previous work [26]. The method can preprocess the original rules so that the target rules have the same semantics as the original rules, but the rules are independent of each other. The result of rule mapping in this method is a series of independent cell spaces, an example of mapping process is shown in Section 3.2. How to divide all rules evenly into pre-deployed computing nodes will be more complex. We need to comprehensively consider the efficiency of the subsequent packet classification process, and it is difficult to achieve absolute average in most cases. Let the mean value equals to the total number of rules divided by the number of computing nodes, we design a division method based on greedy strategy. Each time we partition the rules close to the mean value to the existing nodes until all rules are divided. Finally, we consider how to distribute the massive data packets to be classified to each computing node.
In this paper, we specify the distribution of massive data packets according to the root node interval of the subtree corresponding to each grouping rule, so as to avoid replication in the process of packet distribution. In addition, when constructing the classification decision tree, because the rules are non-conflict and redundancy-free after FDM mapping, there is no rule replication in the decision tree, which not only reduces the memory consumption, but also improves the classification efficiency of the decision tree.
For ease of understanding, we will supplement the division method based on greedy idea with Figure 2. Let the decision VOLUME 10, 2022 tree T be composed of several subtrees in the F 1 dimension. As shown in Fig. 2, the decision tree T has six subtrees corresponding to T 1 , T 2 , . . . , T 6 respectively. Suppose the number of branches in each subtree (we define the path from the root node to the leaf node of the subtree as a branch, and the number of branches corresponds to the number of rules) are {7, 5, 3, 4, 2, 4}, then the total number of rules is N = 7 + 5 + 3 + 4 + 2 + 4 = 25. Assuming that the number of deployed parallel computing nodes n equals '4,' then the average number of rules to be divided by each node is N/n = 25/4 = 6.25. The partition method based on greedy idea is to take the subtree with the number of rules as close as possible to N/n and deploy it to the corresponding computing node, while the subtree is not split. According to this method, all rules in the subtree are divided into each computing node. As shown in Fig. 3, subtree T 1 contains seven rules and T 2 contains five rules. Obviously, compared with T 1 + T 2 , the number of rules contained in T 1 is closer to the average number of rules '6,' that is, |7 − 6| < |(7 + 5) − 6|, so the seven rules in T 1 are divided into node1, similarly, the five rules in T 2 are divided into node2, and the seven rules in T 3 + T 4 are divided into node3. Finally, all the remaining six rules in T 5 + T 6 are divided into node4.
Next, we briefly explain the execution steps of the algorithm. For the convenience of description, we use a small example of 2-tuple rule set shown in Figure 5 for subsequent discussions.

B. MpFPC ALGORITHM
The implementation process of the algorithm includes the following four steps: (1) Preprocessing the original rules, mapping the rules in the multidimensional matrix space through the rule mapping method to form a series of independent cell spaces; (2) In each dimension, the multidimensional matrix space where the rules are located is divided in turn to build a classification decision tree; (3) Based on the number of deployed parallel computing nodes and the number of branches in each subtree, the subtrees are distributed to each computing node as evenly as possible; (4) According to the packet classification method based on decision tree, largescale data packets are distributed and classified in parallel.
The input of the algorithm is the nodes number n and the original k-dimensional classification rules R, after the rule preprocessing and decision tree constructing in the above step 1 and step 2, several (k − 1)-level classification decision subtrees are output.
Step 3 shows how to distribute these subtrees to each computing node as evenly as possible. Finally, step 4 describes the process of distributing and parallel classifying massive packets according to the distributed subtrees.
STEP 1: Rule Preprocessing For the input classification rules R, the k-dimensional rules are mapped to the k-dimensional matrix space in reverse order by using the rule mapping method, and a series of independent cell spaces are formed after mapping. Generally speaking, classification rules can be expressed in the form of interval such as '' represents the source address, destination address, source port and destination port, etc., D(F i ) represents the corresponding domain value interval, decision represents the decision (accept or discard) of rules, and k is the dimension. M k can describe a k-dimensional matrix space. The coordinates of each dimension are expressed by F i , and the corresponding coordinate interval is [0, D(F i )], 1 ≤ i ≤ k. Fig. 4 defines a two-dimensional matrix space, and the coordinate intervals of both dimensions are [0,9]. According to the idea of rule mapping based on FDM design model, any rule with the form of In the mapping process, we use the cell space cs (corresponding to a k-dimensional rectangle in the k-dimensional matrix space) to represent the region that is finally decided to accept: [(l 1 , l 2 , . . . , l k )(d 1 , d 2 , . . . , d k )], where l i and d i refer to the minimum boundary value and range of the region in each dimension respectively. According to reference [26], if the number of rules is n, the time complexity of rule preprocessing (corresponding to the rule mapping method in FDM design model) is O(kn), where k refers to the rule dimension.    Suppose the original classification strategy contains nine rules, as shown in Fig. 5:  The purpose of this step is to construct a classification decision tree according to the cell space obtained by the rule preprocessing process in Step 1 (as shown in Fig. 7).
Generally speaking, according to the definition of coordinate projection interval P(u, v, F i ) of cell spaces u and v on a given dimension F i , the spatial relationship R(u, v, F i ) of any two cell spaces u and v must satisfy one of the six relationships {crossed, covered, included, disjunctive, adjacent and equivalent} [24].
Definition 1: Let cell spaces Taking the two-dimensional case as an example, the spatial relationship and coordinate projection interval of two cell spaces u and v on dimension F 1 are shown in Fig. 8. Based on Definition 1, the coordinate projection interval of n cell spaces on dimension F 1 can be recorded VOLUME 10, 2022 as P(cs 1 , . . . , cs N , F 1 ). Next, we give the process steps of constructing classification decision tree T, as shown in Algorithm 1.

Algorithm 1 Constructing Classification Decision Tree
Input: n cell spaces us 1 ∼us n . Output: the classification decision tree T . Begin 1. while (t ≤ k) { 2. T ← P(cs 1 , . . . , cs n , F 1 ); //add P(cs 1 , . . . , cs n , F 1 ) as the child nodes of root to T in turn to form a subtree 3. t + +; 4. find all the cell spaces associated with the subspace corresponding to the root node of T i (i := 1 to n) (recorded as {cs}); 5. seek the coordinate projection interval P({cs}, F t ) of {cs} on dimension F t ; 6. T i ←P({cs}, F t ); //add P({cs}, F t ) as a child node to T 7. } 8. return T ; End Taking Fig. 7 as an example, in the initial case, T is a onelevel decision tree, including only the root node root. Six cell spaces (cs 1 ∼cs 6 ) in the two-dimensional matrix space form four coordinate projection intervals { [2,3], [4,4], [5,7], [8,9]} on the F 1 dimension. The root node {a} of the subtree T 1 corresponds to the interval [2,3], and all the associated cell spaces cs 1 and cs 5 form two coordinate projection intervals [2,3] and [7,8] in the F 2 dimension. Two projection intervals are respectively added to the subtree T 1 as the child nodes {e, f } of the root node {a}. Similarly, the projection intervals [0, 3] and [7,8] are added to the subtree T 2 as child nodes {g, h}. Projection intervals [0, 2] and [5,9] are added to the subtree T 3 as child nodes {i, j}; The projection interval [7,8] is added to the subtree T 4 as a child node {k}, and finally the decision tree T is formed, as shown in Fig. 7.
It can be seen from Fig. 7 that the decision tree T is composed of four subtrees, and the corresponding interval of each subtree root presents an increasing relationship. Similarly, the corresponding interval of each leaf node in the subtree are also increases. The analysis shows that the branches formed from the root node of T to each leaf node are independent of each other, so these branches corresponding to the rules can be directly divided and grouped. It should be pointed out that in order to facilitate the subsequent data packets to be divided into each group, we regard each subtree as a whole and do not cut the subtree.

STEP 3: Dividing the Decision Subtree into Computing Nodes
Suppose there are n parallel computing nodes, this step aims to divide all subtrees obtained in step 2 into these n nodes approximate evenly according to the number of branches. Assuming that the decision tree T has m subtrees, the number of branch in each subtree is l 1 , l 2 , . . . , l m respectively, let L = l 1 + l 2 + . . . + l m , then the division principle is to make the number of branches divided by each node as close to L/n as possible.
In the sequence {l 1 , l 2 , . . . , l m }, the first t + 1 values are successively taken from l 1 ; if 38384 VOLUME 10, 2022 the first t subtrees corresponding to l 1 , . . . , l t are taken and divided into the first computing node; Continue starting from l t+1 , take several subtrees and divide them into the second computing node according to the same method as above.
Finally, divide all the remaining subtrees into the last computing node. The specific subtree grouping process is described in Algorithm 2. Intuitively, Algorithm 2 divides all subtrees of decision tree T into groups according to the number of pre-deployed computing nodes and the number of branches in each subtree. The principle of division is that the number of branches in each group is as identical as possible. When classifying data packets, first determine whether the value of the data packet in the corresponding dimension matches the corresponding interval of the subtree root node. If so, send the data packet to the computing node corresponding to the subtree for further classification processing; if any interval cannot be matched, the packet decision is directly determined as discard.
As shown in Figure 9, the decision tree T has four subtrees T 1 ∼T 4 , and the number of corresponding branches in each subtree is {2, 2, 2, 1} respectively, the total number of branches can be calculated as '7.' Assuming that the number of computing nodes is '3,' we divide the seven branches into three computing nodes as evenly as possible, then the number of branches in each node is approximately 7/3 ≈ 2.33.
break; 11. } else continue; 12. } else continue; 13.} 14. return Group(1)∼Group(n). End all the remaining two intervals { [5,7], [8,9]} are divided into the third node. STEP 4: Classifying Packets in Parallel Different from the traditional packet classification method based on decision tree, this method first divides the packets to be classified according to the corresponding interval value of VOLUME 10, 2022 each subtree root node, which is approximately equivalent to reducing the packet classification scale to 1/n of the original one (n is the number of deployed computing nodes); After the data packets are divided into parallel computing nodes, they are classified according to the subtree (decision tree) divided in advance. According to the division process in step 3, the scale of the classification decision tree in each node is also approximately reduced to 1/n of T .
As far as data packet classification based on decision tree is concerned, its essence is a query operation. For the decision tree or any of its subtrees, the interval coordinate values corresponding to the child nodes of the root node are strictly increasing, so the binary search method can be directly applied in the process of classification. As shown in Fig. 10, the root node of the decision tree T has four child nodes, and the corresponding interval coordinate values are [2,3], [4,4], [5,7] and [8,9] respectively, satisfying the strict increasing relationship.
Consider a k-tuple packet P:(e 1 , e 2 , . . . , e k ). When classifying it, start from the root node of the decision tree, and first bisearch on all the child nodes of the root node to judge whether the first metadata e 1 of the packet is included in the corresponding interval of a certain node. If any interval can not be matched, it can be determined directly that packet P cannot match any rule, and the packet decision is recorded as discard. Otherwise, continue searching on the subtree within the node. If each tuple e i (i := 1 to k) of packet P matches the corresponding interval of each layer node of a subtree branch, it can be determined that the packet P matches the subtree, and the decision of P is accept.
The classification process is shown in Figure 10. Firstly, the method groups the data packets and divides them into corresponding computing nodes according to the value of the first dimension of the data packets. Then, at each computing node, the packet decision is determined based on the decision tree method. Taking the packet to be classified E = {p 1 , p 2 , p 3 , p 4 , p 5 , p 6 } = {(2, 5), (1,8), (4,7), (6,3), (2,8), (8,8)} as an example, first, determine the value e 1 of the first dimension of the packet, and compared it with the interval value of the child node of the root. It can be seen that p 2 :(1, 8) cannot match any interval value, which means that p 2 cannot match any rule in the decision tree, so it can be directly determined that the packet p 2 decision is discard. In addition, p 1 and p 5 can be divided into the first node; p 3 is divided into the second node; p 4 and p 6 are divided into the third node.
Continue the packet classification based on decision tree at each node: in the subtree corresponding to the first node, because the second dimension of p 1 :(2, 5) is '5,' it cannot match any interval [2,3] or [7,8] corresponding to the two child nodes, so it can be determined that the classification decision of p 1 is discard, While the second dimension of p 5 : (2,8) is '8,' which matches the interval [7,8], so the decision of p 5 can be determined as accept. Similarly, the other packets can be classified at the second and third nodes, the decision of p 3 :(4, 7), p 4 :(6, 3) and p 6 : (8,8) is accept, discard and accept respectively. Here, it should be noted that, because the interval value of each child node is strictly increasing, the bisearch method can be used in packet matching, and the specific process of packet classification is described in Algorithm 3.

Algorithm 3 Classifying Packets in Parallel
Input: the packet P:(e 1 , e 2 , . . . , e k ), the decision tree T . Output: the packet decision ('accept' or 'discard'). Begin / * bisearch on all the child nodes of the root to judge whether e 1 is included in the interval of the s th subtree Child(root, s). If it is true, send P to the computing node where the sub-tree is located; otherwise, determine that the decision of P is discard. * / 1. if (Bisearch (root, e 1 ) == true) do{ 2. send P to the computing node where the sub-tree Child(root, s) located; 3. } else P → discard; 4. break; / * at the computing node where the packet P located, start from the second dimension and search for e i on the subtree Child(root, s) (denoted as root_s). if e i is included in the interval of the t th sub-tree of root_s, continue to search for e i+1 on the sub-tree Child(root_s, t). * / 5. for (i := 2 to k) do{ 6. if (Bisearch (root_s, e i ) == true) do{ 7. root = Child(root_s, t); 8. continue; 9. if (i ≥ k) 10. P → accept. 11. } else 12. P → discard; 13. break; 14.} End

IV. EXPERIMENTAL RESULTS AND ANALYSIS
The proposed algorithms were implemented in Java JDK 1.7, our experiments were carried out on a desktop PC running Windows 10 with 16G memory and Intel(R) Core(TM) i7-10510U Processor of 1.80 GHz. In order to realize the distributed processing of packet classification,we built a Hadoop platform with a master and eight slave nodes. The implementation of the algorithm includes the following steps: Firstly, the rules are mapped by FDM [26] to generate independent cell spaces in multi-dimensional space; Then, the classification decision tree is constructed based on the spatial relationship of cell spaces, and the decision tree is divided and deployed to each computing node as evenly as possible according to the number of nodes; When classifying data packets, we first group the data packets according to the root node interval coordinate values of the subtree of each node, and then distribute them to each computing node for classification. The classification results of all data packets are directly output in each computing node. It should be pointed out that the FDM mapping of rules and the construction of decision tree can be carried out offline in advance. After each classification decision subtree is deployed to the computing node, the data packets can be classified in a distributed manner. Therefore, this method can be used for the parallel classification of large-scale data packets.
To verify the efficiency of classification method MpFPC, we choose three classification rules of different number (100, 1000 and 10000 rules respectively) and four data sets of different sizes (10KB, 1MB, 100MB and 200MB respectively) to test the time that required to classify data packets with MpFPC and Uscuts [24]. The test results are shown in Table 1, Table 2 and Table 3 respectively. Take Table 1 as an example, when the rule number is 100 and the packet size is 100MB, the classification speed and average value of each node are mapped to Figure 11. It can be seen that the classification time on each node fluctuates slightly up and down near the average value, which also shows that the algorithm has good performance in dividing the decision tree as evenly as possible according to the number of nodes, this performance is important to improve the classification efficiency of the algorithm.
Next, taking the number of classification rules as the abscissa and the packet classification speed as the ordinate, the MpFPC and Uscuts methods are used to classify the packet of four different sizes when the rules number is 100 and 10000 respectively, and the classification time is mapped in Fig. 12 and Fig. 13. It can be seen that with the increase of packet size, the classification speed advantage of MpFPC over Uscuts becomes more obvious. For example, when the packet is 100MB, the classification time of MpFPC algorithm is about 1/3 of that of Uscuts algorithm.  In order to further test the relationship between MpFPC method performance and the number of nodes, we compare the classification speed when the rule number is 1000 and     the packet size is 100MB and 200MB in the case of three computing nodes, five computing nodes and eight computing nodes respectively. The classification speeds are shown in Table 4, Table 5 and Table 6.
For the sake of intuition, we comprehensively compare the classification speeds of Uscuts method and MpFPC method in three node cases and map them to Fig. 14. As shown in the figure, compared with Uscuts, the classification speed of MpFPC is significantly improved, and as the number of nodes increases, the classification speed of MpFPC method also increases. For example, when the packet to be classified is 200MB, the time required to classify using Uscuts method   is about 3000ms; while the average time required to classify using MpFPC method are 1762ms at three nodes, 1408ms at five nodes and only about 1000ms at eight nodes. Therefore, by adding the number and improving the computing performance of nodes, it is expected to further improve the packet classification speed and meet the wire-speed requirement.

V. CONCLUSION
With the development of network applications, higher requirements are put forward for the speed of network packet classification. In this paper, the traditional single thread packet classification framework is improved, and the parallel method of distributed computing is used to classify data packets based on decision tree. The algorithm proposed in this paper has two innovations: firstly, the decision tree is divided into several sub-decision trees for distributed packet classification, and the mapping method FDM based on multidimensional matrix is adopted before constructing the decision tree to remove rule conflict and redundancy, so as to avoid the rule replication problem of the traditional decision tree method; Secondly, during packet classification, each packet is compared according to the corresponding interval value of the root node of the decision subtree. After finding the matching interval, it is distributed to the corresponding decision subtree for classification, which realizes the distributed processing of packets and further improves the efficiency of packet classification. The experimental results also show that the classification speed of MpFPC is faster than that of Uscuts, and the advantage of classification speed is more obvious with the increase of packet size. In addition, the experimental results also show that the classification speed will increase with the addition of the nodes number, which provides a new possible way to meet the classification wirespeed requirement in the new generation network.
The design method discussed in this paper is not only limited to packet classification, but also can be extended to other applications, such as OpenFlow switch, Firewall, Security gateway, and so on. With the continuous updating of network applications, in our next work, we will provide new methods according to the OpenFlow requirements of high speed and fast update classification.