CTLA: Compressed Table Look up Algorithm for Open Flow Switch

The size of the TCAM memory grows as more entries are added to the flow table of Open Flow switch. The procedure of looking up an IP address involves finding the longest prefix. In order to keep up with the link speed, the IP lookup operation in the forwarding table should also need to be speed up. TCAM's scalability and storage are constrained by its high power consumption and circuit density. The only time- or space-efficient algorithms for improvement are the subject of several research studies. In order to boost performance even further, this study focuses on time and space efficient algorithms. To strike a balance between speedy data access and efficient storage, this study proposes a combination of compression and a quick look-up mechanism to satisfy the space and speed requirements of the Open Flow switch. As the data is compressed, performance improves because less memory is required to store the look-up table and fewer bits are required to search. The look up complexity of proposed approach is $O(log\,(log\;n/2))$ and average space reduction is 61%.


I. INTRODUCTION
One of the major function of Open Flow [1] is to forward the packet according to the set of rules defined in the forwarding table.When an input packet arrives at the input port of the switch, a matching can be done in the flow table using destination IP address as a key.Typically flow table size is limited, as the number of computers/switches increases, the number of flow table entries are also increases.This will degrade the performance of the switch.According to the specification of the Open Flow switch [2], it suffers from limited storage capacity.Flow table consist of several entries which maps to its corresponding forwarding information.For example flow table fields can be VLAN ID, Source or Destination MAC address, Source or Destination IP address etc., these entries are stored in the memory called TCAM.Open Flow switch forwarding table storage size is limited in capacity and consumes huge power [3].As the number of network users are increases, then there is a demand for more entries to accommodate in the same existing switch by optimization algorithm or replace the old switch with new one with a large flow table storage capacity.If the flow table entries are not optimized then the performance of the network degrade due to controller frequent updates.Hence optimizing the space of flow table is a challenging task in the Open Flow switch [4].
Definition I: Given an IP address p of length w which convert it into another compressed address p of length w which is strictly less than w.
To decrease the memory requirement of the flow table, a variety of data compression techniques can be applied.Dictionary schemes [5] and statistical schemes are the two main categories of data compression techniques.These data compression involves convert sequence of binary information into another binary representation which has the length lesser than the original binary information.This compressed representation of binary code occupies lesser space than uncompressed one.Using a process known as longest prefix matching (LPM), the flow table is checked together with the destination IP addresses [6].
Definition II: Given a destination IP address d and the set of prefix database P, finding the longest match prefix for given d is d which is a subset of database set P.
Due to the continuous growth of flow table entries, finding an efficient data structure for an IP look up operation is crucial.With the huge traffic in the IP forwarding, IP look up will not support the current optical link speed.So, IP look up speed become major bottleneck for the high performance network [7], [8].IP look up in compressed flow table gives better performance than the uncompressed one.IP look up can be done either using hardware approach or software approach.Speed of the IP look up is mainly depends on the number of memory accesses.Content addressable memory (CAM) is a hardware based mechanism for performing IP look up operation, which will gives the result of exact match for a given IP address.
Modern network uses classless inter-domain routing [9] (CIDR) IP addresses of varying prefix length.A Ternary Content Addressable Memory (TCAM) can be used for the varying prefix length matching [10].The recent survey given by CAIDA consortium [11] most of the prefix lengths are of 24 bits.Network routing is done based on address prefix.The shorter prefix covers less address space.In order to restrict the growth of the forwarding table, many of the network system designer limits the prefix length distribution.In the distribution of prefix length for IPv4 addressing, most of the prefixes lies in the range of 16 to 24.TCAM uses three bits to represent data in the memory namely 0, 1, X (don't care).For the given IP address, TCAM generates multiple matched prefixes.Then the result set can be sent to priority encoder to select the longest prefix.This result will point to the corresponding SRAM which stores the next hop information.The most advantage of TCAM is that it will generate the result in a single clock cycle.Though it has advantages, it suffers from limited memory and huge power consumption (10-15Watts/chips).In this paper a statistical scheme of compression is used to reduce the stored bits of IPv4 or IPv6 addresses in the flow table by applying the standard Huffman algorithm [12].Then a trie based look up algorithm is performed on the compressed flow table.
The rest of the paper is divided into the following sections: Section II describes similar works, Section III summarises the contribution, Section IV proposes compressed table look up algorithm, Section V describe about the results and followed by Section VI conclusion.

II. RELATED WORK
Open Flow switch specification briefs about storing the flow entries in the TCAM memory even though it has limited memory [1], [3], [13], [14], [15].In [12] proposed an optimum compression technique which drastically reduces the number of bits to be stored in the memory.This compression is based on the frequency of the string.The string that has high frequency will be encoded with short binary code otherwise received a long binary code.Bonny et al., [16] used a canonical Huffman compression method to reduce the routing table size drastically.Compressing a IP table using prefix tree with optimal time is proposed in [17].Group the whole data set into three categories like alphabet, numbers, operators and rest of all one as a one group.Then it arranged the entire symbol in each group in non-increasing order based on their frequency.Then it gave the code for each symbol in each of the group.These methods increased the compression ratio by 2-3% than the adaptive Huffman coding technique [18].This method suffered from grouping and arranging each symbol in each of the group separately.
Architecture for the forwarding table using graph theoretic approach and encode memory requirement as per row entry of the table is proposed in [19].This scheme applied separate encoding for each column in the forwarding table.This gave the worst case complexity O(d ) where d is the number of column in the forwarding table, which is independent of values and the number of rows in the table.Hardware based longest prefix matching method reduced the power consumption of the forwarding table up to 30% is proposed in [10].
Though it has advantages of fast IP look up and reduced power consumption, it has limited memory capacity.A combined level compression and prefix DAG (Directed Acyclic Graph) into a single trie data structure representation is proposed in [20].Existing look up algorithms could use this method to improve the look up performance.This paper used a heuristic algorithm to perform IP look up.
TCAM consume huge amount of power when it performs IP look up operation and it made delay in the switch latency.TCAM look up for Open Flow switch by fully searching the flow key in the hash table is proposed in [21].An optimal look up algorithm for limited size of the forwarding table is proposed in [22].A prefix tree to represent all the entries in the forwarding table is proposed in [23].It used an efficient data structure to compress any tree structure in such a way that memory access to find the LPM for any entry in the table was not dependent on the stored prefix value, except the prefix length.A new technique which combines directed acyclic graph (DAG) and level compressed (LC) trie is proposed in [24] and it demonstrated 34% improvement in IP look up than LC trie [25].Binary prefix tree to sort the prefixes is proposed in [26].Then they extended this idea to multi way tree for better performance improvement.Ruiz-Sanchezet al., [27] explained a forwarding table implementation employing a simple trie data structure to discover the longest prefix match (LPM).Since, binary trie has many single-child nodes without branching.These bits need to be examined even though there is no branching.This one-child node will occupy more space in the memory.The path compression will remove such one-child node from the binary trie and store relevant information in other nodes.In worst case look up time is O(w).The space complexity will be O(n), which is better than the binary trie.In order to improve search time more than one bit is consider for searching called stride.This stride selection reduced the number of levels in the binary trie.Every node in the multi-bit trie [28] will take 2 k childrens, where k is chosen as a stride value.For example, consider 2 bits at a time, it requires only 16 memory accesses in the worst case for an IPv4 address.So the search time complexity of multibit trie would be O(w/k), where k is the stride value.To reduce the search time further down, a binary search on prefix length is introduced by Waldvogel [29].In this approach the search space is reduced by half in each iteration.The time complexity of this approach is O(log w).A novel technique for compression of rules that combine rules according to the non-prefix wildcards that are shared by all of the rules that have same output port number [30].A machine learning methods to identify the overlapping rules in the ACL and resolved conflicts [31].After analysing the connections between every match field, RETCAM divides the redundancy between various fields into three categories [32].The author [33] offers the foundation for a review of current packet header compression research and standards.Based on the quantity and complexity of activities that must be completed on the flow table, RET-CAM's task-based compression scheme dynamically adjusts the TCAM's size.Each IP address is represented using this method as a collection of key-value pairs, where the value is the matching IP address, and the key is the IP prefix.Highperformance IP lookups are made possible by the multibit trie, which is used to swiftly search for IP addresses depending on their prefix.This work experimented with various stride value and measure the lookup time with the different data sets [34].Today concern is to optimizing the space and reducing the IP look up time.It means that one should go for a design that will perform the look up operation in lesser time and consumes less space.

III. CONTRIBUTION
This paper proposes a new method which compresses the forwarding table size as well as perform IP look up efficiently.Proposed method combines the feature of Huffman encoding scheme and modified trie data structure.In Huffman coding, all the octet value of IP addresses are grouped into different segments and tree is constructed based on the frequency of the each element in each segment.To construct Huffman tree for each segment a minimum priority queue is used.The time complexity to construct Huffman tree is O(n * log n), with each of the insertion in the priority queue takes O(log n).Then, the whole forwarding table is replaced with the new binary code word from the respective segments.From the compressed forwarding table, a complete trie is constructed based on the maximum prefix length.In this trie, all the next hop information will be at the leaf nodes.A separate hash table is maintained at each of the level which store node values.In order to check prefix availability in this table, a perfect hash function is used to achieve the time complexity of O (1).A binary search mechanism is used to locate corresponding level of the trie.Once at the particular level say k, extract k bits from the incoming IP address and perform hashing in order to check this prefix is present or not.If so, recursively apply binary searching on the level table to find the best matched next hop.The given IP address is converted using Huffman code which is stored in an array initially.Since the height r Many studies just concentrate on improving algorithms that are space-or time-efficient.In order to further improve performance, this study focuses on both time and space efficient algorithms.
r The height of the trie data structure is the primary factor influencing the search time.Compression is a key factor in lowering the height.In situations when storage is scarce, Huffman coding is very crucial.
r Proposed method combines the feature of Huffman en- coding scheme and modified trie data structure.
r Along with the trie data structure a combination of hash- ing and binary search is used to improve the performance of search operations r Recursive binary search procedure is applied over the compressed forwarding table it takes the complexity of O(log (log n/2)).r On average this scheme saves 37% and 61% of memory for the above-taken IPv4 and IPv6 data set respectively.The proposed approach's theoretical comparison to the current approach is shown in Table 1.

IV. PROPOSED COMPRESSED TABLE LOOK UP ALGORITHM (CTLA)
Fig. 1 shows the overview of the proposed method.With Huffman coding, effective compression is accomplished by lowering the average length of the encoded IP address.By giving longer codes to less common symbols and shorter codes to more common ones, Huffman coding optimises the way data is represented.Since the original data can be flawlessly restored from the compressed data, this method uses a lossless compression technique.The Huffman encoding produces variable length codes for symbols without causing any loss in information.IP addresses are initially divided into octet-wise and a Huffman tree is constructed for each octet sequence.Following the construction of the Huffman tree, the node's left side receives value 0 and its right side receives value 1.Because there is a noticeable drop in the number of bits for each IP in the lookup table, trie searching requires a lot fewer comparisons than searching without compression.When updating the routing information protocol (RIP), the Huffman tree is recreated.In this arrangement, a minimum value (frequency) is maintained at the root node using a minimum priority queue data structure.Then selects two minimum frequency values from the queue and combines their sum into a new node.The left pointer is given the label 0 when the full tree has been constructed, and the right pointer is given the label 1.Through an iterative process of moving through the Huffman tree, it creates a binary code for each value in the set as shown in Algorithm 1. Modification of the original forwarding table using Huffman coding method was performed.In this algorithm every entry from the forwarding table is taken and split these entries into three or four sets based on the prefix length.Then, it passes each of these sets with frequency into Huffman tree construction Algorithm, Replace the Huffman code for each of the octet and every entry in the forwarding table.Algorithm 2 desribes about applying look up in the compressed forwarding table.Forwarding Table 2 shows destination IP with the corresponding next hop information.Table 3 shows that frequency of first octet.Fig. 2 shows the Huffman tree construction for the first octet.Table 4 shows the binary code for the first octet value.Total number of bits required for the first octet after compression Compression ratio is 92/240 = 38.33, the required memory capacity is 38.89% than the original one.
Hence, the percentage of space reduction is 61.66% for the first octet.
Table 5 shows that frequecy of second octet.Fig. 3 shows the Huffman tree construction for the second octet.Table 6 shows the binary code for the second octet value.Total  memory to represent these in the forwarding table.Hence, the percentage of space reduction is 64.58% for the second octet.Table 7 shows that frequency of third octet.Fig. 4 shows the Huffman tree construction for the third octet.Table 8 shows the binary code for the third octet value.Total number of bits required for the third octet after compression is 5  Hence, the percentage of space reduction is 58.33% for third octet.Table 9 shows the concatenated binary code for each entry (IP address) in the forwarding table.Total number of bits required is ( Actual number of bits are required without compression is 30 * 24 = 720.
Overall Percentage of compression is 280/720 = 38.89One can use 38.88% of memory to represent these in the forwarding table.Hence, one can save up to 61.11% space in the forwarding table.
Theorem 1: Construction of Huffman tree takes (n * log n) Proof: In the set S, let's assume that there are n entries.Using a bottom-up technique, Huffman trees are constructed.Every time there is a merging operation, a new node is made by combining the two minimum frequency nodes.The amount of work required for each merging operation is (log n).There will be a (n − 1) merging process for elements in the range n.The building of the tree therefore requires O(n * log n) times in the worst scenario.
Consider the fixed standard data set for illustration of compression as shown in Table 10.This data set has two fields namely IP prefix and next hop.Binary trie for those data set is constructed by extracting the binary information based on prefix length.Since the trie has all the value at the leaf level.During the insertion of each of the prefix into the trie node values are added into the level wise hash table.Since we know the key value at each of the level, a perfect hashing h(key) = key%square(size) can be applied to find any key value in the level table.Once all the prefixes from the data set is added, a binary search is applied to identify particular level in which key value is to be searched.Instead of searching the prefix right from the root to leaf node, this binary search will reduce the search time.Once the level is located, perfect hashing is applied on this level table by extracting the part of the bits from incoming prefix.This will finds whether this prefix exist in the table or not.If so, the binary search procedure is repeated until the match found in the trie.Fig. 5 shows complete binary trie construction for the forwarding Table 10.In this trie all the next hop information is reside at the leaf node.And also each level of the trie x, it should be encoded into compressed binary code using the stored value in the array.Lets consider the key value to be searched x = 10001.By applying the binary search, which will give the middle level from the trie structure as shown in Fig. 5.For the given destination IP x, the middle level is 2. Now, extract first two bits from the given x from left to right (10).Applying perfect hashing on this level T [2] gives the information that whether this key ( 10) is present or not in the table.If so, proceed down to the trie till the match is found.Once it is found, this will return the corresponding next hop information for the given key x.
The recurrence relation for the binary search over the height h is given by: For each of an iteration, number of levels is reduced by a factor of 2. Since hashing is applied over each of the table which returns the key in constant time O (1).
By applying the Master's Theorem [35]: As per case 2 of Master's Theorem one can write O(log h).Since height of the complete trie by Theorem 2 is log n.Replace h by log n as a result, the search time complexity is O(log log n).
Theorem 2: Height of the complete binary trie with 'n'number of nodes is log n Proof: Assume each level of the trie has maximum number of nodes.
Take log on both the side, then we will have  If the octet value is not similar then inserting into the Huffman tree and store the resultant binary code in those arrays.This will take O(n * log n) effort to add into the Huffman tree by Theorem 1.
Theorem 4: Search complexity of proposed approach before compression is O(log log n) Proof: By Theorem 2, height of the complete binary trie is h=log n.To search a key 'x'.Every level of the trie stores node value in the form of table.The search procedure starts from the middle level of the table i.e., middle = (0 + h)/2.This middle indicates the location of the level in the trie.Then extract 'middle' bits from 'x' from most significant bit.Apply hahing on these extracted bits to check whether this extracted bits present or not in the middle level table.If so, then recursively find new middle level and apply hashing till match is found.Applying the binary search on the level take log h effort.At each of the table search cost is O(1) because of hashing.The total cost of finding the next hop is log h + O (1).
Since h = log n we will get O(log log n).Hence the theorem is proved.Theorem 5: Insert or delete an prefix modified trie takes the complexity of O(log n) Proof: To insert/delete a key 'x' traverse the complete binary trie 'h' and start checking the bits of 'x' from root to leaf.It will take the path either left or right based on the bit of the key 'x'.In the worst case it will take O(h) to insert or delete into the trie which will give O(log n) by Theorem 2.
Theorem 6: Look up complexity in CTLA takes the complexity of O(log log n/2)  Proof: By Theorem 4 the look up complexity for searching an IP address in the modified trie structure take O(log log n).Forwarding table is compressed by applying the Huffman encoding scheme.So, the number of level in this CTLA reduces to O(log n/2).Because of reduction in the number of level in the CTLA over all look up complexity reduces to O(log log n/2).

V. RESULTS AND DISCUSSION
On a PC with an Intel i5-2400 CPU clocked at 3.10 GHz, the simulation of the suggested method was accomplished using C++ programming.The experiment is first carried out with the use of various trie-based algorithms on the Netbench dataset [36] Fig. 6 shows the result of applying the binary trie, path compressed trie, multi-bit trie with stride = 2, 3 and proposed CTLA for the uncompressed forwarding table.Fig. 6 clearly shows that the uncompressed table IP look up using CTLA trie works better than the binary trie, path compressed trie, multibit trie with stride = 2 and 3. Fig. 7 shows the results of applying the binary trie, path compressed trie, multibit trie with stride = 2, 3 and the proposed CTLA for the compressed forwarding table.Fig. 7  FIGURE 8. Look up time for the compressed and the uncompressed forwarding table using the proposed trie.
clearly shows that the compressed IP up using CTLA trie works better than the binary trie, path compressed trie, multibit trie with stride = 2, 3.
Fig. 8 shows the results of applying proposed CTLA for the compressed and uncompressed forwarding table.Fig. 8 clearly shows that the compressed table IP look up using CTLA trie works better than the uncompressed trie structure.

VI. CONCLUSION AND FUTURE WORK
The flow table is compressed using Huffman code in this paper, and an IP lookup is then performed on the compressed table.The octets are compressed in this case since it happens frequently.Each octet's value will have a shorter binary code if it occurs more frequently.The new short code is then substituted for the full forwarding table.For the prefix sets mentioned above, this method typically saves 61% of RAM.The proposed trie data structure is used to perform IP look up efficiently.Proposed search procedure is applied to the compressed forwarding table which takes the complexity of O(log (log n/2)).This is because of compression reduces the modified trie level into (log n/2).The space complexity after the compression take O(n * log n/2).The space complexity without applying the compression scheme will take O(n * log n).Because of the reduction in the space, look up efficiency can also be improved further.
Future Work: Requirements for effective compression and fast look-up techniques may change often in dynamic datasets.The system needs to be flexible enough to adjust to modifications in the data structure without sacrificing efficiency.It can be difficult to achieve real-time compression while keeping latency low, particularly in systems with strict performance requirements.
For each octet in the IP address, one has to store the binary code in an array namely a[0 : 255], b[0 : 255], c[0 : 255], d[0 : 255].Each octet value ranges from 0 to 255.To add 'x' into 'F ', check the arrays with the corresponding index.If octet values are exist in the array, then their frequency increases.Then concatenate all the binary code from each octet into single and store it in F along with next hop.To delete an IP address 'x' from 'F ', check the arrays with the corresponding index.If octet values are already exist in those arrays and decrease their frequency.If the octet values are similar then it can be added/deleted in constant time O(1).

FIGURE 6 .
FIGURE 6. Look up time for the uncompressed forwarding table using the binary trie, path compressed trie, multi-bit trie with stride = 2, 3 and the proposed CTLA.

FIGURE 7 .
FIGURE 7. Look up time for the compressed forwarding table using binary trie, path compressed trie, multi bit trie with stride = 2, 3 and the proposed CTLA.

TABLE 1 . Theoretical Analysis of Time and Space Complexity of
the complete trie is O(log n), application of the binary search makes the search complexity into O(log (log n)).If an above modified search procedure is applied to the compressed forwarding table it takes the complexity of O(log (log n/2)).This is because of compression reduces the modified trie level into (log n/2).The space complexity after the compression is O(n * log n/2).The space complexity without applying the compression scheme takes O(n * log n).The total time complexity to construct Huffman tree is O(n * log n) and to insert a prefix into modified trie takes O(log n/2).Suppose if there are n number of entries in the forwarding table then it will take O(n * log n/2) time.So, the overall construction time is O(n * log n + n * log n/2).The followings are the majore contribution of the proposed approach:

TABLE 5 . Frequency Table for Second OctetTABLE 6 . Huffman Code for Second OctetTABLE 7 . Frequency Table for Third Octet TABLE 8. Huffman Code for Third Octet =
240.Compression ratio is 100/240 = 41.66One can use 35.41% of memory to represent these in the forwarding table.

TABLE 9 . Modified Forwarding Table After Compression 10. Prefix Table maintains
a level table which stores a node value at each of the level.In order to search a given destination IP address 1) Update complexity of the compressed table in the worst case is O(n * log n) Proof: Assume a new IP address 'x' to be added into the forwarding table 'F '.