The ongoing move of hardware platforms to many-core processor challenges the traditional software design methodology. It is critical to develop new programming paradigms and efficient ways to port legacy applications. This paper analyzed a typical packet processing application and also the cache hierarchy and behavior of Raw architecture many-core processor. It presented an easy to implement run-time dynamic core grouping approach to improve the system performance. This approach reduced the cache swap latency by grouping neighbor cores attached to the mesh network. It optimized the scale of group by experimental data got beforehand. The test results showed this approach can improve the Deep Packet Inspection (DPI) system performance around 10% with very minor code change.
Published in:
Parallel Computing in Electrical Engineering (PARELEC), 2011 6th International Symposium on
Date of Conference: 3-7 April 2011