Skip to Main Content
Traditional network processors (NPs) adopt pull model, where NP cores pull packet data from external memory to local memory, triggered by cache miss or fetch instructions. Due to the long latency of data fetching, hardware multithreading is typically used to reduce the waiting time. Multithreading incurs context switch overhead, leading to inefficiency in payload processing applications. We propose a push model for future NP's architectural design to increase throughput and decrease processing delay. A hardware push unit helps to move the segments of a packet to a core's local memory to reduce hardware thread switching. Theoretical analyses are given to compare the pull and push model's performance. Further, we selected our FPGA based THNPU NP platform for verification. Experimental results indicate that the push model not only improves the system throughput, but also reduces the delay, with only a fraction of logic gate increase.