The quick-start extension of the transmission control protocol (TCP), as well as the explicit control protocol (XCP), are experimental congestion control schemes that use router feedback to overcome limitations of TCPpsilas standard mechanisms. Both approaches require additional packet processing in every router and therefore raise the question whether, and how, this can be achieved in high-speed routers. This paper studies the realization complexity of the quick-start and XCP router functions on a network processor. We show that in both cases synchronization issues among parallel processing entities have to be considered, and that this affects the router performance. We develop and compare different synchronization mechanisms for highly parallel packet processing. Our prototype implementation on an Intel IXP network processor allows to quantify the impact on throughput and delay caused by the additional packet processing in the fast path. The measurements reveal that quick-start and XCP processing is feasible at multiple Gbit/s line speed, with quick-start being simpler to scale.