The fine-grained parallelism inherent in FPGAs has encouraged their use in packet processing systems. To facilitate debugging and performance evaluation, designers require on-chip monitors that provide abstractions of low-level details and a system-level perspective. In this paper, we present five architectures that permit transaction-based communication-centric monitoring of packet processing systems. We compare the resource requirements and filtering functionality of each architecture, demonstrating that sequential matching is more resource efficient than parallel matching. We also show that generic filtering has a low overhead compared to specialised filtering while providing additional flexibility. A scalable architecture is also presented, which is more flexible and adaptable to matching requirements than other architectures. These monitoring architectures permit the implementation of a highly effective test system which provides a system-level perspective and is more resource efficient than conventional RTL debug environments.