I. Introduction
The ever-growing volume of traffic in modern networks motivates the utilization of machine learning (ML) in networking [1]. When it comes to network monitoring for cybersecurity, a promising idea is to do traffic classification as early as possible, directly within network devices. Resources in network devices have been traditionally constrained, encompassing limitations in terms of memory, processing capacity, available operations, and more. Consequently, these devices are traditionally treated as “dumb” from a traffic monitoring perspective, performing only the essential functions required for the network to operate. The emergence of new data plane architectures raises the hope that network devices will perform functions beyond simple traffic forwarding. By doing so, the burden on the control and management planes is alleviated, and a portion of the processing is decentralized. Additionally, processing within the network device occurs more expeditiously, reducing the need for offloading to the control plane. Network programmability entails the ability to specify and modify algorithms in both the control and the data plane [2].