Abstract:
Generative pre-trained transformers are exceedingly effective as generative models and classifiers, widely used in natural language processing and computer vision. This w...Show MoreMetadata
Abstract:
Generative pre-trained transformers are exceedingly effective as generative models and classifiers, widely used in natural language processing and computer vision. This work contributes to the exploration of generative pre-trained transformer-based models in the proprietary protocol network traffic. However, building a pre-trained model for proprietary protocol network traffic is non-trivial due to the heterogeneous unknown formats and the extreme scarcity of proprietary protocol network traffic datasets. In this paper, we present PNetGPT, a pre-trained transformer-based model for generating proprietary protocol network traffic. We have constructed the inaugural dataset of 2 real-world proprietary protocols. After training on this dataset, PNetGPT possesses the capacity to generate high-quality proprietary protocol network traffic to support various applications of proprietary protocols, including reverse analysis, protocol fuzzy testing, intrusion detection, etc. We evaluated PNetGPT with two real proprietary protocols and demonstrated state-of-the-art (SOTA) performance in handling heterogeneous unknown formats. The code and datasets are available at: https://github.com/Snail1502/PNetGPT
Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 06-11 April 2025
Date Added to IEEE Xplore: 07 March 2025
ISBN Information: