Loading [MathJax]/extensions/MathZoom.js
MalSSL—Self-Supervised Learning for Accurate and Label-Efficient Malware Classification | IEEE Journals & Magazine | IEEE Xplore

MalSSL—Self-Supervised Learning for Accurate and Label-Efficient Malware Classification


MalSSL, a self-supervised learning-based method utilizing image representation to classify malware. MalSSL classifies unlabeled malware images using contrastive learning ...

Abstract:

Malware classification with supervised learning requires a large dataset, which needs an expensive and time-consuming labeling process. In this paper, we explore the effi...Show More
Topic: Computational and artificial intelligence,Computers and information processing

Abstract:

Malware classification with supervised learning requires a large dataset, which needs an expensive and time-consuming labeling process. In this paper, we explore the efficacy of self-supervised learning techniques for malware classification. We propose MalSSL, a self-supervised learning-based method utilizing image representation to classify malware. MalSSL classifies unlabeled malware images using contrastive learning and data augmentation. The model is initially trained on an unlabeled Imagenette dataset as a pretext task and subsequently retrained on an unlabeled malware dataset in downstream tasks. Two downstream tasks were employed to evaluate the system: 1) malware family classification and 2) malware benign classification. The obtained results include an accuracy of 98.4% for the malware family classification experiment on the Malimg dataset and an accuracy of 96.2% for the malware and benign dataset (Maldeb dataset). Our findings suggest that the proposed system accurately classifies malware without the need for labeled data, displaying higher accuracy compared to other self-supervised methods. This research not only contributes to advancing the state-of-the-art in malware classification but also underscores the potential of self-supervised learning methods as a viable solution for addressing the dynamic landscape of malware threats.
Topic: Computational and artificial intelligence,Computers and information processing
MalSSL, a self-supervised learning-based method utilizing image representation to classify malware. MalSSL classifies unlabeled malware images using contrastive learning ...
Published in: IEEE Access ( Volume: 12)
Page(s): 58823 - 58835
Date of Publication: 22 April 2024
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.