PlausMal-GAN: Plausible Malware Training Based on Generative Adversarial Networks for Analogous Zero-Day Malware Detection

Zero-day malicious software (malware) refers to a previously unknown or newly discovered software vulnerability. The fundamental objective of this paper is to enhance detection for analogous zero-day malware by efficient learning to plausible generated data. To detect zero-day malware, we proposed a malware training framework based on the generated analogous malware data using generative adversarial networks (PlausMal-GAN). Thus, the PlausMal-GAN can suitably produce analogous zero-day malware images with high quality and high diversity from the existing malware data. The discriminator, as a detector, learns various malware features using both real and generated malware images. In terms of performance, the proposed framework showed higher and more stable performances for the analogous zero-day malware images, which can be assumed to be analogous zero-day malware data. We obtained reliable accuracy performances in the proposed PlausMal-GAN framework with representative GAN models (i.e., deep convolutional GAN, least-squares GAN, Wasserstein GAN with gradient penalty, and evolutionary GAN). These results indicate that the use of the proposed framework is beneficial for the detection and prediction of numerous and analogous zero-day malware data from noted malware when developing and updating malware detection systems.


I. INTRODUCTION
Malware can be defined as malicious software that is designed to cause outages, denial of activity, collection of personal data without user consent, unauthorized access to system resources, and similar inappropriate behaviors. With the rapid development of information technology, the exponential increase in malware has become one of the main threats to computer security [1]- [3]. Malicious software detection has become more difficult as the number and variety of applications increase in computer security [4]- [6], with more than 143 thousand new malicious programs targeting mobile devices detected during 2013 [5], and as Kaspersky Lab's research shows that nearly 30% of all computers were threatened at least once during 2018 [7].
Zero-day malware is an unknown or unaddressed software vulnerability that hackers use to do malicious things, such as destroying programs, stealing data, or paralyzing networks [8]. A range of antivirus systems and other strategies are used to help protect against the introduction of malware, which helps in detection if such malware is already present. Antivirus systems typically fail to detect zero-day malware because they rely on signatures to identify malware. Computers are more vulnerable to zero-day malware than to general malware because traditional antivirus systems typically cannot detect zero-day malware. Zero-day malware is an important threat to computer security, and zero-day malware detection is a top priority for malware detection systems.
To detect zero-day malware, we propose a deep learning method of generating arbitrarily modified malware features using the malware's raw code without running it. Malware code based on specific rules and actions generates certain patterns. Examples of the malware sample used in this study are shown in Figures 1 and 11 [9], [10].
While, when dealing with classification tasks using neural networks, data augmentation techniques have been used to compensate for imbalance or data insufficiency problems. In the malware detection research area, several papers also used simple data augmentation techniques (e.g., sliding window, transformation, etc.) to deal with these issues [11], [12].
In this study, we investigated and focused on the different direction of malware training technique with generating zeroday malware data, not focused imbalance or data insufficiency. We proposed a plausible malware training framework capable of detecting analogous zero-day malware that can handle newly plausible malware (Plausible malware training framework based on generative adversarial networks, PlausMal-GAN). Our main contribution is the proposed malware training framework based on generative adversarial networks (GAN) with generated analogous malware samples. The proposed framework trains a generator and discriminator based on real malware data and the generated malware data in the first phase. In the second phase, the generator is fixed and the discriminator is re-trained based on real malware data and the generated malware data by the fixed generator. Ideally, the proposed framework can apply any kind of GAN model, so we evaluated the performance by applying the latest and repetitive GAN models. Moreover, we obtained stable performance for abundant analogous zero-day malware test data in relatively few training data conditions.

II. BACKGROUND A. MALWARE DETECTION
Owing to the increasing damage caused by malware and zero-day malware, research on malware detection methods have been continuously improving. We discuss two aspects of malware detection: malware detection and zero-day malware detection.
Several reported studies have dealt with malware detection [10], [13]- [17]. Nataraj et al. presented a visualization approach that differs from traditional approaches for malware detection [10], where they transformed the malware's binary information into grayscale malware images. Ye et al. and Ndibanje et al. used Windows Audit Log and API Call for malware detection [18], [19]. Traditional machine learning algorithms such as hidden Markov models, support vector machines (SVMs) and random forests were also used for malware detection [20]- [23]. Singh et al. proposed a Big Data analysis framework based on random forests for malware detection [24]. Chen et al. attempted to detect malware by analyzing mobile network traffic with machine-learning methods [25]. Recently, there have been many methods to use deep learning and generative adversarial networks (GAN) because the available computing power has increased [11], [12], [26]- [31]. Pascanu et al. used recurrent neural networks for time-series information in malware classification [26], [32]. Ye et al. presented a heterogeneous deeplearning framework composed of an autoencoder stacked up with a layer of associative memory and multilayer restricted Boltzmann machines [27]. Kabanga et al. used data from converted malware images as input to the convolutional neural networks (CNNs) [28]. Yan et al. used CNN and long short-term memory networks to learn from grayscale image and opcode sequence, respectively, and takes a stacking ensemble for malware classification [11]. The aforementioned methods have disadvantages that detect only certain variants of malware. The developers of malware use obfuscation techniques, such as null byte injection, code exchange, and subroutine reordering, to create new variants with signatures different from existing malware. However, the aforementioned methods use malware that has been discovered so far. Thus, unlearned malware will not be detected. To detect attacks that bypass deep-learning methods [33], Wang et al. proposed a resistant method that is robust to adversarial malware samples by nullifying arbitrary features [33]. However, in this way, malware characteristics are randomly removed, which risks removing not only unnecessary features but also important ones. There are now hybrid methods that combine static and dynamic methods [22], [34]. While these methods can be effective for malware detection, they have the disadvantage of being time-consuming and highly complex.
Recently, there have been some methods developed for zero-day malware detection [13], [14], [35], [36]. Venkatraman and Alazab used a similarity matrix of malware for visualization in order to detect zero-day malware [14]. This method can be used to visually observe that different malware families exhibit significantly different behavior patterns. Gupta and Rani proposed a Big Data framework to address the Big Data problem caused by increase in malware [35]. They also attempted to detect zero-day malware using Big Data analysis techniques and machine-learning algorithms. This method modeled a series of opcodes to detect zero-day malware. Due to the increasing threat of malware in a cyber-physical system, Huda et al. proposed a detection method that uses methods like SVM and K-means to detect unknown malware by extracting knowledge and essential structures from already unlabeled, cheap, available data [36]. In the aforementioned zero-day malware detection methods, certain rules are fixed, and zero-day malware that does not follow these rules cannot be detected. Recently, Kim et al. has proposed transferred deep-convolutional generative adversarial network (tDCGAN), which generates fake malware and learns to distinguish it from real malware [13]. This method obtained not only enhanced performance in malware detection but also showed possibility in a zero-day attack experiment. Since the method is no consideration of high diversity (e.g., plausible diversity) or quality in generated zero-day malware, nor was it measured numerically (i.e., fr e Aßchet inception distance, etc.), it is difficult to assume that focused on zero-day malware detection. While, we implemented analogous zero-day malware classifier with GAN models to create new high-diversity and high-quality malware images for generating plausible malware augmentation. The generated data is used to create a robust detector for zero-day malware detection.

B. DATA AUGMENTATION
Data Augmentation encompasses a suite of techniques that enhance the size and quality of training datasets such that better deep learning models can be built using them [37], [38]. The simple data augmentations based on basic image manipulations are flipping, cropping, rotation, translation, etc [37], [38]. Recently, GAN based approach refers to the practice of creating artificial instances from a dataset such that they retain similar characteristics to the original set [39], [40]. In malware detection, several papers applied data augmentation method to solve imbalance or data insufficiency issues [12], [41].
To our best knowledge, there have been no studies to date which focused on the high diversity and quality of plausible malware in terms of analogous malware augmentation, which is an important factor to be investigated for various transformations or analogous data augmentation using a zero-day malware detection system. In this study, we proposed a plausible malware training framework based on GAN that could consider high diversity in generating analogous zero-day malware data. Moreover, the proposed method showed stable performance even with relatively little training data. We applied different kinds of several recent GAN models (i.e., deep convolutional GAN (DCGAN) [42], least-squares GAN (LSGAN) [43], Wasserstein GAN with gradient penalty (WGAN-GP) [44], evolutionary GAN (E-GAN) [40]) to our design, it could be shown as a potentially reliable adaptation in state-of-the-art GAN models.

C. GENERATIVE ADVERSARIAL NETWORKS
GAN [39] is a deep-learning model that emerged for the purpose of generating data similar to the training data using the given training data. Unlike the original GAN, which uses only one objective function (e.g., minimax), Wang et al. proposed E-GAN [40] using several objective functions (i.e., minimax, heuristic, and least-squares). Generators using each objective function are evaluated by a discriminator, and the best-performing generator is chosen to evolve to the next stage. In the process of evolution, the evolved generator is expected to gradually adapt to the discriminator, which means that the evolved generator can provide high-quality, high-diversity samples and learn the real data distribution. The evolutionary process consists of three stages (i.e., variation, evaluation, and selection): First, the variation stage used the variation operators to produce its offspring {G u 1 , G u 2 ,... }, given an individual G u in the population. In particular, several copies of each individual or parent were created, each of which was modified by different mutations. Then, each modified copy is regarded as one child. Second, in the evaluation stage, we evaluated the performance or individual quality for each child by a fitness function F that depends on the current environment (i.e., discriminator D). Third, in the selection stage, we selected all children according to their values and removed the worst ones. The rest remained alive (i.e., free to act as parents) and evolved to the next iteration.
Compared to the generator using multiple objective functions, the discriminator is the same as the objective function of the original GAN. The discriminator D is trained to distinguish between the real data sample x $ p data ðxÞ and the generated data samplex $ p gen ðxÞ L D ¼ ÀE x$p data ½log DðxÞ À Ex $pgen ½log ð1 À DðxÞÞ: (1)

III. METHODS
In this section, we describe a plausible malware training framework based on generative adversarial networks (GAN) that generates analogous malware with a malware classifier and training discriminator as a malware detector. Figure 2 is an architectures of our proposed framework.

A. PLAUSMAL-GAN FRAMEWORK
To generate analogous malware samples for each kind of malware, the proposed framework trains a generator and discriminator based on GAN with a malware classifier using real malware data and the generated malware data in the first step. The discriminator not only discriminates real or fake, but also learns to classify malware classes. In the second step, the generator is fixed and the discriminator is re-trained based on real malware data and the generated malware data by the fixed generator. Figure 3 shows the overview and process of the proposed framework. The auxiliary classifier GAN (AC-GAN) [45] proposed a structure that produces data that matches class labels as well as data that are close to real data. For malware classifier, the architectures of the proposed framework is following the AC-GAN structures ( Figure 2). Our malware generator generates fake malware samplesx that contain noise sample z by malware class c, and discriminator not only distinguishes between real x $ p data ðxÞ and fakex $ p gen ðxÞ but also class c. The difference between our method and the existing AC-GAN is that the discriminator does not learn the class information of the generated malware sample, only the class information of the real malware sample. Our discrimination training loss is defined as follows: And, we considered standard GAN approach (minmax), least-squares approach, heuristic approach, and combining the preceding three-approach for DCGAN, LSGAN, WGAN-GP, and E-GAN model in the proposed framework, respectively. In E-GAN, we considered an evolutionary step consists of three sub-steps: variation, evaluation, and selection. In the variation step, we adopt three objectives that are interpretable and complementary as mutations proposed by Wang et al. [40]. As shown in Figure 4, the difference between the three objective functions are minimax mutation, heuristic mutation, and least-squares mutation. In addition, we added a classification loss function to the existing mutation functions, because not only the data is close to real but also data corresponding to the class must be generated. The minimax mutation is similar to the minimax objective function of the original GAN, which aimed to minimize the log probability that the discriminator  would do well. In the original GAN, gradient vanishing can occur when the discriminator produces a result close to zero (i.e., DðxÞ ! 0). In other words, if the discriminator is confident that the generated malware data is fake malware data, the generator may not train well. However, we have been able to solve this problem to some extent by adding a classification loss. Unlike early gentle gradients, if the generated malware distribution is somewhat similar to the real malware distribution, the minimax mutation provides a steep gradient, which later allows stable learning M minimax G ¼ Ex $p gen ½log ð1 À DðxÞÞ À log pðcjxÞ: The heuristic mutation minimizes the log probability that the discriminator will do well, which maximizes the log probability that the discriminator will go wrong. Using this mutation, the gradient is steep even though the discriminator is convinced that the generated malware data is fake. Thus, the heuristic mutation can avoid a vanishing gradient, unlike the minimax mutation, which suggests the possibility of better learning in the early stages than the minimax mutation Lastly, the least-squares mutation is similar to the leastsquares objective function of the LSGAN, which aimed at deceiving the discriminator by penalizing the generator. Using this mutation, we get a gentle slope overall and can avoid a vanishing gradient as in a heuristic mutation. Besides, least-squares mutations, when compared to heuristic mutations, do not assign very high costs to generate fake malware samples but do not assign very low costs to mode dropping, which partially avoids mode collapse [43] M least-s.

G
¼ Ex $p gen ðDðxÞ À 1Þ 2 À log pðcjxÞ Require: batch size m ¼ 32: discriminator's updating steps per iteration n D ¼ 1; number of parents m ¼ 1; number of mutations n m ¼ 3; Adam hyper-parameters a ¼ 0:0002; b 1 ¼ 0:5; b 2 ¼ 0:99; the hyperparameter g of evaluation function. Require: initial discriminator's parameters w 0 : initial generator's parameters fu 1 0 ; u 2 0 ; . . .; u m 0 g: for number of training iterations do for k = 0,..., n D do Sample a batch of fx ðiÞ g m i¼1 $ p data (training data), and a batch of fðc; zÞ ðiÞ g m i¼1 $ p c;z (noise sample z by class c). In the evaluation step, the 1) malware quality and 2) diversity of the generated malware samples are measured and evaluated. To detect zero-day malware, it was important to generate samples of high-diversity malware with high quality, so we adopted the evaluation step of the E-GAN architecture.
First, the quality fitness score was used as a measure of quality. This method puts the generated malware image based on the noise sample by class into discriminator D and uses the output value. We use the output of D multiplied by the probability of that class to measure the image quality score for each class. And, we use the average output value. The closer the value is to 1, the closer to reality the malware data is. In other words, the closer to 1, the higher quality malware data F q ¼ Ex $p gen ½DðxÞ Â log pðcjxÞ: Second, the diversity fitness score is used as a measure of malware diversity. This method uses the minus log-gradientnorm of the discriminator. When the generator generates data that greatly changes the gradient of the discriminator, the discriminator is likely to determine that the generated malware data is fake. In contrast, when the generator generates data that does not change the discriminator gradient significantly, the generated malware data is not labeled as fake and tends to achieve high diversity Using the two fitness scores mentioned above, the criterion for the E-GAN evaluation is as follows: where g > 0 is the balance between the quality and diversity measurements.
In the selection step, the offspring with the highest fitness score is selected and proceeds to the next variation step. Throughout the evolution process, the generator will gradually generate data for each class as well as generating data similar to real data. We use the converged generator for malware detection in the next step.

MALWARE DETECTION
For analogous zero-day malware augmentation, the malware generator generates high-quality and high-diversity images. We use the discriminator's classifier as a malware detector. The discriminator has trained anew as a malware detector without adversarial training with the generator. As a malware detector, the discriminator is trained using both generated and real malware images. The objective function of the discriminator is as redefined L D ¼ ÀE x$p data ½log pðcjxÞ À Ex $p gen ½log pðcjxÞ (9) when training the discriminator, the generator is not trained and only generates malware images. Figure 3 shows training the discriminator with data augmentation as a malware detector.

IV. EXPERIMENTS AND RESULTS
This section describes the experiments and results for evaluating the proposed framework.

A. DATASETS 1) MICROSOFT MALWARE CLASSIFICATION CHALLENGE DATASET
To verify the data generation and detection performance of the proposed framework, we used a malware data from the Microsoft dataset [9]. The malware file was a byte file, and we used binary code written to it. The total number of malware is 10,868, divided into 9,781 training sets and 1,087 test sets (9:1 train-test ratio). Appendix B, which can be found on the Computer Society Digital Library at http://doi. ieeecomputersociety.org/10.1109/TETC.2022.3170544, shows the malware data types used and the number of malware for each malware type [9].
log pðc ðiÞ jG u ððc; zÞ ðiÞ ÞÞ # w Adamðg w ; w; a; b 1 ; b 2 Þ end for end for As Nataraj et al. did [10], we convert malware binary code into an image called malware image. If k is the length of the binary code, C is the size of the converted column, and R is the size of the converted row, this is how to calculate the size of the converted columns and rows The malware images were so large that they were reduced to 128 Â 128 using Pillow which python image library. Then we used jet colormaps to represent RGB color images.

2) MALIMG DATASET
In Supplementary Materials Appendix C, available in the online supplemental material, we show the frequency distribution of malware families and their variants in the Malimg dataset [10]. We were able to find malware data from malware class that shared the family name (i.e., Worm: Allaple.A and Allaple.L, PWS: C2Lop.gen!G and C2Lop.P, Trojan: Lolyda.AA1 and Lolyda.AA2, TDownloader: Swizzor.gen!I and Swizzor. gen!E). In Table 1 and Figure 11, eight different malware data have four pairs with two different and similar family names and shared similar properties. For the second zero-day malware experiments, we evaluated malware data with similar properties family in the Malimg dataset, which consists of 5,543 malware samples from 8 different malware families.

B. EXPERIMENTAL DETAILS
The experiment is divided into two parts: a existing malware classification and a analogous zero-day malware attack experiments. In the existing malware classification experiment, we compared the proposed framework with representative GANs (i.e., DCGAN, LSGAN, WGAN-GP, and E-GAN) and previous methods experimental results [13]. In the proposed framework, we used the same network structure (Supplementary Table S2, available online). In the first analogous zero-day malware attack experiment, we also compared our framework with the four GAN models and previous methods results (i.e., random forest, decision tree, nearest neighbors, Naive Bayes, multi-layer perceptron (MLP) [46], CNN [47], GAN [39], and tDCGAN [13]). In the second zero-day malware experiment, we compared the proposed framework phase 1 and phases 1&2 with the representative four GAN models.
The operating system of the computer used in the experiments was Ubuntu 16.04.2 LTS, and the central processing unit was Intel Xeon Gold 6148. The random-access memory was Samsung DDR4 16 GB Â 4, and the graphics processing unit was TITAN XP. When implementing the proposed framework, we used the Pytorch library. The generative and discriminative network architectures used in the generator and discriminator respectively, are shown in Supplementary  Table S2, available online.
C. ANALYSIS OF GENERATED MALWARE DATA Figure 5 shows examples of the generated malware images using the Microsoft dataset [9]. In qualitative terms, Figure 5 shows the generation of malware images that are similar to the real malware images, which shows that the proposed framework can also generate modified malware or analogous zero-day malware.
We choose the Fr echet inception distance (FID) [48] as a quantitative metric for evaluating generator convergence. The FID uses pre-trained Inception v3 networks to extract features of the generated images and real images. Then model the data distribution for extracted features using a multivariate Gaussian distribution with mean m and covariance S. The FID between the real images x and generated images g is computed as below where Tr is the sum of all the diagonal elements. A lower FID implies that the distribution distance between the real images and generated images is closer. It also means that the generated images have high quality and high diversity. As shown in Table 2, our proposed framework has the lowest FID score. This means that the generator of our proposed framework generated a high-quality and high-diversity malware sample. While low FIDs do not actually produce new malware, it is likely a variant of existing malware. This allows us to expect data augmentation with the generated data.

D. MALWARE CLASSIFICATION
To derive a more accurate estimate of model prediction performance, we used 10-fold cross-validation for all methods and it was used for the existing malware classification experiment   using the Microsoft dataset [9]. The average classification accuracy achieved by the proposed framework was 95.56%, which means that the performance of our proposed framework was much better than the previous methods. Table 3 shows the numerical classification results with four difference models (i.e., DCGAN, LSGAN, WGAN-GP, and E-GAN). Because the performance was the most dominant when using the E-GAN model, only the proposed framework with this model was used for some further analysis (i.e., Table 4, Figures 7 and 9).
To verify the performance of the proposed malware classifier model, we showed a confusion matrix in Figure 7. We calculated the precision, recall, and F1-score for each malware type and summarized them in Table 4. Also, we compared the classification accuracies for the proposed framework with difference four GAN models according to the training iterations in Figure 6. In results, the E-GAN models showed higher classification performance than other Representative models.

E. ZERO-DAY MALWARE 1) ZERO-DAY MALWARE EXPERIMENT I USING GENERATED ANALOGOUS ZERO-DAY MALWARE
We modeled plausible zero-day malware for analogous zeroday malware attack experiments using the Microsoft dataset ( Figure 8) [9]. The previous study assumed that the zero-day attacks can be modeled by introducing noise into existing malware data [13]. The noise was generated by the structure similarity (SSIM) method, which uses the structural similarity of images [49]. We likewise used the SSIM method for systematic noise generation. The method of calculating the SSIM values for a pair of images x; y includes calculating m x ; m y as the means for the pixels of the images x, y.
The results of the analogous zero-day malware attack experiment in Table 5 divided the malware images into an experiment with an 8:2 combined ratio and a 7:3 combined ratio. We used 10-fold cross-validation (i.e., the train-test ratios: 9:1). In 8:2 combined ratio experiments, the proposed frameworks' models were more accurate than other previous recent methods [13], and we obtained stable accuracy performance in our frameworks with tested GAN models in all SSIM conditions. Moreover, in the 7:3 combined ratio experiments, we also obtained reliable high averaged performance 98.62%, 98.37%, 98.51%, and 99.49% for the proposed framework methods with DCGAN, LSGAN, WGAN-GP, and E-GAN model, respectively. In particular, the decreasing SSIM values or combined high noise ratio could be an analogous zero-day attack compared to existing malware, but the proposed framework showed stable performances in any SSIM values or combined ratios. As a result, the proposed framework obtained high and stable performance even the large variations of existing malware (e.g., combined ratio 7:3 or SSIM value 0.6) in a analogous zero-day malware attack. Moreover, we were conducted in few training data condition by the changing train-test ratios experiment (9:1!5:5) for a thorough performance verification   evaluation with 2-fold cross-validation (10-fold ! 2-fold cross-validation). This experiment was able to evaluate more various zero-day malware data by increasing the number of existing test data (the average number of zero-day malware data: 506 (122$1,122) ! 14,850 (4,262$33,710)). As shown in Table 6 and Figure 9, we obtained stable test performance even though not only the relatively few training data (reduced to half) but also increased analogous zero-day malware test data in the proposed framework ( > 99%).

2) ZERO-DAY MALWARE EXPERIMENT II USING MALWARE DATA WITH SIMILAR FAMILY NAMES
We conducted a zero-day malware attack experiment II with different class malware data sharing the family name with similar properties from Malimg dataset [10]. We discovered data from the Malimg dataset that are very suitable for use in zero-day malware experiments (Table 1 and Figure 11). We trained and tested four classes using two different family name data with similar properties ( (3,494). Session B has a challenging problem of learning with a small amount of training data. This is a big issue not only in the field of machine learning but also in developing malware detection, especially zeroday malware detection technology. Even if it is derived from the same malware family, it is zero-day malware that is not previously learned, and it can cause a big performance degradation problem in the initial period as there is a very limited data to learn. To verify that the proposed framework can handle zero-day malware problems and a few data issues, we designed a second zero-day experiment using a similar malware family from the Malimg dataset. The experiment consists of the training sessions that were not only composed of session A and B, but also we evaluated the proposed framework with only phase 1 and with phases 1&2. The proposed framework deal with analogous new data by composing phase 1 to train the generator and discriminator and phase 2  to train the discriminator on the analogous zero-day malware data. In Table 7 and Figure 10, we showed that the models trained up to phase 2 performed better than only phase 1 learned in all sessions (A and B). In particular, very interesting results were obtained in session B, where training was performed with a small amount of training data. In session B, the result of learning only phase 1 of the proposed framework was disastrous in all tested GAN models. This experiment demonstrates that existing GAN studies (i.e., phase 1 in the proposed framework) may not respond properly to new data. On the other hand, the final model trained up to phase 2 of the proposed framework showed very stable and high averaged accuracy ( > 98.65%) ( Table 7 and Figure 10). Consequently, the proposed framework can learn very effectively when there is little data, showing excellent performance in the zero-day malware detection problem. In practice, it is known that zero-day malware is often derived from variations of existing malware [8], [13]. To explore the limits in the performance of proposed frameworks, we performed on the restricted dataset for evaluation even using two different datasets [9], [10]. The first zero-day experiment designed assumes a plausible zero-day malware attack by transforming existing malware instead of the actual zeroday malware attack data. Additionally, we designed other zero-day experiments using a similar malware family from different malware types. Although we have obtained outstanding results in various zero-day experiments, we might have obtained more meaningful interpretation and discussion if we measured and utilized a richer malware database.
While, the GAN based image-processing approach method has a one-way limitation about malware code to the image in the malware detection field [8], [13], [29]. However, conversion to the malware code is not required to achieve the goals and objectives of this study. In this paper, the proposed framework is to detect a myriad of similar malware that can be made with slight changes. Even if the proposed framework cannot reproduce the malware code, it is a model that can detect and classify the analogous malware with high similarity to the learned sample malware data. In addition, if a new type of zero-day malware that is not used for learning appears, the proposed method also has the advantage of being able to quickly learn about the new type of malware and apply it. Therefore, in terms of practicality and convenience, it is a very helpful framework when developed zeroday malware detection software.
Meanwhile, as it is known from the adversarial attack, the performance of many machine learning based systems is greatly reduced and neutralized by small distortion (e.g., combining noise, etc.) [50], [51]. This is no different in this field, and some hackers will be taking this vulnerability. Therefore, it is necessary to build a robust and stability security system from these easy modifications. The proposed framework is intuitively generating and learning a plausible new malware from existing malware, and it can be a complementary measure to deal with these challenge problems.

V. CONCLUSION
In the present study, the proposed framework based on plausible malware training and augmentation using a generative adversarial network was to solve the problems caused by malware and analogous zero-day malware. In particular, because zero-day malware is often created by the deformation of existing malware, the proposed framework with representative GAN models augmented even for the high-quality and high-diversity evolved malware images. For detection and classification, the discriminator was trained using malware images generated by the generator and robust to zero-day malware. Moreover, the proposed framework achieved high and stable averaged accuracy in the analogous zero-day malware attack experiment. We believe that the proposed framework based plausible zero-day malware detection approach has important advantages for antivirus systems in the computer security because it does not require inefficient malware signatures analysis. In this study, the malware code has been converted to malware images with fixed sizes through crop and pad operations for efficient learning. In fact, the processes could reduce the signatures of malware. In future studies, we will expand the malware types with various malware datasets (including zero-day malware) and solve the problem of various malware lengths. Moreover, further research should be conducted to develop an optimized GAN model performing in our proposed framework for extensive zero-day malware detection. In future studies it will be interesting to use explainable AI techniques (e.g., [52]) to gain a further understanding of zeroday malware features, thus allowing the zero-day malware detection AI and its creators to learn better from their mistakes. Moreover, cases of extreme changes, such as new type of zeroday malware, deserve further investigation to extend the possible application spectrum.