Tier-Based Optimization for Synthesized Network Intrusion Detection System

The innovation and evolution of hacking methodologies have led to a sharp rise in cyber attacks, highlighting the need for enhanced network security approaches. Network intrusion detection systems based on machine learning are playing a significant role in the domain of network security. However, designing an optimal framework for a network intrusion detection system is an ongoing concern. In this study, an optimal framework for a network intrusion detection system based on image processing is proposed. The framework is a fusion of augmented feature selection flow with an image transformation and enhancement methodology. Initially, the proposed framework reduces the number of features to achieve overall efficiency. Later, the non-image data is transformed into images. The transformed images are then enhanced for achieving effective anomaly detection based on a deep-learning classifier. The proposed method is implemented on three diverse benchmark datasets of intrusion detection. To illustrate the efficiency of the proposed framework it is compared with some of the most recent publications on image-processing-based network intrusion detection systems.


I. INTRODUCTION
The pervasive use of interconnected computer systems has become an irreplaceable aspect of organizational and daily life activities. Concurrently, it had led to concerns about the online privacy and security of the users [1], [2]. As per recent surveys, the reported cyberattacks in 2021 were approximately 5.1 billion [3], [4]. The reports also indicate a surge in sophisticated and high-impact cyberattacks on critical infrastructure globally [4], [5]. Understandably, such a high number of cyberattacks indicate the need for enhancement in network security approaches. Machine Learning (ML) based Network Intrusion Detection Systems (NIDS) are considered to be among the most effective approach to counter network attacks. However, sustaining the efficiency and effectiveness of ML-based NIDS against ever-mutating network attacks is a highly challenging task. Designing an optimal framework for The associate editor coordinating the review of this manuscript and approving it for publication was Marina Gavrilova .
ML-based NIDS is an ongoing struggle [6], [7], [8]. There is a constant compromise between achieving high efficiency and effectiveness. The ML-based NIDS with high efficiency may not be highly effective, while the one with high effectiveness may not be highly efficient [9], [10]. In efforts to optimize ML-based NIDS, researchers have worked on multidimensional approaches i.e. feature selection, data augmentation, classification algorithms, and hybrid algorithms to optimize the NIDS framework [11], [12]. Even with all the efforts, the degree of successful malicious attacks is increasing rapidly. Hence, a refined and scalable intrusion detection method is essential to counter the cybersecurity concern. With the advancements in Deep Learning (DL) and image processing, security experts are exploring the possibilities of using DL and image processing for NIDS [13], [14], [15], [16], [17]. DL is an improved form of the neural network (NN) as it overcomes three significant training phase issues of NN i.e. over-fitting, vanishing gradient, and computational load [18]. The convolutional neural network (CNN) is among the DL models that are designed predominantly for image data [19]. CNN is among the recent and highly accurate classification approaches in image processing [20]. In the last few decades, the use of image processing in the healthcare industry has obligated researchers for achieving extreme precision in image analysis, detection, and classification [21]. The exploration of image processing for NIDS is intriguing due to the high precision results achieved by CNN and image processing methods. The fusion of image processing in NIDS is relatively new and requires innovation. One of the major concerns for image processing-based NIDS is the conversion of non-image network traffic into images for visual processing. A few of the prominent methods for converting the non-image data into images are by converting a one-dimensional vector to a multi-dimensional matrix, using the Fourier domain, and using spectrogram-based image transformation [14], [15], [16]. The mentioned approaches do have concerns regarding general application and image transformation results, which are discussed in the related work section. Such issues have opened doors for further exploration of methods that can improve the conversion of non-image data into images. In our earlier study [17], we implemented the image processing-based NIDS with CNN based classifier. In that study, we used all the features of the implemented datasets. The inclusion of all features was based on the notion that higher pixel images are conducive to detecting anomalies [22]. In the prior study [17] the accuracy of anomaly detection was above 90% on all the datasets. This work is an enhancement to the earlier work by augmenting the proposed framework through feature selection. This study attempts to contribute to the two key areas of image-based NIDS. First, the framework uses a reformed filter-based feature selection flow to achieve overall optimization of the NIDS. The augmented feature selection approach increases the overall efficiency of the NIDS. The second is an innovative framework to transform non-image data into image format. The method of transforming non-image data into images can further be divided into two steps. Initially, the framework transforms non-image data into images. Later, the converted images are enhanced to attain improved anomaly detection using a CNN-based classifier. Even with the fewer pixels of image representation, the proposed framework achieved a detection rate of over 92% on CSE-CIC-IDS 2018 [23], CIC-IDS 2017 [24], and ISCX-IDS 2012 [25] datasets.
The remaining of the paper is structured as follows: Section II discusses the related studies on ML, DL, and image processing-based NIDS. Section III presents the proposed methodology and elaborates on each step of the framework. Section IV gives details on the implementation of the proposed NIDS. Section V highlights the results and comparison of the proposed and recent prominent image-based NIDS approaches. Section VI discusses the outcomes of the proposed methodology in contrast to the results of the implemented comparative approaches. Section VII concludes the study with a future direction of research.

II. RELATED WORK
The researchers have worked extensively to incorporate ML in NIDS. Despite the extended research, the struggle to achieve an optimal framework for ML-based NIDS is a challenging and ongoing task. To optimize ML-based NIDS researchers have explored hybrid and innovative approaches for data pre-processing, feature selection, and prediction algorithms. In recent developments, researchers are exploring DL for NIDS solutions [26]. Among these DL models, CNN is considered a highly effective and efficient model. Generally due to its ability to reconstruct features and learn in-depth patterns from images [27]. Table 1 represents a summary of recent and prominent publications in the domain of image processing-based NIDS.
As seen in Table 1, most of the papers use two main approaches to convert non-image data into image format. One is to simply transform one-dimensional data into a multidimensional matrix. The second is to use the Fourier domain for the transformation of non-image data into image format. Both of the mentioned methods have some advantages and disadvantages. For instance, the first approach is highly efficient but can compromise the correlation between features [32]. Such a compromise can influence the NIDS's ability to detect sophisticated attacks. On the other hand, image transformation by using Fourier domain-based may have complexity issues when it comes to big data [33]. The mentioned issues highlight the room for improvement in the domain of converting non-image network traffic into image format. The application of CNN in the domain of NIDS brings high precision for classification. However, this high precision relies heavily on the transformation of non-image network data into images for a CNN-based classifier. For example, Xiao et al. [28] use a fusion of principle component analysis (PCA) and auto-encoder (AE) for feature engineering, and then two-dimensional images were created for a CNN-based classifier. The proposed hybrid approach was not so successful in detecting minor attack labels in the dataset. Similarly, Zhang et al. [34] proposed a highly complex approach for converting non-image data into images. The proposed approach used the P-Zigzag algorithm for creating two-dimensional greyscale images for CNN (gcForest) classifier. Despite the high computational cost, the proposed model was very effective in detecting anomalies. Further, Jiang et al [35] proposed an effective but highly complex IDS approach. The proposed framework initially balances the dataset by using the one-side selection (OSS) to decrease a large number of samples in the main category and Synthetic Minority Over-sampling Technique (SMOTE) to upsurge the samples of minority samples. After data balancing, the spatial features are extracted using CNN, and temporal features are extracted using a Bi-directional long short-term memory (BiLSTM). The fusion of CNN and BiLSTM also creates a deep layered network for classification. The discussed research highlights two main concerns of image processingbased NIDS. One is the continuous challenge of achieving  an optimized NIDS framework. Second is room for improvement in the approach of converting non-image network data into image format.

III. PROPOSED METHOD
The proposed framework is a fusion of two phases. In this section, both fragments will be discussed separately to give a clear idea of the proposed framework. First, the augmented flow of feature selection and data transformation is discussed. Second, the process of converting non-image data to image format is elaborated. Figure 1 represents the overall flow of the proposed NIDS framework.

A. DATASET PRE-PROCESSING
The datasets used for the implementation of the proposed NIDS framework are CSE-CIC-IDS 2018 [23], CIC IDS 2017 [24], and ISCX IDS 2012 [25]. The mentioned datasets are among the benchmarked and well-known datasets for testing NIDS [36]. The datasets are generated by modeling real-world traffic and attack patterns. To create the datasets the attacks were generated for several days based on existing tools and profiles. Datasets also contain a large volume of both normal and attack traffic generated by various operating systems. Due to the stated reasons, the datasets present diverse and sophisticated attack approaches that are highly suitable for testing the proposed NIDS framework. The pre-processing steps applied to the datasets are the same for all the conducted experiments. Primarily simple data cleaning is used for the datasets. The basic cleaning resolved the issues of, missing values, samples, duplicates, and infinite symbol records from the datasets. Then negative time samples in the datasets are also removed. In the datasets, CIC IDS 2017, and ISCX IDS 2012 the sample of labels ''BENIGN'' and ''NORMAL'' respectively are very high in quantity. To evade bias, samples of the ''BENIGN'' and ''NORMAL'' classes are reduced. In the CIC IDS 2017 dataset, two classes ''Infiltration'' and ''Heartbleed'' are removed as they had insufficient samples. The three classes representing web attacks in dataset CIC-IDS 2017 are merged into one class of ''Web attack''. Further, SMOTE is applied with Edited Nearest Neighbors (ENN) to clean the training sets of each dataset. The SMOTEEN balances all the labels in the datasets. Table 2 represents the details of the datasets after pre-processing.

B. DATA TRANSFORMATION
Securing a network of diverse interconnected devices is a challenging task for ML-based NIDS. To optimize and facilitate ML-based NIDS, data normalization or transformation plays an integral part. The benchmarked datasets .., f i n ), 6: n = Total features.

7:
N m = m-th normalization. 8: Step 1: Pre-Process the dataset. 9: 10: Step 2: Apply transformation/normalization/scaling on dataset, (i.e. N m ). 11: Step 3: Compute Median, Mean and, Skewness of each feature. 13: 16: Step 4: Calculate average Median, Mean and Skewness of the dataset. 17: = mean(mean Step 6: Sum the R of median (m) , mean (m) , and, skenewss (m) to know the appropriate normalization (N m * ). 22: or real network traffic are not normally distributed and are skewed [37]. ML algorithms have a tendency to perform better when the data is normalized, as it tends to increase the general structure and relation among features [38]. However, identifying the most appropriate normalization or data transformation for the data or dataset is a dubious task. In this study, the statistical approach proposed in the paper [8] is implemented to recognize the most appropriate normalization for the three datasets. As the proposed statistical method is simple and efficient in terms of implementation and computational requirements. The only difference between Algorithm 1 in this study and the algorithm suggested in the paper [8] is the flow of feature selection. In this study, the algorithm is implemented before feature selection, while in the research work [8] it was implemented after feature selection. Subsequently, normalizing the dataset presents a more prominent and suitable correlation between the features for a feature selection based on correlation. Algorithm 1 represents the flow in which the algorithm is implemented in this study. For this study, five prominent normalization methods were implemented on all three datasets. The methods implemented are MinMax, Robust scaler, Standard Scaler, L2 standardization, and Yeo-Johnson. The MinMax [39] approach can mathematically be represented as (1). Equation (2) represents the Robust scaler [40] method. Where, 'x' denote the values while Q 1 = 25 th quantile and Q 3 = 75 th quantile.
Mathematically the Standard scaler [39] can be represented as (3), where 's' signifies the standard deviation and 'µ' indicate the mean.
if n is even if n is odd where ''n'' in (6), (7), and (8) represent the number of attributes or values in the dataset, ''x'' denotes the attribute or value in a dataset. Further in (8), ''x'' and ''s'' are the mean and standard deviation respectively. For the suggested statistical process, skewness is considered an absolute or positive value. The median, mean, and skewness of the datasets can be seen in Table 3.
After computing the matrices shown in Table 3, percentile ranking is applied to find the most appropriate normalization method. The formula for ranking and percentile can be represented as (9) and (10).
where ''x'' is the number of values beneath the particular value. The ''N'' signifies the total number of values, and ''n'' highlights the number of values. Ranks are allotted based on descending order. Table 4 represents the ranking of each normalization approach based on the Rank and Percentile method. Based on Table 4, it can be seen that the Yeo-Johnson transformation was able to attain the highest rank among all the normalization methods. Except for dataset CSE-CIC-IDS 2018, where L2 Normalization achieved the same rank as Yeo-Johnson. Later, based on our classification results it is highlighted that the Yeo-Johnson was able to achieve precision higher than L2 Normalization.

C. FEATURE SELECTION
In this age of big data, an immense amount of data is transferred every second. Such a high transaction rate makes real-time incursion detection a problematic task. ML which is the most suitable methodology for NIDS does tend to suffer from a low anomaly detection rate with high-dimensional data. Traditionally, features are selected after performing basic pre-processing. In this study, we experimented by using power transformation before filter-based FS, as normalizing data before applying a statistical-based FS can improve the probability of selecting relevant features. The feature selection flow is adopted as per the study [7] as represented in Figure 2. Based on the study in papers [7] and [11], Pearson correlation (PC) is implemented to select the features from the datasets. Equation (11) represents the mathematical representation of PC.
where: P xy = PC coefficient value,  Table 5 represents the total number of features selected by the PC approach.

D. CONVERTING DATASETS TO IMAGES
After feature selection, the datasets ISCX IDS 2012, CIC IDS 2017, and CSE CIC IDS 2018 are ready to be transformed into images. As figure 1 highlights, the transformation of non-image data into image format is based on two phases. Initially, the DeepInsight-based [42] approach is implemented. The Kernel Principal Component Analysis (KPCA) [43] is used to map the dataset features from a 1D space to a 2D space. Due to the mapping by KPCA, the dataset features are expected to be linearly discrete. The 2D space mapping represents features as points in the Cartesian plane. The plotted points only represent the position of features in 2D space and not the attributes of those features. To facilitate the CNN-based classifier, the convex hull algorithm is used to create a small rectangular shape. This rectangular shape contains all the mapped features of the dataset. The next step is to transform the Cartesian coordinates into pixels. During the transformation of Cartesian coordinates to pixels, some of the features are averaged due to the limitation of pixels. The limitation of pixels is due to the size of the image. As with feature selection, the quantity of features is reduced resulting in a limited pixel representation of images. The newly generated frame of pixels represents the positions of the dataset features. The feature attributes are then mapped based on the frame of the pixel representing features. The overlapping pixels of features are averaged and assigned the same pixel location. After this process, each sample of the label in a dataset is converted into an image representing that sampled label. Once all the datasets are converted from non-image data to image format the Gabor filter [44], [45] is used to further improve the generated images.

1) GABOR FILTER
Gabor filter plays a significant role in modifying, extracting, improving, or representing digital graphical data. These filters have also shown remarkable localization properties in both frequency and spatial domains. The Gabor filters can be considered as special kinds of band-pass filters. Based on the configuration, they allow a particular band of frequencies to pass while stopping the others. The parameter settings for the Gabor filter depend on the task at hand. To implement the Gabor filter, two types of parameters are configured. First, the parameters that define how the Gabor filter will be. Second, which features will the Gabor filter react to. The parameters used for the Gaber filter in the proposed framework can be seen in Figure 3.
A two-dimensional Gabor filter can be considered as a sinusoidal signal of a particular frequency and direction, regulated by a Gaussian wave. To represent the orthogonal direction, the Gabor filter has both imaginary and real components. The complex, real and imaginary equations of the Gabor filter can be represented as Equations (12), (13), and (14) respectively. Both the real and imaginary components can be used separately or can be shaped into a complex where: x = x cos θ + y sin θ, y = −x sin θ + y cos θ. λ = Wavelength of the sinusoidal part, θ = Controls the positioning of the Gabor function, γ = Spatial point ratio, σ = The standard deviation(σ ) of the Gaussian covering, φ = The phase offset/error of the sinusoidal function. The parameters λ, θ, γ , σ , and φ define the form of the Gabor function.
After the transformation of images with the Gabor filter, the process of converting non-image data into image format is completed. Figure 4 illustrates an overview of the proposed procedure for converting non-image data into image format.

E. CNN-MODEL
The final block of the proposed framework is the CNN-based classifier. The CNN-based classifier is implemented due to its potential to achieve high accuracy and computational efficiency. It is also among the most prominent classifier in recent research publications. Implementing a CNN-based classifier also provides ground for comparing the proposed framework with recent prominent methods. The sequential CNN model implemented for the experiments consists of 12 layers. The layers consist of an input layer, three conv2D layers, four dropout layers, flatten layer, and three dense layers including an output layer. The kernel size for each convolutional layer is three. The convolutional layers and the dense layers used Relu as the activation function. Whereas, the output layer  used the softmax as an activation function. A dropout of 0.2 is used for the dropout layers. For training, an Adam optimizer with a 0.001 learning rate is implemented. The sparse categorical cross-entropy is used as a loss function. The CNN model is implemented with the help of Keras (python library). Table 6 represents the summary of the parameter settings for the CNN model.

IV. IMPLEMENTATION
The implementation of the proposed framework is on python (v 3.6) programming language with GPU-enabled Tensor-Flow (v 2.3.1) on the Keras framework is used. The DeepInsight tool based on python is publicly accessible [42]. The tool was downloaded and fused with the proposed framework. The Gabor kernel is created using the cv2 library.   Then filter2D method is convolved with the Gabor filter to extract the specific patterns from the images. To highlight the general application of the proposed framework three different NIDS benchmark datasets are implemented. After converting the NIDS datasets to images, each image dataset is classified using the CNN classifier. To estimate the efficiency of the CNN classification precision, accuracy, F1-score, recall, Cohen's kappa coefficient, and receiver operating characteristics (ROC) are measured as performance assessment metrics. The classification precision, accuracy, F1-score, recall, and kappa coefficient are computed using Equations (15) to (19).
The accuracy represents the correlation of correctly predicted events to the total number of events. Precision can be defined as the percentage of properly classified attacks on all the samples classified as attacks. The recall represents the ratio of all the appropriately predicted attack samples to all the actual attack samples. The F1 score is kind of an average between precision and recall. An F1 score is used to examine the correctness of a classification model. The TN(True Negative) and TP(True Positive) are the appropriately classified attack and normal events respectively. Whereas, FP(False Positive) and FN(False Negative) are incorrectly classified events as normal and attack, respectively. The ROC curve is a visual depiction of the classification model at all prediction edges. In equation (19), 'p 0 ' is the general precision of the ML model, and 'p e ' signifies the balance between the ML model estimates and the true class or label values as if occurring by coincidence. The CNN-model is trained for 100 epochs with an 80/20 ratio of train and test datasets respectively. Figure 5, highlights the flow including the components of  the proposed NIDS framework. The components highlighted are the datasets used for the experimentation, normalization approach, feature selection method, non-image to image conversion, and image enhancement procedure.

V. RESULTS AND COMPARISON
To highlight the capability of the suggested framework it is compared with some of the recent notable approaches. The five comparative NIDS approaches implemented for comparison are shown in Table 7. Table 7 also presents a summary of the method adopted by the study to transform the non-image data into an image format.
To provide comparable grounds for evaluation, the proposed framework and comparison methods are implemented with the same parameter settings. Such as the datasets after VOLUME 10, 2022 pre-processing, and the CNN model for classification. While the approach for converting the non-image datasets to image format was based on the method described in the published work. The first comparative approach used FFT to create images from non-image data. The FFT is an optimized and fast algorithm of the discrete Fourier transformation (DFT). The DFT can be represented as the equation (20). To generate the FFT-based images, 5184 sampling points were taken as per the process explained in the research paper. (20) where X (N ) is the signal sampling in the time domain.To implement the STFT-based spectrogram images, the STFT of a discrete-time signal x[n] can be represented as the equation (21).
The x[n] = (f 1 , .., f n−1 ) represents the input data vector with 'f ' as features of the dataset. While 'm' represents the time and 'Omega (ω)' represents the angular frequency. The mathematical representation of the Hanging window function (w hn [n]) can be seen as the equation (22). (22) where 'N ' presents the length of observation time. The final step of generating the spectrogram images is based on the equation (23).
With the help of equations (21) to (23), the datasets were converted into spectrogram-based image datasets. The paper that implemented 2D-gray scale images, presented two different methods of generating images from non-image datasets. Method one presented an approach to generate a 3-channel RGB (Red, Green, Blue) image. While method two presented a 1 channel 2D gray-scale image. Both methods follow the same process to generate the initial image for RGB and greyscale conversion. After the initial pre-processing, the features of the dataset are re-scaled between the values of 0 to 255. Then 2D images of 13*9 and 13*6 pixels are generated for the CSE-CIC-IDS 2018 and NSL-KDD datasets respectively. For comparison purposes, the 2D gray-scale images of datasets were generated based on the process defined by the paper. The fourth competitor is our earlier work, which followed the same approach as in this paper. Except for the augmented feature selection adopted in this study. The fifth approach implemented is based on the DeepInsight methodology to create images from non-image data. This implementation highlights the image classification results without the fusion of the Gabor filter. As compared to the relative approaches, the proposed framework in this paper is implemented on four different datasets. While the comparative work is implemented on one of two datasets. This highlights the fact that the VOLUME 10, 2022 proposed framework is generally applicable and can achieve high precision results. Table 8 represents the results of the proposed framework in contrast with the implemented comparative approaches.
The confusion matrix and ROC of the proposed and comparative approaches for the dataset ISCX-IDS 2012 can be seen in Figures 6 and 7 correspondingly. In Figure 7 (i.e ROC), Class 0 represents BruteForceSSH, Class 1 represents DDoS and similarly, the remaining Class labels in ROC are in sequence with the confusion matrix labels (i.e Figure 5). Figures 8 and 9 represent the respective confusion matrix and ROC of the proposed and comparative approaches for the dataset CIC-IDS 2017. The 'Class' labels in the ROC (i.e. Figure 9) are in the same sequence as in Figure 8. That is Class 0 represents BENIGN, Class 1 represents Bot, and onwards.
The confusion matrix and ROC of the proposed and comparative approaches for the dataset CSE-CIC-IDS 2018 can be seen in Figures 10 and 11 respectively. As mentioned earlier the 'Class' labels in the ROC (i.e. Figure 11) are in the same sequence as in Figure 10. That is Class 0 represents Benign, Class 1 represents Bot, and onwards.
As the focus of this study is to achieve an optimized NIDS framework. A comparison of the time consumed by each competitive approach and the proposed framework is also computed. The time contrast is only focused on the time consumed by each method in transforming non-image data into image format. As the rest of the steps by each comparative methodology is the same. The python function 'time' [46] is used for time computation. Table 9 shows the time used by each implemented approach.
Understandably a time-based evaluation may not be a standard approach to signify the efficiency of the proposed framework. For instance, factors like hardware can influence the time dynamic of implemented methodology. However, for this study, all the approaches are implemented in the same environment. Therefore the time-based comparison can provide a rough intuition for the efficiency of the proposed and compared frameworks.

VI. DISCUSSION
Based on the results highlighted in the earlier section, it can be seen that the proposed NIDS framework was able to achieve competitive results. In this section, the results achieved by each dataset are discussed separately. Starting with the results of the dataset CSE CIC IDS 2018. The proposed framework was able to achieve a precision of almost 98% on the dataset with only 72 features. In contrast to our earlier work [17], which achieved a slightly higher precision on the same dataset but with 79 features. While the other competitors were not able to achieve a precision higher than the suggested framework. Despite using all the 79 features of the dataset. The results of the recommended framework on the CIC-IDS 2017 dataset are the highest among the comparative approaches. Even though the proposed method used only 61 features as compared to the 79 features used by all the comparative approaches. The dataset ISCX-IDS 2012 attained the highest precision as compared to the implemented methodologies. The ISCX-IDS 2012 dataset was able to achieve the highest result with only 41 features. Whereas the competitive methods used 82 features of the dataset. As discussed earlier, in the era of big data a reduced number of features can play a vital role in optimizing an ML-based NIDS. The core purpose of this study was to attain an optimized framework for image processing-based NIDS. The implementation results highlight that the suggested system can play a significant role in optimizing image processingbased NIDS.

VII. CONCLUSION
The NIDS is among the most fundamental part of providing network security. NIDS based on ML and DL is considered highly effective against illusive attacks on the network. DL algorithms are considered highly efficient in understanding the patterns of normal and ab-normal behaviors on a network. Due to the advancements in the field of image processing, security experts are exploring the possibilities of building efficient NIDS based on image processing. In this study, a new framework for NIDS based on image processing is presented. The proposed framework follows a three-tier approach to generate a refined and improved representation of the non-image-based NIDS dataset. The framework reduces the number of features to achieve low computational with high precision. The feature selection process also normalizes the data for better interpretation of features for DL-based models. Although in image processing larger image means higher precision. However, the proposed framework reduces the number of features and employs a fusion of DeepInsight with the Gabor filter to generate highly representative images of the non-image-based dataset. Such representation can assist a CNN in understanding deep and useful patterns from the images. To evaluate the efficiency and general application of the recommended framework, three different network intrusion detection datasets were implemented. The proposed framework achieved high accuracy on the implemented datasets. For future work, it is planned to explore methods that can assist in identifying appropriate parameters for implementing the Gabor filter on network flow. Identifying such an approach can avoid the need to implement a bank of Gabor filters on the non-image datasets. Further, we plan to evaluate the potential of the proposed framework with a variety of other ML-based classifiers and inspect methods that can identify attacks in live network traffic.