Loading web-font TeX/Math/Italic
Malware Detection: A Framework for Reverse Engineered Android Applications Through Machine Learning Algorithms | IEEE Journals & Magazine | IEEE Xplore

Malware Detection: A Framework for Reverse Engineered Android Applications Through Machine Learning Algorithms


The image represents the proposed model of the article.

Abstract:

Today, Android is one of the most used operating systems in smartphone technology. This is the main reason, Android has become the favorite target for hackers and attacke...Show More

Abstract:

Today, Android is one of the most used operating systems in smartphone technology. This is the main reason, Android has become the favorite target for hackers and attackers. Malicious codes are being embedded in Android applications in such a sophisticated manner that detecting and identifying an application as a malware has become the toughest job for security providers. In terms of ingenuity and cognition, Android malware has progressed to the point where they’re more impervious to conventional detection techniques. Approaches based on machine learning have emerged as a much more effective way to tackle the intricacy and originality of developing Android threats. They function by first identifying current patterns of malware activity and then using this information to distinguish between identified threats and unidentified threats with unknown behavior. This research paper uses Reverse Engineered Android applications’ features and Machine Learning algorithms to find vulnerabilities present in Smartphone applications. Our contribution is twofold. Firstly, we propose a model that incorporates more innovative static feature sets with the largest current datasets of malware samples than conventional methods. Secondly, we have used ensemble learning with machine learning algorithms i.e., AdaBoost, Support Vector Machine (SVM), etc. to improve our model’s performance. Our experimental results and findings exhibit 96.24% accuracy to detect extracted malware from Android applications, with a 0.3 False Positive Rate (FPR). The proposed model incorporates ignored detrimental features such as permissions, intents, Application Programming Interface (API) calls, and so on, trained by feeding a solitary arbitrary feature, extracted by reverse engineering as an input to the machine.
The image represents the proposed model of the article.
Published in: IEEE Access ( Volume: 10)
Page(s): 89031 - 89050
Date of Publication: 04 February 2022
Electronic ISSN: 2169-3536

Funding Agency:

Citations are not available for this document.

SECTION I.

Introduction

To this degree, it is guaranteed that mobile devices are an integral part of most people’s daily lives. Furthermore, Android now controls the vast majority of mobile devices, with Android devices accounting for an average of 80% of the global market share over the past years [1]. With the ongoing plan of Android to a growing range of smartphones and consumers around the world, malware targeting Android devices has increased as well. Since it is an open-source operating system, the level of danger it poses, with malware authors and programmers implementing unwanted permissions, features and application components in Android apps. The option to expand its capabilities with third-party software is also appealing, but this capability comes with the risk of malicious attacks. When the number of smartphone apps increases, so does the security problem with unnecessary access to different personal resources. As a result, the applications are becoming more insecure, and they are stealing personal information, SMS frauds, ransomware, etc.

In contrast to static analysis methods such as a manual assessment of AndroidManifest.xml, source files and Dalvik Byte Code and the complex analysis of a managed environment to study the way it treats a program, Machine Learning includes learning the fundamental rules and habits of the positive and malicious settings of apps and then data-enabling. The static attributes derived from an application are extensively used in machine learning methodologies and the tedious task of this can be relieved if the static features of reverse-engineered Android Applications are extracted and use machine learning SVM algorithm, logistic progression, ensemble learning and other algorithms to help train the model for prediction of these malware applications [2].

Machine learning employs a range of methodologies for data classification. SVM is a strong learner that plots each data item as a point in n -dimensional space (where n denotes the number of features you have), with the value of each feature becoming the vector value. Afterward, it performs classification by locating the hyperplane that best distinguishes the two groups, thereby improving the recognition properties of any two parameters. Conversely, boosting or ensemble techniques like Adaboost assigns higher weights to improve the behavior of misclassified variables in conjunction with other machine learning algorithms. If combined along with weak classifiers, our preliminary model benefits from deploying such models since they have a high degree of precision or classification. References [3], [4] and [5], supports classifiers in their system models to find the highest accuracy. Although using ensemble or strong classifiers can cause problems like multicollinearity, which in a regression model, occurs when two or more independent variables are strongly associated with one another. In multivariate regression, this indicates that one regression analysis may be forecasted from another independent variable. This scope of the study can be presented as a detection journal analysis itself and can present several experimentations and results based on machine learning models [6], [7].

In the latest versions of the Android operating system (OS), any app that requires access privileges may ask the OS for permission, and the OS will ask the user whether they want to approve or decline the request through a pop-up option. Many studies have been conducted on the effectiveness of this resource management strategy. Research shows that consumers make decisions by granting access to all requests to applications [8]. In comparison, more than 70% of Android mobile applications request permission that isn’t required or is not needed in the app in the first place. A chess game that asks for photographs or requests for SMS and phone call permits, or loads unwanted packages is an example of an extra requested authorization. So, trying to set an app’s vindictiveness and not understanding the app is a tough challenge. As a result, successful malicious app monitoring will provide extra information to customers to assist them and defend them from information disclosure [9]. Figure 1 elaborates the android risk framework through the Google Play platform, which is then manually configured by the android device developers.

FIGURE 1. - Android security framework.
FIGURE 1.

Android security framework.

In contrast to other smartphone operating systems, such as iOS, Android requires users to access apps from untrusted outlets like file-sharing sites or third-party app stores. The malware virus problem has become so severe that 97 % of all Smartphone malware now targets Android phones. In a year, about 3.25 million new malware Android applications are discovered as the growth of smartphones increases. his is roughly equivalent to introducing a new malware version of Android every few seconds [10]. The primary aim of mobile malware is to gain entrance to user data saved on the computer and user information used in confidential financial activities, such as banking. Infected file extensions, files received via Bluetooth, links to infected code in SMS, and MMS application links are all ways that mobile malware can propagate [11]. There are some strategies for locating apps that need additional features. Using these approaches, it should be easy to assess whether the applications labelled as suspicious and requiring extra authorization are malicious.

Static analysis methodologies are the most fundamental of all approaches. Until operating programs, the manifest file and source codes are examined [12]. For many machine learning tasks, such as enhancing predictive performance or simplifying complicated learning problems, ensemble learning is regarded as the most advanced method. It enhances a single model’s prediction performance by training several models and combining their predictions. Boosting, bagging, and random forest are examples of common ensemble learning techniques [13]. In summary, the main contributions of our study are as follows:

  1. We present a novel subset of features for static detection of Android malware, which consists of seven additional selected feature sets such as (Permissions, App-Components, Method Tags, Intents, Packages, API Calls, and Services/Receivers) that are using around 56000 features from these categories. On a collection of more than 500k benign and malicious Android applications and the highest malware sample set than any state-of-the-art approach, we assess their stability. The results obtain a detection increase in accuracy to 96.24 % with 0.3% false positives.

  2. With the additional features, we have trained six classifier models or machine learning algorithms and also implemented a Boosting ensemble learning approach (AdaBoost) with a Decision Tree based on the binary classification to enhance our prediction rate.

  3. Our model is trained on the latest and large time aware samples of malware collected within recent years including the latest Android API level than state-of-the-art approaches.

This research paper incorporates binary vector mapping for classification by allocating 0 to malicious applications and 1 for non-harmful and for predictive analysis of each application fed to the model implemented in the study. The technique eases the process by reducing fault predictive errors. Figure 2 shows the procedure for a better understanding of the concept applied later in our study. The paper passes both the categories of applications through static analysis and then is further processed for feature extraction. We presented features in 0’s and 1’s after extraction. Matrix displays the extraction characteristics of each application used in the dataset.
FIGURE 2. - Static binary matrix extraction.
FIGURE 2.

Static binary matrix extraction.

There are major issues to be addressed to incorporate our strategy. High measurements of the features will make it difficult to identify malware in many real-world Android applications. Certain features overlap with innocuous apps and malware [14]. In comparison, the vast number of features will cause high throughput computing. Therefore, we can learn from the features directly derived from Android apps, the most popular and significant features. The paper implements prediction models and various computer ensemble teaching strategies to boost and enhance accuracy to resolve this problem [15]. Feature selection is an essential step in all machine-based learning approaches. The optimum collection of features will not only help boost the outcomes of tests but will also help to reduce the compass of most machine-based learning algorithms [16].

Studies have extensively suggested three separate methods for identifying android malware: static, interactive meaning dynamically, and synthetic or hybrid. Static analysis techniques look at the code without ever running it, so they’re a little sluggish if carried out manually and have to face a lot of false positives [17]. Data obfuscation and complex code loading are both significant pitfalls of the technique. That is why automated operation helps to achieve reliability, accuracy, and lesser time utilization [18]. Reverse engineer Android applications and extract features and do static analysis from them without having to execute them. This method entails examining the contents of two files: AndroidManifest.xml and classes.dex and working on the file with the.apk extension. Feature selection techniques and classification algorithms are two crucial areas of feature-based types of fraudulent applications. Feature filtering methods are used to reduce the dimension size of a dataset. Any of the functions (attributes) that aren’t helpful in the study are omitted from the data collection because of this. The remaining features are chosen by weighing the representational strength of all the dataset’s features [19]. Parsing tools can help learn which permissions, packages or services an application offers by analyzing the AndroidManifest.xml file, such as permission android.permission.call phone, which allows an application to misuse calling abilities. The paper elaborates exactly what sort of sensitive API the authors could name by decoding the classes.dex file with the Jadx-gui disassembler [20]. In certain cases, including two permissions in a single app can signify the app’s possible malicious attacks. For example, an application with RECEIVE SMS and WRITE SMS permissions can mask or interfere with receiving text messages [21] or applying sensitive API such as sendTextMessage can also be harmful and lead to fraud and stealing.

Until we started our main idea of the project. The fact explained that Android applications pose a lot of threats to its user because of the unnecessary programs compiled inside them and explained why it is necessary to automate the process of static analysis for the efficient detection of malware applications based on the extracted features. The rest of the paper is planned as follows. Related works are examined in Section II. Section III presents the design details of the proposed model. Section IV elaborates the assessment findings and future threats. The experiments and results will be dilated and performed in Sections V and VI. Section VII includes our research issues, recommendations, and conclusions for the future.

SECTION II.

Related Works

Linux (Android core) keeps key aspects of the security infrastructure of the operating system. The Android displays to the administrator a list of features, sought to reinstall an update. The program installs itself on the computer after they issue access. Figure 3 shows the integrated core parts of Android architecture. It comprises applications at the top layer and also includes an application framework, libraries or a Runtime layer, and a Linux kernel. These levels are further divided into their components, which make an Android Application. The Linux Kernel is the key part of Android that provides its OS functionality to phones, and the Dalvik Virtual Machine (DVM) is to manage a mobile device. Application is the Android architecture’s highest layer. Native and third-party apps such as contacts, email, audio, gallery, clock, sports, and so on are located only in this layer. This framework gets the classes often used to develop Android apps. It also handles the user interface and device infrastructure and provides a common specification for hardware entry. To facilitate the development of Android, the Platform Libraries include many C/C++ core libraries and Java-based libraries such as SSL, libc, Graphics, SQLite, Webkit, Media, Surface Manager, OpenGL, and others. The taxonomy helps understand the viewer with a logical algorithmic approach for grasping the core surfaces and functionality of the operating system.

FIGURE 3. - Taxonomy of android architecture.
FIGURE 3.

Taxonomy of android architecture.

The methods proposed in this related work contribute to key aspects such as selected features for classification and a higher predictive rate for malware detection. Certain research has focused on increasing accuracy, while others have focused on providing a larger dataset, some have been implemented by employing various feature sets, and many studies have combined all of these to improve detection rate efficiency. In [22], the authors offer a system for detecting Android malware apps to aid in the organization of the Android Market. The proposed framework aims to provide a machine learning-based malware detection system for Android to detect malware apps and improve phone users’ safety and privacy. This system monitors different permission-based characteristics and events acquired from Android apps and examines these features employing machine learning classifiers to determine if the program is goodware or malicious. The paper uses two datasets with collectively 700 malware samples and 160 features. Both datasets achieved approximately 91% accuracy with Random Forest (RF) Algorithm. [23] Examines 5,560 malware samples, detecting 94 % of the malware with minimal false alarms, where the reasons supplied for each detection disclose key features of the identified malware. Another technique [24] exceeds both static and dynamic methods that rely on system calls in terms of resilience. Researchers demonstrated the consistency of the model in attaining maximum classification performance and better accuracy compared to two state-of-the-art peer methods that represent both static and dynamic methodologies over for nine years through three interrelated assessments with satisfactory malware samples from different sources. Model continuously achieved 97% F1-measure accuracy for identifying applications or categorizing malware. [25] The authors present a unique Android malware detection approach dubbed Permission-based Malware Detection Systems (PMDS) based on a study of 2950 samples of benign and malicious Android applications. In PMDS, requested permissions are viewed as behavioral markers, and a machine learning model is built on those indicators to detect new potentially dangerous behavior in unknown apps depending on the mix of rights they require. PMDS identifies more than 92–94% of all heretofore unknown malware, with a false positive rate of 1.52–3.93%. The authors of this article [26] solely use the machine learning ensemble learning method Random Forest supervised classifier on Android feature malware samples with 42 features respectively. Their objective was to assess Random Forest’s accuracy in identifying Android application activity as harmful or benign. Dataset 1 is built on 1330 malicious apk samples and 407 benign ones seen by the author. This is based on the collection of feature vectors for each application. Based on an ensemble learning approach, Congyi proposes a concept in [27] for recognizing and distinguishing Android malware. To begin, a static analysis of the Android Manifest file in the Android Application Package (APK) is done to extract system characteristics such as permission calls, component calls, and intents. Then, to detect malicious apps, they deploy the XGBoost technique, which is an implementation of ensemble learning. Analyzing more than 6,000 Android apps on the Kaggle platform provided the initial data for this experiment. They tested both benign and malicious apps based on 3 feature sets for a testing set of 2,000 samples and used the remaining data to create a training set of 6,315 samples. Additional approaches include [28], an SVM-based malware detection technique for the Android platform that incorporates both dangerous permission combinations and susceptible API calls as elements in the SVM algorithm. The dataset includes 400 Android applications, which included 200 benign apps from the official Android market and 200 malicious apps from the Drebin dataset. [29] Determines whether the program is dangerous and, if so, categorizes it as part of a malware family. They obtain up to 99.82 % accuracy with zero false positives for malware detection at a fraction of the computation power of state-of-the-art methods but incorporate a minimal feature set. The results of [30] demonstrate that deep learning is adequate for classifying Android malware and that it is much more successful when additional training data is available. A permission-based strategy for identifying malware in Android applications is described in [31], which uses filter feature selection algorithms to pick features and implements machine learning algorithms such as Random Forest, SVM, and J48 to classify applications as malware or benign. This research [32] provides a feature selection using the Genetic algorithm (GA) approach for identifying Android malware. For identifying and analyzing Android malware, three alternative classifier techniques with distinct feature subsets were built and compared using GA. Another technique achieves satisfactory accuracy but there FPR is very high with limited samples [33].

One of the important matters that has not been considered by any of the studies is the sustainability of the model after the advancement of applications. This issue is still a challenge for our research as well. The model’s ability to classify will gradually decrease over time when new features or evolved applications are created. Only [29] and [26] specify this issue and introduce it as a drift concept, describing the low performance of their systems after some time. Our research doesn’t implement this problem as well, but we suggested some potential studies to initiate solutions for models’ sustainability in the research issues and challenges section. Another matter that could arise in the field of implementing machine learning algorithms is the “Multicollinearity Problem” which we have discussed in the introduction section. This subject arises due to the algorithms being dependent on multiple variables embedded in these machine learning or deep learning models. Although it is one of the rising issues in the area and could be present in our study it would constitute better as separate research. Our model is already incorporating a wide range of evaluations and analysis of Android applications features sets but this would be a great opportunity to further enhance the models for future use. There are relevant studies that support alleviating this challenge by detecting the model’s dependencies in terms of comparing multiple models together and then calculating the greater impact of the highest given model. Authors in references [34], [35], [36] consider different tales concerning different machine learning models to highlight and find out the measures for different model scenarios.

Tables 1 and 2 elaborates on the novelty of our approach and compare state-of-the-art methodologies in several categories. Table 1 focuses on the key novel categories in terms of malware samples, feature sets, the method proposed, accuracy, false-positive rate, the level of API (increased complex application behavior) and system environment for data processing. It also explains that our sample set and feature set is larger and achieve satisfactory accuracy with 0.3% FPR, depicting the lowest false positives other than DroidSieve. Our contribution lands on the upgraded API levels with large sample sizes including enhanced feature sets to detect malware. Table 2 elaborates a more in-depth approach and shows the key features present in the proposed and other approaches with also the time awareness of the data being collected.

TABLE 1 Relative Techniques Analysis on Basis of Multiple Factors in Comparison to Proposed Approach (PER: Permissions, STR: String, API: Application Programming Interface, INT: Intents, PKG: Package, APP-C: App Components, SR: Services, RS: Receivers)
Table 1- 
Relative Techniques Analysis on Basis of Multiple Factors in Comparison to Proposed Approach (PER: Permissions, STR: String, API: Application Programming Interface, INT: Intents, PKG: Package, APP-C: App Components, SR: Services, RS: Receivers)
TABLE 2 Relative Techniques Analysis on Basis of Features and Sample Collected in Comparison to Proposed Approach
Table 2- 
Relative Techniques Analysis on Basis of Features and Sample Collected in Comparison to Proposed Approach

A. Reverse Engineered Applications Characteristics

As for Android apps, various apps have various functionalities. If the app is to use the device tools, you must specify the corresponding allowances in the Android Manifest format. Different program forms, therefore, have different declarations of prior approval [37], [38]. System static analysis also identifies an application as malicious or benevolent. In classification, they make rational choices using features. The article shows the taxonomy diagram for the features present in Android applications [39]. It comprises all the components present in the APK files and how they are when they are reverse engineered by using a disassembler, in our case Jadx-gui. Fig.4 shows the process of apk file disassembly.

FIGURE 4. - Reverse engineering APK files architecture.
FIGURE 4.

Reverse engineering APK files architecture.

1) Androidmanifest.xml

In the root folder of any reverse-engineered application, there must be an android Manifest.xml file. The Manifest file gives essential information to the Mobile application, which is required by the framework before executing any code for the app. The authorization process should protect the application’s key elements, which include the Operation, Service, Content Provider, and Broadcast Receivers. These results mainly accomplished by affiliating these components with the relevant element in its manifest definition and making Android dynamically implement the features in the closely associated contexts [28].

Fig. 5 shows the taxonomy of the Android manifest components, which contain all the requested permissions, packages, intents and features for extraction.

FIGURE 5. - Taxonomy of android manifest.
FIGURE 5.

Taxonomy of android manifest.

B. Feature Set Extraction

Using feature filtering decreases the dimensions of data collection by deleting functions that are not useful for study. We chose the characteristics based on their capability to display all data sets. Enhanced efficiency by reducing the dataset size and the hours wasted on the classification process introduces an effective function selection process. Our process does not support a revamped Android emulator, because it’s not a convenient approach and we preferred our system for physical devices in the future. Jadx carries out the modification and evaluation of source code. The system concentrated on trying to hook the byte-level API calls [40]. For our dataset, features from over 1, 00,000 applications are extracted containing around 56000 extracted features. Functions and processes of opcode API features are removed from the disassembled Smali and Manifest files of an APK file. The Smali file, segmented by the process and the opcode frequency of Dalvik for every method, is determined by scanning Dalvik Bytecodes. To verify invocation of a hazardous API in that form, it is also possible to determine the hazardous frequency of an API invocation for each method during the byte code search. For string functions, strings are selected without the method of isolation from the entire Smali archives [41].

We will never have a predictable response when the number of features inside a dataset exceeds the number of occurrences. In other terms, when we don’t have enough data to train our machine on, generating a structure that could capture the association between both the predictive variables and responses variable appears problematic.

The system used in this study also incorporates larger feature sets for classification. Although this problem arises in machine learning quite often to some extent choosing the type of model for detection or classification can highly impact the high dimensionality of the data being used. Support vector machine and AdaBoost can handle relatively well than other algorithms because of their high dimensional space/hyperplane sectioning. Another suspension for our datasets was the tool used for extracting the given datasets. Androguard implements parsing and analyzing automation to further break down components of application apk’s after decompiling and encourages weighting of the data into binary, making it easy to use relevant data for classification. It uses certain functionality to get useful features from manifest files of these Android applications reducing the acquiring irrelevant features. Although the data in this study works significantly well for evaluation, however, the datasets will be needed to upgrade in terms of forthcoming evolving measures.

Certain other authors have presented many tools and proposals to deal with high dimensional data such as [42], [43], inducing multiple methods such as filtering wrapping to enhance robustness.

The feature set of our model includes:

  • {F}_{1}\rightarrow Permissions

  • {F}_{2}\rightarrow API Calls

  • {F}_{3}\rightarrow Intents

  • {F}_{4}\rightarrow App Components

  • {F}_{5}\rightarrow Packages

  • {F}_{6}\rightarrow Services

  • {F}_{7}\rightarrow Receivers

1) Permissions

Permission is a security feature that limits access to certain information on smartphone, with the role of preserving sensitive data and functions that might be exploited to harm the user’s experience. A unique label is assigned for every permit, which typically denotes a limited operation. The permissions are further categorized into four parts by Google: normal, dangerous, signature, and SignatureOrSystem. For evaluating Android permissions, researchers take a variety of methods [44]. Standard (also called secure) levels of security permissions, such as VIBRATE and SET WALLPAPER, are permissions without risk. Android kit installer will not allow the user to approve these permissions. The dangerous security standard will pose warnings to the user before implementation and will require the user’s consent. The signature and symbol Security stages of SignatureOrSystem cover the riskiest permits. Only applications with the same certificate, as the certificate used to sign the request declaring approval, are allowed to sign signature permissions [45]. It also acts as a buffer in the middle of hardware and the rest of the stack. A variety of different C/C++ core libraries, such as libc and SSL, are being used in libraries. Dalvik virtual machines and key libraries are part of the Android Run Time. App Model defines classes for developing Android applications, as well as a standardized structure for hardware control and the management of user experience and app property. API libraries are used for both proprietary and third-party users [46]. Table 3 shows some dangerous permissions that pose problems to the reverse engineered applications.

TABLE 3 Dangerous Permissions (Malware Probability)
Table 3- 
Dangerous Permissions (Malware Probability)

2) Intents

The message delivered among modules such as activities, content providers, broadcast receivers, and services is known as Android Intent. It’s commonly used along with the startActivity() function to start activities, broadcast receivers, and other things. Individual intent counts are exploited as a continuous feature in categorization. To provide more specificity, we divide the list of intents into further sections, such as intentions including the phrases (android.net), which are linked to the network manager, intents including (com.android.vending), for billing transactions, and intents addressing framework components (com.android) and proving to be harmful elements in these apps.

3) API Calls

Safe APIs are tools that are only available by the operating system. GPS, camera, SMS, Bluetooth, and network or data are some examples. To make use of such resources, the application must identify them in its manifest [47]. The Cost-sensitive APIs are those that can increase cost through their usages, such as SMS, data or network, and NFC. Each version includes these APIs in the OS-controlled set of protected APIs that require the device’s user’s sole permission. API calls that grant sensitive information or device resources are commonly detected in malicious codes. These calls are isolated and compiled in a different feature set so they might contribute to harmful activity. Table 4 elaborates dangerous API features:

TABLE 4 Frequently Deployed Malware Sensitive API Calls
Table 4- 
Frequently Deployed Malware Sensitive API Calls

4) API Components

The program that requires access or activity e.g., a path from point A to point B on a route predicated on a user’s location from another application makes a call to its API, stating the data/functionality demands. The other software includes the data/functionality that the first program requested. For privacy reasons, some API features must be declared and not used in these apps. These components relate to broadcast features present in these applications.

5) Packages, Services and Receivers

The package manifest has always been found in the package’s root and provides information about the package, such as its registered name and sequence number. It also specifies crucial data to convey to the user, such as a consumer name for the program that displays in the User Interface (UI). The file format is in .json for packages.

According to a publication process model, Android apps can transmit and receive messages from the Android system and other Android apps. When a noteworthy event occurs, these broadcasts are sent out. The Android system, for example, sends broadcasts when different system events occur, such as the system booting up or the smartphone charging. Individuals can sign up to receive certain broadcasts [48]. When a broadcast is sent, the system automatically directs it to applications that have signed up to receive that sort of broadcast. Services, unlike activities, do not have a graphical user interface. They’re used to build long-running background processes or a complex communications API that other programs may access. In the manifest file, all services are represented by < service > elements and they allow the developer to invalidate the structure of the application.

C. Classification

The collection of chosen features in the signature database, separated into training and test data, and is used to recognize android malware apps by traditional machine learning techniques [49]. There are three different computer frameworks: supervised learning, unsupervised learning, and reinforcement learning. The supervised learning method is the focus of this paper, comprises algorithms that learn a model from externally provided instances of known data and known results to produce a theoretical model so that the learned model predicts feedback about previous occurrences over new data [50]. The deployment of ensemble techniques and strong learning classifiers helps classification of our binary feature sets, resulting in correctly categorized malware and benign samples. We believe that these classification mechanics produce efficient outputs because of their sorting nature. Fig. 6 explains the process of the learning model.

FIGURE 6. - Machine learning process.
FIGURE 6.

Machine learning process.

A comparative algorithm selection for our model based on AdaBoost, Naive Bayes, Decision Tree classifier, K-Neighbor, Gaussian NB, Random forest classifier, and Support Vector Machine performing a relative review which will give an accurate analysis of the algorithm for the prediction of our model.

1) Algorithm Characteristics Appraisal

The assessment of suggested algorithms was carried out using Python. The use of FPR and Accuracy assess our comparative algorithms trials [51]. These estimates, derived from the following basic factors, are listed further down:

  • Accuracy: Accuracy is one criterion being used to evaluate classification techniques. True Positive (TP) refers to the number of malicious apps which were misclassified as malicious, and False Negative (FN) identifies the number of safe applications which were misidentified as malicious. The number True Negative (TN) measures the truly benign applications and FN denotes the number of irregular apps that were wrongly labelled as normal [52].

  • False Positive Rate: Determines the measuring factor of a model’s ability to identify correct apps or the model’s ability to generate FP.\begin{align*} {(Accuracy)}_{m,b}=&\frac {\left ({ TP }\right )_{m,b}+{(TN)}_{m,b}}{All~samples} \tag{1}\\ {(FPR)}_{m,b}=&\frac {\left ({ FP }\right )_{m,b\mathrm { }}}{({FP)}_{m,b}+({TP)}_{m,b}}\tag{2}\end{align*}

    View SourceRight-click on figure for MathML and additional features.

Equations (1) and (2) demonstrate the accuracy of the false detection rate measuring formula applied to calculate the Detection Rate (DR) and precision whereas variables (m, b) represent the malicious or benign applications w.r.t. True Positive (TP), True Negative (TN) and False Positive (FP). Accuracy of the classification dataset, which contains both benevolent and malicious applications, our models define a hyperplane that divides both categories with the largest probability. One class is synonymous with ransomware and the other with friendly applications [53]. The authors then assumed the research data to be unknown applications, which are classified by projecting them to subspace to determine if they are on the malicious or friendly side of the hyperplane [54]. Then, using our model will correlate all the regression findings to their original reports to assess the proposed model’s malware identification accuracy [55]. Static features make for a pleasing accuracy and precision of more than 90%. What’s more noteworthy is that defining the usage of API calls in a single part of the Android platform allows for the creation of the most representative function space or the resources where malicious and benign can be distinguished more easily [56], [57]. If the amount of the classification target is greater than the probability estimates, the classification target of the testing data is then calculated as that label [58]. The objects are Blue or Red; the dividing lines identify the border, so an object on the right side is called blue, meaning benign, a general scenario and likewise. This is an example of linear classification, but not all classifications are this basic, and functional groups are needed to differentiate between groups [59], [60].

SECTION III.

Proposed Methodology

The major goal of our research is to determine which criteria are most helpful in detecting malware in cell phones, particularly those running Android. We have taken up the task to train up to six machine learning algorithms such as AdaBoost, Support Vector Machine, Decision Tree, KNN, Navies Bayes and Random Forest techniques and classify these machine learning algorithms accurately. The methodology section is divided in two sections; Pre-Processing (explaining the pre-requisite processing) and the Proposed Model (Model functionalities and components).

A. Pre-Processing

APK files from numerous apps were included in the resulting datasets (containing malware and benign characteristics). A Jadx-Gui decompiler is then used to reverse engineer the apk files to extract features from the Android manifest file’s feature set for further processing. These stages are regarded as pre-processes from before real assessments and are essential parts before any kind of testing and training using any predictive models.

Androguard, an open-source tool that extracts prioritised features from files and converts them into binary values, is used to extract features. For labelling the false or accurate android application, we employ binary search techniques, i.e., 1 or 0 for benign and 1 or 0 for malware. Figure 7 shows our technique’s pre-processing framework and flow structures, which must be accomplished before the classifiers are tested.

FIGURE 7. - Flow analysis of our research.
FIGURE 7.

Flow analysis of our research.

The operations embedded in the rectangle are to be determined beforehand, ensuring efficient data collection. The main role in this is by the decompiler and extractor which improves and eases the model’s data classification efficiency for detection of malware applications. Although our study discusses the challenge of multi-collinearity and the use of high dimensional data being implemented, we have discussed the better output for high-dimensional data in our feature extracted section but the issue of collinearity still stands and can be done as a novel contribution as future work.

Succeeding the extraction process and the use of efficient datasets accommodating useful features, the testing and training are administered. For our model, a comparative approach will be adopted based on Naive Bayes, Decision Tree classifier, K-Neighbour, Gaussian NB, Random Forest classifier, Support Vector Machine and AdaBoost. The comparison evaluation will provide an accurate assessment of the algorithm used to forecast our model. The installation package is a ZIP-compressed bundle of files that includes the manifest file (AndroidManifest.xml) and classes.dex. The manifest file describes an Android application, namely the activities, services, broadcast receivers, and content providers that make up the system. The methodology and the classification are explained before in the related work section. The next section describes the model functionality.

B. Proposed Model

The model gathers information from many Android applications (Google Play). These reverse-engineered (decompiled through Jadx-Gui) apps are then subjected to static analysis to extract features. Our suggested approach in figure 8, for the training phase, uses the retrieved characteristics to create vector mapping parsed through Androguard. The contribution is indicated by the proposed feature section that encompasses nearly 56,000 extracted features from the feature set seen in figure 8. Those collected features are then composed in a form of a dataset .csv file, stating the benign and malware properties in 1 or 0. After we generate the datasets, the features are ready for classification by predictive models. We adopted Python to create a machine algorithm classification performance program after collecting the dataset, and then we’ll employ the best accurate algorithms to train our models for malware and benign detection. The system’s approach and its operation are detailed in figure 8, which depicts the whole methodology of our model and algorithm learning phase with the training model processing for detection. Figure 9 shows us the training cycle of the program and how the model first is constructed and then evaluated. Then further on the data is cycled towards testing and that is the data fed to the trained model for further prediction analysis of the android applications.

FIGURE 8. - Proposed methodology of our system.
FIGURE 8.

Proposed methodology of our system.

FIGURE 9. - Training model Processing.
FIGURE 9.

Training model Processing.

The future threats and predictions pointed out in the next section state insecure android applications which contain unnecessary permissions, and opt for an easy way for an attacker to steal private data or launch major attacks, and later on, present the methodology of our research.

SECTION IV.

Future Threats and Prediction

By 2020, mobile applications will be installed onto consumer devices over 205 billion times. Statistics by Marketing Land suggest that 57 percent of the overall digital content time is spent on mobile devices. Our daily activities always depend on social networking, bank transfers, business operations, and mobile managed services applications. Accommodating over two billion individuals, almost 40% of the world’s total population, Juniper Sources point to the number of those using mobile banking services. These predictions and future threats are based on theoretical data collected through extensive survey of journals, online forums and research articles.

Developers devote close attention to the development of software to provide us a comfortable and seamless experience and when someone enthusiastically installs these mobile applications requiring personal information, the user stops thinking about the security consequences. This is the reason people don’t even look closely at the permissions or the feature updates being asked by the applications [61]. They simply download the application they want and, when asked for installation, they overlook everything else and start using the app. Most of these applications never even ask the consent of the consumer and the hackers are using their information without their knowledge. The future threat rises, at the end of 2020 and beginning of 2021:

  • 70% of Google Play Store applications require access to one more “dangerous permission and packages, up from 66.6% in Q12020, which is a 5 percent raise”. 69.4% of applications for children (13 years of age) claim at least one risky permit up from 68.8% in 2020 (a 1 percent rise).

  • Over 2.3 million applications altogether, over 2.1 million applications for children need at least one harmful authorization.

Figure 10 shows the percent hit in 2020, proceeding to 2021 on both the application for permission criteria. As per these statistics, the predicted rate in the coming years (till 2025) proposes that there could be a grave danger because of these unnecessary access as per each level of the Android API. Figure 10 shows the representation of both the factors, application for everyone and the other for application kids for the year 2019–2020. The graph shows the increase in 5% of the applications with dangerous malware. This takes a great deal of application security and also depicts the futuristic way that if nothing is done on time, these applications will increase up to a higher number in the future.
FIGURE 10. - Graph of application threat increase by 5%.
FIGURE 10.

Graph of application threat increase by 5%.

According to multiple tech reviews, each one published in 2021, states that according to research of 2,500 top-of-the-line and rising applications, over two portions of the most popular Android applications on Google Play request excessive user permissions and access. These allow apps, among other unwanted behaviors, to launch harmful scripts and access messages unnecessarily with unwanted features inbuilt [62]. They stated that with the increase in usage of application components and features and also the release of new Android frameworks and APIs each year. It is most likely that threats are surely to increase by 15% from 5%. The average Android user has roughly 80 applications loaded, thus at least one app on the phone demands additional authorization on the phone. It is likely that excessive authorizations may jeopardize user data and privacy or even allow device hacks.

Figure 11 elaborates the dangerous malware increase till 2020 with every newer version of API Level. Figure 12 shows the most rising apps from 2016 to 2021 and the percentage of dangerous permissions, packages these applications gain [63]. These applications are used daily and if they are involved in unnecessary and third-party access, then there is a special need to apply countermeasures on these applications, as this is going to be a major threat in the future.

FIGURE 11. - Increase in android malware statistics.
FIGURE 11.

Increase in android malware statistics.

FIGURE 12. - Third-party well-known dangerous apps increase from 2013 to 2020.
FIGURE 12.

Third-party well-known dangerous apps increase from 2013 to 2020.

Also, the Figure depicts the need to measure these threats and devise countermeasures or at least present models to provide more encoded procedures to carry out for these well-known applications. These apps provide a lot of opportunities, but with an increase in private and intellectual property stored in these apps, certain anecdotes need to be proposed.

SECTION V.

Experimental Results

In this section, the results of our experimentation are stated. To start our experimentation discussion, we will elaborate on the basic criteria for performing our implementation successfully and will also briefly discuss the data collection or the dataset that we got and then further converse about the actual contribution part.

A. Experiment Setup

Our environment is based on Windows 8.1 Pro with Intel®, Core (MT) i5-2450 CPU, 2.50 GHz as a processor. The installed memory (RAM) of the system is 4.00 GB with a 64-bit Operating System (OS), x-64 based processor.

For the generated dataset Androguard 3.3.5 (latest release) is used for decompiling and feature extraction, deployed in regulated .csv files in binary vectors. We have installed Python 3.8.12 (version 3.8) on our system for the implementation and execution of training and testing scripts of imported machine learning models.

B. Dataset

Three different datasets are used for our implementation, mainly apps belonging to Google Play. The static features of our first two datasets containing API calls, permissions, intents, packages, receivers and services were collected from MalDroid [64] and DefenseDroid [65] which includes around 14,000 malware samples. The model also uses a third dataset of around 6000 malware samples and 2421 benign samples using our own generated application’s dataset. Applications in the datasets were randomly selected from Google Play and then reverse-engineered by the Jadx-GUI tool to acquire their APK’s. The features present in our own selected applications are then extracted using Androguard into binary data. All the datasets from different platforms are combined to incorporate our multiple features sets more than state-of-the-art approaches (explained in table 5) in a single training to achieve higher accuracy and classification of malware. The datasets are first trained on every algorithm for comparative classification analysis. After the accuracy of the algorithms are evaluated, the dataset is again trained and tested on the higher-performing algorithms to use as a feed, based on the features, inserted into the database and our model will then forecast the output for a given android application extracted features. Table 6 represents the datasets training and testing ratio and number of columns before and after pre-processing.

TABLE 5 Sample Datasets
Table 5- 
Sample Datasets
TABLE 6 Datasets Ratio (Training & Testing), MalD (MalDroid), DefenseD (DefenseDroid), GD (Generated Dataset), Pre-Pro (Pre-Processing)
Table 6- 
Datasets Ratio (Training & Testing), MalD (MalDroid), DefenseD (DefenseDroid), GD (Generated Dataset), Pre-Pro (Pre-Processing)

The next subsection elaborates the discussion and presentation of the programs for our machine learning algorithms.

C. Machine Learning Algorithm and Ensemble Learning

Six models have been selected to experiment with two strong classifiers (AdaBoost, SVM and Random Forest). The model executes upon KNN, NB, RBF, Decision Tree, SVM and we have also performed AdaBoost with Decision Tree by calculating the weighted error of the Decision tree based on its data points. As the input parameters are not jointly optimized, Adaboost is less prone to overfitting. Adaboost can help you to increase data performance of existing weak classifiers. After the higher weight of all the wrongly misclassified data points is rightly classified, the model can enhance model accuracy. Figure 13 shows the functioning of the boosting technique.

FIGURE 13. - Boosting mechanism.
FIGURE 13.

Boosting mechanism.

Since, there is a distinct boundary between two categories, ensemble methods and SVM perform rather well enough when dealing with clear aligned datasets following adequate extraction processes. Another significant benefit of the SVM Algorithm is that it can handle high-dimensional data, which comes in handy when it comes to its use and application in the Machine Learning sector. As seen in the diagram above, AdaBoost’s greater weighted property aids our weak learner (Decision Trees) with achieving higher accuracy and wider consumption for misclassified binary feature inputs.

D. Program Parameters

Our project is based on Python 3.9.7 and divided our execution into two programs. The first program, written to compare the algorithms for the accuracy check of respective models, based on AdaBoost, Decision Tree, KNN, SVM, Naive Bayes, and Random Forest for the comparative analysis. The program uses different import and split functions to train the models and then stores the result in a variable embedded for the testing model. The function sklearn.model_selection, used for accessing the bundles of algorithms, accuracy_score for accuracy readings, pandas to read the database, and NumPy to convert the testing model data into rr format.

The parameter on the x-axis is the features of the algorithms and on the y-axis is its label (figure 19), meaning the accuracy percentage for these algorithms. The x (accuracy of the models) and y (labels of the models) parameters of the program are configured to shuffle = True using the test_train_split function, so each algorithm takes a random permission value from the dataset. Figures 14 and 15 show the import modules and parameters values set in our program.

FIGURE 14. - Representation of the modules of our program.
FIGURE 14.

Representation of the modules of our program.

FIGURE 15. - Program parameters and split functions.
FIGURE 15.

Program parameters and split functions.

FIGURE 16. - Fit and pred function for SVM.
FIGURE 16.

Fit and pred function for SVM.

FIGURE 17. - Predictive measures for AdaBoost.
FIGURE 17.

Predictive measures for AdaBoost.

FIGURE 18. - Results stored to acc variable and plotted by plt.bar function.
FIGURE 18.

Results stored to acc variable and plotted by plt.bar function.

FIGURE 19. - Models accuracy percentage w.r.t label.
FIGURE 19.

Models accuracy percentage w.r.t label.

First, all the algorithms are imported into the program to implement the training data for the model, meaning the machine is training based on the given datasets. The program will work as each algorithm will take up random binary value of an app from the dataset and execute its feature’s accuracy score in another variable. After training the data, the program passes the testing data to store into a predictive function. The program is designed to identify the normal and harmful permissions features through the dataset binary values (0.1) and specifies those results in function pred () As you can see in the code below, the program uses a fit () function, which takes the training data as arguments that are fitted using the x and y parameters into testing data for our two models (AdaBoost and SVM). All the variables were specified at the end that was given to each of our algorithms in the program to the variable acc. After executing the program, every algorithm will start accessing the dataset and start predicting the dataset value for the android features. Figures 16 and 17 represent the main key functions for our models AdaBoost and SVM, which are discussed above.

Figure 17 also explains the predictive procedure of the ensemble model with 1000 malware sample runs and given features to train for a single predictive classification output. The same fit() function is used for dataset training. The model is placed for higher weights of decision trees algorithm within row values and executed in yhat. Accuracy is then accomplished by declaring the mean and standard deviation (mean (n_acc_scores), std (n)acc_score))) for the binary classification output of malware. Further ahead, Figure 18 shows the plotted assigned value for accuracy after the data is trained on the models.

Figure 19 shows the accuracy percentage for our models which is 96.24% and the graph displays the highest correct predictive frequency out of all the algorithms, professing the research work for greater validity. This graph is plotted by training the algorithms on the datasets to verify which algorithm can classify the application’s features accurately. Program 1 (python script for models accuracy) is scripted to import all of the algorithms and execute them one by one on these datasets to train the algorithms, producing the most precise values after testing. In the case of AdaBoost, we trained Decision Tree first on the dataset and then used those classified values to train on the higher weights using AdaBoost. AdaBoost takes those classified samples and features used by decision trees and generates higher weights for correct results after training on those features again. (x,y) are the stored values by decision trees which are given as input values for AdaBoost to enhance accuracy, hence the model with the highest accuracy in Fig. 19. This program performs in a way that when all the models are done training, the script generates a graph using the plt.bar command to display the algo that classifies most applications correctly. Figure 19 and Table 7 show the accuracy and the label value that depicts the training data each algorithm randomly took and trained its model for.

TABLE 7 Shows the Label Values for Each Algorithm and Their Accuracy Percentage
Table 7- 
Shows the Label Values for Each Algorithm and Their Accuracy Percentage

SECTION VI.

Model Precision Evaluation

After training the datasets on algorithms and achieving accuracy percentage, individually developed another program that uses the properties of the previous code to help execute and predict the application state according to the input from the dataset. For this program, the algorithm with greater prediction capabilities is imported, i.e., AdaBoost and SVM using the function sklearn imports linear_svc and sklearn.ensemble import AdaBoost. The database stores input features into the rr python module as a feeding factor for the trained models and designated 1 for the benign applications and 0 for the malware application, meaning the app which uses unnecessary features will give the output 0, helping the use understand that this is a malicious app. This will work in a way that, when the program executes, the algorithms will take the input from the database and then categorize the features based on what we trained the algorithm upon. So, if there are malware applications fed as an input to the database, the trained model will predict the outcome and label the state of the application.

Following the import of the trained models, the random_state = 0 and the testing data = 0.25 for the algorithms. The import of sklearn.preprocessing_normalize function, which takes samples separately according to the Normalize unit. Every set of data with one component or perhaps more (each data matrix row), rescaled separately from other samples to the standard. The program also imports the function sklearn.features_extraction.text which transforms a text data array into a token count matrix and at the very end declares the accuracy score of these algorithms by using sklearn.metrics function, implementing loss, score, and utility functions to quantify performance in the categorization of the feature sets. Parameters for this program are the same as the previous program, but to fix features on every algorithm, the x type is dedicated to the trained models for features and y type for the prediction of the applications. So when the program executes it will work in the same manner and this time gives us the precision value instead of the plotted accuracy percentage of the algorithms and at last, the program will print out the pred () function value which was declared to the model’s testing data. Figures 20 and 21 indicate the consideration of AdaBoost and SVM prediction for features extracted for single feature input.

FIGURE 20. - Prediction function for SVM for testing data for the database.
FIGURE 20.

Prediction function for SVM for testing data for the database.

FIGURE 21. - Prediction function for AdaBoost for testing data for the database.
FIGURE 21.

Prediction function for AdaBoost for testing data for the database.

Further ahead, the prediction results of the program are discussed. As the code executes, the models will take the features from the dataset that was provided for a single application. The result displayed in Figure 22 shows that it’s a benign application. When permission features, again fed as input the Figure 23 shows that it is a malware application based on the features the highly trained models draw out. In the same manner, the database is fed with feature binary values and the model will predict the result in 1 or 0. Figures 16 and 17 elaborate on the predictive function which will allow AdaBoost and SVM to predict the basis of the applications on the feeding input. Figures 22, 23, 24 and 25 are output screenshots of 1 showing benign and 0 for harmful applications with random application features for respective models.

FIGURE 22. - Output [1] representing the benign application (SVM).
FIGURE 22.

Output [1] representing the benign application (SVM).

FIGURE 23. - Output [0] representing the malware application (SVM).
FIGURE 23.

Output [0] representing the malware application (SVM).

FIGURE 24. - Output [1] representing the benign application (AdaBoost).
FIGURE 24.

Output [1] representing the benign application (AdaBoost).

FIGURE 25. - Output [0] representing the malware application (AdaBoost).
FIGURE 25.

Output [0] representing the malware application (AdaBoost).

A. Results

After the forecast of our models, results show that the accuracy for our highest predictive systems is 96% and 92%. The proposed model doesn’t peak in higher accuracy or predictive rate but it contributes by introducing enhanced and large feature sets (containing around 56000 newly extracted features) with the latest API level applications datasets collected in recent years than state-of-the-art approaches. Another point of view for a less predictive rate is the limitation of our sources/environment to process and generate these datasets on our models. The novelty and contributions are explained in Tables 1 and 2.

Figures 26, 27, 28 and 29 show the runs performed on the datasets on our trained model. The applications in orange indicate not harmful apps and only passes sensitive features over the line, which doesn’t pose that much of a threat for the application, but it still shows the model issue for indicating true negatives for zero apps. The applications in black indicate harmful applications and the false positive rate (FPR) of this category which falls over the non-harmful apps is about 3–4 applications in case of AdaBoost and 6–7 in case of SVM in our system for 1000 runs, as shown in figures above achieved with 96% and 92% accuracy of AdaBoost and SVM.

FIGURE 26. - Orange entries for hon-harmful applications in AdaBoost.
FIGURE 26.

Orange entries for hon-harmful applications in AdaBoost.

FIGURE 27. - Black entries for harmful applications in AdaBoost.
FIGURE 27.

Black entries for harmful applications in AdaBoost.

FIGURE 28. - Orange entries for non-harmful applications in SVM.
FIGURE 28.

Orange entries for non-harmful applications in SVM.

FIGURE 29. - Black entries for Harmful applications in SVM.
FIGURE 29.

Black entries for Harmful applications in SVM.

All four figures are plotted in a hyperplane which describes the applications classifications in two sections i.e. Harmful and Non-harmful applications. The above line represents the harmful apps section (Black and Red) and applications lying below the line indicated non-harmful applications. The plotted hyperplanes help in understanding the prediction applications perspective as shown in Fig 27 and 29 showing successful classification above the line and 3–4 apps below line indicating misclassifications. The same process is for non-harmful apps in orange colors (Fig 26, 28) and the above line shows misclassifications but they don’t pose serious threats.

The Forthcoming is the comparative review of both malicious and benign applications of our models and experimental results with accumulative accuracy and FPR. The purpose to plot a comparative graph of malware detection is to understand the relative perspective of both our parameters. Figure 30 represents a comparative analysis of both models in terms of malicious and benign applications. Triangles in red represent the classification and detection of AdaBoost and in the square, the SVM is displayed. The graph shows a malware section angle for the executive runs performed and the values above the hyperplane shows the category of Non-Harmful apps. The 0.7 misclassification rate of SVM and 0.3 of AdaBoost is plotted with malware applications falling into the true positive category.

FIGURE 30. - Comparative analysis of malicious and benign in Adaboost and SVM.
FIGURE 30.

Comparative analysis of malicious and benign in Adaboost and SVM.

Nevertheless, the models perform with 96.24% accuracy by accurately predicting the applications categories.

We use Accuracy and FPR as evaluation markers in this project. Precision is computed as the percentage of true harmful samples in the malware tagged by the detection system, showing the system’s capacity to discriminate malware properly in the field of malware detection. False Positive Rate (FPR) is the criteria to judge the model’s performance in terms of establishing how many true indications a model gives. Below are the experimental results in quantitative measures, presented in table 8, which explains the points based on accuracy, false positive rate and their predictive measures after testing on binary input for 1000 runs on our 2 higher predictive models depending on testing and training of mixed datasets containing features and malware samples. The operational speed advantage of AdaBoost is not apparent when adopting the datasets for classification and prediction. However, given AdaBoost structural features with parallel learning, we anticipate it will perform better while computing bigger data sets. We reached the same conclusion after we analyzed a much bigger data set with over 500,000 apps.

TABLE 8 Experimental Results (AdaBoost and SVM), Selected, Specify Features Selected in the Model, MalD (MalDroid), DefenseD (DefenseDroid), GD (Generated Dataset), FPR (False Positive Rate), Acc (Accuracy)
Table 8- 
Experimental Results (AdaBoost and SVM), Selected, Specify Features Selected in the Model, MalD (MalDroid), DefenseD (DefenseDroid), GD (Generated Dataset), FPR (False Positive Rate), Acc (Accuracy)

In table 8, both models are compared and trained on datasets and specify the accuracy, FPR and features used and selected corresponding to the composing samples. The FPR is also presented in figures 26 to 28 above, specifying the calculative measures through a hyperplane. The accuracy and false positives have been measured by the equation described in section IV in algorithm characteristics for the number of runs of the model. Results show 96.24% as the highest accuracy for the model after experimentation and false-positive rate of 0.3% in the case of the ensemble approach.

Related works explain the originality of our model and exhibit the novel features and sample size. To conclude our model still lack fewer percentages in terms of accurate detection. To justify this fact, table 9 presents some properties of similar studies with higher performance rates, indicating such elements which elaborated the efficiency of our system.

TABLE 9 Relative Resources (Pro = Processing), (Acc = Accuracy), (FPR = False Positive Rate)
Table 9- 
Relative Resources (Pro = Processing), (Acc = Accuracy), (FPR = False Positive Rate)

[29] This model has exceptional computational/processing power with a much stronger environment to test and train their datasets. [24] Has somewhat of a similar resource with higher processing but their sample size is very limited in comparison to our model. A few other studies describe similar technical advantages, thus, leaving us to work with restrictive measures. Table 9 presents some key properties to elaborate on similar systems’ components.

SECTION VII.

Research Issues and Challenges

This section highlights our experiment’s prevalent and crucial topics. These hurdles are based on various stages of our work and maybe gradually rectified in the work to be undertaken in the future.

  1. Features declared mostly on the device are more durable than the features specific to the applications and therefore can usually automate malware detection. The range of android parameters for processing is rather big and difficult to detect properly if someone does not extract the features properly.

  2. There is still a fast increase in the number of apps. Malware apps can always be identified in potential in combination with methods based on AI or machine learning, such as inept learning, to make the detection more sophisticated to make it easier to identify and regulate app prediction rate.

  3. Application behaviours in the malware ecosystem encourage non-emerging threats. Our study doesn’t incorporate the rider analysis or behaviour of repackaged malware. The study simply uses the reverse-engineered apk files and extracts the given context to the AndroGuard and extracts features in binary vectors. Although this is a major issue and a key challenge with the advancement in Android malware. This approach will be our advanced project to perform differential or effective analysis on reverse applications, determining the effects of these applications and their results.

  4. The applications with time induce new features with enhanced malware abilities which is why we would have to upgrade the system whenever the model’s FPR rate after execution increases. The simplest explanation for how to identify if the model is degrading on evolved features is that our datasets are designed in binary matrix extracted from features that are currently implemented in these applications and not features that will be present in evolved apps in coming years. With new features, we would have to reverse and extract those features to form an updated dataset again to train on these classifiers. [66], [67], [68] and [69] discuss the possible solutions for this key issue and propose some possible solutions but for our model and given the resource we have only performed for current features. For future work, we will consider model sustainability and how to classify the malware that our system will be able to detect even if the features are not yet implemented.

  5. The research mentions the problem of multicollinearity in the introduction, depicting the rise of dependent variables in-between machine learning algorithms which cause interpretation in results. However, this field of study can be taken as a future work for further testing of several models handling multicollinearity because our model itself is already performing high processing detection schemes to generate accuracy for Android applications features malware. We will foresee this issue and incorporate it to produce an efficient solution to the problem. Authors in [70], [71], [72] proposes some solutions to tackle this challenge and can help understand viewers queries.

A. Limitations

The technique in this paper is based on binary classification of lightweight code of static feature sets present in the Android manifest file. The three major limitations of our method are:

  1. The research doesn’t include dynamic or runtime application features. We will consider the potential dynamic aspects of Android applications in the future, including real-time permissions and API requests and possible features extracted. We will evaluate the behavioural traits of the app using a mixture of dynamic and static evaluation to discover harmful tendencies.

  2. Our system lags in future sustainable operative measures, meaning the system will need to be upgraded in terms of forthcoming API levels and malware collection or terms of new innovative features present in these Android applications.

  3. The constraint of a slow and low processing environment is another motive for less accuracy and predictive measures of our model in comparison to a few other peak detection techniques achieving higher accuracy.

SECTION VIII.

Conclusion

In this research, we devised a framework that can detect malicious Android applications. The proposed technique takes into account various elements of machine learning and achieves a 96.24% in identifying malicious Android applications. We first define and pick functions to capture and analyze Android apps’ behavior, leveraging reverse application engineering and AndroGuard to extract features into binary vectors and then use python build modules and split shuffle functions to train the model with benign and malicious datasets. Our experimental findings show that our suggested model has a false positive rate of 0.3 with 96% accuracy in the given environment with an enhanced and larger feature and sample sets. The study also discovered that when dealing with classifications and high-dimensional data, ensemble and strong learner algorithms perform comparatively better. The suggested approach is restricted in terms of static analysis, lacks sustainability concerns, and fails to address a key multicollinearity barrier. In the future, we’ll consider model resilience in terms of enhanced and dynamic features. The issue of dependent variables or high intercorrelation between machine algorithms before employing them is also a promising field.

Cites in Papers - |

Cites in Papers - IEEE (21)

Select All
1.
Vivek Menon U, Vinoth Babu Kumaravelu, Vinoth Kumar C, Rammohan A, Sunil Chinnadurai, Rajeshkumar Venkatesan, Han Hai, Poongundran Selvaprabhu, "AI-Powered IoT: A Survey on Integrating Artificial Intelligence With IoT for Enhanced Security, Efficiency, and Smart Applications", IEEE Access, vol.13, pp.50296-50339, 2025.
2.
A.Padmavathi, K.Yasaswy, K.Jayesh Rahul, "Detecting Malicious Threats On Mobile Applications Using Machine Learning Approaches", 2024 International Conference on Communication, Control, and Intelligent Systems (CCIS), pp.1-6, 2024.
3.
Nathanael Berliano Novanka Putra, Jonathan Sebastian Marbun, Rheva Anindya Wijayanti, Dzakwan Al Dzaky Bewasana, Nurul Qomariasih, "Counter Attack Malware Application Using Automatic Reverse Engineering Web Application", 2024 IEEE Asia Pacific Conference on Wireless and Mobile (APWiMob), pp.109-114, 2024.
4.
Priya Matta, Atika Gupta, Sopan Talekar, Shashank Vyas, Priyanka Rastogi, Upma Jain, "SeLCM: An Efficient and Robust Malware Detection Model", 2024 2nd International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT), vol.1, pp.1326-1331, 2024.
5.
Prashant Bhooshan, Shiva Darshan S. L, Nidhi Sonkar, "Comprehensive Android Malware Detection: Leveraging Machine Learning and Sandboxing Techniques Through Static and Dynamic Analysis", 2024 IEEE 21st International Conference on Mobile Ad-Hoc and Smart Systems (MASS), pp.580-585, 2024.
6.
Balachandra Chikkoppa, Hanumanthappa J, Vijeeta Pati, Shridhar Allagi, Liset S. Rodriguez-Baca, Carlos F. Cruzado, "A Comparative Study of Malware Detection in Enterprise Networks", 2024 2nd World Conference on Communication & Computing (WCONF), pp.1-5, 2024.
7.
S. Kalaiselvi, S. Poorani, G Shakthi Sri, M Rathnaa, M Sowmiya Sree, "Hybrid Machine Learning Approach for Malware Analysis", 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp.1-6, 2024.
8.
Shailaja Uke, Gayatri Gite, Haider Hirkani, Inderdeep Bassan, Isha Raghvani, "Malware Detection and Classification for URLs using Ensemble Learning", 2024 4th International Conference on Pervasive Computing and Social Networking (ICPCSN), pp.248-263, 2024.
9.
Thai Vu Nguyen, Duc N. M. Hoang, Long Bao Le, "Multi-Head Attention Based Malware Detection with Byte-Level Representation", 2024 IEEE Wireless Communications and Networking Conference (WCNC), pp.1-6, 2024.
10.
Hayat Hussain Reshi, Karan Singh, "Enhancing Malware Detection using Deep Learning Approach", 2024 International Conference on Automation and Computation (AUTOCOM), pp.497-501, 2024.
11.
Pawan Kumar, Sukhdip Singh, "WoS Bibliometric-based Review for Security Testing of Android Applications using Malware Analysis", 2024 5th International Conference on Innovative Trends in Information Technology (ICITIIT), pp.1-6, 2024.
12.
M SujayKumar Reddy, Phanitha Sri Thota, "Reverse Engineering of Android Malware Classification Using Semi-Supervised Learning", 2024 10th International Conference on Advanced Computing and Communication Systems (ICACCS), vol.1, pp.995-1000, 2024.
13.
Rahmat Junaidi, Teddy Mantoro, Media Anugerah Ayu, Umar Aditiawarman, "Analysis of Malware Inserted in APK Files in the Case of “Undangan Nikah.apk” Using Reverse Engineering", 2023 International Conference on Technology, Engineering, and Computing Applications (ICTECA), pp.1-5, 2023.
14.
Muna Muhammad, Ahthasham Sajid, Gunjan Chhabra, Hamed Taherdoost, Inam Ullah Khan, Keshav Kaushik, "Security & Privacy Issues in IoT Using Blockchain and ML", 2023 Seventh International Conference on Image Information Processing (ICIIP), pp.733-740, 2023.
15.
Innocent Barnet Mijoya, Shiraz Khurana, Nishant Gupta, Keshav Gupta, "Malware Detection in Mobile Devices Using Hard Voting Ensemble Technique", 2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp.116-121, 2023.
16.
Almaha Almuqren, Mounir Frikha, Abdullah Albuali, "Automated Malware Detection Based on a Machine Learning Algorithm", 2023 IEEE Tenth International Conference on Communications and Networking (ComNet), pp.1-12, 2023.
17.
Vishesh Tanwar, K. R. Ramkumar, "A Survey on the Role of Reverse Engineering in Security Attacks", 2023 International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE), pp.1-6, 2023.
18.
Chhaya Negi, Amit Kumar Mishra, Anshika Verma, Zoya Ganguli, Siddhant Thapliyal, Mohammad Wazid, D. P. Singh, "A Robust Approach for Malware Attacks Detection in the Internet of Things Communications", 2023 World Conference on Communication & Computing (WCONF), pp.1-6, 2023.
19.
Nor Zakiah Gorment, Ali Selamat, Lim Kok Cheng, Ondrej Krejcar, "Machine Learning Algorithm for Malware Detection: Taxonomy, Current Challenges, and Future Directions", IEEE Access, vol.11, pp.141045-141089, 2023.
20.
Reem Alrawili, Michael Oliva, Amber Honnef, Emily Sawall, Ali Abdullah S. AlQahtani, "Malware and Average Individual", 2022 IEEE Asia Pacific Conference on Wireless and Mobile (APWiMob), pp.1-6, 2022.
21.
Tom Meurs, Marianne Junger, Erik Tews, Abhishta Abhishta, "Ransomware: How attacker’s effort, victim characteristics and context influence ransom requested, payment and financial loss", 2022 APWG Symposium on Electronic Crime Research (eCrime), pp.1-13, 2022.

Cites in Papers - Other Publishers (16)

1.
Sandeep Kumar Davuluri, Mukesh Soni, Ghayth ALMahadin, Richard Rivera, Jinal Upadhyay, Pavan Patel, "Mining Intelligence Hierarchical Feature for Malware Detection", Intelligent Computing and Networking, vol.1172, pp.221, 2025.
2.
Yash Sharma, Anshul Arora, "A comprehensive review on permissions-based Android malware detection", International Journal of Information Security, 2024.
3.
Sadananda Lingayya, Praveen Kulkarni, Rohan Don Salins, Shruthi Uppoor, V. R. Gurudas, "Detection and analysis of android malwares using hybrid dual Path bi-LSTM Kepler dynamic graph convolutional network", International Journal of Machine Learning and Cybernetics, 2024.
4.
Ehtesham Hashmi, Muhammad Mudassar Yamin, Sule Yildirim Yayilgan, "Securing tomorrow: a comprehensive survey on the synergy of Artificial Intelligence and information security", AI and Ethics, 2024.
5.
Noah Oghenefego Ogwara, Krassie Petrova, Mee Loong Yang, Stephen G. MacDonell, "A Risk Assessment Framework for Mobile Apps in Mobile Cloud Computing Environments", Future Internet, vol.16, no.8, pp.271, 2024.
6.
Saygın Diler, Yıldırım Demir, "Çoklu Doğrusal Bağlantı Olması Durumunda Veri Madenciliği Algoritmaları Performanslarının Karşılaştırılması", Nicel Bilimler Dergisi, vol.6, no.1, pp.40, 2024.
7.
Ali Raza, Zahid\\xa0Hussain Qaisar, Naeem Aslam, Muhammad Faheem, Muhammad\\xa0Waqar Ashraf, Muhammad\\xa0Naman Chaudhry, "TL‐GNN: Android Malware Detection Using Transfer Learning", Applied AI Letters, 2024.
8.
Lubna Javaid Haji, Sudesh Kumar, "Feature Selection-Based Machine Learning Model for Malware Detection", Proceedings of the International Conference on Machine Learning, Deep Learning and Computational Intelligence for Wireless Communication, pp.509, 2024.
9.
Amardeep Singh, Hamad Ali Abosaq, Saad Arif, Zohaib Mushtaq, Muhammad Irfan, Ghulam Abbas, Arshad Ali, Alanoud Al Mazroa, "Securing Cloud-Encrypted Data: Detecting Ransomware-as-a-Service (RaaS) Attacks through Deep Learning Ensemble", Computers, Materials & Continua, vol.79, no.1, pp.857, 2024.
10.
Rahul Gupta, Kapil Sharma, Ramesh Kumar Garg, "Innovative Approach to Android Malware Detection: Prioritizing Critical Features Using Rough Set Theory", Electronics, vol.13, no.3, pp.482, 2024.
11.
Muhammad Aamir, Muhammad Waseem Iqbal, Mariam Nosheen, M. Usman Ashraf, Ahmad Shaf, Khalid Ali Almarhabi, Ahmed Mohammed Alghamdi, Adel A. Bahaddad, "AMDDLmodel: Android smartphones malware detection using deep learning model", PLOS ONE, vol.19, no.1, pp.e0296722, 2024.
12.
Sangeeta Rani, Khushboo Tripathi, Ajay Kumar, "Machine learning aided malware detection for secure and smart manufacturing: a comprehensive analysis of the state of the art", International Journal on Interactive Design and Manufacturing (IJIDeM), 2023.
13.
Anuradha Dahiya, Sukhdip Singh, Gulshan Shrivastava, "Android malware analysis and detection: A systematic review", Expert Systems, 2023.
14.
Amir Djenna, Ahmed Bouridane, Saddaf Rubab, Ibrahim Moussa Marou, "Artificial Intelligence-Based Malware Detection, Analysis, and Mitigation", Symmetry, vol.15, no.3, pp.677, 2023.
15.
Elliot Mbunge, Benhildah Muchemwa, John Batani, Nobuhle Mbuyisa, "A review of deep learning models to detect malware in Android applications", Cyber Security and Applications, vol.1, pp.100014, 2023.
16.
Ahmed Sabbah, Adel Taweel, Samer Zein, "Android Malware Detection: A Literature Review", Ubiquitous Security, vol.1768, pp.263, 2023.

References

References is not available for this document.