Integrating Machine Learning Algorithms with Quantum Annealing Solvers for Online Fraud Detection

-Fraudulent transaction identification is crucial in today's digital world, and machine learning has proven effective in addressing this challenge. However, existing systems often detect fraudulent activities post-occurrence, lacking real-time efficacy. Additionally, the highly imbalanced nature of fraud data complicates traditional machine learning approaches. To overcome these limitations, we propose a novel fraud detection framework using Quantum-Enhanced Support Vector Machines (Q-SVM). Leveraging quantum annealing solvers, the Q-SVM exhibits remarkable improvements in both speed and accuracy when applied to a highly imbalanced bank loan dataset, while maintaining competitive performance on a moderately imbalanced Israel credit card transaction dataset. We evaluate the detection performance by implementing twelve machine learning methods and observe that feature selection significantly enhances detection speed with marginal accuracy improvement. Our discoveries highlight the capability of Q-SVM, imbalanced information, while asserting the viability of conventional AI approaches for non-time series information. These experiences help in choosing suitable discovery draws near, considering trade-offs between speed, accuracy, and cost. Our study highlights the promising role of Quantum Machine Learning (QML) in fraud detection, fostering future research in quantum computing applications.


I. INTRODUCTION
Fake exchanges represent a huge monetary danger to organizations around the world, bringing about significant financial misfortunes and reputational harm.In the US, organizations persevere through a yearly normal deficiency of $4 billion, while insurance agency in the UK face £1.6 billion in fake exchange claims [1].Past the financial effect, organizations likewise experience the ill effects of botched deals open doors and reputational takes a chance because of fake exercises.The proliferation of mobile technologies has led to a remarkable surge in online transactions, with a staggering 110% increase in ecommerce transactions in the US alone during early 2020 [2].Consequently, this surge has seen a corresponding rise in web attacks and associated fraudulent activities, presenting a substantial challenge for effective fraud detection.Existing fraud detection systems often fall short in providing real-time or near real-time detection, primarily detecting fraud only after the damage has occurred [3].This constraint sabotages their capacity to actually forestall misfortunes.Also, the slanted idea of exchange datasets, where certifiable exchanges far dwarf fake ones, further confounds the recognition interaction.Tending to the requirement for cutting edge misrepresentation identification arrangements, this study proposes a clever methodology that coordinates quantum strengthening solvers and AI calculations to empower superior grade, ongoing/close to continuous extortion location.By utilizing quantum tempering, this exploration expects to beat the limits of traditional figuring and improve the effectiveness of extortion location calculations.
The present study focuses on addressing the second and third challenges: the timely detection of fraudulent activities and handling imbalanced datasets.To achieve this, the research explores the application of autoregressive models for analysing non-stationary time series data generated by online transactions as in [4].Traditional linear autoregressive models and deep autoregressive networks with quadratic formulation are considered for this purpose.Additionally, transformation techniques for example, power change, square root, and log change are investigated to change over non-fixed information into fixed structure, guaranteeing the viability of information demonstrating and estimating.In outline, the combination of quantum toughening with AI procedures presents a promising avenue for businesses to bolster their fraud detection capabilities in the dynamic landscape of online transactions.By providing real-time insights and the ability to distinguish between normal and fraudulent transactions accurately, this approach offers Vol 12 Issue 08, Aug 2023 Page 460 potential solutions to combat the rising tide of fraudulent activities, safeguarding businesses' financial interests and customer trust.

A. Research Background
Online fraud detection is a critical challenge in the rapidly evolving digital landscape.Traditional fraud detection methods often struggle to keep up with sophisticated fraudsters.As a result, there is a growing interest in leveraging cutting-edge technologies to improve fraud detection systems.
Machine learning algorithms, particularly those based on deep learning and ensemble techniques, have shown promising results in various domains [5,6].They excel at identifying complex patterns and anomalies in large datasets, making them ideal candidates for fraud detection.However, these algorithms may still encounter limitations when dealing with high-dimensional data or combinatorial optimization problems.Quantum annealing [7] solvers, like D-Wave's quantum computers [8], offer a novel approach to address complex optimization tasks efficiently.They leverage quantum phenomena to explore a large solution space and find optimal or near-optimal solutions to hard optimization problems [9].The integration of machine learning algorithms with quantum annealing solvers presents a compelling opportunity for enhancing online fraud detection as realized by many prior works [10][11][12][13][14][15][16].By combining the strengths of both approaches, this hybrid approach can potentially achieve better accuracy and efficiency in detecting fraudulent activities in real-time.The machine learning algorithms can pre-process and extract relevant features from the data, reducing the problem's dimensionality, and then feed the transformed data into the quantum annealed for optimization.This integration could lead to more robust fraud detection systems capable of dealing with the growing sophistication of online fraudsters.However, challenges like the limited quit connectivity and noise in quantum annealing systems must be carefully addressed during the integration process.

B. Importance of credit dataset
The credit dataset is of utmost importance for integrating machine learning algorithms with quantum annealing solvers for online fraud detection.Online fraud has become a prevalent issue in the digital era, with fraudsters constantly evolving their tactics to exploit vulnerabilities in payment systems.Machine learning algorithms are effective tools for fraud detection due to their ability to analyse vast amounts of transactional data and identify suspicious patterns in real-time.However, as fraudsters become more sophisticated, conventional machine learning algorithms may struggle to keep up with the rapidly changing landscape.This is where quantum annealing solvers can offer significant advantages.Quantum annealing leverages quantum mechanics to solve optimization problems, making it highly efficient for combinatorial optimization tasks, such as fraud detection.By integrating machine learning algorithms with quantum annealing solvers, businesses can achieve enhanced fraud detection capabilities that can adapt to new fraud patterns quickly.The credit dataset plays a crucial role in this integration, as it serves as the foundation for training and validating the machine learning models.The dataset provides a diverse range of credit transaction samples, reflecting both genuine and fraudulent activities.This diversity ensures the models are robust and generalizable to detect various fraud scenarios accurately.The credit dataset serves as the linchpin for the successful fusion of machine learning algorithms with quantum annealing solvers, empowering businesses with a powerful and dynamic fraud detection system capable of combating the ever-evolving online fraud landscape.

SMOTE, or Synthetic Minority Over-sampling
Technique, is a widely used data augmentation approach in machine learning designed to tackle the issue of class imbalance within datasets.In many real-world scenarios, one class, referred to as the minority class, may be severely underrepresented compared to the other classes, known as majority classes.This imbalance can lead to biased models that perform poorly when it comes to predicting the minority class.To address this problem, SMOTE generates synthetic samples for the minority class by creating new instances interpolated between existing data points of that class.The method follows these steps: first, it selects a minority instance, and then it identifies its k-nearest Neighbours.Next, new samples are generated along the line segments that connect the chosen instance and its Neighbours in the feature space.By doing so, SMOTE effectively expands the representation of the minority class, providing more balanced data for model training.The user has control over the oversampling ratio, which determines the number of synthetic samples to be generated for the minority class.This way, the data augmentation process can be fine-tuned to suit the specific needs of the classification task.By applying SMOTE, machine learning models can achieve better performance on imbalanced datasets, reducing bias and enhancing predictive accuracy as in [17] for all classes, including the minority class.

D. Dataset Preparation
A credit card dataset like [18] typically contains information about credit card transactions made by Vol 12 Issue 08, Aug 2023 Page 461 customers, including various attributes such as transaction amount, merchant category, transaction date, customer demographics, historical transaction data, and whether the transaction was fraudulent or not.The goal of applying machine learning techniques to such a dataset is to build predictive models that can effectively detect fraudulent transactions and improve the overall security and fraud prevention measures of credit card companies.

Data Preprocessing
The first step is to pre-process the credit card dataset, which involves handling missing values, data normalization or scaling, and encoding categorical variables.It is essential to convert all the data into numerical format suitable for machine learning algorithms.

Data splitting
It involves dividing the dataset into two distinct subsets: the training set and the test set.The primary purpose of this division is to facilitate the training of machine learning models using the training set and subsequently assess their performance and ability to generalize to new, unseen data using the test set.This practice ensures that the model is not merely memorizing the training data but can make accurate predictions on new data it has never encountered before.

Feature Selection
The process of fraud detection involves identifying relevant features in the dataset that significantly contribute to predicting fraud.To achieve this, feature selection techniques are utilized to pinpoint the most important attributes as observed in [19].Moreover, dimensionality reduction methods like Principal Component Analysis (PCA) are applied to extract the most informative aspects of the data, ensuring efficient fraud prediction while minimizing unnecessary complexity.

Model selection
Model selection for fraud detection on the credit card dataset involves considering several machine learning algorithms.Each algorithm, such as Support Vector Machines (SVM), Logistic Regression, Decision Trees, Random Forests, Gradient Boosting, and Neural Networks, offers unique strengths and applicability based on factors like dataset size, complexity, and the desired level of interpretability.

Model training
It involves training the chosen machine learning model on the provided training set.Throughout this process, the model acquires the ability to differentiate between legitimate and fraudulent transactions, using the input features as the basis for its learning.

Performance comparison
The performance of the trained model is assessed using the test set to evaluate its effectiveness.When dealing with binary classification tasks such as fraud detection, several common evaluation metrics are utilized.These metrics include accuracy, precision, recall, F1score, and the use of Receiver Operating Characteristic (ROC) curves.

II. PREVALENCE OF RANDOM FOREST IN CREDIT DATASET
The widespread utilization of Random Forest (RF) in credit-related datasets for seamless integration of machine learning algorithms with quantum annealing solvers has significantly enhanced online fraud detection capabilities primarily attributed to its strong performance in handling complex and high-dimensional data.RF is known for its ability to handle both numerical and categorical features, making it suitable for credit datasets that often contain diverse types of information.The integration of machine learning algorithms with quantum annealing solvers brings unique advantages to online fraud detection.Quantum annealing can explore a broader solution space compared to classical optimization techniques, potentially leading to improved model optimization and better fraud detection accuracy.Moreover, RF's ability to handle class imbalance and its low risk of over fitting are advantageous in fraud detection scenarios where genuine transactions significantly outnumber fraudulent ones.This ensures that the model maintains high accuracy and robustness even when faced with imbalanced data.RF's prevalence in credit datasets for integrating with quantum annealing solvers stems from its robustness, versatility, and capacity to complement the power of quantum computing in tackling the challenges of online fraud detection.

III. LITERATURE SURVEY
Financial fraud detection is a critical aspect of safeguarding the integrity of financial systems and protecting stakeholders from potential losses.Data mining techniques have emerged as valuable tools for identifying fraudulent activities within vast and complex datasets.This classification framework [20] aims to provide a systematic approach to detect financial fraud by employing various data mining algorithms, such as decision trees, support vector machines, and neural networks.An academic review of the literature reveals a substantial body of research on this topic, showcasing the efficacy of data mining in detecting fraud across diverse financial domains, including banking, credit card transactions, and insurance claims.The reviewed studies highlight the advantages of data-driven approaches, Vol 12 Issue 08, Aug 2023 Page 462 demonstrating their ability to handle large-scale data, recognize intricate patterns, and adapt to evolving fraud strategies.By leveraging these techniques, financial institutions can enhance their fraud detection capabilities and mitigate potential risks, thereby fostering a more secure and trustworthy financial environment.Data mining plays a crucial role in detecting and preventing credit card fraud, a prevalent issue in today's digital world.Credit card fraud occurs when unauthorized individuals gain access to sensitive financial information and use it for illicit purposes.To combat this, financial institutions and payment processors employ data mining techniques to analyze vast amounts of transactional data in real-time.Data mining enables the identification of patterns, anomalies, and trends associated with fraudulent activities.Various machine learning algorithms, such as neural networks, decision trees, and logistic regression, are utilized in [21] to build predictive models that can flag suspicious transactions.These models take into account factors like transaction amount, location, time, and user behavior to distinguish between legitimate and fraudulent activities.Continuous monitoring and refining of these models are vital to stay ahead of evolving fraud tactics.By leveraging data mining, financial institutions can minimize losses, protect their customers, and maintain trust in the digital payment ecosystem, making transactions more secure and reliable.
Champion-challenger analysis is a crucial technique employed in credit card fraud detection to enhance the performance of fraud detection models.In this context, a "champion" model [22] represents the existing fraud detection system, while the "challenger" model is a newly proposed or alternative approach.The objective is to compare their performance and determine if the challenger model can outperform the existing champion.To achieve better results, researchers and practitioners have explored hybrid ensemble techniques, combining the strengths of multiple algorithms.These ensembles may incorporate traditional machine learning methods alongside deep learning models.The combination of techniques can lead to improved fraud detection accuracy, as each method brings unique capabilities to identify fraudulent transactions.Deep learning models, such as neural networks, are adept at capturing intricate patterns and relationships within the data, enabling them to discern fraudulent activities effectively.By integrating these capabilities into an ensemble framework, credit card fraud detection systems can achieve higher precision and recall rates, reducing false positives and false negatives.This hybrid approach empowers financial institutions to adapt their fraud detection systems to the evolving nature of credit card fraud, ensuring a robust and effective defense against fraudulent activities while minimizing the impact on legitimate transactions.
Machine learning plays a pivotal role in enhancing cyber security and financial systems by addressing the critical application of credit card fraud detection [23].As digital transactions become more prevalent, the need for robust solutions to swiftly identify and thwart fraudulent activities has become paramount Machine learning algorithms play a pivotal role in this domain due to their ability to analyze vast amounts of transactional data and identify patterns associated with fraudulent behavior.The process typically involves training the machine learning models on historical transactional data labeled as either genuine or fraudulent.Commonly used algorithms include logistic regression, decision trees, random forests, and neural networks.These models learn to recognize suspicious patterns, such as unusual purchase locations, large transactions, or abnormal frequency of transactions.Real-time fraud detection is achieved by feeding new transaction data into the trained model, which then assigns a probability of fraud.If the probability exceeds a predetermined threshold, the transaction is flagged for further investigation, and appropriate measures can be taken to prevent financial losses and protect the cardholder.Overall, credit card fraud detection using machine learning has become an indispensable tool for financial institutions to secure their systems and safeguard their customers' assets from unauthorized activities.

IV. PROPOSED
We propose a novel machine learning-based approach to address misdiagnosis problems in classification.Our system combines a new data preprocessing technique for feature transformation, along with Support Vector Machines (SVM), K-nearest neighbor (KNN), and Random Forest Classifier.This integration aims to achieve the best accuracy by eliminating bias and instability, while considering the heterogeneity and size of the data.The proposed system leverages these techniques to perform rigorous classifier tests, providing a robust solution for accurate classification and fraud detection in online systems.

A. Block Diagram
The Block Diagram of our Machine learning frameworks-based credit dataset schema has been shown in the below figure 1.

B. Deployed Approaches
The Machine learning approaches utilized in our Machine learning frameworks-based credit dataset have been detailed below.

ISSN 2456 -5083
Page 464 function of the human brain.They consist of interconnected nodes, also known as neurons, organized in layers.Each neuron receives inputs, processes them using an activation function, and produces an output.The connections between neurons carry weights, which are adjusted during training to optimize the network's performance.
1. Input Layer and Feature Extraction: In Artificial Neural Networks, the credit dataset's input layer receives the features such as transaction amount, merchant category, and customer demographics, which are then processed to extract relevant patterns and information.

Hidden Layers and Learning Representation:
The hidden layers of the neural network learn to represent complex relationships and patterns within the data, enabling the network to capture non-linear dependencies between features and improve fraud detection performance.

F. Advantages
The advantages of our Machine learning frameworksbased Online Fraud Detection are as follows:  Highest accuracy  Reduces time complexity. Easy to use

G. Applications
The probable applications suiting our Machine learning frameworks-based Online Fraud Detection are as follows:  E-commerce Fraud Prevention. Real-time Transaction Monitoring. Identity Verification.

V. RESULT AND ANALYSIS
The results of the proposed technique of unraveling the Online Fraud Detection with Machine learning technique are provided in this section.

A. Home Page
The screenshot of the home page of the proposed Online Fraud Detection is shown in the below figure 2.

E. User Home Page
The screenshot of the User home page of the proposed Online Fraud Detection is depicted in the following figure 6.The User can access his\her portal.

VI. COMPARISON TABLE
Based on the comparison table depicted in the below table 1, it is evident that the models vary significantly in their performance metrics for credit card fraud detection.Random Forest (RF) outperforms the other models with the highest accuracy of 85%, striking a good balance between precision (81%) and recall (92%), as indicated by its F1-score of 86%.SVM achieves a remarkably high precision of 99%, but its recall and F1-score are relatively low at 15% and 26%, respectively.K-Nearest Neighbors (KNN) and Extra Trees (ET) models show similar performance with 74% accuracy.While KNN performs better in recall (63%) and F1-score (70%), ET excels in precision (76%).The Artificial Neural Network (ANN) exhibits the lowest accuracy of 50% and struggles with precision (45%) despite having a higher recall of 86%, resulting in a relatively lower F1-score of 69%.In summary, RF appears to be the most promising model for credit card fraud detection in this comparison, with a balanced trade-off between precision and recall.However, it's essential to consider the specific requirements and objectives of the application when choosing the most suitable model.

VII. CONCLUSION
Our user-friendly application, "Integrating Machine Learning Algorithms with Quantum Annealing Solvers for Online Fraud Detection Model," has successfully incorporated advanced techniques like Support Vector Machines (SVM) of accuracy 57%, K-Nearest Neighbor (KNN) of accuracy 74%, Random Forest Classifier of accuracy 85%, ExtraTreesClassifier of accuracy 74%, and Artificial Neural Networks (ANN) of accuracy 74%.Through rigorous testing, we identified the most effective approaches to distinguish between Fully Paid and Charged Off cases in online fraud detection.The integration of quantum annealing solvers further enhances the model's efficiency and accuracy.With these advancements, our application showcases a robust and reliable tool for online fraud detection, providing businesses with the means to make more informed decisions and safeguard against fraudulent activities.

Figure 1
Figure 1 Block Diagram of our Machine learning frameworks-based credit dataset

3 .
Output Layer and Fraud Prediction: The output layer of the neural network provides a probability score or a binary classification (fraudulent or not) based on the learned representations, allowing the model to predict and detect potential fraudulent credit card transactions.

Figure 2
Figure 2 Screenshot of the Home PageB.About PageThe screenshot of the about page of the proposed Online Fraud Detection is depicted in the following figure3.The detailed description of the project provided in this page.

Figure 3
Figure 3 Screenshot of the About PageC.RegisterThe screenshot of the Register page of the proposed Online Fraud Detection is depicted in the following figure4.User can register with required details.

Figure 4 Figure 5
Figure 4 Screenshot of the Register PageD.LoginThe screenshot of the Login page of the proposed Online Fraud Detection is depicted in the following figure5.The User can login with required details

Figure 6
Figure 6 Screenshot of the User Home PageF.Upload PageThe screenshot of the upload page of the proposed Online Fraud Detection is indicated in the subsequent figure7.Here, the Online Fraud Detection are uploaded into the proposed technique for recognizing the credit dataset.

Figure 7 Figure 8
Figure 7 Screenshot of the Upload PageG.View DataThe screenshot of the View data page of the proposed Online Fraud Detection is indicated in the subsequent figure8.Here, the Online Fraud Detection are View data into the proposed technique for recognizing the credit dataset

Figure 9
Figure 9 Screenshot of the Preprocessing PageI.Model TrainingThe screenshot of the Model page of the proposed Online Fraud Detection is indicated in the subsequent figure10.Here, the Online Fraud Detection are Model data into the proposed technique for recognizing the credit dataset.

Figure 10 Figure 11
Figure 10 Screenshot of the Model Training PageJ.Prediction PageThe screenshot of the prediction page of the proposed Online Fraud Detection is expressed in the following figure11.Based on the uploaded data points, the Online Fraud Detection is predicted in this page.