Empowering Glioma Prognosis With Transparent Machine Learning and Interpretative Insights Using Explainable AI

The primary objective of this research is to create a reliable technique to determine whether a patient has glioma, a specific kind of brain tumour, by examining various diagnostic markers, using a variety of machine learning as well as deep learning approaches, and involving XAI (explainable artificial intelligence) methods. Through the integration of patient data, including medical records, genetic profiles, algorithms using machine learning have the ability to predict how each individual will react to different medical interventions. To guarantee regulatory compliance and inspire confidence in AI-driven healthcare solutions, XAI is incorporated. Machine learning methods employed in this study includes Random Forest, decision trees, logistic regression, KNN, Adaboost, SVM, Catboost, LGBM classifier, and Xgboost whereas the deep learning methods include ANN and CNN. Four alternative XAI strategies, including SHAP, Eli5, LIME, and QLattice algorithm, are employed to comprehend the predictions of the model. The Xgboost, a ML model achieved accuracy, precision, recall, f1 score, and AUC of 88%, 82%, 94%, 88%, and 92%, respectively. The best characteristics according to XAI techniques are IDH1, Age at diagnosis, PIK3CA, ATRX, PTEN, CIC, EGFR and TP53. By applying data analytic techniques, the objective is to provide healthcare professionals with practical tool that enhances their capacity for decision-making, enhances resource management, and ultimately raises the bar for patient care. Medical experts can customise treatments and improve patient outcomes by taking into account patient’s particular characteristics. XAI provides justifications to foster faith amongst patients and medical professionals who must rely on AI-assisted diagnosis and treatment recommendations.


I. INTRODUCTION
The most typical kind of Central nervous system cancer which arises from glial cells is called glioma [1].One kind of brain cancer that penetrates the outer layer of the brain is glioma.Glioblastoma is the most malignant type of tumours [1].Gliomas are growths of cells that start in the brain or spinal cord.Healthy brain cells called glial cells mimic the cells found in gliomas.By encircling nerve cells, The associate editor coordinating the review of this manuscript and approving it for publication was Nuno M. Garcia .glial cells facilitate their functionality.When a glioma grows, a clump of cells called a tumour forms [2].The tumour can expand to the point that it presses against brain or the spinal cord, causing signs and symptoms.The symptoms depend on the precise region of the brain or the spinal cord that is impacted.Malignant gliomas can enter healthy brain tissue and grow swiftly.Treatment options for gliomas generally involve surgery, radiation therapy, chemotherapy, and other treatments.Depending on where the glioma is, different symptoms could be present.The type of glioma, its size, and the rate of its growth can all affect the symptoms.Common symptoms and indications of glioma comprises headache, nausea, decreased brain activity, memory loss, alterations in personality or temperament, issues of vision, issues with speech [2], [3].In Figure 1, the method of treating a glioma tumour is shown.In 2007, World Health Organization (WHO) categorized brain tumours according to the type of cell and grade (grades I-IV), adopting histopathological criteria of diffuse gliomas that relate to similarities with suspected cells of origin and anticipated levels of differentiation [4], [5].Grade I tumours tend to occur in children and are typically benign tumours, indicating that they are usually treatable [6].Three tumour types are classified as grade II: oligodendrogliomas, astrocytomas, and oligoastrocytomas, which are a combination of both.Adults commonly experience them.All low-grade gliomas have the potential to grow into high-grade tumours in later stages.Anaplastic Astrocytomas, Anaplastic Oligodendrogliomas, and Anaplastic Oligoastrocytomas are all examples of grade III tumours.Compared to grade II, they are more assertive and intrusive.According to the WHO classification, grade IV glioma, which is identified as Glioblastoma Multiforme (GBM), is one of the deadliest tumours [6].
Brain tumours classified as gliomas might differ from one another in terms of their genetic and Molecular makeup.Several factors are taken into account when predicting how glioma would behave and how it will progress.Each variable's explanation and potential impact on glioma prediction are Gender, Age at diagnosis, Race, ATRX, PTEN, EGFR, CIC, MUC16, TP53, IDH1, GRIN2A, IDH2, FAT4, FUBP1, BCOR, RB1, CSMD3, NOTCH1, SMARCA4, PIK3CA, NF1, PIK3R1, and PDGFRA [7].While gender may not have a direct impact on glioma prediction, it can be thought of as a demographic feature that may be associated with other factors.The age of the patient during the diagnosis is a crucial prognostic indicator.In general, younger patients typically experience greater results than older people.For instance, gliomas in paediatric patients could differ in their features and prognoses.The incidence and prognosis of gliomas can vary depending on race.ATRX, PTEN, EGFR, CIC, MUC16, TP53, IDH1, GRIN2A, IDH2, FAT4, FUBP1, BCOR, RB1, CSMD3, NOTCH1, SMARCA4, PIK3CA, NF1, PIK3R1, and PDGFRA are examples of genetic mutations that can occur.These particular genetic changes or mutations may manifest in glioma cells.Each of these changes alters the tumour's Molecular make-up and, consequently, its behaviour.Patients with glioma may have different treatment options and a different prognosis depending on whether these mutations are present or absent.For instance, better prognoses are linked to IDH1 and IDH2 mutations [8], while poorer prognoses are linked to amplification of EGFR [9] and PTEN loss.TP53 mutations may influence the tumour's virulence [10].Genetic testing is essential for glioma Molecular profiling in order to identify the tumour subtype and customize treatment.All of these factors may be taken into consideration by glioma prediction models to evaluate the hazards, and possible treatments for specific patients.Machine learning algorithms, for example, can examine a mixture of these elements to forecast how the disease would most likely develop and the best course of treatment.It is imperative to consult with medical professionals who can provide tailored guidance based on these variables and the most recent discoveries about the diagnosis, management, and prognosis of gliomas [11].
Prediction of patient suffering from glioma tumour can be achieved with artificial intelligence and machine learning classifiers.Artificial intelligence (AI) refers to creation of computer systems and algorithms that are capable of performing activities that frequently need human intelligence, like learning from data, reasoning, solving problems, and understanding natural language.Robots and autonomous systems are included, along with the various additional techniques and purposes, including neural networks and machine learning [12].Algorithms for artificial intelligence (AI) perform complex operations on enormous volumes of data.These activities consist of recognizing text and imaging, remote medical care, precise disease identification, and disease prediction [13].A subfield of artificial intelligence called machine learning (ML) is concerned with developing algorithms and models that enable computers to process data, draw conclusions, and make predictions without being explicitly programmed [14], [16].It involves using statistical techniques to help computers become more proficient at a particular task through iterative learning from experience.The healthcare industry can benefit greatly from machine learning since it can analyse vast amounts of patient data to find trends, predict disease outcomes, and help with early diagnosis.In addition, machine learning can be used to streamline administrative procedures, enhance medicine development, and improve treatment regimens, ultimately leading to better patient care and cheaper healthcare costs [17], [18].As a result of the recognition that classifiers must be utilised appropriately in order to provide transparency, accountability, and ethics, Explainable AI (XAI) was created [15].Black-box models have new potential because to the explainability element, which also enables healthcare stakeholders the assurance of interpreting deep learning (DL) [19], [20] and machine learning (ML) algorithms [19], [21].Transparency in predictive analysis is essential for the healthcare sector, and XAI intends to work on improving it [15].The idea of creating artificial intelligence systems and machine learning models in a way that allows people to understand and interpret its decisions and behaviours is known as explainable artificial intelligence (XAI).XAI aspires to render artificial intelligence accountable and transparent by 31698 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
outlining the justifications for the decisions that AI systems make [22].
Machine learning algorithms can forecast whether or not a patient has been diagnosed with a glioma tumour using genetic and Molecular makeup markers.By examining a wide range of genetic markers, the algorithms can determine the level of seriousness of the illness of a person and the possibility of complications.This will aid in early diagnosis and detection.It will assist in early treatment planning for patients and prevent subsequent issues, improving the health of the patient.Continuous observation and analysis can also enable early patient intervention for those who are at risk, enhancing overall healthcare and health outcomes.Explainable Artificial Intelligence (XAI) is essential for fostering trust, addressing ethical concerns, ensuring regulatory compliance, debugging models, enhancing user understanding, and promoting collaboration.It provides transparency and interpretability in AI systems, making them more accountable and accessible.
Following is the organization of the remaining content: Related work is illustrated in Section III. Materials and Methods are discussed in the Section IV.The results of the study are discussed in considerable detail in the Section V. Section VI addresses conclusion of the classifiers along with probable applicability.

II. RELATED WORK
To forecast whether the patient is diagnosed with glioma or not, a number of research have already used machine learning approaches.The following research projects have significantly advanced knowledge: Using multi-modal MR image fusion, Ouerghi et al. [23] examined the function of radiomic feature integration in conjunction with machine learning techniques in the distinction of low-grade gliomas from high-grade gliomas.80 histologically verified glioma patients from the MIC-CAI BraTS 2019 dataset i.e., 40 high-grade gliomas and 40 low-grade gliomas were analyzed for this research [23].Five machine learning algorithms were created and examined using the fused and the recovered data utilizing a tenfold cross-validation plan.As an outcome, the model of random forests, which used 21 characteristics chosen from the raw data, achieved the highest accuracy of 96.5%.Utilizing texture information based on 153 multi-parametric MRI patients, a radiomics approach has been proposed by Tian et al. [24].For separating grades III from IV and LGGs from HGGs, respectively, SVM models were created utilizing 30 and 28 optimum characteristics.The accuracy of the SVM (support vector machine) algorithm was 96.8% for separating LGGs from the HGGs and 98.1% for separating grade III from grade IV, which was acceptable compared to utilizing single sequence MRI or histogram parameters [24].A total of 285 cases collected for the Brain Tumour Segmentation 2017 Challenge were examined by Cho et al. [25].Five prominent characteristics were chosen for the machine learning models using the minimal redundancy maximum relevance algorithm.The three different classifiers (support vector machines, logistic regression, and random forest) obtained 94% of mean accuracy for training class and 92.13% of maximum accuracy for test class [25].For training cohorts, they displayed an average AUC of 0.94, whereas for test cohorts, it was 0.9030 (as for logistic regression it is 0.9010, for support vector machine it is 0.8866, and for random forest it is 0.9213).A non-invasive glioma prediction framework is proposed in by Wu et al. [26].Between 2012 and 2016, experiments were carried out on about 161 cases of glioma from the Henan Provincial People's Hospital.The outcomes showed that the de-redundancy algorithm was widespread and had an accurate grading impact.The 2D segmented tumour was used to calculate 346 radiomics characteristics.A candidate feature was built using mutual information.Then an elastic net was used to carry out the feature selection.The prediction model was developed using linear regression to obtain the necessary sensitivity of 93.57%, specificity of 86.53%, 0.9638 AUC, and 91.30% accuracy.Cao et al. [27] produced a quantitative framework according to the location of tumour and volume of tumour, characteristics employing data from the 229 The Cancer Genome Atlas LGG and GBM patients [27].Two of the sample approaches were used in the construction and testing of the LASSO regression i.e., least absolute shrinkage and selection operator and nine machine learning models: institution-based and repeat random sampling (with 70% of training set and 30% of validation set) [27].The best results were obtained via stack modelling and support vector machines (AUC > 0.900, accuracy > 0.790 for validation set derived from institution-based sampling; AUC > 0.930 for average validation set, accuracy > 0.850 for repeat random sampling).The regression model demonstrated the best performance for the LASSO approach (institution-based sample validation set, with AUC 0.909 and model accuracy of 0.830).From 735 photos, Rathore et al. [28] retrieved 2D quantitative imaging characteristics, including conventional, clinical, and textural characteristics.Tenfold cross-validation using the 735 glioma images resulted in a successful verification of the texture features (accuracy: 75.12%,AUC: 0.652) using the SVM algorithm.Table 1.summarizes the related work with a deep and wide review.
The current study is concerned with using XAI methods like Eli5, SHAP, Qlattice and LIME to enhance prediction of glioma, a brain tumour.A recent development in machine learning called Explainable AI (XAI) aims to address the unresolved query of how ''black box'' artificial intelligence (AI) algorithms determine decisions.In an effort to make decision-making processes and models more understandable and comprehensible, this field conducts research on them.
The findings cited above indicate that prediction has already been done using ML and AI algorithms.The following ways that this article adds to the body of literature: 1. Pearson's correlation, mutual information and principal component analysis, the feature selection approaches were used to identify the most crucial attributes.This study compared different feature selection techniques.2. A ground-breaking customized ''ensemble-stacking'' approach was developed and put into use to improve performance using baseline classifiers.3.In this unique investigation, four XAI algorithms were applied to the given data to clarify predictions: ELI5, LIME, Qlattice and SHAP.

A. DATASET DESCRIPTION
We used a dataset that has been made available to the public for this study [39].The UCI Machine Learning Repository has access to the Glioma Grading Clinical and Mutation Features Dataset.Given that gliomas are prevalent primary tumour of brain and are classified as either GBM (Glioblastoma Multiforme) or LGG (Lower-Grade Glioma) based on imaging and histological criteria, the grading procedure heavily relies on the clinical and molecular/mutation aspects.Three clinical characteristics and the twenty most frequently altered genes from TCGA-LGG and TCGA-GBM brain gliomas are taken into consideration in this dataset.
The dataset includes the Clinical and Molecular/mutation factors that are used in prediction task to identify patients with specific clinical, molecular, and mutational characteristics as LGG or GBM.The molecular/mutation factors are described as 0 = NOT_MUTATED; 1= MUTATED.There are 889 patients with 23 attributes in this set of patients.The target variable in this dataset is grade (binary classification problem) which stated a patient to have LGG or GBM.Out of 889 patients, 487 suffered with LGG while 352 suffered with GBM.Out of 23 attributes there was 1 numerical attribute and 22 categorical attributes.The attributes had no null values.Table 2 lists the dataset's attributes.

B. DATASET PREPROCESSING
The processing of dataset turns unprocessed data into forms that are comprehensible and useful.Raw datasets present several issues, including defects, unpredictable behaviour, absence of trends, and unpredictability [40].Preprocessing is also required for the purpose to address missing values and discrepancies.The dataset was preprocessed, and a few further operations were required to be prepared for employing in deployment.
Balancing the dataset allows for training a model easily by avoiding it from getting biased toward one class.There are two approaches for balancing data: undersampling and oversampling.Despite being simple to construct as well as capable of enhancing model run-time, undersampling has some downsides.The elimination of data points from the original set of data can result in loss of important information.Oversampling in this manner results in false scores from the minority class.So, in this study, the dataset was already appropriately balanced, so there was no need to implement any extra balancing algorithms [41], [42].
Data normalization entails rescaling the attributes to ensure their means and variance are both equal to 0. Standardization aims to preserve the variations in value limitations while reducing every attribute to a comparable scale.We employed the conventional scaler technique for feature scaling.Following the standardization procedure, an outlier has no longer impact on dataset hence, standardization do not have variability restriction.
To allow algorithms to analyze and learn from nonnumerical aspects, categorical data must be transformed to numerical format for use in data analysis and machine learning.A key method for characterizing binary vectors of categorical data in machine learning and data analysis is onehot coding.Every group or label in the dataset appears as a binary vector, with just a single component set to ''hot'' (set to 1) and all other elements set to ''cold'' (set to 0).The one-hot encoding method of feature engineering is critical for preparing data for tasks such as classification and regression [43].For our study, one hot encoding was not performed.

C. FEATURE SELECTION
In this study, Pearson's Correlation, Principal Component analysis and Mutual information were used to select the best characteristics.These algorithms aided in the extraction of crucial attributes while also reducing the quantity of data.

1) PEARSON'S CORRELATION
Pearson's correlation coefficient analysis was conducted after an initial assessment of the dataset to see how each attribute influenced the outcome.The coefficient value ''r'' as well as the output were perfectly associated if the value reached close to ''1/1,'' while ''0'' value indicated no association.Positive correlation coefficient value implies that component influenced the outcome positively.If it was unfavorable, the outcome was influenced in the opposite direction.The method for evaluating correlation coefficients is based on an idea that analyzing the degree to which specific variable's attributes are associated can help to determine the value of attribute collection in a dataset [44], [45].Few variables correlated positively, while others correlated negatively.Figure 2 depicts the correlation heatmap.

2) MUTUAL INFORMATION (MI)
The Mutual Information Method is one of an effective approach for selecting features [46].This filtering method requires into consideration the numerical properties of the dataset.Mutual information depends on an entropy, which is the measure of how unpredictable the features are.The qualities were rated according to the relative contribution of each to the desired variable, as shown in Figure 3.

3) PRINCIPAL COMPONENT ANALYSIS (PCA)
Principal component analysis (PCA) is frequently employed in modern data analysis.The purpose of PCA is to find a  highly meaningful foundation for reexpressing a particular set of data.Dimensionality reduction, compression of data, extraction of features, and data visualization are just a few examples of uses.Principal components, which are produced by linearly combining the original variables, are a set of new orthogonal variables produced by PCA [47].Using Principal Component Analysis (PCA), one can make certain datasets less dimensional.enhances interpretability while retaining most of the information.It accomplishes this by introducing fresh, unrelated covariates.This new variable discovery, or what we refer to as the key components, will lessen the difficulty of solving the eigenvalue/eigenvectors problem [48].The Principal Component Analysis (PCA) technique aids in determining which set of data most accurately captures the topic under study.People receive several components via the PCA technique, which reduces the dimensionality of a multivariate dataset by condensing the dimensions into a single variable [49] as shown in Figure 4.

4) IMPORTANT FEATURES
Some features that are constant include IDH1, Age at diagnosis, PIK3CA, ATRX, PTEN, CIC, EGFR and TP53.These characteristics were selected to be examined further.Table 3 depicts the list of key features from Pearson's Correlation, Mutual Information and PCA.

D. MACHINE LEARNING TERMINOLOGIES
The steps involved in machine learning are selection and preparation of dataset, model training, model deployment and model's performance assessment.For the purpose of improving the model's performance, iterative testing as well as enhancements to the model are commonly employed.The end result of machine learning is the building of an algorithm that accurately synthesizes to freshly acquired data and addresses the query at hand.To choose the best model, hyperparameter tuning must be effective.Hyperparameter  selection aims to highlight the beneficial outcomes from previous training cycles.Model improvement is possible by modifying the algorithm's parameters [45].We used the grid search optimization technique in this study which acquire's optimized parameter values.Grid search is a tuning technique that manually examines each value in the predefined hyperparameter space to carry out comprehensive parameter searching.Many machine learning algorithm's performance is influenced by their hyperparameter configurations [50].
Machine learning ensemble models are used in variety of ways, such as bagging, boosting and stacking.By stacking models, we may train several of them to tackle related issues and then integrate the results to create a more powerful model [51].Making use of this concept, we built three stacks on two distinct levels.Figure 5 illustrates stacking with a graphic demonstration.Random forest, KNN, logistic regression and decision trees made up the initial stack.Tree based models like Xgboost, lightgbm, catboost and adaboost made up second set of stack.The ultimate stack was created by further ensembling the aforementioned stack.
XAI techniques were applied to interpret the model outputs.The implication of interpretability using XAI models, is the ability to understand and make sense of the decisions or outcomes generated by a model.It provides transparency, ensuring that the inner workings of the model are accessible and can be explained in a human-understandable manner.The following XAI models were applied to this study: 1. SHAP (SHapley Additive exPlanations): By determining the relative contributions of every feature to resulting estimation and prediction, this model-neutral method assesses the outcome for any machine learning model.An automated machine learning project's entire workflow is achieved using a pipeline of machine learning, which is made up of several connected data processing modules.Preparing the data, selecting the model, selecting features, adjusting hyperparameters, and evaluating are typically involved.By offering a methodical and automated perspective to the whole process, this pipeline has been designed for maximizing efficacy of various machine learning models.The machine learning (ML) pipeline applied in this study is depicted in Figure 6.

A. PERFORMANCE METRICS
Our AI models have been evaluated and compared using classification measures like precision, recall, F-1 score, accuracy and AUC score (Area Under the Curve).Our classifiers aim to identify patients with gliomas, a particular type of brain tumour.
1. Accuracy: The accuracy refers to the ability to accurately identify between patients who are having a LGG (Lower-Grade Glioma) or GBM (Glioblastoma Multiforme).To ascertain whether the forecast was accurate, it is necessary to calculate the percentage of true positive as well as true negative outcomes in each of the examined cases.

Precision: The proportion of patients who are having a
Lower-Grade Glioma or Glioblastoma Multiforme out of all other patients is determined by this statistic.This means that individuals who have been diagnosed with a glioma that was not actually a glioma are also taken into account.3. Recall: This performance metric is defined as the precise ratio of patients who are having a LGG (Lower-Grade Glioma) to all patients that were impacted.False-negative events are highlighted by this statistic.
When the false-negative cases are rare, this metric is quite boosted.4. F1 score: The combined precision and recall ratings of a model are represented by its evaluation statistic.Log of 6.167, hamming loss of 0.178, jaccord score of 0.690, and Mathew's correlation coefficient (MCC) of 0.0.644 are the related loss metrics for Final Stack.Table 3 displays summary of these test results.
Using Grid Search approach and 5-fold cross-validation, hyperparameter adjustment was applied to all algorithms in order to prevent overfitting.The models' selected hyperparameters are listed in Table 5.
The AUCs for the final stack model is represented in Figure 7.For test size = 0.2 and balanced data, final stack model received AUC value of 90%.The precision-recall (PR) curve and confusion matrix of the final stack model is demonstrated in Figure 7 with a precision of 79%.This study used heterogenous classifiers along with feature selection techniques to enhance performance.To help doctors in predicting whether the patient is diagnosed with glioma tumour i.e., Lower-Grade Glioma or Glioblastoma Multiforme, these models could be deployed in the hospitals.
Major objective of deep learning (branch of machine learning) is to train the artificial neural networks for learning and prediction from data.Its capacity to automatically identify and depict intricate patterns in data has led to its enormous rise in popularity.This makes it ideal for a variety of applications, including natural language processing, autonomous systems, image and speech recognition, and more.The ability of deep learning models, especially deep neural networks with numerous hidden layers, to automatically extract hierarchical features from unprocessed data enables them to carry out tasks that were once believed to be beyond the of traditional machine learning techniques.Deep learning continues to expand our knowledge of artificial intelligence and has revolutionized a number of areas, including healthcare, banking, and self-driving automobiles [52].
Artificial Neural Networks (ANNs) are computer processing devices which heavily borrow from how the human brain operates.An enormous number of interconnected computational nodes (neurons) form the fundamental building block of artificial neural networks (ANNs).These distributed neurons work together to optimize the outcome and learn from the input.[53].The input layer receives the data, which is subsequently distributed to the hidden layers after it's loaded as a multidimensional vector.The hidden layers perform the learning process, using the preceding layer's decisions to assess whether a stochastic modification will eventually enhance or damage the output.The term ''deep learning'' refers to systems that have multiple hidden layers stacked on top of each other [53].In our study, we have employed ANN architecture to predict whether the patient is suffering Lower-Grade Glioma (LGG) or Glioblastoma Multiforme (GBM) and this model has performed with 84% accuracy, 88% AUC and 80% precision.An ANN architecture for Glioma classification is illustrated in Figure 8.
ReLU served as an input and hidden layers' activation function, while for output layer, sigmoid served as an activation function.As seen in Figure 9, the accuracy of the test training sets is satisfactory for ANN architecture for 30 epochs.which can learn to optimize by themselves.The fundamental component of innumerable artificial neural networks (ANNs) is still a single neuron that receives an input and then executes action (like scalar product following non-linear function) [53].
The class score that is generated as the ultimate output will still be represented by the network using single perceptual scoring function (weight) from the picture vectors that are input.The last layer of the architecture, which will have loss functions linked to the classes, complies with all the usual rules designed for an ordinary ANNs [53].
The primary application of CNNs is in pattern detection inside images, which is sole discernible distinction among them and ANNs.This further minimizes the number of parameters required to initialize the model and allows us to incorporate attributes unique to individual images into design, thereby improving network's suitability for image-focused applications.CNN's primary benefit over its predecessors is that it can identify important traits automatically without  human oversight, making it the most widely used [54].Convolutional neural networks are made up of several building pieces, including fully connected, pooling, and convolution layers.They use a backpropagation algorithm to autonomously and adaptively learn the spatial hierarchies of information [55].Convolutional neural networks are a kind of feedforward neural network which are capable of extracting features from data that has convolution patterns.Unlike traditional feature extraction methods, CNN does not require human feature extraction.CNN's architecture is influenced by how people see things.An artificial neuron is analogous to a biological neuron; CNN kernels are diverse sensors that can react to varied stimuli; activation functions mimic the process by which neural electric signals that surpass a specific threshold are passed on to the subsequent neuron.The creation of loss functions and optimizers allowed the CNN system as a whole to learn what is expected [56].
In our study, we have employed ANN architecture to predict whether the patient is suffering from Lower-Grade Glioma (LGG) or Glioblastoma Multiforme (GBM) and this model has performed with 84% accuracy, 88% AUC and 80% precision.The CNN architecture for Glioma classification is illustrated in Figure 11.LeakyReLU was the activation function employed in the convolution layers as well as the dense layers, whereas sigmoid was used in the output layer.As seen in Figure 12, the accuracy of the test and training sets is satisfactory for CNN architecture for 30 epochs.Figure 13, for the same architecture and epochs illustrates the training and validation losses acquired for additional analysis, with a drop in the loss throughout training indicating good convergence.
The results obtained from the various models are visually depicted in form of a bar graph in the Figure 14.The dataset is shuffled in the beginning and the data is divided in ratio 80:20 (train: test).This train data is also validated on all the models and results obtained, depict that the models overcome issue of overfitting.Table 6 summarizes the result on training data.

C. EXPLAINABLE ARTIFICIAL INTELLIGENCE (XAI)
Within the healthcare industry, XAI is still in its infancy, despite its promise to enhance the use of AI.Interpretability serves as a guiding principle in the responsible deployment of AI, ensuring that complex models can be understood and scrutinized.It not only reinforces trust and compliance but also empowers users to harness the benefits of advanced technologies while remaining vigilant to potential biases and ethical considerations.The right models can be (semi-)automatically found, their criteria and justifications optimized, partners included, analytics incorporated, a level of safety and accountability recommended, and strategies for integrating them with clinical workflow recommended [57].Four XAI models are used in this study: QLattice, Eli5, SHAP and LIME.With the use of the feature importance methodologies discussed above, we can better comprehend the relevance of certain qualities.For interpretation, the final stacking model was used [58].
The significance of every feature to machine learning model's prediction are considered by SHAP model in order to interpret model's output.[59].By applying principles from game theory, it calculates an insight of each attribute to the model's output and offers comprehensible and straightforward explanations.This procedure is used to assess the prediction's accuracy [60].Figure 15 shows the beeswarm plot and Bar Chart produced by SHAP analysis for local interpretation.According to the figure, the primary risk factors for glioma tumour in a patient are IDH1, Age at diagnosis, ATRX, CIC, PTEN, EGFR, IDH2, RB1, TP53, MUC16, NOTCH1 and FUBP1.
The machine learning model's predictions are explained via the LIME technique, which creates a locally relevant model centered on the anticipated spot [60].By selecting the portion of the initial properties that are most important in predicting and building basic model to explain the correlation among all the characteristics and model's yield, it produces interpretations that are understandable locally.Patients with gliomas, categorized as either Glioblastoma Multiforme (GBM) or Lower-Grade Glioma (LGG), are seen in Figure 16.Drawing conclusions from this figure, it can be inferred that green color in bar chart demonstrates attributes and features, critical in determining patient's likelihood of having GBM (Glioblastoma Multiforme), while the red colour indicated the factors that were crucial for predicting that the patient has LGG (Lower-Grade Glioma).
Eli5 is an another XAI technique for interpreting and evaluating the model predictions.It functions as python toolkit for visualizing and debugging predictions using standard API.It facilitates investigators understanding of black-box models and also provides functionality for multiple platforms [61].Figure 17 depicts how many factors impacted the capacity of various activities to forecast whether patient is diagnosed with glioma brain tumour.The chart demonstrates that IDH1, Age at diagnosis, Race, PIK3CA, BCOR, CIC, PDGFRA and EGFR are the most important features in predicting glioma.
Qlattice is a quantum mechanics-inspired machine learning platform that uses a probabilistic Graphical Model to find complex connections and patterns in data [62].Several thousand of potential models are examined by QLattice before selecting the model which most effectively addresses the present issue.The programmer must initially set up few variables, including labels, input properties, and other variables.These variables are referred as registers.Once the registers are defined, more models can be derived from this particular XAI technique.Model collection is referred as ''QGraphs'' and these graphs are made up of nodes an edges.Every node includes an activation function, while every edge has the weight assigned to it.The ''QGraph'' is run using the features to generate important knowledge.The QLattice module in Python is implemented using the ''Feyn'' package.The QGraph is displayed in Figure 18, and Equation (a) give an explanation of the model's transfer function as well as information about mutual information, Pearson's correlation and Principal component analysis.The data indicates that the most significant factors for glioma prediction are IDH1, Age at diagnosis and IDH2.The 'add' function is also used to comprehend the results.
Based on the previously discussed XAI approaches, the beeswarm plot shows how all the qualities work together to predict glioma in relation to SHAP.The force plot only shows the most important characteristics that enhance glioma prediction [63], [64].SHAP can be interpreted locally as well as globally.In comparison to previous XAI techniques, a variety of visualization plots are offered to help comprehend   an importance of each feature.Each attribute's contribution in predicting glioma can be seen in LIME.
We have created visualizations for those who have suffered with LGG (Low Grade Glioma) an GBM (Glioblastoma Multiforme) [65], [66].The weights of the qualities are revealed in LIME.However, in this case, our understanding is local rather than global.This implies that individual patient forecasts are the only ones that can be investigated in greater detail in LIME.The claim is supported by Qlattice's demonstration of the key factors that cause glioma and by a transfer equation [67], [68].With the use of the popular quantum computing approach, Qlattice, the model is trained to recognise predictions.However, in comparison with other methods, it requires a large amount of processing time and resources.Qlattice instructs the model to recognise predictions by utilising the popular quantum computing technology.But when compared to other approaches, it requires a lot of processing effort and resources.The relative significance of each characteristic in predicting gliomas is shown by Eli5 [69].Eli5 is an incredibly effective method for treebased models, including random forests and decision trees.However, as of right now, deep learning classifiers and other baseline models, such as Eli5, are not supported.

V. DISCUSSION
This study employed machine learning to evaluate a patient's risk for either an LGG or a GBM.The dataset contained 889 patients in total.Pearson's correlation, Principal component analysis and Mutual information were employed in feature selection process.The Machine learning models that were used for prediction included Random Forest, Decision Tree, KNN, Logistic Regression, SVM (Linear, Sigmoid), Stack 1 (RF, LR, DT, KNN), AdaBoost, CatBoost, LGBM, XgBoost, Stack 2, and Final Stack.To enhance our comprehension of the outcomes, four XAI procedures were employed, and a comparison of the methods was conducted.As a preliminary decision support system, glioma tumour can be predicted using the ML model.The Final Stack model was used for prediction since it performed better than any other model.In order to improve our understanding of the results, four XAI methodologies-SHAP, Eli5, LIME and Qlattice were used, and approaches were contrasted.
Machine learning (ML) models that evaluate medical imaging data, including MRI scans, are used to predict glioma tumours, that includes low-grade gliomas and highgrade glioblastomas i.e., Glioblastoma Multiforme, by identifying the tumour type and stage based on features including size, shape, and location.To improve forecasts, these models additionally take into account patient characteristics including age, gender, family history, pre-existing diseases, and genetic data.Tumour progression and possible transition into GBM are assessed by taking into account treatment history and responsiveness to medicines.Predicting the entire impact 31712 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
of glioma tumours requires an understanding of patient outcomes, including survival rates and quality of life.ML models enable medical personnel to make informed treatment decisions for patients with gliomas by combining information from many sources.
Glioma is the most ordinary primary cerebral tumour, affecting about 7 out of 100,000 people globally annually [70].Glioblastoma (GBM) is prevalent and malignant form of glioma; despite great efforts and significant progress over the past few decades in understanding the root causes of glioma development, there is still no remedy for GBM, and an average duration of patients' survival with this particular diagnosis is till 12 to 15 months [70].The writers of the Upsala Journal of Medical Sciences, one of the oldest medical publications in Sweden, have offered a medical perspective on the attributes leading to the prediction of glioma tumours, as XAI models have interpreted the anticipated results.Large-scale genomic analyses of glioblastoma (in humans) have revealed many genes which are mutated.These analyses indicate that the three main signaling connections are disrupted in the majority of tumors: p53 pathway, which is primarily regulated by p53 and p14; receptor tyrosine kinase (RTK)/RAS/PI3K pathway, which is primarily regulated by EGFR amplification, PDGFRA overexpression or amplification, and PTEN inactivation [70].Authors of The Chinese Medical Journal have provided a medical viewpoint on the characteristics that contribute to the likelihood of glioma tumours [71].Genetic alterations like BRAFV600E mutation, FGFR1 alteration, MYB or MYBL1 rearrangement are often observed in only low-grade gliomas, whereas high grade gliomas include diffuse hemispheric glioma with H3G34-mutant, midline glioma with H3K27 altered, TP53 and ATRX mutations, and H3G34-mutant glioma [72].The New England Journal of Medicine (NEJM) reports that gliomas have changes to a number of genes, such as PTEN, TP53, EGFR and CDKN2A.While a tumour progresses towards a high grade, these changes typically take place in a specific order.Though EGFR amplification and PTEN loss or mutation are suggestive of higher-grade tumors, the TP53 mutation appears to happen quite early in the formation of an astrocytoma [73], [74].
Several studies have employed different kinds of machine learning and deep learning models to enhance prediction and forecasting of glioma tumour.A range of machine learning techniques were employed by Niu et al. [75] in his study, to uncover the molecular pathways behind gliomas using machine learning techniques in conjunction with protein-protein interaction networks.Consequently, there are 19 genes separating grade I and grade II, 21 genes separating grade II and grade III, and 20 genes separating grade III and grade IV.The glioma phases were then predicted using five machine learning techniques based on the chosen critical genes.The grade II-III prediction framework was then developed using a supplementary naive bayes classifier that was 72.8% accurate after comparison.Furthermore, random forest was used in the construction of the grade I-II and the grade III-IV prediction model, which yielded accuracy rates of 97.1% and 83.2%, respectively.Sun et al. [76] aimed to examine the glioma grading estimation efficacy of commonly used radiomics feature selection and categorization methods.Using MRI data, quantitative radiomics characteristics were derived from the tumor areas of 210 patients with Glioblastoma Multiforme and 75 patients with Lowgrade glioma.Next, two different test modes, 15 feature selection and 15 classification algorithms were examined for their diagnostic performance using ten-fold cross-validation and percentage split.To further optimize the prediction, roles of tumour sub-region, MRI modality, feature type, and quantity of selected characteristics were compared.According to the results, the best performance in differentiating between LGG and GBM was obtained by integrating the multilayer perceptron classifier with the linear support vector machine feature selection method.This was observed in both percentage split (0.953, AUC:0.981) and ten-fold cross validation (0.944, AUC:0.986).Bhatele et al. [77] provided an overview of the most advanced machine learning-based methods for classifying gliomas.This suggested method was based on the application of a hybrid feature extraction approach and hybrid ensemble learning model, which uses the Central pixel Neighbourhood Binary pattern, Discrete wavelet Decomposition and Gray level run length matrix techniques to classify glioma, into two grades based on fused MRI sequences: Low grade Glioma and Highgrade Glioma [77].This hybrid ensemble learning model, called the Improved eXtreme Gradient Boosting model, was utilized in this work.Two popular global datasets are used to evaluate the proposed method: BRATS 2013 and BRATS 2015, which include a range of MRI fusion combinations, and a well-balanced local dataset comprising of MRI images of low-grade and high-grade glioma from different Madhya Pradesh, India MRI centers.Utilizing the suggested method, the Enhanced eXtreme Gradient Boosting ensemble model yielded an optimal accuracy of more than 90% on the local dataset with resultant fusion of T1C + T2 + Fair MRI sequences.Sudre et al. [78] conducted a study to assess diagnostic value of a dynamic susceptibility contrast MRI paradigm in the classification of treatment-naïve gliomas into grades II-IV and across isocitrate dehydrogenase (IDH) mutation condition in a multicenter patient group.Retrospective identification was done on 333 individuals from 6 tertiary centers who were both molecularly and histologically diagnosed with primary glioma tumour (IDH-mutant = 151 or IDH-wildtype = 182).Using the collected characteristics, a random-forest technique was used to estimate and predict grades or mutation condition.Over 53% of gliomas were correctly categorized when the gliomas were graded, and 87% of the cases had a grade classification.Table 7. displays a deeper description on the important markers used to predict glioma and Table 8. displays comparison of our suggested methodology with the models that are currently in use.

VI. LIMITATIONS AND FUTURE SCOPE A. LIMITATIONS
The quality and availability of the datasets determine how well the model performs, so it is necessary to continuously expand and improve it.Thorough testing, scalability assessments, and external validations are essential before implementing in healthcare institutions in order to guarantee robustness across a variety of clinical settings.
Given the intrinsic complexity of glioma prognosis, careful interpretation of the model's predictions is necessary even with Explainable AI for interpretability.Throughout realworld implementation, ethical issues, patient privacy, and regulatory compliance are critical and require constant attention.These elements highlight how crucial it is to translate our research into useful therapeutic applications using a methodical and cautious approach.

B. FUTURE SCOPE
The future direction of our proposed study involves an integrated approach to enhance the glioma prediction model.We will focus on incorporating advanced features, continuously expanding the dataset through collaborations with diverse medical institutions worldwide, and integrating the latest imaging modalities.Leveraging state-of-the-art deep learning algorithms and transfer learning techniques will be pivotal, especially when dealing with large datasets.Additionally, international collaboration can be sought to combine data from different countries, creating a globally representative initiative for glioma research.Implementing a cloud-based system will facilitate scalability and collaboration across geographical boundaries.Rigorous medical validation and clinical trials will be conducted, and educational programs will bridge the knowledge gap between informatics and medical experts.Ethical considerations, including patient privacy and data security, will remain paramount throughout the study.This holistic approach aims to not only advance the field of glioma prediction but also provide a valuable tool equally beneficial to both medical professionals and machine learning experts.

VII. CONCLUSION
Early detection and treatment of glioma tumours can improve their prognosis and reduce their risk of consequences.Thus, we used machine learning and XAI techniques to predict glioma, which was classified as Low-Grade Glioma (LGG) or Glioblastoma Multiforme (GBM) based upon the histology and the criteria for imaging.Beyond enhancing transparency, interpretability (XAI) acts as a bridge between the technical intricacies of machine learning models and real-world decision-makers.It empowers individuals to navigate complex predictions, fostering a symbiotic relationship between human judgment and AI capabilities while addressing regulatory, ethical, and practical considerations.This study's data set comprised 889 patients with 23 features.The three feature selection techniques that were used were mutual information, Pearson's correlation, and the Principal Component Analysis (PCA) algorithm.The XgBoost model reached a maximum accuracy of 88%, while the stacked model attained 82% accuracy.For interpreting the model's predictions, four XAI techniques were used i.e., SHAP, QLattice, Eli5 and LIME.
The most important factors for the prediction of glioma were determined to be IDH1, age at diagnosis, PIK3CA, ATRX, PTEN, CIC, EGFR, and TP53.Furthermore, the efficacy of creating classifier dependability was evaluated by contrasting the suggested method with other relevant studies.These classifiers/models can be utilized by the medical professionals as a decision support system to predict gliomas.An interface might be used to implement real-time prediction of glioma tumour screening; therefore, a broader population's glioma could be predicted using this methodology.

FIGURE 1 .
FIGURE 1. Glioma progression depicted in a detailed process flow, emphasizing key stages and events.

FIGURE 3 .
FIGURE 3. The mutual information algorithm ranks traits by their relevance in a concise evaluation.

FIGURE 4 .FIGURE 5 .
FIGURE 4. Cumulative variation in a dataset visualized through graphical representation.

TABLE 3 .
Critical characteristics and key features inherent in the dataset, offering fundamental insights for analysis and interpretation.

FIGURE 6 .
FIGURE 6. Essential stages in the machine learning pipeline.

FIGURE 7 .
FIGURE 7. a) AUC curve, b) PR curve, and c) Confusion matrix for the final model.

FIGURE 8 .
FIGURE 8. Configurational layout defining the architecture of the artificial neural network (ANN).
Figure 10 for the same model architecture and for 30 epochs illustrates the training and validation losses acquired for additional analysis, with a drop in the loss throughout training indicating good convergence.Similar to traditional artificial neural networks (ANNs), convolutional neural networks are composed of neurons

FIGURE 9 .
FIGURE 9. Accuracy curve depicting the performance of the artificial neural network (ANN).

FIGURE 10 .
FIGURE 10.Loss curve illustrating the performance of the artificial neural network

FIGURE 11 .
FIGURE 11.Architectural configuration delineating the structure of the Convolutional Neural Network (CNN).

FIGURE 12 .
FIGURE 12. Accuracy curve illustrating the performance of the convolutional neural network (CNN).

FIGURE 13 .
FIGURE 13.Loss curve depicting the performance of the convolutional neural network (CNN).

FIGURE 16 .
FIGURE 16.Interpretation of LIME (Local interpretable model-agnostic explanations) in predicting patients with glioma tumors.

FIGURE 18 .
FIGURE 18. Model predictions elucidated through QGraph and transfer function for a comprehensive understanding.

TABLE 1 .
An overview of related work.
31701Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE 1 .
(Continued.)An overview of related work.

TABLE 2 .
Comprehensive overview detailing the key attributes present in the dataset.

TABLE 4 .
Summary of outcomes derived from the machine learning models employed in this research when applied to the test dataset.

TABLE 5 .
Compilation of hyperparameters utilized in the Grid Search process.

TABLE 7 .
An overview of markers used to predict glioma.Authorized licensed use limited to the terms of the license agreement with IEEE.Restrictions apply.

TABLE 7 .
(Continued.)An overview of used to predict glioma.

TABLE 8 .
Comparative analysis between our proposed methodology and existing approaches.