EXplainable AI for Decision Support to Obesity Comorbidities Diagnosis

This paper describes the implementation of a comprehensive clinical decision support system (CDSS) for the risk factors prediction of comorbidities related to obesity and for the characterization of indirect connections between such comorbidities and non-communicable diseases. In particular, the direct correlation between obesity, diabetes, cardiovascular, and heart disease is analyzed by using machine learning (ML) predictive models, while the connection of the co-occurring disorders to the numerous additional non-communicable diseases is analyzed via a graph-based user interface. The CDSS here proposed is, therefore, structured with three main components: ML predictive models based on publicly available datasets, explainable artificial intelligence (XAI) local and global model interpretation, and graph-based representation of non-communicable disease connections. Multiple ML models are presented for risk assessment and a comparison is carried out based on performance key performance indicators. The best-performing model for each disease was proved to be: the multi-layer perceptron for diabetes and heart disease, and extreme gradient boosting for cardiovascular disease. Comorbidities risk factor prediction and a XAI local model explanation is performed on significant case studies. In addition, XAI global model interpretation is given for the entire dataset providing insights on the features’ contribution to the models’ implementation. Moreover, the graph-based visualization of indirect disease co-occurrence is performed by filtering connections according to different relative risk factor thresholds. This interface can be exploited by healthcare professionals to obtain, according to the needs and the clinical approach, a global perspective on obesity and its associated pathologies prevention as well as long-term treatment and care provision.


I. INTRODUCTION
Numerous health issues, including type 2 diabetes, dyslipidemia, cardiovascular disease, respiratory issues, and various types of cancers are all closely correlated with obesity and overweight [1], [2], [3].Such comorbidities raise the risk of several non-communicable diseases (NCDs) [4], which can cause mortality [5].Due to its link to significant chronic diseases and the consequent financial burden on the health care system, the treatment and prevention of obesity constitute a key issue in the public health system.
The associate editor coordinating the review of this manuscript and approving it for publication was Shadi Alawneh .
For healthcare professionals (HCPs) working in this context, clinical decision support systems (CDSSs) can be essential tools for predicting and preventing comorbidities and the associated NCDs.
Most CDSSs are not knowledge-based; they are not programmed to adhere to medical knowledge, but they need data sources to feed statistical pattern recognition or train machine learning (ML) algorithms.Artificial intelligence (AI) is widely used in the field of decision support and specifically in obesity studies, either for cross-sectional studies, as in this paper, or retrospective/prospective studies [6].Numerous ML methods were analyzed and compared in literature, each of them suited to specific needs or datasets [7].Most of the issues related to such spreading AI are due to data availability, heterogeneous data structure [8], and the comprehension of the logic behind most of the AI algorithms producing recommendations where the operator can access only the final decision without an insight into the process (black boxes) [9].
Methods that explain how AI systems work can be incorporated into the CDSS, implementing the paradigm of explainable artificial intelligence (XAI), with the following goals for both the end user and the designer: • enabling HCPs to evaluate whether the system's output is reliable, • improving trust in the HCP-patient relationship, • revealing new insights regarding what the AI system learned from the data, • finding possible weaknesses while discovering the underlying causes of faults more efficiently, • recognizing what the algorithm was tuned for and managing the associated choices.
In this work, we propose and thoroughly analyze the implementation of a comprehensive CDSS based on XAI (hereafter referred to as XAI-CDSS) consisting of the integration of the following three components: ML predictive models, based on publicly available datasets, for the calculation of obese subjects' risk factors for the selected comorbidities (i.e.diabetes, cardiovascular, and heart disease); XAI plots interpretation of the local prediction and of the global model; an interactive multi-node graph interface, fundamental to highlight indirect links between obesity, the selected comorbidities, and NCDs.

A. DATA DESCRIPTION
Data for pathologies such as diabetes, cardiovascular, and heart disease were extracted from public datasets as documented in this section.In the selection of such datasets, attention was focused on those including features characterizing the obesity condition.In particular the body mass index (BMI) was considered and, when missing, the height and weight for BMI calculation.The increasing BMI is indeed associated with the development of numerous comorbidity [5].The percentage of the overweight and obese categories, based on the BMI feature, was calculated for each comorbidity dataset in order to verify their suitability to the scope of the work.

2) CARDIOVASCULAR DISEASE (CVD)
The dataset from Kaggle [11] consists of 70,000 records of patient data, 11 features, and the binary target.The included features are objective (age, height, weight, gender), from examination (systolic blood pressure, diastolic blood pressure, cholesterol, glucose), subjective (smoking, alcohol intake, physical activity), and the binary target (presence of cardiovascular diseases).Data were collected simultaneously with the medical examination.

3) HEART DISEASE (HD)
Data come from the 2020 annual CDC survey data of 400,000 adults related to their health status.A selection of 18 features out of the 279 original ones was extracted from the Kaggle dataset [12].The included features are the following: BMI, smoking, alcohol drinking, stroke, physical health, mental health, difficulty in walking, gender, age category, race, diabetes, physical activity, generic health, sleep time, asthma, kidney disease, skin cancer), and the target binary feature (heart disease).

4) DISEASE CO-OCCURRENCE GRAPH
The data underlying the long-term disease co-occurrence graph comes from the Danish Disease Trajectory Browser (DTB) [13] and downloaded from [14].It explores 25-year longitudinal population-wide disease progression patterns from the entire population of Denmark.The trajectories include diseases classified by the ICD-10 [15] as cardiovascular (I48-Atrial fibrillation and flutter), heart disease (I51-Complications and ill-defined descriptions of heart disease), and diabetes (E14-Unspecified diabetes mellitus) are extracted in JSON format, including nodes and edge parameters.The nodes represent the pathology ICD10 code and the edges are the directional connection between couples of diseases that were developed by the subject simultaneously in a 5-year time horizon.Such information includes, for each path going from the selected comorbidity to an NCS: the number of patients following the path and the associated relative risk (RR) which compares the risk of a disease among one group with the risk among another group.

B. PRE-PROCESSING AND FEATURE ENGINEERING
Features selection was based on: • features with missing values removal, • exploratory data analysis (EDA), for dataset analysis and selection of the most significant features using statistical graphics and other methods including Principal Component Analysis to exclude elements with low variance significance, heat maps in order to exclude elements with high correlation, and cluster analysis for group identification.
• variance threshold [16], to remove features with variance lower than 0.05, • codebook's study, to remove the characteristics with no valid answers (e.g.'ever told') in the description, • experts' opinions, to select relevant information based on the enrollment protocol, • class balancing based on the target feature.A one-hot-encoding procedure was used to encode categorical variables creating a binary column for each category.This is needed for feeding categorical data to ML models.Moreover, numerical variables were scaled using a min-max scaler (for Heart disease datasets) and a standard scaler (for diabetes and cardiovascular disease datasets) in order to adapt the datasets to ML learning algorithms, which prefer data covering small ranges.The choice of the specific scaler was performed based on the single model performance.

1) DIABETES DATASET
The first step of data cleaning was performed by removing 24,206 duplicate lines and outliers based on statistical information; no observations with missing values were found.A target class unbalance was detected in the original dataset (84.7% healthy, 15.3% diabetic).A random undersampling procedure was carried out randomly by eliminating records from the class ''healthy'' in order to restore the target class balance.Based on the feature engineering analysis, Table 1 shows the features selected to train the predictive algorithm.The percentage of overweight and obese categories, based on the BMI feature, was calculated resulting in the 36.37% and 41.55%, respectively.

2) CARDIOVASCULAR DISEASE DATASET
The first step of data cleaning was performed by removing duplicate 3,208 lines and outliers based on statistical information; no observations with missing values were found.The target variable is approximately balanced (48.8% healthy, 51.2% cardiovascular disease).Based on the feature engineering analysis, Table 2 shows the features selected to train the predictive algorithm.Moreover, the BMI feature was obtained indirectly from the 'weight' and 'height' feature columns as the weight (kg) divided by the height (m) squared.The percentage of overweight and obese categories, based on the BMI feature, was calculated resulting in the 35.97% and 28.68%, respectively.

3) HEART DISEASE DATASET
The first step of data cleaning was performed by removing 27,315 duplicate lines and outliers based on statistical information; no observations with missing values were found.A target class unbalance was detected in the original dataset (90.8% healthy, 9.2% heart disease).An undersampling procedure was carried out randomly by eliminating records from the class ''healthy'' individuals in order to restore the target class balance.Based on the feature engineering analysis, Table 3 shows the features selected to train the predictive algorithm.The percentage of overweight and obese categories, based on the BMI feature, was calculated resulting in the 37.34% and 33.95%, respectively.

C. PREDICTIVE MODELS
ML algorithms for predictive modeling were applied to the pre-processed datasets in order to assess risk factors associated with the selected obesity comorbidity.A binary classification (positive or negative to pathology affection) was performed based on the risk factor (respectively higher or lower than 50%).The predictive analysis consisted of applying different ML algorithms: Multi-Layer Perceptron (MLP), Extreme Gradient Boosting (XGB), Logistic regression (LR), Nearest Neighbors (NN), Random Forest (RF), Decision Tree (DT), and Linear Support Vector Machine (LSV).The performances of the analyzed algorithms are presented for each pathology in Section III-A.The preprocessed dataset was divided into training and testing datasets: random sampling was performed, 75% of the data was used for training, and the remaining 25% was held out for testing.Furthermore, models' calibration was performed in the training dataset by hyperparameters tuning using the grid-search procedure [17].In particular Sklearn GridSearchCV [16] was used: it includes an internal k-fold cross-validation technique to calculate the score for each combination of parameters on the grid.

1) MULTI-LAYER PERCEPTRON (MLP)
The MLP, which belongs to the family of artificial neural networks, was implemented.The used package is the Sklearn [18].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 4, where: • Random state allows the randomity of the estimator to be checked, in order to ensure the reproducibility of the results.• Max iter specifies the maximum number of iterations required for the optimization algorithm to converge.For stochastic solvers (''sgd'', ''adam'') this parameter represents the number of training periods.
• Learning rate init expresses the initial learning rate value.It controls the step size in the weight update and is used only with a solver type ''sgd'' or ''adam''.
• Learning rate indicates the strategy to be applied for weight updating.
• Hidden layer sizes is a tuple in which the i-th value represents the number of neurons to be inserted in the i-th hidden layer.In this case, a different hidden layer size was provided for each of the datasets.
• Activation represents the activation function to be applied to hidden layers neurons, 'Relu' is the rectified linear unit function, returns f(x) = max(0, x).

2) EXTREME GRADIENT BOOST (XGB)
The XGB model was implemented using the XGB Classifier of the xgboost package [19].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 5 where: • Random state is a random number seed for the randomness of the estimator; it ensures the reproducibility of the results.
• Learning rate is a parameter that controls the step size at which the algorithm updates the weights of the model.
• N. estimators describes the number of trees to be used to implement the boosting technique.
• Tree method specify the XGB tree construction algorithm to be used.

3) LOGISTIC REGRESSION (LR)
The LR model was implemented using the LR Classifier of the of the sklearn package [16].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 6 where: • Random state allows the randomity of the estimator to be checked, in order to ensure the reproducibility of the results.
• Solver specifies the algorithm to be used for parameter optimization.
• C is a positive float value, which defines the inverse of the regularization force.Small values specify a stronger regularization.
107770 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.• Max Iter specifies the maximum number of iterations to be performed.
• N Jobs specifies the number of parallel jobs to run for finding nearby points.If set to 1, all CPUs are used.

4) NEAREST NEIGHBORHOOD (NN)
The NN model was implemented using the NN Classifier of the sklearn package [16].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 7 where: • Random state allows the randomity of the estimator to be checked, in order to ensure the reproducibility of the results.
• Weights describes the weight function to be used for predictions.The value ''uniform'' refers to the fact that all nearby points are assigned equivalent weights, while ''distance'' allows to assign a weight equivalent to the inverse of the distance of the different neighbors.
• p is the parameter used in the Minkowski distance relationship.The value p = 1 produces a metric equivalent to the distance of Manhattan, while the value p = 2 allows to obtain the mathematical relationship of the Euclidean distance.
• N Neighbors allows you to specify the size of the parameter k.
• N Jobs specifies the number of parallel jobs to run for finding nearby points.If set to 1, all CPUs are used.
• Algorithm determines the algorithm used to calculate the nearest points.

5) RANDOM FOREST (RF)
The RF model was implemented using the RF Classifier of the of the sklearn package [16].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 8 where: • Random state allows the randomity of the estimator to be checked, in order to ensure the reproducibility of the results.
• Criterion represents the criterion used to evaluate the quality of splits.Two examples are the Gini impurity index and the entropy measurement.
• Class Weights determines the weights assigned to the different classes during the training.In particular, if class_weight equal to ''balanced'' uses y, the target label values, to automatically adjust weights so that they are inversely proportional to the frequency of samples in the different classes in the input data.
• Max Features represents the number of features to consider when looking for the best split.
• N Estimators describes the number of trees to be used to implement the boosting technique.

6) DECISION TREE (DT)
The DT model was implemented using the DT Classifier of the sklearn package [16].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 9 where: • Random state allows the randomity of the estimator to be checked, in order to ensure the reproducibility of the results.
• Criterion represents the criterion used to evaluate the quality of splits.Two examples are the Gini impurity index and the entropy measurement.
• Class Weight determines the weights assigned to the different classes during the training.In particular, if class_weight is ''balanced'' it uses y, the target label values, to automatically adjust weights so that they are inversely proportional to the frequency of samples in the different classes in the input data.
• Splitter represents the strategy to be used to choose the type of split in each node.

7) LINEAR SUPPORT VECTOR (LSV)
The LSV model was implemented using the LSV Classifier of the sklearn package [16].The grid-search procedure, including a k-fold (k=5) cross-validation for model optimization and hyperparameter tuning, brought to the selection in Table 10 where: • Random state allows the randomity of the estimator to be checked, in order to ensure the reproducibility of the results.
• C is a positive float value, which defines the inverse of the regularization force.Small values specify a stronger regularization.
• Max Iter specifies the maximum number of iterations to be performed.

D. EXPLAINABLE ARTIFICIAL INTELLIGENCE (XAI)
The XAI interface was integrated into the CDSS to enhance the understanding of how the ML algorithm provides predictions.This integration aims to boost the confidence of healthcare professionals (HCPs) in utilizing this technology for clinical decision-making.It is based on Shapley Additive eXplanations (SHAP) [20].It exploits the basis of cooperative game theory matching a model input features with a player of the game, and the model function with the rules of the game.The returned Shapley value represents the contribution that each player (feature) gives, with its participation in any possible coalition, to the expected value of the game result.These Shapley values sum up the difference between the result of the game when all players are present, i.e. the output of the current model f (x), and the result of the game when no player is present, i.e. the output of the baseline model The XAI tool was used both for local and global model explanations.For the local explanation, the models were tested on specific case study subjects.A waterfall graph is returned to the user, along with the risk factor prediction, as a local feature importance plot, where the bars represent the SHAP values for each feature.The global explanation was obtained as a bar plot where the global importance of each feature is the mean absolute value for that feature over all the given samples in the training dataset.

E. DISEASE CO-OCCURRENCE GRAPH VISUALIZATION
The proposed XAI-CDSS incorporates a user-friendly multinode graph-based visualization tool.This tool enables HCPs to interactively visualize and analyze the directional connections, referred to as paths or links, between the selected obesity comorbidities (diabetes, cardiovascular, and heart disease) and the designated non-communicable diseases (NCDs).Its integration aims to provide HCPs with an intuitive interface for exploring and characterizing these connections.
A new compatible graph implementation, based on data in Section II-A4, has been developed in order to be integrated into the proposed XAI-CDSS.The following Python packages have been used: NetworkX [21] to build the graph and pyvis [22] for an interactive view.Nodes are characterized by two attributes: 'label', which indicates the ICD10 code of the pathology, and 'group', representing, in color, the source node and the degree of connection.The edges have the following attributes: 'label', which indicates the RR, and 'weight', which indicates the number of patients following the path from pathology 1 to pathology 2 and is graphically converted into the link width.In particular, the edge connecting the obesity node to the three selected comorbidities will show a weighted relative risk RR w based on the predicted risk f (x) as a result of the model prediction and the absolute RR based on data.It will be therefore defined as RR w = f (x) * RR.

III. RESULTS
This section presents the findings of the proposed XAI-CDSS in a comprehensive cross-sectional study, focusing on the selected pathologies associated with obesity: diabetes, cardiovascular diseases, and heart diseases.In the first place, the predictive models' performances are presented.Secondarily, two case studies are analyzed showing the prediction outcome and the local XAI explanation.The global model explanation is then provided, presenting the feature importance for prediction across the entire analyzed population.Finally, the integrated graph-based visualization for NCDs connections to the selected comorbidities is shown.

A. MODEL PERFORMANCES
The predictive models in Section II-C were trained on pre-processed data as described in Sections II-A and II-B.The different model performances on the test dataset are reported in Tables 11, 12, and 13 for the three comorbidities.The discrepancy between the models' performance indicators in the training and in the test datasets was calculated.The performance in the test phase decreased by a percentage between the 1% and the 15%, thus, resulting in a low risk of overfitting.Moreover, given the difficulty in finding new original and more controlled medical datasets, such results  are considered satisfactory as a proof of concept for the developed methodologies; they are encouraging for future application to further studies and pathology associations.
The confusion matrices are presented for MLP in Figure 1, 2, 3 and for XGB models in Figure 4, 5, 6 as they present the best accuracy for the considered pathologies.Given the comparison among the predictive model performances, the MLP was applied to diabetes and heart diseases, and the XGB was applied to cardiovascular disease due to their superior accuracy compared to the other considered predictive models.

B. LOCAL MODEL EXPLANATION
Different case studies are here presented: Case Study 1 is a subject resulting positive to all three correlated pathologies; Case Study 2 is a subject presenting a negative risk for all three correlated pathologies; Case Study 3 is a subject presenting a positive risk for two comorbidities out of three.The visualization of the results for the user in terms of probability and XAI local explanation is presented.

1) CASE STUDY 1: PATIENT POSITIVE TO ALL DISEASE
The subject presented is an obese subject with the following features: age: 61 years old, gender: male, height: 170 cm, weight: 115 kg, systolic pressure:120 mmHg, diastolic pressure: 80 mmHg, cholesterol: 235 mg/dl (normal), physical activity: active, difficulty in walking: no, cardiac disease/attack: yes, education: Some college or technical school.The risk factors given by the predictive model are 84.99% for developing diabetes, 77.65% for developing cardiovascular disease, and 61.43% for developing heart disease.Figure 7 (a)(b)(c) represents the XAI waterfall diagrams with the local explanation of the predicted risk factor for each pathology.For diabetes analysis, the subject results have a high risk (84.99%) of developing such a pathology.High BMI, high blood pressure, education level (Some college or technical school), and high cholesterol have a significant contribution to the high risk of developing such a disease.Age and past heart disease/attack push the prediction toward diabetes with a moderate contribution.The male gender and the absence of difficulty in walking give a slightly negative contribution to the prediction.
For cardiovascular disease, the subject results having a high risk (77.65%) of developing such a pathology.High BMI is the main contributor to the high risk of developing such a disease.Also age, above-normal cholesterol, systolic and diastolic blood pressure, and height go in the same direction with a lower contribution.Weight, diastolic blood pressure, physical activity, and male gender have a negative contribution to the prediction.
For heart disease, the subject results having a slightly high risk (61.43%) of developing such a pathology.XAI waterfall diagrams in Figure 7(c) give an idea of how each feature contributed to the prediction allowing the HCP to decide if the model response is reliable.BMI and male gender are the main contributors to the risk of developing such a disease.Also, age goes in the same direction with a lower contribution.The absence of difficulty in walking gives a slightly negative contribution to the prediction.

2) CASE STUDY 2: PATIENT NEGATIVE TO ALL DISEASE
The subject presented is an obese subject with the following features: age: 44 years old, gender: male, height: 160 cm, weight: 100 kg, systolic pressure: 120 mmHg, diastolic pressure: 80 mmHg, cholesterol: 130 mg/dl, physical activity: active, difficulty in walking: no, cardiac disease/attack: no, education: graduate.The risk factors given by the predictive model are 50.29% of not developing diabetes, 71.79% of not developing cardiovascular disease, and 84.97% of not developing heart disease.
The subject-associated probability of not developing diabetes is borderline, very close to the 50% threshold.The XAI waterfall diagrams in Figure 8 (a)(b)(c) are therefore significant to the HCP for assigning the proper meaning and trust to the prediction, giving an idea of how each feature contributed to model results.
For diabetes analysis, the subject results borderline (50.29%) of not developing such pathology.Young age (44 years), not having high cholesterol, and college graduate education are the main contributors to the prediction of not developing the considered pathology.Moderate contributors in the same direction are: not having a past heart disease/attack, the female gender and not having difficulty walking.The high BMI and high blood pressure push toward the development of diabetes.
Systolic blood pressure, age, and normal cholesterol are the main contributors to the probability of the subject being healthy with respect to cardiovascular diseases.The high BMI, the diastolic blood pressure push instead to the prediction of developing such pathology.The other features provide very low contributions to the prediction.
For heart disease, the subject results have a high probability (84.97%) of not developing the pathology.The young age (40-44 years) is the main contributor to such a favorable prediction.The female gender and the absence of difficulty in walking push slightly toward a healthy subject.High BMI is instead a negative contributor to the prediction.

3) CASE STUDY 3: PATIENT WITH DIFFERENT PREDICTION TO THE COMORBIDITIES
The subject presented is an obese subject with the following features: age: 50 years old, gender: female, height: 165 cm, weight: 99 kg, systolic pressure:150 mmHg, diastolic pressure: 110 mmHg, cholesterol: 150 mg/dl, physical activity: inactive, difficulty in walking: no, cardiac disease/attack: no, education: some high school.The risk factors given by the predictive model are 77.59% of developing diabetes, 90.03% of developing cardiovascular disease, and 75.86% of not developing heart disease.The XAI waterfall diagrams in Figure 9 (a)(b)(c) are significant to the HCP for assigning the proper meaning and trust to the prediction, giving an idea of how each feature contributed to model results.For diabetes analysis, the subject results have a high risk (77.59%) of developing such pathology.High blood pressure, education level (some high school), and high BMI have a positive contribution to the high risk of developing such a disease.Normal cholesterol, the age range (50-54 years), no difficulty in walking, and no past heart disease push towards a prediction of a healthy subject.
For cardiovascular disease, the subject resulted in a high risk (90.03%) of developing such pathology.High systolic and diastolic blood pressure and no physical activity are the main contributors to the high probability of developing such a disease.Normal cholesterol moderately pushes toward the prediction of a healthy subject.
For heart disease, the subject results have a high probability (75.86%) of not developing the pathology.A relatively young age (50-54 years), the female gender, and the absence of difficulty in walking are the main contributors to such a favorable prediction.High BMI gives instead a negative contribution to the prediction.

C. GLOBAL MODEL EXPLANATION
The XAI bees warm test, Figure 10, performed on the training dataset, gives a global view on how each feature value contributes in terms of signed Shapley value to the pathology risk factor calculation for each model.It represents a scatter plot including cross-information on the normalized feature values and the correspondent signed Shapley value.The continuous color scale represents the feature values normalized in the high and low range (high: red, low: blue); the position of each point on the x-axes, represents the Shapley contribution for each data point.Such visualization can give important scientific and clinical insights into the pathologies, but it also could give important information to the machine learning algorithm designer in order to understand how to choose features and find algorithm faults.
Figure 10(a) is the global XAI for diabetes prediction using the MLP model.It is possible to notice how significantly negative Shapley values associated with low values (blue) of the features ''high blood pressure'', ''age'', ''high cholesterol'', and ''BMI'' have a strong impact on the prediction of healthy subjects.High values of ''High body pressure'' and ''High Cholesterol'' push toward diabetes prediction with moderate impact.According to such global explanation, the main neat contributors to diabetes prediction are high values of the features ''age'', ''BMI'', and the presence of ''difficulty walking'' and ''heart disease/attack''.An outcome coming from the global XAI is that the feature ''education'', coming from the socio-cultural context, gives an interesting contribution to the prediction: the model shows that low-level education has an impact on the development of diabetes, while high-level education pushes toward a health prediction.Figure 10(b) is the global XAI for cardiovascular prediction using the XGB model.It is possible to notice how significantly negative Shapley values associated with low values (blue) of the features ''systolic blood pressure'', ''age'', ''diastolic blood pressure'' and ''weight'' have a strong impact on the prediction of healthy subjects.High values of ''systolic blood pressure'', ''age'', ''cholesterol'', ''BMI'', ''diastolic blood pressure'', and ''weight'' push toward diabetes prediction.According to such global explanation, the main neat contributors to cardiovascular disease prediction are high values of the features ''systolic'', ''diastolic blood pressure'' and ''cholesterol'', while ''height'', ''physical activity'' and ''gender'' are not significant since they give an ambiguous contribution to the prediction.Figure 10(c) is the global XAI for heart disease prediction using the MLP model.It is possible to notice how significantly negative Shapley values associated with low values (blue) of the feature ''age'' have a strong impact on the prediction of healthy subjects.According to such global explanation, the main neat contributors to diabetes prediction are high values of the features ''age'', the presence of ''difficulty walking'', and ''BMI''.''Gender'' gives a clear indication of how the male gender is more prone to heart disease illness.

D. GRAPH-BASED VISUALIZATION OF COMORBIDITY
As proven by disease trajectories applications in medical research [23], [24], [25], the multi-node graph user interface represents an important support to the HCPs and the patient in order to have a global view of the development of further pathologies connected to the current obesity condition.The ICD10 code legend is fundamental for an immediate interpretation of such a visual and interactive interface.The graph can be read starting from the obesity node (orange).The paths from the obesity node to the selected comorbidity nodes (yellow), i.e. diabetes, cardiovascular disease, and heart disease, will be labeled with a weighted RR as a function of the f (x) predicted risk factor for each disease.The paths related to the other NCDs connections are labeled with the RR coming from the dataset as described in Section II-A.The width of the connections is weighted by the number of patients included in that path, meaning that they developed both connected node diseases simultaneously within a time horizon of 5 years.According to the case study and the HCP approach, the node and paths visualization could be filtered by selecting the most significant connections with high RR or by visualizing the low RR connections.Figure 11 represents the comorbidity graph filtered according to the RR values of the edges.Figure 11(a) includes the diseases connected with a high probability of obesity comorbidities (edges with RR ≥ 2.5) and gives the HCP a first view on the pathologies that might have a high impact on the patient health and therefore must be taken care of in the short-term prevention and treatment.Figure 11(b) and (c) includes, respectively, connections with an intermediate range of RR (edges with 1.5 ≤ RR ≤ 2.5) and with a low RR (edges with RR ≤ 1.5).In the latter, given the high number of connections, the extended name of the diseases is not shown and can be found in the ICD10 code [15] legend.These graphs can drive the HCP in medium/long-term prevention and treatments.

IV. CONCLUSION
This work presents the development of an XAI-CDSS, a decision support system for predicting comorbidities and non-communicable diseases associated with obesity.As a first step, multiple machine learning predictive models were implemented in order to determine the risk factors associated with diabetes, cardiovascular, and heart diseases in association with obesity.The best-performing models, MLP for diabetes and heart disease, and XGB for cardiovascular disease were chosen on the basis of a performance comparison in terms of accuracy, precision, recall, F1 score, ROC AUC, and confusion matrices.In addition, the explanation of how the best-performing models determined the predictions was given using XAI plots performed by SHAP and showing how each feature contributed.Specific case studies were presented in order to test the model predictions and to visualize the XAI model local explanation.Moreover, global SHAP beeswarm plots were presented for interpreting the impact of each feature value on the prediction and giving an interesting insight into ML extraction of meaning from data.Finally, a graph-bases interface was developed and integrated into the XAI-CDSS to allow the HCP to have a global view of the impact that obesity could have on patient health on different time horizons, showing the connections and the correlated relative risk of developing further pathologies which are not directly associable to obesity and to the direct comorbidities.The developed XAI-CDSS represents valid support to clinicians for predicting widespread pathologies correlated directly and indirectly to obesity, driving the HCPs from short-term to long-term prevention and treatment.

FIGURE 1 .
FIGURE 1. MLP classifier confusion matrices for the training and the test datasets: Diabetes.

FIGURE 2 .
FIGURE 2. MLP classifier confusion Matrices for the training and the test datasets: Cardiovascular diseases.

FIGURE 3 .
FIGURE 3. MLP classifier confusion Matrices for the training and the test datasets: Heart diseases.

FIGURE 4 .
FIGURE 4. XGB classifier confusion matrices for the training and the test datasets: Diabetes.

FIGURE 5 .
FIGURE 5. XGB classifier confusion matrices for the training and the test datasets: Cardiovascular diseases.

FIGURE 6 .
FIGURE 6. XGB classifier confusion matrices for the training and the test datasets: Heart diseases.

FIGURE 7 .
FIGURE 7. Case Study 1: XAI waterfall graphics for a subject with risk factors: (a) 84.99% of developing diabetes, (b) 77.65% of developing cardiovascular disease, and (c) 61.43% of developing heart disease.

FIGURE 8 .
FIGURE 8. Case Study 2: XAI waterfall graphics for a subject with risk factors: (a) 50.29% of not developing diabetes, (b) 71.79% of not developing cardiovascular disease, and (c) 84.97% of not developing heart disease.

FIGURE 9 .
FIGURE 9. Case Study 3: XAI waterfall graphics for a subject with risk factors: (a) 77.59% of developing diabetes, (b) 90.03% of developing cardiovascular disease, and (c) 75.86% of not developing heart disease.

FIGURE 10 .
FIGURE 10.Global XAI beeswarm graphics show how each feature contributes to the model prediction for the entire population of the training test.(a) Diabetes, (b) Cardiovascular diseases, (c) Heart diseases.
GRAZIA V. AIOSA received the B.S. degree in computer engineering and the M.S. degree in telecommunication engineering from Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), Università degli Studi di Catania, Italy, in 2016 and 2019, respectively.She is currently an Early Stage Researcher with DIEEI.Her research interests include machine learning and artificial intelligence.MAURIZIO PALESI (Senior Member, IEEE) received the M.Sc.and Ph.D. degrees in communication and computer engineering from Università degli Studi di Catania, Catania, Italy, in 1999 and 2003, respectively.He is an Associate Professor of computer engineering with Università degli Studi di Catania, and a Visiting Associate Professor with the Indian Institute of Technology Guwahati, Guwahati, India.His current research activity is focused in the area of domain specific architectures.He is a member of HiPEAC.He has served as the general chair and the TPC co-chair for several international conferences and workshops.He has served as a guest editor for 20 special issues in top-level journals.He serves as an associate editor for 12 international journals.FRANCESCA SAPUPPO (Member, IEEE) was born in Catania, Italy, in 1979.She received the M.S. degree in electronic engineering and the first Ph.D. degree in electronics and automation from Università degli Studi di Catania, Italy, in 2003 and 2007, respectively.She is currently pursuing the second Ph.D. degree in industrial and information engineering with Università degli Studi di Messina, carrying out research on explainable data-driven model identification and artificial intelligence.From 2008 to 2016, she carried out research with Università degli Studi di Catania, covering topics including multiphysics models for composite materials, model order reduction methods applied on MEMS and electronic circuits, real-time electro-optical instrumentation, cellular nonlinear networks, image processing for biomedical applications, microfluidics and micro-optics based on polymeric materials.From 2017 to 2022, she worked with industry as data scientist of green energy monitoring, optimization and control, and in education.

TABLE 3 .
Heart disease features.

TABLE 11 .
Model performances for diabetes models in the test dataset.

TABLE 12 .
Model performances for cardiovascular disease models in the test dataset.

TABLE 13 .
Model performances for heart disease models in the test dataset.