Early Detection of Pancreatic Cancers Using Liquid Biopsies and Hierarchical Decision Structure

Objective: Pancreatic cancer (PC) is a silent killer, because its detection is difficult and to date no effective treatment has been developed. In the US, the current 5-year survival rate of 11%. Therefore, PC has to be detected as early as possible. Methods and procedures: In this work, we have combined the use of ultrasensitive nanobiosensors for protease/arginase detection with information fusion based hierarchical decision structure to detect PC at the localized stage by means of a simple Liquid Biopsy. The problem of early-stage detection of pancreatic cancer is modelled as a multi-class classification problem. We propose a Hard Hierarchical Decision Structure (HDS) along with appropriate feature engineering steps to improve the performance of conventional multi-class classification approaches. Further, a Soft Hierarchical Decision Structure (SDS) is developed to additionally provide confidences of predicted labels in the form of class probability values. These frameworks overcome the limitations of existing research studies that employ simple biostatistical tools and do not effectively exploit the information provided by ultrasensitive protease/arginase analyses. Results: The experimental results demonstrate that an overall mean classification accuracy of around 92% is obtained using the proposed approach, as opposed to 75% with conventional multi-class classification approaches. This illustrates that the proposed HDS framework outperforms traditional classification techniques for early-stage PC detection. Conclusion: Although this study is only based on 31 pancreatic cancer patients and a healthy control group of 48 human subjects, it has enabled combining Liquid Biopsies and Machine Learning methodologies to reach the goal of earliest PC detection. The provision of both decision labels (via HDS) as well as class probabilities (via SDS) helps clinicians identify instances where statistical model-based predictions lack confidence. This further aids in determining if more tests are required for better diagnosis. Such a strategy makes the output of our decision model more interpretable and can assist with the diagnostic procedure. Clinical impact: With further validation, the proposed framework can be employed as a decision support tool for the clinicians to help in detection of pancreatic cancer at early stages.


FIGURE 1. (A)
: Design principles of a nanobiosensor for protease detection. The OFF mode occurs when distance between fluorophore TCPP (tetrakis-carboxyphenyl-porphyrin), Fe/Fe 3 O 4 nanoparticle, and FRET-acceptor cyanine (Cy) 5.5C is reduced, upon cleavage of the oligopeptide tether by a suitable protease present, this distance increases and leads to an increase in fluorescence intensity, which is called the ON mode. (B): TEM and HRTEM of dopamine-coated Fe/Fe 3 O 4 core/shell nanoparticles. (C): Typical emission spectra occurring from a nanosensor for protease detection after 1h of incubation at 37 0 C (λ exc = 421 nm). low: buffer; middle: nanosensor; high: nanosensor after incubation with the respective enzyme; with permission from reference [22], copyright Elsevier, Amsterdam 2021.
PC drops sharply with a later stage of detection. According to the American Cancer Society, the 5-year relative survival rate for PC is 39% at the localized state, 13% at the regional state, and 3% at the distant stage [1]. Therefore, a feasible and cost-effective Liquid Biopsy [9] for PC detection would be of great value, if it is capable of detecting PC at the localized stage, preferentially by means of a simple blood test. Liquid biopsies are of big interest for diseases like pancreatic cancer, where tissue samples are limited. Some liquid biopsies exploited for PC consist of identifying and quantifying tumor-associated components released from all tumor sources that can be present in blood, serum, or plasma, such as circulating tumor DNA (ctDNA), circulating tumor cells (CTCs), and extracellular vesicles (EVs) [10], [11]. Major difficulties with these liquid biopsy technologies include the lack in ability to isolate pure tumor-associated components, which typically contains a mix of tumor-and non-tumor associated components, which makes this technology insufficient to stand alone [11]. However, with the exception of the protease-activity technology discussed here, none of the ''classic'' approaches to Liquid Biopsies, such as the capture and detection of circulating tumor cell or circulating tumor DNA, DNA-methylation studies or the analysis of the content of extracellular vesicles, are capable of reliable detecting stage 1 PC [12]- [15], [16].
The Bossmann group has established a panel of seven proteases (caspases B and E, matrix metalloproteinases (MMPs) 1, 3, and 9, urokinase plasminogen activator (UpA), and neutrophil elastase) and arginase as suitable panel of enzymes for early PC detection in 2018 [5]. This selection was based on Gene expression analysis [17] using data from NCBI GEO, Entrez Gene ID, Unigene ID and Gene Symbol [17]. Protease and Arginase activities in serum were measured with Fe/Fe 3 O 4 core/shell nanobiosensors with an average particle size of 15 nm [5], [18]- [20], [21], [22]. The function principle is shown in Figure 1 (A). Each protease cleaves its respective consensus sequence and releases the fluorophore TCPP, which escapes the Förster quenching sphere of the nanoparticle plus tethered cyanine 5.5 dye (FRET pair) [18]. Upon escape, TCPP fluorescence increases and can be detected by a clinical plate reader. The fluorescence signal correlates with the fluorescence intensity [5]. Note that the nanobiosensor for arginase activity detection is not cleaved. Arginase performs a ''post-translational'' modification converting peptide-bound arginine into ornithine. The latter changes the dynamic of the peptide tether, which increases TCPP fluorescence [20].
Although statistically significant differences of the protease/arginase activity pattern of the group of all pancreatic cancer patients (n = 31) and the group of healthy, age-and gender-matched volunteers (n = 48) could be established utilizing these Fe/Fe 3 O 4 -based nanobiosensors, the overall sizes of the investigated groups was too small to establish the feasibility of early PC detection beyond reasonable doubt. Furthermore, simple biostatistics (e.g. performing Welch tests [23] and calculating p-values between data groups [23], [24]) do not provide the maximal extractable information from ultra-sensitive protease/arginase analyses. Therefore, we have employed information fusion based hierarchical decision structures for early-stage detection of pancreatic cancer. The classification models based on hierarchical decision structures are attracting significant research attention in the recent years. This is because they have demonstrated an appreciable predictive performance on a wide variety of interesting engineering applications like text classification [25], intrusion detection [26], manufacturing [27] and credit scoring prediction [28]. Biomedical applications like generation of molecular graphs [29], lung nodule malignancy classification [30], COVID-19 detection [31], skin lesion classification [32] and detection of Alzheimer's disease [33] have also incorporated the use of hierarchical learning methods to build efficient classifiers. However, such hierarchical decision models have not been proposed for the early-stage detection of PC. Given the limited sample size in such studies, exploiting a hierarchical classification structure helps to reduce the complexity of model at each step, thereby opening the possibilities to improve the performance of traditional multi-class classification approaches. In this work, we propose a novel hierarchical framework for early-stage detection of pancreatic cancer. Firstly, a hard hierarchical decision structure (HDS) coupled with feature engineering at each step provides a better performance as compared to traditional multi-class classification approaches. Secondly, a soft hierarchical decision structure (SDS) additionally provides confidence associated with predicted labels in the form of probability values for each class.
The major purpose of using computational methods for early pancreatic cancer detection is detecting the onset of pancreatic cancer in the group of chronic pancreatitis patients, which would permit a maximal time for successful treatment with emerging methods, such as immunotherapy [34]. The key contributions of this work are as follows: • This work, for the first time, proposes the use of ultrasensitive nanobiosensors for protease/arginase detection and integrates it with an information fusion based hierarchical decision structure to detect pancreatic cancer (PC) at the localized stage by means of a simple Liquid Biopsy.
• HDS, coupled with appropriate feature engineering steps is proposed to improve the performance of traditional multi-class classification approaches. Results illustrate up to 17% improvement in performance with the proposed HDS scheme relative to conventional multi-class classification approaches.
• To better assist the clinician's decision-making and provide insights into the decision criteria driving the statistical methods, an SDS is developed to provide confidence scores associated with predicted labels, in the form of class probability values. The decision labels and values of class probabilities obtained from HDS and SDS respectively support clinicians in recognizing situation where predictions of the computational models are uncertain. It helps them determine whether more tests are necessary for a more accurate diagnosis. In this manner, the proposed framework possesses the potential to serve as an effective decision support tool for early-stage PC detection. The remainder of this article is organized as follows: details pertaining to the dataset are presented in Section II, the proposed methodology is elucidated in Section III. The results are discussed in Section IV, followed by concluding remarks in Section V.

II. DESCRIPTION OF THE DATASET
The dataset resulting from protease/arginase activity quantified using ultrasensitive nanobiosensors consists of a set of eight biomarkers. Identified biomarkers were obtained from the NCBI Gene expression omnibus (GEO) database, which is public accessible. Biomarkers were proteases with significant differences in expression levels between two samples, a primary tumor sample and a healthy tissue sample, which both had to be in Homo sapiens. The features for each sample comprise of values corresponding to a panel of seven proteases and arginase, selected based on gene expression analysis using data from Unigene ID, Gene Symbol, NCBI GEO and Entrez Gene ID [17]. Protease/arginase activity for each identified biomarker was quantified in human serum samples obtained from the Biospecimen Repository Facility in the Cancer Center of the University of Kansas Medical Center [35]. The group size was as follows: ''Healthy'' volunteers (n = 48) and pancreatic cancer patients (n = 31), which was further divided into ''Localized'' (earlier stage) and ''Metastatic'' (later stage) pancreatic cancer. Localized pancreatic cancer samples signify the absence of any indication that the cancer has spread outside the pancreas, while metastatic pancreatic cancer indicates that it has spread to other parts of the body as well. Quantified protease/arginase activity was quantified in serum samples after 60 min incubation, and this dataset was then utilized to develop a computer prediction model.
Although the sample size for this research study is limited, it is important to note that we are proposing a unique, one-ofa-kind approach for early cancer detection. We believe that providing these initial results will stimulate more follow-on efforts in this direction creating a significant clinical impact. Specifically, for this study, the team was not able to obtain more disease samples and had to work with the maximum number of samples the biospecimen repository at the University of Kansas Medical Center was able to recruit. It is imperative that the activity measured with this nanobiosensor technology depends strongly on the protocol and quality of the serum samples collected, which has been previously noticed by this team. For this reason, the better approach was to work with a smaller, but well-defined sample size instead of obtaining a larger sample size from different repositories to avoid multiple variables introduced during the comparison analysis. Furthermore, we provide confidence intervals for all our inferences in Section IV in order to accommodate the sample size effects and better illustrate the power of the results.
The training set formation process for individual binary classifiers at respective hierarchical steps is illustrated in Figure 2. Eighty percent of all the instances in the dataset are randomly selected to train the binary classifier in the first hierarchical step. This step results in the isolation of samples belonging to ''healthy'' group. So, 80% of the remaining instances, i.e., the samples from ''localized'' and ''metastatic'' groups are selected at random to form the training set for binary classifier in the second hierarchical step. This strategy for preparing the training sets for individual binary classifiers is adopted due to limited number of samples available in the dataset.

III. METHODS
In this work, the problem of early-stage detection of pancreatic cancer is modelled as a multi-class classification problem. The data derived from the experiments consists of four classes, namely, ''Healthy'', ''Localized'' pancreatic cancer and ''Metastatic'' pancreatic cancer. Two information fusion based decision structures are proposed: 1) A HDS with specific feature engineering at each step for better performance relative to conventional classification approaches. 2) A SDS that provides confidence associated with predicted labels in the form of probability values for each class.

A. HARD HIERARCHICAL DECISION STRUCTURE
The fundamental premise of the proposed information fusion based HDS involves tailoring the statistically most significant features with appropriate weights to execute an efficient binary classification task at each hierarchical step. The proposed HDS is shown in Figure 3. The individual elements involved in building the HDS are described next.

1) COMPUTING WEIGHTS FOR FEATURES
The first step in the proposed HDS framework is to identify whether the given sample belongs to healthy (null hypothesis)  (1) to obtain the corresponding feature weights.
Here, w i is the weight corresponding to feature f i , − log e f i represents the negative value of natural logarithm of p-value corresponding to feature f i and N is the total number of features. The p-values and computation of corresponding weights for all the features in the dataset under consideration are presented in Table 1.

2) SELECTING FEATURES FOR EACH HIERARCHICAL STEP
If a given sample is identified as ''non-healthy'' in the first hierarchical step, the next step is aimed at determining the degree or extent of abnormality involved. The second hierarchical step determines if the given sample belongs to a ''localized'' or ''metastatic'' group. The corresponding binary classifier uses a subset of the features rather than using all the features obtained from experiments, as in the first hierarchical step. This feature engineering step identifies the most relevant features, thereby simplifying the models and making them easier to interpret. Moreover, this allows to have shorter

B. SOFT HIERARCHICAL DECISION STRUCTURE
While the HDS offers a three-class classifier, it does not provide any information regarding the confidence associated with the decisions. This drawback is addressed in the proposed SDS framework that provides confidences associated with the predicted labels in the form of probability values for each class. The proposed SDS is shown in Figure 4.
It is basically an extension of the HDS, where the prediction for each sample is accompanied with the probability values of that sample being affiliated to each of the three classes. The differences between these probability values provide an indication of confidence associated with the corresponding prediction. For a given instance, if the probability value corresponding to one of the classes in significantly higher than the rest, the confidence associated with such a prediction would be HIGH. On the other hand, if there is no significant difference between the probability values corresponding to all the classes, the associated confidence would be LOW. This framework helps the doctors determine whether additional tests are required for proper diagnosis.
All steps in the SDS are probabilistic extensions of the HDS. For example, the first step in SDS results in two values indicating probabilities of the given sample being ''healthy'' or ''non-healthy'', represented by P(H ) and P(H ) respectively. These values are essentially the probability estimates of classification model trained at first hierarchical step for a given sample and are obtained using predict_proba() method of trained scikit-learn [39] models. The second hierarchical step evaluates the probabilities of the given sample being ''localized'' or ''non-localized'', given the condition that it belongs to ''non-healthy'' group, represented by P(L|H ) and P(L|H ) respectively. These values are probability estimates of classification model trained at second hierarchical step for a given sample. As a result, the probabilities of a sample being ''localized'' or ''metastatic'' is evaluated based on equations (2) and (3) respectively.

A. HARD HIERARCHICAL DECISION STRUCTURE
The proposed framework is evaluated by training a series of hierarchical classification models by considering several combinations of binary classifiers in all the three hierarchical steps, indicated in Figure 3. The classification methods considered for individual binary classifiers include: (i) Gaussian Naïve Bayes (GNB) [40], (ii) Decision Tree (DT) [41], (iii) Support Vector Machine (SVM) [42], (iv) k-Nearest Neighbors (kNN) [43], (v) Random Forest Classifier (RFC) [41], [43] and Logistic Regression (LR) [44]. In order to avoid overfitting, we have used k-fold (k = 5) cross validation technique as a resampling method for training and evaluating the performance of classification models. The combinations of classification methods exhibiting an overall mean accuracy score of more than 85% are reported in Table 2. The sensitivity and specificity of all model combinations are also indicated. The 95% confidence intervals for evaluation metrics (accuracy score, sensitivity and specificity) are represented using mean and standard deviation of k-fold cross-validated values.
The training sets were formed as described in Section II, and evaluation was performed over all the instances in the dataset under consideration. In Table 2, it can be observed that the best performance (overall mean accuracy score of 92.40%) is obtained using kNN for binary classification in first hierarchical step and SVM in the second step. Moreover, the sensitivity and specificity scores for this case are observed to the most favorable as compared to all other combinations of classification methods. The corresponding confusion matrix is presented in Table 3.  On the contrary, the maximum mean classification accuracy obtained from conventional multi-class classification approach using individual classification methods is 74.66%, as indicated in Table 4. The 95% confidence intervals associated with the predictions of statistical models are reported using mean and standard deviation of k-fold (k = 5) crossvalidated accuracy scores. This demonstrates that the HDS framework outperforms the conventional multi-class classification approaches for early-stage detection of pancreatic cancer. The superior performance of proposed HDS framework is primarily attributed to the following reasons: (i) the features are weighed in the first hierarchical step based on their distinguishing ability, unlike traditional multi-class classification approaches which give equal importance to all the features; (ii) only a subset of features which are able to confidently differentiate between localized and metastatic PC are considered in the second hierarchical step, instead of accounting for all the features irrespective of their differentiating capability; (iii) splitting a multi-class classification problem into stepwise binary classification tasks allows for a more simplified feature representation and better learning.

B. SOFT HIERARCHICAL DECISION STRUCTURE
The proposed SDS framework supports computation of confidences associated with the predicted labels in the form of probability values for each class. An example for a correct and incorrect prediction are shown in Figure 5 and Figure 6 respectively.
The instance shown in Figure 5 is correctly classified as ''Healthy''. It can be seen that the probability of this sample belonging to ''Healthy'' class is significantly higher than those of the other classes. In such a situation, the clinician can have sufficiently high confidence on the model prediction and it can be concluded that no further tests are required. In contrast, the instance shown in Figure 6 actually belongs to a ''Healthy'' class but is misclassified as ''Metastatic''. Additionally, it can be observed that the differences in probabilities of ''Healthy'' and ''Metastatic'' classes is not as significant as in the instance demonstrated in Figure 5. One of the fundamental limitations of standard AI-based decision-making  models is that they attempt to impose a strict conclusion in the form of an output by selecting the most appropriate option among all the possibilities. In such a scenario, the confidence and faithfulness towards predictions of these computational models is disputable, particularly for crucial applications such as medical diagnosis. The proposed SDS framework overcomes this shortcoming by specifying confidence associated with model predictions in the form of class probability values. This information helps the clinician perceive the lack of confidence in the model predictions and nudge them to possibly prescribe further tests prior to diagnosis. In a sense, the SDS builds off the HDS and makes it more ''interpretable'' to the end user.

V. CONCLUSION
In this work, we have combined the use of ultrasensitive nanobiosensors for protease/arginase detection with information fusion based statistical framework to detect PC at the localized stage by means of a simple Liquid Biopsy. The information fusion based hierarchical decision structures are proposed for early-stage detection of pancreatic cancer. The HDS, coupled with feature engineering at each step exhibits an overall accuracy score of around 92%, as opposed to 74% obtained with conventional multi-class classification techniques. The SDS builds off the HDS to achieve a more ''interpretable'' outcome by providing confidence associated with predictions in terms of probability values for each class. This information can be used to clinicians in order to perceive the lack of confidence in model predictions and to examine if any further tests are required before making a final decision. The prime advantage of using such computational methods for detection of pancreatic cancer during early-stage is detecting the onset of pancreatic cancer in the group of chronic pancreatitis patients, which would allow a maximal time for successful treatment with emerging methods like immunotherapy.