Individualized Models for Glucose Prediction in Type 1 Diabetes: Comparing Black-Box Approaches to a Physiological White-Box One

Objective: Accurate blood glucose (BG) prediction are key in next-generation tools for type 1 diabetes (T1D) management, such as improved decision support systems and advanced closed-loop control. Glucose prediction algorithms commonly rely on black-box models. Large physiological models, successfully adopted for simulation, were little explored for glucose prediction, mostly because their parameters are hard to individualize. In this work, we develop a BG prediction algorithm based on a personalized physiological model inspired by the UVA/Padova T1D Simulator. Then we compare white-box and advanced black-box personalized prediction techniques. Methods: A personalized nonlinear physiological model is identified from patient data through a Bayesian approach based on Markov Chain Monte Carlo technique. The individualized model was integrated within a particle filter (PF) to predict future BG concentrations. The black-box methodologies considered are non-parametric models estimated via gaussian regression (NP), three deep learning methods: long-short-term-memory (LSTM), gated recurrent unit (GRU), temporal convolutional networks (TCN), and a recursive autoregressive with exogenous input model (rARX). BG forecasting performances are assessed for several prediction horizons (PH) on 12 individuals with T1D, monitored in free-living conditions under open-loop therapy for 10 weeks. Results: NP models provide the most effective BG predictions by achieving a root mean square error (RMSE), RMSE = 18.99 mg/dL, RMSE = 25.72 mg/dL and RMSE = 31.60 mg/dL, significantly outperforming: LSTM, GRU (for PH = 30 minutes), TCN, rARX, and the proposed physiological model for PH=30, 45 and 60 minutes. Conclusions: Black-box strategies remain preferable for glucose prediction even when compared to a white-box model with sound physiological structure and individualized parameters.

Individualized Models for Glucose Prediction in Type 1 Diabetes: Comparing Black-Box Approaches to a Physiological White-Box One Giacomo Cappon , Francesco Prendin , Andrea Facchinetti , Giovanni Sparacino , and Simone Del Favero

I. INTRODUCTION
T YPE 1 diabetes (T1D) is a chronic autoimmune disease caused by the progressive destruction of beta cells in the pancreas, which leads to the inability of producing endogenous insulin by the organism, [1].As a result, blood glucose concentration (BG) tends to exceed the hyperglycemic threshold (BG > 180 mg/dL), a situation that, if frequent and prolonged, could lead to several serious cardiovascular long-term complications, as well as nephropathy and neuropathy, [2].To reduce BG levels, administration of exogenous insulin several times a day is necessary.Unfortunately, excessive exogenous insulin dosing could lead patients to hypoglycemia, i.e., BG < 70 mg/dL, which is dangerous even in the short-term since it could cause fainting, light-headiness, coma and even death, [3].
Effective T1D treatment relies on BG frequent monitoring, made through either the classic fingerstick device, [4], or more modern minimally invasive continuous glucose monitoring (CGM) sensors, [5], [6], [7], and is far from being trivial, [8].Indeed, T1D management represents, from a patient perspective, a life-long learning process to understand how several everyday factors (e.g., illness, diet, and physical activity) affect BG levels and how interventions (e.g., rescue carbohydrate intake and, of course, insulin administration) can be used to keep BG in the safe range.In this context, many efforts have been made by the research community to provide new tools able to help patients with T1D, [9], [10].Among them, CGM-based algorithms able to predict future BG concentration in real-time have the potential to significantly improve T1D therapy efficacy, [11], by enabling proactive therapeutic decisions based on the expected future glucose levels, rather than the current one.
Despite several different methodologies have been proposed for real-time BG prediction, [11], the problem remains open due to several unknown disturbances altering BG levels and to the high inter-/intra-patient variability in glucose physiology.The existing prediction techniques rely on a mathematical model which describes the relationship between a certain set of input features and BG.The choice of a suitable model of glucose-insulin regulation is a critical step and the options spans from "black-box" models, completely data-driven, to models based on mechanistic/semi-mechanistic nonlinear description of metabolic physiology.
Among the white-box models, the nonlinear physiological models available in the T1D literature, there are the so-called minimal models, [24], that proposed simplified descriptions of the physiology with a few equations and model parameters.This parsimonious parametrization grants identifiability in predefined experimental conditions.Unfortunately, these models have proved too rigid and simplistic to allow accurate prediction, [11].A possible white-box alternative are maximal models, commonly used in computer simulations, [25], [26], [27], [28].They provide a more realistic physiological description by using several equations with many parameters.Despite their appealing feature of having a clear and solid physiological ground, their use for glucose prediction were substantially less investigated, since their many parameters are hard to be estimated form easily accessible patient data (i.e.CGM, meal and insulin data), making them hard to personalize and thus limiting their predictive effectiveness.Moreover, they have a nonlinear structure, requiring sophisticated tools both for parameters estimation and for the computation of glucose prediction.
In this work, we face the above mentioned challenges and explore the potential of using a white-box maximal-model based methodology for glucose prediction, comparing it to black-box alternatives.The white-box model adopted is inspired by the UVA/Padova T1D Simulator, [28], accepted by the US Food and Drug Administration (FDA) as a replacement of animal preclinical testing of closed-loop drug delivery systems.A Bayesian approach, implemented by Markov Chain Monte Carlo (MCMC), [29], is used to estimate the large number of parameters in the presence of complex nonlinear dynamics.The obtained personalized model is then used within a nonlinear prediction scheme based on a particle filter methodology, [30].
The so derived white-box glucose prediction approach is compared to four black-box algorithms: a nonparametric techniques that recently proved effective in glucose prediction, [17], three deep learning algorithms (long short term memory (LSTM), gated recurrent unit (GRU), temporal convolutional network (TCN)), and a recursive autoregressive with exogenous input (rARX) model.Algorithms' assessment has been performed for different prediction horizon (PH) on a dataset which comprises 12 subjects monitored for about 10 weeks in daily-life conditions.
The paper is organized as follows.Section II describes the nonlinear physiological model of glucose-insulin regulation used for BG prediction.Section III describes the black-box methodologies considered for this work.Section IV illustrates the dataset and the evaluation metrics.Section V reports the results and discussion.Finally, Section VI summarizes the main findings, draws some conclusions and proposes possible future developments.

II. WHITE-BOX GLUCOSE PREDICTION MODEL
In this work we explored the use of a white-box maximal models for glucose prediction.The approach is based on two components: i) a nonlinear model of glucose-insulin regulation, which is personalized to capture patient-specific physiology using a Bayesian methodology, and ii) a particle filter that leverages on such model and, handling the non-linearity, allows to formulate glucose predictions.
In the following, Section II-A describes the structure of the physiological model, Section II-B discusses the model identification procedure, and Section II-C reports details on the employed prediction scheme.

A. Physiological Model of Glucose-Insulin Regulation
The physiological model of glucose-insulin regulation used in this work has two inputs, insulin infusion I(t), and carbohydrate intake CHO(t), and one output, the interstitial glucose concentration IG(t).The model is composed of three subsystems describing subcutaneous insulin absorption, oral glucose absorption, and glucose-insulin kinetics.As for a preliminary version of the model presented in [31], we started from the maximal physiological model implemented in the UVA/Padova T1D Simulator (T1DS), [32], and to simplify it as much as possible to reduce the number of parameters to be estimated for individualization while retaining its ability to achieve good glucose prediction.The details about model structure and its parameters can be found in Appendix A.
The overall physiological model is a nonlinear time-invariant state-space model: where x x x phy (t) is the state vector, defined as where x ins is the state vector of the subcutaneous insulin absorption subsystem, x oral is the state vector of the oral glucose absorption subsystem, and x glu is the state vector of the glucoseinsulin kinetics subsystem; is the input vector; f phy (•) is the state update function obtained combining ( 6), (7), and (9); θ phy is the set of unknown model parameters (formally defined in Appendix A) whose estimation will be discussed in the next section.

B. Offline Bayesian Estimation of Personalized Model Parameters by Markov Chain Monte Carlo
Model personalization has been performed by identifying for each patient the unknown model parameters θ phy using the training data Y := {CGM (t k ), t k = k • T s , k = 1, . . ., D} and U := {u phy (t k ), t k = k • T s , k = 1, . . ., D} where D is the number of data points available.
The identification has been performed by adopting a Bayesian approach and specifically, in this work θ phy is estimated through its posterior mean defined as In fact the posterior mean is known to be the minimum variance unbiased estimator of θ phy .
The Bayes theorem allows to obtain the a posteriori density function p θ|Y,U (θ|Y, U ) as: where p Y |θ,U (Y |θ, U) is the likelihood function, i.e., the probability of observing a certain Y given the parameter vector θ and the input U .Even using (3), the integral in ( 2) is analytically intractable, therefore it has to be approximated by resorting to MCMC [29].In particular, we generate N samples θ i , i = 1, . . ., N from the posterior distribution p θ|Y,U (θ|Y, U ), by creating a Markov Chain whose stationary distribution is exactly this posterior (target distribution).Then, these samples θ i are used to perform Monte Carlo integration to obtain a point estimate of θ phy : To build such a chain, the Single Component Metropolis-Hastings (SCMH) algorithm has been used [29].Implementative details about the implemented SCMH procedure can be found in Appendix B. An open-source software implementation of the proposed offline Bayesian estimation approach can be found at. 1

C. Real-Time Glucose Prediction Through Particle Filter
Up to this point, we focused on estimating the parameters of the physiological model of a T1D subject, to capture the patientspecific dynamics.This process can be done "offline" using the available information obtained from retrospective patient data.
As mentioned above, in this section, we will present how we used such a personalized model to predict, in real-time, future glucose concentrations.This task has to be performed "online", so we resorted to a sequential algorithm that at each timestep t k , when a new measurement y(t k ) = CGM (t k ) becomes available, updates the current estimate of the model state x(t k ) and uses it to infer future glucose concentration.In particular, we employ the particle filter (PF), [30], the state-of-the-art sequential Bayesian prediction technique capable of handling the nonlinear structure of the model. 1 [Online].Available: https://github.com/gcappon/replay-bgPF is based on the recursive update of the posterior probability function p(x(t k )|y(t 1:k ), u(t 1:k )) where y(t 1:k ) is a shorthand for the variables y(t 1 ), . . ., y(t k ) and u(t 1:k ) indicates u(t 1 ), . . ., u(t k ).
The one step-ahead prediction step assumes that the posterior probability p(x(t k−1 )|y(t 1:k−1 ), u(t 1:k−1 )) is available at time t k−1 and uses such a posterior probability to infer Then, when at time t k a new measurement, y(t k ), becomes available, in the measurement update step such a measurement is used to compute the posterior probability The two steps are then repeated for each available measurement in the dataset.
PF performs these steps using a sampled approximation of the probability functions at play: where {x p (t k−1 )} P p=1 is a set of P points, called "particles", in the support of p(x(t k−1 )|y(t 1:k−1 ), u(t 1:k−1 )).Each particle is associated to a weight {w p (t k−1 )} P p=1 , p w p (t k−1 ) = 1, and p(x(t k )|y(t 1:k−1 ), u(t 1:k )) where {x p (t k )} P p=1 are the P particles representing p(x(t k )|y(t 1:k−1 ), u(t 1:k )), each associated to a weight {w * p (t k )} P p=1 , p w * p (t k ) = 1.The one step-ahead prediction step of the PF is particularly convenient.It can be shown that where p(x(t k )|x(t k−1 )) is fully described by the state update equations of (1).
Regarding the measurement update step, it is possible to demonstrate that it holds where p(y(t k )|x(t k ), u(t 1:k )) is the likelihood function that is fully specified by (1).
Details on how these quantities are calculated in practice are reported in Appendix C.
As an additional result, the posterior probability p(x(t k )|y(t 1:k ), u(t 1:k )) is furtherly used by the PF to compute the posterior probabilities p(x(t k+i )|y(t 1:k ), u(t 1:k+i )), ∀i = 1, . . ., P H describing the state distribution predicted i steps-ahead in time up to P H steps ahead (with P H the prediction horizon).
In particular, it is possible show that:

. , P H
that is completely specified by (1).Finally, a point estimate of future CGM values at time t k+i , i = 1, . . ., P H can be derived using the expectation of the posterior:

. , P H.
An open-source software implementation of the proposed real-time glucose prediction approach based on PF can be found at. 2

A. Linear Non-Parametric Models
Although the metabolic physiology is nonlinear, several contributions have shown that its approximation with linear models is an appealing option for BG prediction, [17], [33].For this reason, our analysis considered the use of an advanced nonparametric (NP) identification techniques for linear models, [34].In particular, this approach estimates the unknown impulse response related to insulin, meal and glucose, from noisy measurements.Unlike the standard approach that constraints the unknown functions to a parametric structure, [35], the nonparametric approach estimates the unknown impulse responses over a infinite-dimensional set given by a Reproducing Kernel Hilbert Space (RKHS).Such a space is completely specified by the choice of the kernel.In this case, the Stable Spline kernel is used as it incorporates key prior knowledge, such as smoothness and stability of the predictor impulse responses to be estimated.In [17], this approach proved to be the most effective among several linear and nonlinear methods for glucose prediction.

B. Nonlinear Deep Learning Approaches
As described in [11], [36], there are a growing number of deep learning methodologies to forecast BG levels.In particular, due to their ability to handle time-series and sequential data, there is an increasing trend to develop both recurrent and convolutional neural networks (RNN and CNN, respectively) for BG forecasting, [21], [22], [37].Unlike traditional feed-forward neural networks in which the information flows from the input towards the output layer, RNN are characterized by recurrent units with loops propagating the information back to the same unit, [38].So, each learning step takes into account not only the current input, but also what was learnt from the previous inputs, [38].In this work we implemented two multi-input RNN based on: i) Long Short Term Memory (LSTM) cells and ii) Gated Recurrent Units (GRU).Specifically, they are composed by a single hidden layer which comprises 30 units, as in [21].Similarly to [23], [39], we also investigated the use of CNN-based algorithms, by developing a Temporal Convolutional Network (TCN) which combines 3 causal and dilated convolutional layers (equipped with 8, 16, and 32 filters) that are used for extracting features from the inputs.Finally, all the proposed deep learning algorithms are fed by three input channels (past history of glucose data, meal intake, and insulin injections) and equipped with an output layer (dense), comprising a number of units corresponding to the future BG levels to forecast (in this work 12, corresponding to a PH = 60 minutes, sampling time is 5 minutes), as in [21], [40].As shown in Fig. 1, once fed by input data, the algorithms provide as output a trajectory of 12 future consecutive glucose samples.Of note, the deep learning models are developed within Python (Keras library) and trained using a Nvidia Titan RTX.

C. Recursive Models
Finally, we explored adaptive strategies that have the potential to accommodate the changes over time in patient metabolism.In particular, we implemented the personalized recursive autoregressive model with exogenous input (rARX) proposed by Finan et al. [41].As the other methodologies, this approach takes in input past CGM values, carbohydrate intakes, and insulin recordings.Beside using this data to predict future CGM values, each time a new CGM sample is collected the model parameters are adapted according to a well-established recursive estimation scheme [35].

A. Dataset
The dataset used in this study is the Ohio Type 1 Diabetes Mellitus dataset updated on the 2020 release, [42], from now on referred as the OhioT1DM.The OhioT1DM dataset, comprises 12 subjects with T1D monitored with a Medtronic Enlite CGM system for 8 weeks.Participants wore an insulin pump (Medtronic 530 G or 630 G) and a wearable system (Basis Peak fitness or Empatica Embrace) to measure physiological variables, such as skin temperature, skin conduction, and heart rate.In addition, subjects reported information on meals: timing, amount, and type (that is, breakfast, lunch, dinner, snack, hypoglycemia treatment).Each subject in the OhioT1DM data Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.set is split a set (about the intial 6 monitoring weeks) and into a test set (roughly the last 10 days).This dataset represents a challenge for BG predictive algorithms: glucose dynamics recorded in daily-life conditions are much more complex to describe than those generated by simulation tools or those recorded during well-controlled in-hospital trial sessions, since in the former the patient is exposed to substantially larger disturbances to glucose homeostasis.Handling data recorded under free-living conditions raises some technical issues mainly about synchronization and completeness of the recorded information.In particular, the OhioT1DM dataset presents long portion of missing CGM readings and the sampling time is not homogeneous.Therefore, all signals were aligned into a uniform time grid with a sampling period of T s = 5 minutes.Any CGM gap in the training set shorter than 30 minutes was interpolated with a first order polynomial while a simple and causal zero-order-hold imputation was performed on the test set.
A preliminary investigation indicates that alternative real-time extrapolation techniques (e.g.linear extrapolation) do not lead to a significant improvement in the overall prediction performance.

B. Algorithms Assessment Metrics
Evaluation of prediction accuracy has been performed in the test set by comparing the obtained glucose predictions with the actual CGM values using different P H. Two, frequently used, performance metrics are considered: the root mean square error (RMSE) and the time gain (TG), defined as follows: (5) The higher T G, the more prompt and useful the prediction and T G = 0 means that the model prediction is comparable to looking at the current glucose level.
Performance metrics of Physiological model (within the PF) are compared against NP, LSTM, GRU and TCN models using a paired t-test unless the hypothesis of normal distribution was rejected by a Lilliefors test (p-value < 0.05).In this case, Wilcoxon ranksign test was used.Reported p-values are two-tailed and considered statistically significant when < 0.05.

V. RESULTS AND DISCUSSION
Table I reports the prediction performance in terms of RMSE and TG, for PH = 30, 45, and 60 minutes achieved by the white-box physiological model (hereafter labeled as PHY), and the considered black-box methodologies (i.e., NP, LSTM, GRU, TCN, and rARX).From Table I, three main outcomes can be observed: i) black box algorithms outperform PHY in terms of RMSE, for all prediction horizons, ii) all methodologies seem to achieve similar TG (no statistically significant difference was found when comparing all pairs of prediction approaches with a paired t-test corrected with the Bofferroni method), and iii) that there are no large differences in terms of RMSE between the considered black-box approaches.
Specifically, the RMSE obtained by PHY is larger than the one of the other methodologies by approximately 5 mg/dL, 7 mg/dL, and 9 mg/dL for PH = 30, 45, and 60 minutes, respectively.This corresponds to a performance deterioration of about 25%.The NP approach allows to achieve the lowest RMSE: median RMSE = 18.99 mg/dL, RMSE = 25.72 mg/dL and RMSE = 31.60mg/dL for PH = 30, 45 and 60 minutes, respectively.The performance improvement is found to be statistically significant with respect to LSTM and GRU (p-value = 0.04 and p-value = 0.03) for PH = 30 minutes, while no significant difference is found for longer prediction horizons.Furthermore, NP model showed to be significantly better than TCN, with p-value = 0.001, p-value = 0.006 and p-value = 0.009, for PH = 45 and PH = 60 minutes, respectively.The NP approach achieves the largest median improvement with respect to PHY, decreasing RMSE by 26.5%, 26.2% and 24.9%, for PH = 30, 45 and 60, respectively (p-values < 10 -4 for all the considered PHs).The numerical results obtained in this work by LSTM, GRU and TCN are comparable with what has been obtained in other literature contributions dealing with the assessment of individualized deep learning algorithms for BG forecasting in the OhioT1DM dataset, [22], [23], [33].In [22], a recurrent convolutional network granted a mean RMSE = 20.6 mg/dL, 26.8 mg/dL and 33.9 mg/dL for PH = 30, 45 and 60 minutes, respectively.Similarly, the TCN and the LSTM tested in [23]  To the best of our knowledge, there are only two other literature contributions investigating maximal model individualization, [43], [44].However, both in [43], [44], rich datasets of frequent plasma glucose and insulin measurements collected in a clinical setting were required, thus strongly limiting the applicability of these methodologies to be used on data collected in free-living conditions.Furthermore, none of two models derived was used for glucose prediction.
Regarding the use of particle filters to obtain glucose predictions, the work of [45] showed the potential of this approach for the scope.However, in the prediction scheme of [45], they adopted an average model, thus not personalizing the methodology at the patient level.
With the aim of better understanding the consistent difference between PHY and black-box models in glucose prediction, we consider an illustrative example on a one-day long data portion extracted from the test set, depicted in Fig. 2. In the top panel, CGM data (grey dashed line) of a subject of the OhioT1DM dataset (ID:570) is shown with the 30-minute ahead-in-time prediction of PHY, NP, and LSTM models, in blue, yellow and red, respectively.Middle and bottom panels show meal and insulin data, respectively.Analyzing the extracted portion of data, there are two meals with similar amount (30 g/min at 9:10, 35 g/min at 17:35) and two similar corresponding insulin boluses delivered with the so-called dual-wave mode, [46].It is interesting to note that the two corresponding postprandial Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
responses are very different.In the first case, glucose increases from 200 mg/dL to 300 mg/dL within an hour after the meal, whereas, in the second case, glucose remains almost flat (about 20 mg/dL excursion) and then decreases after an hour, reaching hypoglycemia.The first postprandial excursion is aligned with the physiological expectation that a meal should be followed by a glucose increase, while in the second postprandial excursion this does not happen, for causes that are hard to guess and that might span from an high fat meal composition to psychological stress slowing down digestion.The (unavoidably simplified) physiological model structure has not the flexibility to cover both types of responses and the model imposes a similar shape to the two postprandial glucose excursions leading, as a consequence, to an extremely large prediction error observed during the second meal.On the contrary, the black-box approaches prove more flexible and are able to produce two different postprandial shapes despite the similar inputs.
As a final consideration, we report a preliminary investigation of the computational power required to compute a prediction using each method.In fact, in the final deployment of a glucose prediction algorithm, this step might have be performed in real-time on portable hardware.As a proxy, we considered the computation time required on a DELL desktop PC equipped with an Intel(R) Core(TM) i7-9700 CPU @ 3.00 GHz, 16.0 GB RAM.The code was not optimized, PHY (considering N=5000 particles), NP and rARX are implemented in Matlab while LSTM, GRU and TCN were implemented using Python (Keras library).Therefore, this result offers only an early exploration of this issue.The average time required to PHY to compute the 30-minutes ahead prediction is about 30 ms and is one order of magnitude larger than the time required to the other methods: the NP method needs 2 ms, rARX 0.05 milliseconds and the nonlinear methods (less than 3 milliseconds).Moreover, it should be noted that the computation time of PHY is strongly affected by the number of particle considered (here 5000), whereas the other approaches exhibit a computational time less sensitive to the method hyperparamters.The time required to train the models is not considered, as this computationally demanding step could be performed on a remote server.
Future work will attempt to increasing the flexibility of whitebox models.This will include considering time-varying model parameters estimated in real-time by the PF, to track patientspecific intraday variability and meal-to-meal differences in CHO absorption.Moreover, further investigations will compare white-box and black-box modeling approaches for control purposes, i.e. when they are used within a model-based closed-loop controller or as core element of a decision support systems.In fact, it is unclear if superior prediction performance will translate into superior glycemic control or if the physiologically grounded, albeit simplified, structure of white-box model could lead to more robust and effective control actions.Finally, future work will analyze the impact of newer and more accurate CGM sensors on prediction accuracy.In fact, the CGM sensor used in the dataset considered in this work was less accurate than the newer generation devices currently available in the market [47], [48].

VI. CONCLUSION
Real-time glucose prediction algorithms are key for developing next-generation tools to improve diabetes care.The diabetes research community intensively focused on the use of black-box prediction approaches, investigating many techniques spanning from linear models to deep learning-based approaches.On the other hand, white-box maximal models for glucose predictions are less investigated, due a number of technical difficulties they pose.
This work compared five black-box methodologies (a nonparametric technique that recently proved effective in glucose prediction, [17], a LSTM, [33], a GRU, a TCN, [23], and a rARX [41]), with a newly developed white-box technique based on a nonlinear physiological model of glucose-insulin dynamics, whose parameters are individualized through a MCMC approach and embedded in a PF to predict future glucose values.On the data under study, collected by T1D patients in free-living conditions, the considered black-box methodologies significantly outperform the white-box approach for all the PH under study.Moreover, among the data-driven algorithms, the best performance are achieved by the linear NP approach, that grants statistically significant improvement in the performance with respect to LSTM and GRU for PH = 30 minutes, and to TCN for all the considered PH.One possible reason for the differences in performance between white-box and black-box models might reside in the fact that the first are less flexible in accommodating the large variety of patterns observed in the data and that might be caused by multiple unmodeled factors, including variability in meal absorption, different meal compositions, stress, illnesses, physical activity, inaccuracy in estimating carbohydrate content of a meal.Future works will aim at increasing the flexibility of white-box models and at comparing white-box and black-box model for control purposes.

APPENDIX A PHYSIOLOGICAL MODEL OF GLUCOSE-INSULIN REGULATION
This appendix reports a detailed description of the nonlinear physiological model of glucose-insulin regulation used by the white-box glucose prediction approach presented in Section II.The model is composed of three subsystems describing subcutaneous insulin absorption, oral glucose absorption, and glucose-insulin kinetics, to each presented in one of the following sections.

A. Subcutaneous Insulin Absorption Subsystem
The subcutaneous insulin absorption model is a slightly simplified version of the one used in the T1DS model, described in [49] and illustrated in Fig. 3.The model is composed of three compartments.Exogenous insulin I is infused to the first compartment where it appears after a delay β.In the first compartment, representing insulin in a non-monomeric state, insulin is transformed in a monomeric state and then diffused to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the plasma.The model equations are: where I sc1 (mU/kg) and I sc2 (mU/kg) represent the insulin in a non-monomeric and monomeric state, respectively; I p (mU/l) is the plasma insulin concentration; k d (min −1 ) is the rate constant of diffusion from the first to the second compartment; k a2 (min −1 ) is the rate constant of subcutaneous insulin absorption from the second compartment to the plasma; k e (min −1 ) is the fractional clearance rate; V I (l/kg) is the volume of insulin distribution; β (min) is the delay in the appearance of insulin in the first compartment.A priori information on model parameters has been obtained from the literature, [49].Specifically, V I and β have been set to population values, i.e. 0.126 l/kg and 8 min, respectively.Furthermore, k d has been constrained to k d ≥ k a2 since the two combinations are interchangeable.Unknown model parameters, identified via the MCMC procedure presented in Section II-B, are

B. Oral Glucose Absorption Subsystem
The oral glucose absorption subsystem model, taken from [50], represents a simplified version of the model used in the T1DS and is illustrated in Fig. 4. It describes the gastrointestinal tract as three-compartment system: the first two compartments account for food in the stomach (solid and grinded state), while the third compartment models the upper small intestine where CHO is absorbed.Model equations are: where Q sto1 (mg/kg) and Q sto2 (mg/kg) are the glucose amount in the stomach in a solid and liquid state, respectively; Q gut (mg/kg) is the glucose concentration in the intestine; k gri (min −1 ) is the rate constant of grinding; k empt (min −1 ) is the rate constant of gastric emptying; k abs (min −1 ) is the rate constant of intestinal absorption; CHO (mg/kg/min) is the ingested carbohydrate rate.Model (7) allows to estimate the rate of glucose appearance in plasma Ra (mg/kg/min) as: where f (dimensionless) is the fraction of the intestinal content absorbed in the plasma.A priori information on model ( 7) has been obtained from the literature, [50].In particular, we set f equal to 0.9 and we constrained k gri = k empt .Furthermore, k abs has been constrained to k abs ≤ k empt since the two combinations are interchangeable.As such, the unknown model parameters, estimated using the MCMC approach presented in Section II-B, are θ oral = [k abs , k empt ].

C. Glucose-Insulin Kinetics Subsystem
The subsystem of glucose-insulin kinetics is based on a wellknown two-compartment model that describes the impact of the plasmatic insulin action and glucose rate of appearance in plasma glucose concentration introduced in [51].The model is further equipped with a third compartment to describe the transport of glucose from plasma to the interstitium where it is measured by the sensor.The model is illustrated in Fig. 5. Model equations are: where G (mg/dl) is the plasma glucose concentration, X (min −1 ) is the insulin action on glucose disposal and production; SG (min −1 ) is the glucose effectiveness describing glucose ability, per se, to promote glucose disposal and inhibit glucose production; G b (mg/dl) is the basal glucose concentration in the plasma; V G (dl/kg) is the volume of glucose distribution; p 2 (min −1 ) is the rate constant of insulin action dynamics; SI (ml/μ U•min) is the insulin sensitivity; I pb (mU/l) is the basal insulin concentration in the plasma; IG (mg/dl) is the interstitial glucose concentration; α (min) is the delay between the plasmatic and interstitial glucose concentration compartments; and ρ(G) is a function, introduced by [52], that allows to better describe Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
the insulin action in the hypoglycemic range by increasing its impact when glucose concentration decreases below a certain level.Furthermore, to account for patient-specific intraday insulin sensitivity variability, [53], the parameter SI is considered time-varying over the day: A priori information on parameter distributions has been obtained from the literature, [54].V G has been fixed to population value, i.e.
This partitioning scheme has been chosen since it improves MC mixing and allows to break the correlation between SI and p 2 , known to be critical from the literature, [55].An iteration i of the algorithm consists of five steps p = 1, . . ., 5 and each step updates the p-th partition of θ phy , θ p , by approval/rejection of a sample φ p extracted from the proposal density function q p (•|•).Specifically, as prescribed by the SCMH procedure, approval occurs with probability α α = min 1, π(φ p |θ i,−p )q p (θ i−1,p |φ p , θ i,−p ) π(θ i−1,p |θ i,−p )q p (φ p |θ i−1,p , θ i,−p ) with π(θ p |θ i,−p ) proportional to the posterior of θ p given that the other components θ −p assume the value θ −p = θ i,−p : where θ i,−p comprises all the other components of θ phy except for θ p , and p θ (θ p |θ i,−p , U) is the prior probability distribution of θ p given θ i,−p .Precisely, θ i,−p contains the most updated version of each component as available at the current stage of the algorithm: Components up to p − 1 have already been updated when processing the p-th components at iteration i, while other components, from p + 1 to 5, have not been updated yet, so their value computed in the previous iteration i − 1 is used.
For what it concerns the proposal distribution, we used a Gaussian centered in the value assumed by θ p in the previous chain iteraction where Σ p is a tuning parameter that regulates the acceptance rate of the chain.We set Σ p to a diagonal matrix whose components are an estimate of the conditional standard deviation of each element of partition p, sd(θ phy p |Y, U ), multiplied by a scaling factor 2.4/ √ d, where d is the number of parameters to be Algorithm 1: Adaptive Single Component Metropolis Hastings.
estimated in p-th partition, as suggested in [56].This estimates is computed by running two exploratory MCMCs for nIter = 600 iterations and updated every 1500 iterations of the algorithm, thus implementing an adaptive SCMH.Finally, the convergence of the MCMC has been verified through the well-known Raftery-Lewis criterion, [29], which provides the number of iterations necessary to ensure the Markov Chain to represent the target posterior distribution.
The Adaptive Single Component Metropolis Hasting is summarized in Algorithm 1.

APPENDIX C REAL-TIME PREDICTION THROUGH PARTICLE FILTER: IMPLEMENTATIVE DETAILS
In the following, we present the numerical scheme implemented by PF to perform the one step-ahead prediction, measurement update, and multiple step-ahead prediction.
In view of this, to draw the new set of particles it is sufficient to let each particle x p (t k−1 ) evolve according to model (1), and corrupt it with a realization of the noise v. Measurement update step: The PF algorithm sets the weight w * p (t k ) of each p-th particle x p (t k ) to the likelihood function evaluated on x p (t k ) w * p (t k ) = p(y(t k )|x p (t k ), u(t 1:k )). ( In particular, the statistics of the stochastic modelling error e, p(y(t k )|x p (t k ), u(t 1:k )) is defined as: p(y(t k )|x(t k ), u(t 1:k )) = N (y(t k ) − y p (t k ), SD ).(13) where y p (t k ) is obtained using (1) and SD is the constant standard deviation of the error.Weights are then normalized such that p w * p (t k ) = 1.This provides a sampled form representation of the posterior density p(x(t k )|y(t 1:k−1 ), u(t 1:k )) Resampling step: To improve the accuracy of PF, the measurement update step is completed by updating the set of particles.Specifically, {x p (t k )} P p=1 are substituted with a new set of P particles, {x * p (t k )} P p=1 generated from the sampled representation of p(x(t k )|y(t 1:k−1 ), u(t 1:k )) such that Pr(x * p (t k ) = x p (t k )) = w * p (t k ).This step is performed by a well-established resampling algorithm [57].
Multiple steps-ahead prediction: Multiple steps ahead predictions can be obtained as follows.First, the probabilities p(x(t k+i )|y(t 1:k ), u(t 1:k+i )), ∀, i = 1, . . ., P H are obtained in sampled form starting from p(x(t k )|y(t 1:k ), u(t 1:k )) and propagating the P particles {x p (t k )} P p=1 i steps ahead as we did in the one step-ahead prediction step.Then, the set of particles {y p (t k+i |t k )} P p=1 is computed and used to obtain p(y(t k+i )|y(t 1:k ), u(t 1:k+i ))∀, i = 1, . . ., P H in sampled form.
Finally, a point estimate of glucose i steps-ahead, y(t k+i ), is obtained as the average computed over the sampled form of p(y(t k+i )|y(t 1:k ), u(t 1:k+i )), i.The implemented PF is summarized in Algorithm 2.

Abstract-Objective:
Accurate blood glucose (BG) prediction are key in next-generation tools for type 1 diabetes (T1D) management, such as improved decision support systems and advanced closed-loop control.Glucose prediction algorithms commonly rely on black-box models.Large physiological models, successfully adopted for simulation, were little explored for glucose prediction, mostly because their parameters are hard to individualize.In this work, we develop a BG prediction algorithm based on a personalized physiological model inspired by the UVA/Padova T1D Simulator.Then we compare white-box and advanced black-box personalized prediction techniques.Methods: A personalized nonlinear physiological model is identified from patient data through a Bayesian approach based on Markov Chain Monte Carlo technique.The individualized model was integrated within a particle filter (PF) to predict future BG concentrations.The black-box methodologies considered are non-parametric models estimated via gaussian regression (NP), three deep learning methods: longshort-term-memory (LSTM), gated recurrent unit (GRU), temporal convolutional networks (TCN), and a recursive autoregressive with exogenous input model (rARX).BG forecasting performances are assessed for several prediction horizons (PH) on 12 individuals with T1D, monitored in free-living conditions under open-loop therapy for 10 weeks.Results: NP models provide the most effective BG predictions by achieving a root mean square error (RMSE), RMSE = 18.99 mg/dL, RMSE = 25.72 mg/dL and RMSE = 31.60mg/dL, significantly outperforming: LSTM, GRU (for PH = 30 minutes), TCN, rARX, and the proposed physiological model for PH=30, 45 and 60 minutes.Conclusions:

Fig. 2 .
Fig. 2. Representative subject (ID:570) of the OhioT1DM dataset.The upper panel shows CGM data (grey dashed line) and the 30-min ahead prediction obtained using PHY (blue line), NP (yellow line) and LSTM (red line).Middle panel shows the CHO content of the meal, in g/min.Bottom panel shows injected insulin (U/min).
achieved a mean RMSE = 20.23 mg/dL and RMSE = 20.11mg/dL for PH = 30 minutes, and RMSE = 34.21mg/dL and 33.10 mg/dL for PH = 60 minutes.Finally, despite permitting model adaptation, rARX performs slightly worse than the other black-box models employed in this work but still better than PHY, by achieving a median RMSE = 20.18mg/dL, 27.91 mg/dL and 34.14 mg/dL for PH = 30, 45 and 60 minutes, respectively.