A personalised bayesian approach for early intervention in gestational weight gain management towards pregnancy care

Pre-pregnancy body mass index and weight gain management are associated with pregnancy outcomes in expecting women. Poor gestational weight gain (GWG) management could increase the risk of adverse complications. These risks can be alleviated by lifestyle based interventions if undesired GWG trend is detected early on in the pregnancy. Current literature lacks analysis of gestational weight gain data and tracking the pregnancy over time. In this work, we collected longitudinal gestational weight gain data from women during their pregnancy and model their weight measurements to predict the end-of-pregnancy weight gain and classify it in accordance with the medically recommended guidelines. The measurement frequency of the weights is often very variable such that segments of data can be missing and the need to predict early utilising few data points complicates data modelling. We propose a Bayesian approach to forecast weight gain while effectively dealing with the limited data availability for early prediction. We validate on diverse populations from Europe and China. We show that utilising individual’s data only up to mid-way through the pregnancy, our approach produces mean absolute errors of 2.45 kgs and 2.82 kgs in forecasting end-of-pregnancy weight gain on these populations respectively, whereas the best of state-of- the-art yields 8.17 and 6.60 kgs on respective populations. The proposed method can serve as a tool to keep track of individual’s pregnancy and achieve GWG goals, thus supporting the prevention of excessive or insufficient weight gain during pregnancy.


I. INTRODUCTION
In this increasingly obesogenic society, weight management is a key lifestyle-related condition that affects people of all ages and ethinicities. One of the most important demographic groups affected by this is pregnant women. 47% of the pregnant women gain too much weight over the gestational period and around 23% tend to gain too little weight during § Equal contribution their pregnancy [1]. Institute of Medicine (IOM) updated the recommended set of guidelines [2] on how much weight women in different BMI categories should gain during their pregnancy to encourage optimal health for the mother and her child (Table 1). With only 30% of the women in the normal weight category after pregnancy [1], most of the women do not follow the guidelines or realize too late in the pregnancy that an intervention or control of the weight gain is necessary. Risks associated with undesired weight gain: There have  [2] for weight gain and rate of weight gain during pregnancy with respect to BMI. The guidelines assume a weight gain of 0.5 − 2 kg in the first trimester of pregnancy. been several studies that associate gestational weight gain with pregnancy related outcomes. For example, excessive Gestational Weight Gain (GWG) can pose several short and long term risks for the mothers such as fetal macrosomia and postpartum weight retention leading to maternal obesity [3]. Women entering into pregnancies with high pre-pregnancy Body Mass Index (BMI) are at increased risk for gestational diabetes [4]. It can also result in large-for-gestational-age infants and/or caesarean delivery or other labor and delivery complications [1]. In terms of risks for the offsprings, Oken et al. [5] and Sridhar et al. [6] found that exceeding the recommended guidelines was associated with a 46% increase in odds of having an overweight/obese child after adjusting for maternal prepregnancy BMI, race/ethnicity, age at delivery, education, child age, birthweight, gestational age at delivery, gestational diabetes, parity, infant sex, total metabolic equivalents, and dietary pattern. Additionally, adverse cardiovascular diseases in later stages of the offspring's life is also reported in [7]. On the contrary, gaining too little weight during pregnancy is also not considered healthy. Evidence for a correlation exists between indaequate weight gain and perinatal mortality. Davis et al. studied over 100,000 records from the National Center for Health Statistics (NCHS) 2002 Birth Cohort Linked Birth/Infant Death Data and indicated that inadequate gestational weight gain is highly associated with increased odds of infant death up to 1 year after death [8]. Other reported risks include increased risk of preterm birth or small-for-gestational-age infants [1] or failure to initiate breastfeeding [2]. There have been several factors associated with the undesired gestational weight gain such as age, ethnicity, genetics [9], [3] which are fixed. Apart from these fixed factors, modifiable factors related to lifestyle such as amount of physical activity and food intake also show a high correlation with the gestational weight gain [10]. Several intervention studies [11], [12] showed that lifestyle based interventions can improve the outcome of gestational weight gain, if the intervention is timely, preferably initiated before the start of the pregnancy [13].
In this work, we aim to reliably predict the gestational weight gain using the weight measurements from initial days of the pregnancy. Our proposed approach uses the weight gain measurements from other subjects in the training data to generate prior information about the (personal) model of the test subject. The model is then trained on the available limited data of the test subject along with the generated prior information resulting in an increase in the performance of the overall system, which we discuss later. Our proposed solution can help prenatal care providers in risk assessment during a pregnancy and provide adaptive coaching to the mothers. Moreover, mothers can track the rate of weight gain and use the model to monitor weight gain, thus reducing GWG related risks at the end of their pregnancy.
Real life weight measurements are used that are mostly self-reported (measurements consistent with regular midwife/ hospital visits) by 233 expecting mothers during their pregnancy in Europe and China. We formulate this as an absolute weight prediction problem with the end goal of predicting the weight at the end of the pregnancy and classifying if the weight is within the IOM recommended guidelines or not. We have restricted our analysis to the mothers with singleton pregnancy for this study. Data from mothers expecting more than one child is very rare to obtain. Also, the guidelines for gestational weight gain consider singleton mothers [2].
Lifestyle interventions can be done in the form of personal coaching by traditional health-care providers, or eHealth mobile-application based coaching or a mix of both [14], [15]. A schematic diagram of the solution following a mix of both is provided in Fig. 1 where recorded weight measurements are sent for processing along with meta-data and feedback/alerts can be shared with the individual and/or caregivers. Recommended weight gain during pregnancy varies Weight Gain

GWG Forecast Model
IoM Guidelines Below Normal Normal Above Normal FIGURE 1: Schematic diagram of GWG weight estimation from person to person based on their BMI ranges. Women with underweight pre-pregnancy BMI are expected to gain weight at a higher rate than women that were overweight before pregnancy. This calls for personalization of the learning method. Additionally, it is important to note that the problem of estimation is a multi-step forecasting problem, which means that we train a model using self-reported weights at the start of the pregnancy period (e.g. first 180 days) and use this model to forecast the weight at the end of pregnancy (around day 270-280). The primary contributions of this paper are, • collecting weight gain data from women across time during the course of their pregnancy in a practical scenario (example, via self reporting), • building personalised model for GWG trend prediction using as little personal data as possible, • unique raw weight gain transformation approach that reduces inter-BMI class variance for accurate GWG modelling. • validating the proposed approach across different geographical regions and examine the model transfer to evaluate the generalizability of the approach.

II. RELATED WORKS
Various works [1], [16] study the association of prepregnancy BMI, the amount of weight gain during pregnancy and the health risks to mothers and infants. Diana et. al.
propose a differential equation model for pregnant women in different pre-pregnant BMI category that predicts GWG that results from changes in energy intakes [17]. This method helps predict the impact of changes in dietary energy intake on GWG in these BMI categories. Although this tool helps in understanding the dietary needs, there exists no studies that helps pregnant women understand and track the absolute weight gain during their pregnancy in a personalised manner based on individual's weight gain data. Several time series forecasting methods exist in the literature such as state-space approaches e.g. Kalman filtering [18] and Autoregressive Integrated Moving Average (ARIMA) [19] that learn structures from the time series data for fewstep ahead predictions, given sufficient historical personal data. However, they tend to converge towards the mean as the forecast horizon increases, thus giving inaccurate predictions [20]. Alternatively, a polynomial model of lower order (1, 2, or 3) can be used to estimate the end-of-pregnancy weight gain using weight measurements from the start of the pregnancy period, if enough reliable weight measurements collected uniformly over time are available for training. However, there are two major challenges i) weight measurement data are often noisy, incomplete, sparse and non-uniformly sampled due to the self-reported nature, ii) available data from the initial few days of the pregnancy are often limited, complicating the training of a model. Polynomial fit using maximum likelihood estimation (MLE) or ARIMA suffer from at least one of these challenges. In the recent decade, deep learning approaches such as Long short-term memory (LSTM) networks [21] have become popular and they are known to model the non-linearity among the datasets very well for forecasting. However, lack of availability of individual training data pertaining to early prediction in our case, and high number of trainable parameters associated makes them unsuitable in the practical scenario at hand. Fig. 2 illustrates the early prediction of weight gain measurements for two subjects using state-of-the-art methods.
In this paper, we experiment with parametric Bayesian regression to model the time series data. In contrast to the previous work [22], our algorithm incorporates meta-data such as pre-pregnancy weight and BMI to improve the efficacy. We also test the generalization capability of our proposed algorithm on new data from a different geographic region by training our proposed approach on data from one region and testing the learned model on another region. We show that our approach outperforms state-of-the-art in early weight gain prediction by using data from training subjects to create an a-priori model estimate and then tuning it to model the test subject's limited available personal training observations. To our knowledge, this is the first study that uses few weight measurements from the early days of pregnancy to estimate the end-of-pregnancy weight gain.

III. DATABASE
Data from diverse pregnant women were collected in Europe (D E ) and China (D C ). Women that were in their gestational week 5 or later were recruited randomly from midwife practices in Europe and private hospitals in China. The details of these datasets are described below:

1) DE
Two midwife locations recruited 90 participants in Eindhoven, The Netherlands over a period of three months. However, data from only 80 women were considered for the final analysis as 10 subjects dropped out of the study due to miscarriage or technical problems. 40% of the women were experiencing their first pregnancy, while for another 40% it was their second and 20% had more than two previous pregnancies. Education level was generally high with more than 60% having at least college degree. This means that women with low and no education are under-represented in this data. This may be relevant as it is well known that Socio Economic Status (SES) is correlated with nutrition, weight-gain and lifestyle factors in general. 9% of women reported smoking. The weight data was collected using a WiFiconnected weight scale, Withings WS30 1 . The participants were asked to log their weights weekly and the recorded weight data was sent to the cloud via a mobile application. Participants were instructed to weight themselves at least once per week. However, post-hoc analysis shows that participants recorded 2.0 ± 1.4 measurements per week. Overall, 86% of participants were adherent to the study measurement protocol with most of the women measuring more than 1 time per week.

2) DC
Two hospitals recruited 366 subjects living in Shanghai, China. After filtering the subjects that had a disease or left the study in the middle, 153 women's pregnancy weight gain data were considered. About 2/3 of subjects were having their first pregnancy and only few were pregnant for the : State-of-the-art methods, MLE with (order = 1, 2, 3), ARIMA, LSTM to predict the end-of-pregnancy weight-gain for i th subject. The prediction accuracy that can be obtained from the data shown in left subplot is superior to the accuracy using the data that is shown in the right. The data shown in (a) is of a higher quality at the start of the pregnancy period (i.e. more uniformly sampled, less sparse).
3rd or 4th time. The overwhelming majority of the subjects have received at least college degree, which together with a median household income of 2811 − 4200 US$ per month indicates their relatively high social-economical status. The weight data were collected weekly in home as well as on regular visits to the hospital. The in-hospital weight data was highly correlated with the in-home collected data, indicating that the in-home measured data were reliable for further analysis.
Additional meta-data such as age, height and prepregnancy weight were also collected for both the datasets. The participants provided an informed consent pre-data collection and the study was approved by the Internal Ethics Committee for Biomedical Experiments of the involved organizations (ICBE Reference number 2015-0079 and 2017-0189 for D E and D C respectively). It is important to note that D C is more sparse than D E in time. The maximum number of samples for an individual present in D C is 37 and in D E this is 230. This is one of the reasons why modelling such a data is difficult. The data in D C shows less variability among individual subjects in terms of pre-pregnancy BMI class (Table 2). Table 3 shows the data distribution in our sample dataset pre and post-pregnancy for under, within and over guidelines. Interestingly, our sample dataset's distribution is close to that in [1], which is obtained from a large population of more than a million women, with almost half of the women gaining above the recommended guidelines. This further strengthens the need for this study.

Notation.
We are given a population of N − 1 subjects that, by means of self-reporting tools, acquired N − 1 time series of gestational weight gain measurements as represents the input gestational days up to delivery day t i mi and y i = [y i 1 , y i 2 , y i 3 , · · · , y i mi ] represents the output weight gain for i th subject, where y i k = y(t i k ). It is important to note here that t i 1 does not necessarily equal t j 1 , i, j ∈ {1, 2, · · · , N − 1}. This is because each subject acquires measurements at different times according to their personal preferences and adherence to data collection.
Additionally, we are given individual weight measurements from test subject's (N th subject) initial t + d days of We call this the personal-training data. Weight gain data from N − 1 training subjects over entire gestational period is called the public-training data.
The objective is to try to learn function(s) f from given public and individual training data, such that, where i ∼ N (0, σ 2 ) is independent and identically distributed (i.i.d) according to a Gaussian. Our parametric approach learns parameters' information a-priori from the public-training data. We then use this generated prior-knowledge along with the personal-training data to build personalised models and learn f . The individual weight gain in future at delivery time t + m is forecasted using the learned model f and . Firstly, before we discuss the parametric regression, we introduce a pre-processing technique for transformation of input data using IOM guidelines.

A. TRANSFORMATION USING IOM GUIDELINES
We subtract the pre-pregnancy weight to calculate the weight gain data. After using pre-pregnancy weight to standardize the data, we propose to transform the obtained weight gain data by introducing a non-linear trend controlled by a subject's pre-pregnancy BMI. This trend is based on the prepregnancy-BMI classes and their respective expected rate of weight gains in accordance with IOM guidelines. Lower and upper guidelines are obtained using linear interpolation based on the total weight gain and the rate of weight gain that are suggested by the IOM guidelines (Table 1). For the i th subject with pre-pregnancy BMI class bmi i at time t k , this means that the following extrapolation is proposed.
where bmi i = {'underweight','normal','overweight','obese'} is calculated using pre-pregnancy BMI, ∆ min = 0.5 kgs, ∆ max = 2 kgs are the first trimester (90 days) minimum and maximum gains respectively according to the guidelines (Table 1). α bmi min and α bmi max are the minimum and maximum allowed weight gains during second and third trimester in IOM guidelines (Table 1) It should be noted that we are introducing a non-linear trend in our pre-processing approach by multiplication with ρ(t) instead of standard division based normalisation. As Fig.  3a and Table 1 suggests, an underweight woman is allowed a larger weight-gain bandwidth than an obese woman. We multiply the original weight gain data with this bandwidth factor ρ calculated based on pre-pregnancy BMI class that allows an underweight woman to have a wider window of weight gain than an obese woman (Fig. 3b ). Such scaling ensures that the data across different subjects and BMIs are closer to each other in transformed space for a better fit. Fig.  4 shows how original and transformed data scale across each BMI class among all the subjects in dataset D E .

B. REGRESSION
We can fit a p th -order polynomial with f = w 0 + w 1 t + w 2 t 2 +· · ·+w p t p in eq. (1) and estimate the coefficients w = [w 0 , w 1 , · · · , w p ] T by maximizing the likelihood (L) over an individual's personal-training data D, L(w) = P (D|w), Eq. (7) refers to the model learnt from the individual's sparse limited observations up to given t d days. Next, we exploit the public-training data and find the maximum likelihood point estimates (MLE) ofŵ i for each individual time series in the public-training data following eq. (7). If we assume gaussianity over the distribution of w such that w ∼ N (µŵ, Σŵ), we can find a closed-form solution of maximum-a-posterior (MAP), w M AP analytically. Here, µŵ = mean([ŵ 1 ,ŵ 2 , · · · ,ŵ N −1 ] T ), Σŵ = cov([ŵ 1 ,ŵ 2 , · · · ,ŵ N −1 ] T ) are mean and covariances of the polynomial coefficientsŵ 1 ,ŵ 2 , · · · ,ŵ N −1 that are each obtained using the individual gestational weight gain data from each of the N − 1 subjects in the public-training data. This distribution over the MLE estimates of the coefficients, p(w) is acquired from the N − 1 subjects in the public-training data as an a-priori estimate. The likelihood learnt from the self-training data and the a-priori distribution learnt from the population data are then combined using bayes theorem to calculate the maximum-a-posteriori (MAP) estimate of the coefficients p(w|D).
We can ignore P (D) in eqn. (8) as it doesn't depend on w.

C. CLASSIFICATION USING GUIDELINES
We further extend the prediction results for better interpretation by classifying the predicted weight gain into three classes, 'underweight', 'normal', and 'overweight' represented as integer values '-1', '0' and '1' respectively. For this purpose, we compare the predicted weight gain with the recommended weight-gain guidelines at the delivery day t d to get the 3-class classification output. Following eq. (2) and (3), classification function c(t i , y i (t i )) for i th subject is defined as a function of time t i and weight gain value y i (t i ):

V. EXPERIMENTS
We experiment with 1 st to 5 th order to fit our weight-gain data. We empirically chose a third order polynomial as it obtains the minimum prediction error among all other orders in cross-validation. However, with transformation based preprocessing, we choose order 2 for modelling y transf ormed as the transformation itself adds to the non-linearity by order 1.

A. STATE-OF-THE-ART
ARIMA. This is a time series forecasting approach [19] that exploits correlations in historical data. Forecasting using ARIMA methods requires uniformly spaced samples of the time series. We introduce uniformity in personal training data by linear interpolation between samples. We fit an ARIMA(p,d,q) model by i) enforcing equi-spaced sampling by linear interpolation, ii) performing a grid search over the hyperparameters [23] to find an optimal autoregressive order, degree of differencing, and moving average order, iii) forecasting multi-steps ahead in time to find the endof-pregnancy gestational weight gain using the optimised hyperparameters over the training part (GWG data until day t d ).
LSTM. We evaluate LSTM based regression network with 200 hidden units by training them to minimise the mean absolute error using the 'adam' optimization method [24].
MLE. We also tested a polynomial fitting approach following maximum likelihood estimation (MLE) with different order polynomials. Order 2 produces best results (among the orders 1 to 5).

B. EVALUATION METRIC
The performance of regression was computed using Mean Absolute Error (MAE), We use accuracy acc as the desired metric for evaluating classification performance defined using eq. (9) as #correct predictions in recommended guidelines #total subjects (10) where I is the indicator function such that I(A) = 1, if event A occurs and 0 otherwise and t i mi is the delivery day for i th subject. Accuracy acc at a time t j is the accuracy (averaged over N users) calculated using eq. (10) when personal-training data for the i th subject is considered to be available only until the day t j . Next, we calculate the normalized area under the accuracy curve (AuAC) to evaluate the performance of a given approach with respect to the available training data between days T 0 to T 1 as We omit T 0 from the notation AuAC T1−T0 and use AuAC T1 to denote AuAC until day T 1 for simplicity as T 0 = 120 is fixed in our analysis. This is because atleast one subject exists with no recorded weight gain measurement before day 120. Fig. 5 shows two exemplary curves A and B with B being better at early prediction than A, hence AuAC B 160 > AuAC

VI. RESULTS
We evaluate the performance of the described approaches in terms of MAE and accuracy of the predicted weight gain (class) against the actual end-of-pregnancy weight gain (class). In order to validate the performance, we perform leave-one-subject-out cross validation, where training dataset in each iteration consists of public-training data (weight-gain from N − 1 subjects) and personal-training data from the test subject as defined in section IV. We experiment by varying the amount of available personal-training data until a certain day in pregnancy and perform cross-validation to evaluate the performance of different approaches against training data availability. We also present performance measures for early prediction by taking day '140' as the early threshold as it is mid-way through the pregnancy. Finally, we study the effects of transferring model learnt from one geographic region to infer the data from subjects in another geographic region.

A. WEIGHT GAIN TREND VISUALISATION
We predict the trend of weight gain on both the datasets D E and D C and present in Fig. 6 how such a prediction looks like with limited training data. Fig. 6 shows the personaltraining data up to 140 days into the pregnancy and the best and worst prediction results in terms of mean absolute error alongside the actual weight gain measurements during the later stages of pregnancy using the proposed approach with transformation. Since we are concerned about the end-ofpregnancy weight gain, we calculate the MAE right before the delivery date between actual and predicted weight gain while also show the predicted trend of weight gain for these subjects. The errors in prediction for the (best, worst) cases among the D E and D C are (0.93, 9.24) and (0.03, 11.42) kgs respectively. One can see that in Fig. 6(c) and (d), there is only single training observation before day 140. In Table  4 the confusion matrix for predicting different classes according to recommended guidelines on the both the datasets with training data until day 140. Also, Table 4   Next, we perform LOOCV over all the subjects in each of the dataset by varying the availability of personal-training VOLUME 4, 2016 data before a given day in gestational age and calculate the performance averaged over all the subjects.

B. COMPARISON WITH STATE-OF-THE-ART
To compare the performance of the proposed approach with the state-of-the-art methods, we study Mean absolute error (MAE) and accuracy (acc) against different amount of available personal-data. Fig. 7 shows that our proposed method outperforms the state-of-the-art approach in early detection (until day 160). All the improvements of the proposed method P T are statistically significant based on a paired ttest with equal variances and p < 0.05 on both the datasets D C and D E compared to state-of-the-art.
Furthermore, ARIMA models' results are statistically insignificant as compared to proposed method for available training data from day 170 to 210 for both the datasets. Additionally, from Fig. 7, it can be observed that the MAE reduces and accuracy increases with increasing availability of personal-data. Paired t-test with equal variances suggest that these improvements are statistically significant only for dataset D E when sufficient training data is available (day 190 onwards) and is never statistically significant for D C .
Next, in addition to accuracy we try to quantify the performance of all the approaches against different availability of training data using a single metric by calculating AuAC between day 120 to day 140. These values for different methods are presented in Table 5. Also, the accuracy score with training data until day '140' reported in Table 5 suggests an improvement of around 25.9% and 31.1% over the best of state-of-the-art for datasets D E and D C respectively.

C. EFFECT OF MODEL TRANSFER BETWEEN DATASETS
We test the proposed approach in two settings to test the model transfer as follows, i) we train the MAP model on D E and test the model learnt on D C , ii) we perform leave-oneout cross validation (LOOCV) on D C . Fig. 8 shows the comparison of model transfer with or without the transformation based processing step.It can be observed in Fig. 8 that accuracy of model transfer based on P T is greater than LOOCV until day 160 i.e in early prediction. However, accuracy of the proposed MAP approach without the transformation is almost always better with model transfer.

VII. DISCUSSION
Predicting weight gain reliably in pregnant women as early as possible is at the heart of this study. In this study, we experiment by first collecting weight-gain datasets in two different geographies and building prediction models that utilise prior information generated from public-training dataset to tune the personal-model for accurate estimation of the endof-pregnancy weight gain. The total percentage of the most represented class post-pregnancy is set as a baseline for comparing prediction accuracy. According to Table 3, this baseline is 0.49 for D E and 0.44 for D C marked in Fig. 7.
With limited amount of available personal-training data for prediction of weight gain, our MAP based bayesian approach forms an a-priori estimate of model coefficients based on public-training data model coefficients. This addition of prior in the model also acts as a type of regularization. This results in high performance gains in early prediction of around 25.9% and 31.1% over the best of state-of-the-art for datasets D E and D C respectively. Additionally, including the transformation based processing step improves the performance further (Table 5). This is because our transformation step introduces a non-linearity in time based on pre-pregnancy BMI that scales each subject's raw weight gain data with respect to the allowed rate of weight gain thus scaling each time series to similar range. Also, the polynomial fit for transformed time-series is done with one lower order (p = 2) than the ordinary MAP fit (p = 3) which improves the generalization ability of the fit. It is evident from Fig. 6(b), the worst result occurs when the person's weight gain trend is different from any of the available subjects in public-training data and the personal-training data (until day 140) is also insufficient to capture this trend. We think that there are two ways in which this can be addressed 1) increasing the amount of personal-training data and/or 2) increasing the size of public-training data by adding more subjects that reduces the variance of the model. Fig. 6(c) and (d) show the best and worst result on dataset D C . It can be observed that in both the cases only a single personal-training observation is present before day 140 with it being present close to test data in time in the best case ( Fig. 6(c)) and being further away in time to the forecast horizon in the worst case ( Fig. 6(d)). One can infer that the points close in time to the forecast horizon have more importance in reliable prediction than the ones farther away in time. Table 4(a) and (b) suggest that most of the prediction errors are to the neighbouring classes. The accuracy is lowest for the underweight class as it is the most under-represented class (Table 3) in our dataset. Fig. 7a,c and Fig. 7b,d show the mean absolute error in prediction and the accuracy for different datasets averaged over all the subjects. Fig. 7 shows that the prediction error reduces and accuracy improves as the personal-training data availability increases.
Although at a glance at Fig. 7(a), it might look like ARIMA's MAE for dataset D C is less than proposed P T when training data is available as early as day 170. However, as described in subsection VI-B the low mean absolute error is statistically insignificant as compared to P T until day 210. As more personal-training data is available by day 210 for dataset D C and by day 240 for dataset D E , the personal mod- : Peformance scores (mean absolute error and accuracy) for the proposed approach with respect to state-of-the-art on D E and D C . A single (abscissa, ordinate) pair in the figure represent the performance score (ordinate) averaged over all the subjects with respect to availability of training data until a certain day (abscissa). MAE reduces (a,c) and accuracy increases (b,d) as availability of training data increases. Majority label percentage in respective datasets is taken as the accuracy baseline. els based on ARIMA tend to become more accurate than the proposed approach. Although, this could be of importance in problems with low forecast-horizon, but in cases where early forecast is needed such as ours, proposed approach outperforms ARIMA.
LSTM based deep learning approaches train well when lots of training data is available. In our case, this availability of personal-training data is not present because of two reasons i) the data is sampled irregularly and has a very low sampling frequency and ii) early intervention requires using as little personal-training data as possible. We believe that even when more subjects participate, our approach will scale better than LSTM based approach because of the aforemen-tioned reasons. Fig. 8 shows that model 'Transfer' works better than 'LOOCV' irrespective of pre-processing. Table 4(c) shows that there is a huge improvevment in predicting the class "underweight" with this model transfer without compromising the performance of other classes. The dataset D E exhibits more variability in terms of capturing weight-gain trend among different BMI classes with pre-pregnancy BMI ranging from 20 to 28 kg/m 2 . This might be one of the causes that model trained on this dataset generalizes well on D C .
Our proposed bayesian approach with pre-processing has a prediction MAE of only 2.45 kgs (D E ) and 2.82 kgs (D C ) and a classification accuracy of 67.5% (D E ) and 58.9% (D C ) at day '140'(mid way through the pregnancy) for early intervention as compared to state-of-the-art approaches, best of which has an MAE of 8.17 (D E ) and 6.60 kgs (D C ) and an accuracy up to 53.8% (D E ) and 44.8% (D C ). Fig. 7 shows that our approach predicts better than the state-of-theart when training from data using 120-240 days, and predicts close to state-of-the-art during the very last few days of the pregnancy. AuAC 140 can be thought of as an early intervention score that measures how accurate the classification performance is with varying amount of training data from day 120 until day 140. In other words, the early prediction performance of our technique with transformation has an AuAC 140 of 0.65 (D E ) and 0.53 (D C ). Another key step in this work was to apply model transfer to test the generalisation capability of the model between two different geographic regions that further improves the prediction capability on the sparser dataset D C .

VIII. CONCLUSION
In this study, we propose an efficient early-weight gain prediction system in pregnant women. We validate and show the efficacy of our proposed approach over this unique dataset from two diverse geographical regions. Our approach utilises the power of combining a-priori information learnt from the public-training data and tunes the parameters of personal training data based on this prior information. Additionally, we incorporate a pre-processing step to scale our data using meta-data such as pre-pregnancy weight and BMI to achieve additional boost in our performance. Our results show the reliable estimation of end-of-pregnancy weight gain that can help to provide proper interventions by pre-natal care providers and to reduce risks of adverse maternal and neonatal effects of excessive or inadequate GWG.
CHETANYA PURI received the Master degree in Telecommunication Systems Engineering from Indian Institute of Technology, Kharagpur, India. He is currently pursuing his PhD from KU Leuven, Belgium and working as an early stage researcher on the 'HEART' project (HEart related Activity Recognition system based on IoT). His focus lies in building machine learning algorithms that can deal with limited availability of data and its applications in healthcare. His research interests are machine learning, Digital Signal Processing, anomaly detection, and time series analysis.
Prior to joining KU Leuven, he worked as a researcher at Research and innovation, Tata Consultancy Services, India where he was involved in building anomaly detection techniques for Cardiac Health estimation using Electrocardiogram, Photoplethysmogram and Phonocardiogram signals from wearables and other non-invasive sensors' data.