Using Epidemic Modeling, Machine Learning and Control Feedback Strategy for Policy Management of COVID-19

Coronavirus disease (COVID-19) is one of the world’s most challenging pandemics, affecting people around the world to a great extent. Previous studies investigating the COVID-19 pandemic forecast have either lacked generalization and scalability or lacked surveillance data. City administrators have also often relied heavily on open-loop, belief-based decision-making, preventing them from identifying and enforcing timely policies. In this paper, we conduct mathematical and numerical analyses based on closed-loop decisions for COVID-19. Combining epidemiological theories with machine learning models gives this study a more accurate prediction of COVID-19’s growth, and suggests policies to regulate it. The Susceptible, Infectious, and Recovered (SIR) model was analyzed using a machine learning model to estimate the optimal constant parameters, which are the recovery and infection rates of the coupled nonlinear differential equations that govern the epidemic model. To modulate the optimized parameters that regulate pandemic suppression and mitigation, a systematically designed feedback-based strategy was implemented. We also used pulse width modulation to modify on-off signals in order to regulate policy enforcement according to established metrics, such as infection recovery ratios. It was possible to determine what type of policy should be implemented in the country, as well as how long it should be implemented. Using datasets from John Hopkins University for six countries, India, Iran, Italy, Germany, Japan, and the United States, we show that our 30-day prediction errors are almost less than 3%. Our model proposes a threshold mechanism for policy control that divides the policy implementation into seven states, for example, if Infection Recovery Ratio (IRR) >80, we suggest a complete lockdown, vs if 10¡IRR¡20, we suggest encouraging people to stay at home and organizations to work at 50% capacity. All countries which implemented a policy control strategy at an early stage were accurately predicted by our model. Furthermore, it was determined that the implementation of closed-loop strategies during a pandemic at different times effectively controlled the pandemic.


I. INTRODUCTION
Human populations have been affected by communicable 26 diseases since ancient times [1]. Two of the most ancient 27 The associate editor coordinating the review of this manuscript and approving it for publication was Yiqi Liu . and deadly diseases of humanity, tuberculosis and malaria, 28 ravaged Ancient Egypt for more than 5,000 years and are still 29 a major health problem today [2]. In 2009, a new A/H1N1 30 influenza virus emerged, causing the first global pandemic 31 in 40 years [3]. Within the first two months of the out-32 break, the disease had spread to more than 70 countries 33 generally relies on keeping the pandemic (R 0 ) below a time. This can have the unintended consequences of causing 90 economic activities to plummet, thereby putting millions of 91 jobs at risk. The aim of this work is to assist city officials in 92 making data-informed decisions to keep people safe, while 93 sustaining economic activities during any pandemic outbreak. 94 This work addresses how effectively we can modulate and 95 manage public health policies by developing hybrid models 96 which are not just purely data-driven, but are also based on 97 control feedback and time based monitoring of the country. 98 Furthermore, how can we measure the pandemic contagious-99 ness using novel threshold mechanism to control the pan-100 demic. In our research, we explored how feedback can help 101 stabilize and slow the progression of this deadly viral infec-102 tion. Using engineering principles, we have come up with 103 a practical approach to provide policymakers with concrete 104 guidance, one that takes both medical and socioeconomic 105 factors into account. We relied on feedback-based strategies 106 to control the outbreak and manage the longer-term caseload 107 effectively. The goal of this paper is to develop a fundamen-108 tal understanding of interventions' impacts on the pandemic 109 spread and their ability to forecast its progression. We present 110 a novel data-based modeling approach that is equally effi-111 cient in controlling and suppressing the spread of viruses. 112 In this study, we address the following challenges: (1) how 113 to evaluate public health policies using models that are not 114 only data-driven, but also based on control feedback strat-115 egy, and (2) how can we calculate the Infection Recovery 116 Ratio (IRR) to assess the contagiousness of the pathogens 117 as explained in [19]. (3) how can we analyze the best-fit 118 model for predicting the COVID-19 outbreak using different 119 analytical techniques. This work addresses these issues and 120 provides the following contribution: ing algorithms [29], [30], [31]. The main limitation is that

200
It has been observed that most of the studies, concentrated 201 mainly on either classical epidemiological models or machine 202 learning models for COVID-19 pandemic predictions, both of 203 which have limitations in generality and scalability, as well 204 as a paucity of monitoring data [8]. Furthermore, it is par-205 ticularly difficult to calculate mortality rates among reported 206 cases (case fatality ratio) in the early stages of an epidemic. 207 Therefore, these inaccuracies and biases can be carried over to 208 the estimates of the impact of the public health measures that 209 are being taken to contain COVID-19 in the community [32]. 210 Researchers have also applied feedback-based control theory 211 to control the number of infections by monitoring (R 0 ) in 212 conjunction with the number of fatalities [18]. Although R 0 is 213 a standard measure that can be utilized to measure the disease 214 spread, it does not indicate the severity of infectious disease, 215 nor does it indicate the rapidity of a pathogen's spread [33]. 216 This paper uses a novel metric to measure the disease's 217 contagiousness.

218
It is shown in [34], [35], and [36] that simple social distanc-219 ing control actions are analyzed for controlling the impact 220 of pandemic (more specifically the universal single inter-221 val social distancing is utilized). These previous studies pre-222 sented different techniques for determining the optimal single 223 interval control action, in order to minimize the infected peak 224 prevalence rate which is defined as maximum proportion of 225 infected individuals. Almost all of these proposals assume a 226 full lockdown, a scenario that is somewhat unrealistic.

227
The aim of this study is to develop a hybrid communicable 228 disease, data-driven, and control feedback theory strategy, 229 a standard tool in control engineering, to address the limita-230 tions outlined above. Even though the aforementioned mod-231 els are useful for predicting epidemic spread, they lack the 232 granularity necessary to analyze individual behaviors during 233 epidemics and analyze the relationship between individual 234 decisions and epidemic spread. Therefore, such a high-level 235 analysis is of limited use to city officials in adjusting public 236 health policy guidelines [37].

238
In this section, we provide details of the proposed work with 239 respect to the SIR Model, machine learning and the control 240 feedback strategy.

242
Multiple mathematical models have been developed to exam-243 ine the spread of infection. One of these models is the 244 SIR model, which is composed of three coupled nonlinear 245 ordinary differential equations. The model assumes that pop-246 ulation size is constant and is divided into three parts: Suscep-247 tible (s), Infected (i) and Recovered (r). The SIR differential 248 equations can be expressed as follows [38]: where s(t) = susceptible population at time t, i(t) = infected 253 population at time t and r(t) = recovered population at time 254 t, k 1 = infection rate, and k 2 = recovery rate.

255
In these differential equations, the ratio of constant coeffi-256 cients, k 1 (the infection rate of the pathogen) and k 2 (its recov-257 ery rate), will determine IRR which is k 1 /k 2 as described 258 in [19]. This paper employs a machine learning (ML) model  Specifically, we calculate the LF for training data spanning 303 over T days, also referred to as time period T . In addition, t 304 refers to the day of that particular time period.

305
ML is trained to minimize the LF to find optimal val-306 ues of k 1 and k 2 . To minimize the LF, the limited memory 307 Broyden-Fletcher-Goldfarb-Shanno with bound-constrained 308 optimization algorithm (L-BFGS-B) algorithm [39] is used 309 with the minimize optimization function in the scikit-learn 310 library [40]. Through the training of the ML model, it is 311 possible to determine the optimal values for k 1 and k 2 , which 312 are the constants in the SIR model.Using SIR differential 313 equations in conjunction with ML strategies, the model was 314 trained to predict SIR curves with k 1 and k 2 being optimized 315 after training.

316
SIR graphs include curves representing actual data, as well 317 as curves representing S, I , and R curves, which are fitted 318 curves obtained from training the model, along with a pre-319 diction for a near future date. In this study, the initial values 320 of I and R are taken from real-world data, and the initial 321 value of the susceptible population is calculated by using the 322 appropriate ratio when compared with the number of cases in 323 mainland China.
324 Table 1 shows that the default values for k 1 and k 2 which 325 are set as 0.001. Observations have shown that when the 326 initial values of the S, I , and R populations are changed, the 327 curves change and take on a new shape and behavior. This 328 behavior will be described in the results section. By using 329 ML, the k 1 and k 2 parameters of the SIR model are opti-330 mized to find the point of intersection for infected and recov-331 ered cases. A simple, but elegant method for controlling the 332 multiple input signals related to the various trigger points is 333 to look at the intersection point between the patients who 334 are infected and those who have recovered. These trigger 335 VOLUME 10, 2022 points are denoted by phases, with phase (P 1 ) policies being enforced when predicted infected patients exceed predicted 337 recovered patients and phase (P 2 ) policies being unenforced 338 when predicted infected patients fall below predicted recov-339 ered patients. Feedback control can be used to modulate sig-340 nals between phases, as discussed in section III-C. follows: where n is the degree of the polynomial and c is a set of coef- In the SIR model, the constant value determines the PW of 409 the control signal, enabling the government to decide which 410 policy to implement. In addition, the constants k 1 and k 2 of 411 those differential equations are also used to calculate the IRR, 412 which is defined as k 1 /k 2 . In this study, a square wave is 413 used as the control signal, while a PWM signal controls when 414 the signal is turned on or off. Digital signals can be either 415 0 or 1, so we used a signal value of 1 to represent complete 416 lockdown, in other words, strict restriction policies must be 417 implemented, and 0 to represent no restriction. Actions are 418 defined as enforcement of a policy within a specific phase. 419 Figure 2 illustrates the phase transitions. Specifically, the PW is calculated as the difference in x 421 coordinates between the first date in the training data and the 422 intersection point of the SIR curves for infected and recovered 423 patients. The start date is the first date in the training data that 424 is considered in training the model. Thus, pandemic control 425 can only be achieved if real cases follow the path of the pre-426 dicted curves after this tipping point. It is often the case that 427 the curves do not intersect in the near future due to the IRR 428 being high or an epidemic ravaging the country. Therefore, 429 instead of using a large PW in such cases, we use thresholds 430 of IRR values to define the PW as follows:  In the next section, we will introduce the transfer function, We will assume the system is a simple single-pole filter in 471 this article, with the following characteristics [42]: With the above equations where ω 0 is proposed as: Modulated control signals can be passed through a transfer 476 function, where the PW of the input signal and the pole of 477 the transfer function are set by parameters of the SIR model. 478 If only square waves are used, it is suggested that the pub-479 lic be locked down together with strict on-off policies, but 480 instead, the lockdown is abruptly imposed without warning 481 and the off policy is suddenly retracted since all restrictions 482 are revoked at once. Instead, a control signal through a trans-483 fer function is used to allow a gradual implementation of 484 policies.

485
Following the filtering phase, the control signal enters 486 the state of imposition of the most restrictive measures that 487 have the least economic impact on the population and main-488 tain a balance between public health and economic needs. 489 Restrictions are implemented progressively, depending on 490 IRR and k 1 . For instance, in a country with a high IRR 491 imposing complete lockdown occurs more quickly, with 492 fewer intermediate state of pandemic control. As the y-value 493 changes within the PW from 0 to 1, indicating a specific 494 policy should be implemented by the government. The grad-495 ual implementation of policies ensures that economic activ-496 ities are not disrupted and stricter policies are only imposed 497 when necessary. In Figure 3, phase transitions and state tran-498 sitions are illustrated by using a modulated signal through 499 a transfer function while considering the parameters of the 500 SIR model. 501 Table 1 provides the data for feeding into the ML model for 502 each of the six countries included in the study, as well as the 503 IRR, k 1 , and k 2 outputs of the ML model. The optimal values 504 are obtained using the L-BFGS-B algorithm after a model 505  has been fitted using SIR differential equations. Additionally,  The framework of the proposed work is shown in Figure 4. then PWM signals regulate when the signal is ON-OFF 543 and consequently transfer function is used to implement 544 the policy at different states depending on optimized IRR 545 values. In this way, policy enforcement can be imple-546 mented gradually and turned on/off as necessary, rather than 547 abruptly imposed on the public without warning during the 548 phase.

550
A. DATASET 551 SIR curves were plotted using dataset from John Hopkins 552 University (JHU), which includes a time series of COVID-19 553 cases [43]. The dataset contains country-specific distribu-554 tions of confirmed cases, deaths, and recovered cases, as well 555 as day-specific counts for each category. The dataset is a 556 time-series dataset which covers covid information for a dura-557 tion of 510 days from 1/22/20 -6/14/21. The instances of data 558 or number of days used for training varies for each country 559 and we match the training data duration in such a way that 560 data before the onset of covid-spread in a country is used 561 for training and the SIR curves are predicted for the next 562 30 days (which is the size of testing data To demonstrate the predictability of SIR curves, six countries 570 were selected from around the world, including the US, India, 571 Iran, Italy, Germany, and Japan. The COVID-19 pandemic 572 curve is drawn from January 22, 2020 forward, with each 573 country having its own demographics, social habits, and poli-574 cies regarding the pandemic. for India, Italy, Iran, Germany, the US and Japan, respectively. 577 Here, the blue line represents the number of active cases 578 in the country. The red dashed line and red dashed rectan-579 gle represents the predicted behavior of the infected curve 580 and the actual behavior of infected cases across the coun-581 try respectively. The green dotted line and the green dotted 582 circle represents the predicted behavior of recovered cases 583 and actual recovered cases in the country respectively. Plot-584 ting SIR graphs of India, Italy, Iran, and Germany show the 585 least variation between actual and predicted values. By con-586 trast, Japan's and the U.S.'s 'actual' values do not match 587 with the regression predictions. It is because these countries 588 have implemented some restrictive measures to combat the 589 spread of the disease. In the year 2020, the spread in the 590 US was growing at a faster rate, and the IRR reached 150, 591 which was alarming. Our hypothesis suggests that a nation-592 wide lockdown could have prevented widespread disruptions 593 caused by the COVID-19 pandemic. The plot shows that, 594 in the US, imposing a lockdown would have decreased the 595  Table 1, while solid lines represent the control signal 614 passing through a single-pole filter with ω 0 as the pole. As the 615 final control signal is not a square wave, it suggests gradual 616 VOLUME 10, 2022

636
The control signal's y-coordinates are used to determine 637 which policy to adopt. If a country has a high IRR factor, 638 such as the US, there is less time spent in S 0 to S 5 states 639 and, therefore, reaching S 6 state is relatively faster than with 640  Table 1. The black line represents the proposed policy control signal to be implemented by the country. The black policy control signal is the output signal of the transfer function when the red square wave is fed through as input. respect to Japan, which has a low IRR factor. As seen from 641 the PW of Japan, the majority part of the PW indicates the 642 country falls into the state S 3 to S 5 state. In contrast to Iran, 643 the control signal does not indicate a total lockdown within 644 the next 30 days.

645
In Table 2, we show the y-coordinate of the control sig-646 nal and the policy that a country needs to follow to attain 647 a given state. Depending on the pandemic control signal's 648 x-coordinate interval, the country determines how long it will 649 take to implement a particular policy.  Figure 9 (c)).

710
In the second case, India's curves only correspond to the 711 period around the outbreak of the pandemic. As seen in 712 98254 VOLUME 10, 2022

723
Epidemiological models such as the SIR model lack het-724 erogeneity and can easily be programmed and analyzed [44].

725
There is also a shared opinion that SIR models are not com-726 pletely appropriate due to their inflexibility into account, it would recommend a full lockdown within the 795 same period instead of the above measure. This is because 796 the infection rate is higher than the recovery rate. As a result 797 of gradual implementation, the health care system does not 798 overload and is capable of effectively treating pandemic.

800
In the paper, data analysis has been conducted using Covid-801 19, 2020 datasets. In addition, the study examines threshold 802 mechanisms for implementing state policies during pandemic 803 outbreaks. The threshold values for these policies were deter-804 mined with a limited dataset. We plan to use ML in the future 805 to calculate and predict the threshold for implementing pan-806 demic policies during outbreaks of pandemics.

808
This paper utilizes SIR, feedback strategy based on con-809 trol theory and regression models to study the behavior of 810 COVID-19 pandemic. It was deduced that the SIR model 811 is best suited for analyzing long-term trends in the spread 812 of diseases, whereas the regression model provides better 813 results during the outbreak phase than the SIR model. The 814 results can be quantified using mean squared error between 815 the predicted values and the actual values. At an early stage 816 of the pandemic, it is evident from the results shown in the 817 paper that there is a large difference between the predicted 818 and the actual number of infected people curves. Based on 819 the graphical result and the mean squared error, it shows that 820 the SIR model cannot provide a useful early prediction of 821 the epidemic in this case. However, we use the SIR param-822 eters along with feedback based control theory in order to 823 implement policy guidelines in the form of phases and states. 824 Control feedback strategy helped to determine not only the 825 type of policy that should be implemented in the country but 826 also the length of the time it should be implemented for. The 827 model was comparatively evaluated with regression analy-828 sis in understanding when and where the pandemic strategy 829 should be employed. As a result of this study, the coun-830 try officials would be able to control the pandemic and not 831 impact the economy negatively. In countries such as India 832 and the US, the SIR curves do not converge because the 833 infection rate increases much faster than the recovery rate. 834 In addition, it was found that the SIR model along with 835 machine learning accurately computed the actual and pre-836 dicted cases for all countries with the exception of Japan 837 and US since these countries did not implement a policy 838 control strategy at an early stage, whereas India, Iran, Italy, 839 and Germany did. Furthermore, based on SIR parameters 840 and feedback strategy based on control theory the states that 841 we proposed can be effectively used by the public officials. 842 It was found that if these state policies were implemented 843 at the right time, the pandemic would have been controlled 844 without affecting the economy of all the aforementioned 845 countries.

846
For future work, we plan to use ML for predicting IRR 847 values. Furthermore, the idea for the control feedback strat-848 egy for policy management can be tried in different use-cases 849 for pandemic control, two of which are presented here.
On account of pandemic spread, it is basic to know the spa-