Models for digitally contact-traced epidemics

Contacts between people are the absolute drivers of contagious respiratory infections. For this reason, limiting and tracking contacts is a key strategy for the control of the COVID-19 epidemic. Digital contact tracing has been proposed as an automated solution to scale up traditional contact tracing. However, the required penetration of contact tracing apps within a population to achieve a desired target in the control of the epidemic is currently under discussion within the research community. In order to understand the effects of digital contact tracing, several mathematical models have been proposed. In this article, we survey the main ones and we propose a compartmental SEIR model with which it is possible, differently from the models in the related literature, to derive closed-form conditions regarding the control of the epidemic as a function of the contact tracing apps penetration and the testing efficiency. Closed-form conditions are crucial for the understandability of models, and thus for decision makers (including digital contact tracing designers) to correctly assess the dependencies within the epidemic. With our model, we find that digital contact tracing alone can rarely tame an epidemic: for unrestrained COVID-19, this would require a testing turnaround of around 1 day and app uptake above 80% of the population, which are very difficult to achieve in practice. However, digital contact tracing can still be effective if complemented with other mitigation strategies, such as social distancing and mask-wearing.


I. INTRODUCTION
S INCE April 2020, the WHO has been recommending two main and complementary strategies to curb the COVID-19 epidemic: social distancing on the one end, test, trace, treat (the famous 3 T's) on the other. As for any respiratory viral infection, the sooner we are able to "remove" contagious people from interacting with others, the sooner the epidemic will be restrained. Indeed, if, on average, each infected person infects less than one susceptible person, rather than, for example, two or three, the epidemic will die naturally [1]. Of course, for this to be effective, all three T's must be carried out swiftly. Contact tracing without testing is impossible: first, you have to know that a person is potentially contagious before being able to track down their contacts. Similarly, tracing must be completed as quickly and as thoroughly as possible: the All authors are with CNR-IIT, Via G. Moruzzi 1, 56124, Pisa (e-mail: first.last@iit.cnr.it).
This work was partially funded by the SoBigData++, HumaneAI-Net, and SAI projects. longer it takes to identify past contacts, the longer the time a potentially contagious person spends unknowingly infecting other people. Then, contagious people must be immediately isolated, and treated if necessary.
Contact tracing can be performed manually or digitally. Manual contact tracing involves reconstructing the history of past contacts with the infected person in the days before being detected as contagious. This is typically done through interviews with the infected person. Manual contact tracing suffers from two main problems: i) it is labor-intensive, hence it struggles to keep up when the number of daily new cases is high, and ii) the contagious person might not be able to recall precisely their past contacts (simply because they forget some or because some chance contacts with strangers are not noticed in the first place). Digital contact tracing, performed by means of smartphone apps that -typically via Bluetooth -automatically detect and register contacts, have the potential to overcome the two limitations of manual contact tracing described above. The research community has already identified convincing solutions that provide reasonable tradeoffs between privacy and tracing accuracy [2]- [5]. Specifically, decentralized Bluetooth-based contact tracing has emerged as the solution of choice and privacy-preserving apps based on this approach have been deployed 1 in many countries [6], [7]. Recent research proposals involve leveraging blockchains and IoT for a more efficient and privacy-preserving tracking [8]. The vast majority of deployed apps leverage the Exposure Notification protocol, jointly rolled out by Apple and Google in Spring 2020. However, digital contact tracing comes with its own problems. The main one is that, for it to be effective, a significant percentage of the population must have the app installed [9]. Bumping up this percentage may not be as easy as it seems [10]. For example, people with old smartphones (typically not supporting Bluetooth Low Energy or for which an updated operating system is not available) cannot enjoy the tracking functionality. Fear of government intrusion into privacy turns off other potential participants.
Due to the limitations discussed above, the percentage of people with an installed and fully-functioning contact tracing app will be far from 100%. Thus, key questions are, among others: how large should this percentage be for digital contact tracing to be effective in containing the epidemic? How does this percentage depend on the contact patterns between people? How does it depend on the implementation of other mitigation measures (such as social distancing and mask-wearing)? Digital contact tracing has yet to be properly evaluated as a public health measure through a large-scale assessment [11]. Thus, the answers to the above questions must then necessarily come from mathematical models and simulations. The network of people with the contact tracing app installed is just another instance of a mobile social network [12]: people interact with each other socially and these interactions are mirrored in the anonymous data collected from the contact tracking app. Thus, by taking advantage of the properties of this mobile social network, we will investigate the above questions.
Ferretti et al. [9] adapted a model introduced by Fraser et al. [1] (and based on the popular Von Foerster equation) in order to tackle the same research problem. However, the model in [9] can only be solved numerically, and hence it is unable to yield a closed-form condition under which the epidemic can be controlled based on the characteristics of the digital contact tracing in place. In this article, to complement the model in [9], our goal is to propose a deterministic compartmental model for digital contact tracing that provides closed-form conditions for the control of an epidemic. Closedform control conditions are crucial for the understandability of models and are instrumental for decision makers and computer scientists working on digital contact tracing.
The contributions of this paper are the following: • To complement the model in [9], we propose a deterministic compartmental model for digital contact tracing. The advantage of this modeling approach is that closed-form solutions can be obtained, and hence analytical conditions on the control of the epidemic can be derived. Vice versa, the Von Foerster equation on which [9] is based, is more accurate than simple compartmental models, but can only be solved numerically. • Alongside the compartmental model, we also introduce a standalone model that captures how the testing delay affects the efficacy of the detection of infected people, depending on the duration of the latent window (the period during which an infected person is not yet contagious) and the contagious window (during which the infected person is contagious but has yet to develop symptoms). This model is general, and can be solved in closed form for some common distributions describing these time intervals. • We apply the model to realistic COVID-19 epidemic scenarios, showing that, even with high penetration of digital contact tracing, the control of the epidemic is extremely difficult without additional mitigation strategies. The rest of the paper is organized as follows. In Sections II-III, we overview the main results in the related literature regarding COVID-19 modeling and we summarise the properties of the disease itself that are important from the modeling standpoint. Our deterministic compartmental model is presented in Section IV, together with the model on the efficacy of detection of infected persons. The proposed model is then applied to a set of realistic epidemiological scenarios in Section V. Finally, Section VI concludes the paper.

II. A BRIEF OVERVIEW ON COVID-19
From the modeling standpoint, a crucial aspect is to understand when infected people become contagious. For any viral disease, the typical timeline is the following. Following contact with a contagious person, an individual may become infected. However, they do not become contagious immediately: there is a latent period during which the person is infected but not yet contagious (i.e., the virus is replicating but its quantity is not yet enough to infect another person). Another important stage is the incubation period, which goes from the infection time to the time when the person starts developing symptoms. The latent period may be shorter than the incubation period: this means that an infected person becomes contagious before developing symptoms. Clearly, this makes controlling the spread of the disease harder, since the contagious person who has not developed any symptoms is not aware of their contagiousness. Despite being the subject of hot debates in the initial phases of the COVID-19 epidemic, it is now clear that asymptomatic and pre-symptomatic carriers play a major role in the spread of the SARS-CoV-2 virus [13]- [27].
SARS-CoV-2 is an airborne 2 virus, i.e., it travels through the air. The typical transmission pathway occurs when a contagious person talks, sneezes, or coughs, producing infectious droplets that are inhaled by people in close proximity. Less frequently, these droplets may fall on the surfaces in close proximity and then contribute to transmitting the disease when the contaminated surface is touched by a susceptible person and then this person touches his/her face (eyes, mouth, etc.). The latter transmission pathway is known as environmental transmission. It is not known to play a major role in the COVID-19 epidemic 3 , hence we will not consider it in the modeling. A third transmission pathway is that of aerosol [28]- [31]: when a contagious person talks, sneezes, or coughs they also produce some smaller droplets (known as aerosol) that evaporate faster than they fall on the ground [32]. This means that with aerosol transmission, the dry virus lingers in the air for a considerable time and travels long distances. The bad news is that common face masks (such as surgical and cloth ones) are not well equipped to contain such small droplets. Thus, aerosol transmission is much more challenging than droplet transmission, which can be easily contained by relying on widespread social distancing and lower-grade maskwearing. However, model-wise, they can both be captured by appropriately tuning the probability of infection upon contact.

III. MODELS OF EPIDEMICS
There are two main modeling approaches in the related literature: mathematical models and agent-based models. Mathematical models of epidemics typically lay out a system of Ordinary Differential Equations (ODE)/Partial Differential Equations (PDE) that describes how the number of susceptibles, infected, etc., varies over time. Sometimes these systems can be solved in closed form and provide very useful trends describing what-if scenarios. Otherwise, numerical solutions can be obtained. Due to their nature, these models are based on several simplifying initial assumptions to make the mathematical representation of the phenomenon tractable. On the opposite side of the modeling spectrum, there are agentbased models. Agent-based models are computational models where agents (corresponding to people) interact, in simulation, according to some predefined rules, which can be arbitrarily complex [33]- [39]. They are conceptually very similar to the models used in transportation simulation. They recreate synthetic populations in terms of demographics, traffic flows, etc., and then an epidemic is simulated. Recently, machine learning approaches have gained popularity. Tomy et al. [40] exploit Graph Neural Networks (GNN) to solve the problem of inferring the state of the entire population by observing just a few individuals. GNN have also been used to forecast pandemic evolution [41]- [43], for the temporal reconstruction of epidemic spreading [44], and for identifying patient-zero in an epidemic [45]. More traditional approaches have also been used, e.g., in [46], for epidemic forecasting. Since neither agent-based nor machine learning approaches are the focus of this work, we will not discuss them further.
By far, the most used mathematical model is the classical SIR model and its many variations [47]. In the basic version, people are divided into three compartments (denoted with S, I, R, hence the name of the model). In S, people are susceptible to the disease, i.e., they can become infected upon contact with an infectious person. Infectious people are in compartment I. After a certain time spent in compartment I, infected people recover and move to compartment R. Transitions between compartments are then modeled as follows: Parameters β and γ describe the rate at which susceptibles become infected and infected recover. The SIR model can be described by a system of ordinary differential equations that can be solved in a closed form. This representation of an epidemic is referred to as deterministic, because the above equations are an approximation, holding for very large populations, of the stochastic version 4 of the SIR model [47]. The simple SIR model has been extended in several directions, adding the exposed compartment (where people infected but not yet contagious reside), which we also use in this work, and many more (see [47] for a general discussion and [48] for an application to . It has also been used to study the two-pathogen case [49], when two pathogens insist on the same population (we do not consider this case in our work). 4 Stochastic SIR models have been extensively studied in the complex system community, as they are amenable to capture the effect of network topology. At the beginning of a disease outbreak, individual variability (such as whether a node is a "hub" in the contact network) plays a major role in determining whether an epidemic will occur or not, and deterministic models (which treat all nodes as equal) are not an appropriate choice. In fact, deterministic models assume a fully-mixed population, meaning that an infected individual has the same probability of infecting any susceptible node in the network. This assumption is needed to write down the ODE system, but it is only considered reasonable once an epidemic has started, and several nodes are already contributing to it. Our model, similarly to other ODE-base models, should be used when a disease has already achieved its epidemic phase, in which accurately capturing the outbreak stage is no longer important while having tools for studying the controllability becomes essential. Please refer to [47], and reference therein, for a detailed discussion. The deterministic compartmental models discussed above are based on the simple assumption that the time spent in each compartment can be reasonably approximated with an exponential random variable (the Markovian assumption). When this is not the case, other types of models must be considered. An important class of non-Markovian models are those based on the McKendrick-VonFoerster equation, which incorporates a so-called age structure to the model [47]. Originally, this model was designed to capture births and deaths in the dynamics of population growth in cellular biology: offspring are generated at a young age, and death occurs typically at an old age. Hence, keeping track of the population age over time was essential to predict the evolution of the population size. When applied to epidemiology, age is seen from the point of view of infection, i.e., it corresponds to the time since the individual became infected. And the birth rate at infection age τ becomes the rate at which a person infected τ days ago produces offspring, i.e., newly infected people. Thus, the infection rate is no longer constant over time and depends on the current age profile of the population.
Finally, a related active area of study for COVID-19 is the correct estimation of the parameters that describe the dynamics of infection [50]- [55]. This is important both for purely mathematical models and for agent-based models, because a correct estimation of the epidemic parameters allows researchers to correctly set up their assumptions and simulations.

IV. FACTORING IN DIGITAL CONTACT TRACING
The McKendrick-VonFoerster model introduced in Sec. III has been used in [9], a seminal paper dedicated to evaluating the efficacy of contact tracing for COVID-19. The model by Ferretti et al. [9] is based on the one proposed in [1], with parameter values customized to the COVID-19 setting. This model is the de-facto reference for digital contact-tracing effectiveness estimation and the vast majority of forecasts, coming both from within the scientific communities and news outlets, have been based on its results. By its own nature, the McKendrick-VonFoerster model can only be solved numerically. Hence, it is unable to yield a closed-form condition under which the epidemic can be controlled based on the characteristics of the digital contact tracing in place. Thus, in this work, we complement the results of Ferretti et al. [9] showing how a simpler model, whose control condition is solvable in closed form, can be obtained. The notation we use in the paper is summarized in Table I at the end of the section.
We start with a simplified version of the model, presented in Figure 1, for illustrative purposes. As usual for deterministic compartmental models, we start with a population of constant size N , i.e., the sum of people in all states must add up to N . When looking at a large population, the short-term variation in its size is small and can be neglected. The goal of the model is to capture how infection spreads through the population and to assess whether the spread can be stopped or not by means of digital contact tracing and the resulting quarantine of contacts. We assume that the epidemic is faster than the long-term dynamics of births and deaths in the Figure 1. How people move across SEIR compartments. In red, the arcs associated with containment measures: quarantine of exposed plus isolation of those infected and contagious. population, so we ignore the latter. People can be in one of four states: S (susceptible), E (exposed), I (infectious), R (removed). Since people can be either tracked (with a contact tracing app) or untracked, we duplicate these states to account for this difference (thus, each state is marked with subscript T or U for tracked and untracked, respectively). We do not need to duplicate the removed state because people in R do not contribute to the epidemic anymore. While we use the common letters S, E, I, R to denote the states, we slightly adjust the default meaning of the states to take into account asymptomatic and presymptomatic transmission. In this model, then, exposed means infected but not yet contagious, while infectious means infected and contagious but with no symptoms (this includes the pre-symptomatic and the asymptomatic phase of the disease). The removed state includes all infected (whether contagious or not) that have been isolated and/or have recovered. In this simplified model, a person is isolated either because she is infected and has been tracked by the contact tracing app or because she is contagious and has started developing symptoms. We do not include a dedicated state where people are both contagious and symptomatic because it is reasonable to assume that people with symptoms will isolate themselves (and therefore join the removed state). The fraction of tracked people is indicated by α, where α represents the percentage of the population that subscribed to the considered contact tracing app. Thus, at time t = 0, we have αN people in state S T (corresponding to people that are susceptible and tracked) and (1 − α)N people in S U (susceptible but not tracked). From the susceptible state, people can only move to the exposed state 5 . Recall that "exposed", in this case, means infected but not yet contagious. Therefore, the time spent in the exposed state corresponds, without containment measures in place, to the latent period of the disease. However, there is a crucial difference between untracked and tracked exposed: tracked exposed will be notified by health authorities about their previous contact with a tracked contagious person and they will be isolated, i.e., they will move to the removed state. The same happens for tracked infectious. Removed people are isolated, and hence cannot infect anyone. This is the crucial contribution of contact tracing. Below, we summarize how the transitions from each state can be modeled. S U → E U The rate at which people leave the S U state is given by the effective contact rate β (rate at which there is an encounter between an S and an I and this encounter generates an infection) times the number of possible encounters between people in S U and those in I (we don't care whether the encounter is with a tracked or untracked infected). S T → E T The same holds true for the rate at which tracked susceptibles leave S T , with the appropriate change from U to T of S's subscript with respect to the previous case. E U → I U Untracked exposed (people in E U state) stay there until they become contagious, and this happens at a rate . E T → I T Instead, tracked exposed will either become contagious and move to state I T (this happens with the rate ) or be warned of having had contact with an infected and told to self-isolate (this happens with rate θ, discussed in detail in Sec. IV-A). I U → R Untracked contagious (I U ) are isolated when they begin to develop symptoms, and this happens at a rate γ. I T → R Instead, tracked contagious (I T ) can be either isolated when they start developing symptoms (similarly to the untracked case) or be informed that they have had contact with an infected and told to self-isolate (this happens with rate ψ, discussed in detail in Sec. IV-A).
The key point in being able to add the effect of contact tracing to the SEIR equations is to adequately model the red transitions in Figure 1. The rates θ and ψ capture how effective digital contact tracing is in removing infected people, and factor in testing delay as well as epidemic features (latent period, contagious period, etc.). We will discuss this aspect in Section IV-A and provide a methodology to derive θ and ψ. For now, let us assume that we can assign proper values to all the parameters of the model. In Theorem 1 below, we discuss how to solve the model and how to assess whether the epidemic can be controlled or not depending on the efficacy of contact tracing. The proof of the theorem can be found in Appendix A.
Theorem 1. The epidemic described by the SEIR model in Figure 1 can be controlled when the following condition is true: Remark. The closed-form condition in Theorem 1 could not have been obtained with the model used by Ferretti et al. in [9], which can only be solved numerically. Closed-form conditions are crucial for the understandability of models, and thus for decision makers (including digital contact tracing designers) to correctly assess the dependencies within the epidemic. Note also that the complexity of this solution is O(1), hence it does not depend on the size of the population like, e.g., for agent-based models. Thus, C1 provides an easily interpretable and fast answer to the controllability problem, albeit approximate.
To illustrate the intuition behind condition C1 in Theorem 1, let us consider two ideal cases separately: i) instantaneous tracing (θ + ψ → ∞) and ii) perfect app uptake (α = 1). These correspond to the two dimensions of digital contact tracing: how good we are at detecting infections of the tracked people and how many people we are able to track. When tracing is instantaneous (corresponding to the first case above), the threshold on α (derived from Equation 1) converges to 1 − γ β . For SIR models, the ratio β γ corresponds to the basic reproduction number R 0 [56], hence the threshold on α, interestingly, is equivalent to the herd immunity threshold 1 − 1 R0 . Note that instantaneous tracing alone is not sufficient to control the epidemic: α must be high enough for tracking to cover a large fraction of the population. A superfast tracking that only follows just a tiny fraction of the population is basically useless. In the second case (perfect app uptake, i.e. α = 1), condition C1 reduces to γ + θ + ψ > β. Therefore, θ + ψ must be large enough to compensate for a high β (effective contact rate). This means that even under the ideal situation where everyone has the app (α = 1), control of the epidemic may not be attainable if the tracing process is slow. The efficiency of contact tracing is captured by θ and ψ and, in Section IV-A, we discuss how to derive them.
A. Estimating parameter θ and ψ from contact history Now, we step back and discuss how to model θ and ψ, which are the rates at which exposed and infectious people are removed, respectively. As discussed above, they capture the effectiveness of testing. To derive them, we have to reconstruct the process from contagion (encounter with an infectious person that yields to infection) until removal.
Exposure notifications are triggered by tracked people becoming symptomatic and, therefore, being tested. We know that, since SEIR models assume homogeneity in encounters (which boils down to a single β describing the entire contact process, with no distinction between high vs. low social interactions), the contact rate at which the newly symptomatic tracked person met with tracked susceptibles is αβ (i.e., the baseline rate scaled by the fraction of tracked people). This rate must be split across the different states in which the previous contact might currently be in. Specifically, a past contact can be still exposed, already infectious, or removed. We neglect the removal of susceptibles because the population of susceptibles is very large (by assumption, S ∼ N ), hence removing them would not impact the epidemic. Thus, Definition 1 below follows.
Definition 1 (Alertable Contacts). The alertable contacts of a positive person i can be a) in state E T (no symptoms, not contagious), b) in state I T (contagious, no symptoms), c) in state R (symptomatic or recovered, hence already "removed" from the epidemic). We denote the probabilities associated with each of these conditions as p E , p I , and p R , respectively (note that they add up to 1).
Intuitively, health authorities should strive to increase as much as possible p E , because people in the exposed state have yet to infect someone. Of course, this might not be possible (e.g., due to testing delays) so the next best thing is to increase p I . Instead, notifying people who are already in the removed state is completely useless from the point of view of epidemic containment. As illustrated in Figure 2, we can model the conversion to symptomatic of a previous contact considering: the length of the latent period L (which, as discussed in Section II, goes from the time of infection to the time a person becomes contagious), the length of the infectious but asymptomatic period C, and the testing delay T (the time it takes for a test result to be available after the person has developed symptoms). Note that L and C are only determined by the properties of the specific disease. On the contrary, T is totally dependent on the efficacy of the testing system in place, hence it can be shortened by human interventions (e.g., using rapid tests rather than molecular ones or by scaling up testing facilities). The probability distribution of L, C, and T can be obtained from real data, when available (at the end of the section, we will discuss an example based on a realistic duration of the latent and infectious windows). Using their distribution, we can characterize the only missing time interval in Figure 2: A, which represented the time it takes for the app notification to pop up after contact. In Lemma 1 we derive interval A's distribution. Lemma 1. The random variable A describing the time interval between the at-risk contact and the time when the notification from the contact tracing app arrives is distributed as C + T , i.e., as the sum (between random variables) of the residual infectious-but-asymptomatic period C and the testing delay T .
Proof. As illustrated in Figure 2, A describes the time at which the contact tracing app notification arrives. This time corresponds to the interval between contact with an infectious person and the notification time, hence it includes a residual contagious period (which we denote with C ) and the testing delay T. Thus, A is distributed as C + T (hence its PDF is given by the convolution of the PDF of C and T [57]). Mathematically, C can be obtained assuming that the contact between the susceptible and the contagious individuals appears during C as a random observer: as a result, C can be derived as the residual duration [58] of C for the infectious person i (i.e, the time left before the person becomes symptomatic, hence they are discovered to be positive). Denoting by F X the CCDF of a generic random variable X, the formula to calculate the residual time C is the following [58]: (2) Since C can be derived from real epidemic data, C can also be calculated.
Now that A is fully characterized, by deriving its interplay with L and C we obtain p E , p I , and p R in Lemma 2 below. Lemma 2. The probabilities p E , p I , and p R (associated, respectively, with catching a person in state E T , I T , and R) are given by the following: where we denote by W the difference A − L.
Proof. From Figure 2, we can see that p E is equivalent to the probability of the notification arriving within the latent period (corresponding to P (A < L)). The probability that the notification arrives during the contagious and asymptomatic period (P (L < A < L + C)) yields p I . The value of p R can then be obtained complementing to 1 (or computing P (A > L + C), i.e., the probability that the notification arrives when the individual is already symptomatic). Operatively, this results in the thesis.
Not for all distributions the above algebra of random variables yields closed-form solutions, but for some significant ones, it does, at least approximately. This happens, e.g., in the Normal case discussed in the next section (Sec. IV-A1). Closed-form solutions can also be obtained with exponential random variables. Once the probabilities p E and p I are derived, it is straightforward to obtain rates θ and ψ.
Theorem 2. The rates at which exposed and contagious people are removed (θ and ψ, respectively) are given by the following: where p E and p I are obtained as in Lemma 2.
Proof. The thesis simply follows from scaling the overall tracked contact rate αβ by the probability that the exposed person is notified when still in the exposed state or in the infectious state.
1) Example with normally distributed characteristic times: For the sake of example, we can now obtain p E , p I , and p R leveraging the typical average duration of the latent and contagious periods for the original COVID-19 epidemic. From [59], we obtain the average duration of the latent period (E[L] = −1 = 3 days) and that of the period before an infected becomes contagious (E[C] = γ −1 = 2 days). Note that the expectations of L and C correspond to the inverse of and γ time from infected to contagious γ −1 time from contagious to removed (recovery/isolation) θ rate at which tracked exposed are isolated ψ rate at which tracked infectious are isolated α fraction of population with the app installed and running L latent period C infectious-but-asymptomatic period T testing delay A time between at-risk contact and app notification W time to contagious after app notification p E probability of alerting a person in state E T p I probability of alerting a person in state I T p R probability of alerting a person in state R in the SEIR model of Section IV. For example, assume that L and C are normally distributed, each with a standard deviation 0.5 (the occurrence of negative values with this configuration is negligible). We also assume that T is normally distributed, with rate µ T and standard deviation σ T . It is easy to verify that A = C + T can be approximated as normally distributed as well, specifically A ∼ N (E[C ] + µ T , Var(C ) + σ 2 T ). Since we are dealing with normally distributed variables, it is easy to obtain their difference and sum using the algebra of normally distributed random variables.
Leveraging the formulas we have obtained, we can now better understand the impact of testing delays on the ability to intercept infected people in each stage using contact tracing. In the following, we focus on a tagged pair of people (one tracked infectious i and one tracked susceptible j infected by i, analogously to Figure 3) and we study the probability that j is notified when in the exposed, infectious, or removed state, respectively, as we vary the testing delay. Note that, since we focus on a tagged pair of tracked people, this result does not depend on α, which is a population-level parameter. As Figure 3 shows, as long as the test turnaround is less than 2 days, the infected person is most likely caught while they are still not contagious. Vice versa, beyond a 4-day turnaround, we basically intercept only people that are already contagious or even symptomatic. As discussed above, the earlier we intercept infected people, the better. Small testing delays are thus a key ingredient of a containment plan.

V. CAN DIGITAL CONTACT TRACING TAME AN EPIDEMIC?
Now we use the model defined in the previous section and apply it to study the effectiveness of digital contact tracing   Fig. 3. p E , p I , and p R as the average testing delay increases.
in controlling the COVID-19 epidemic. To this end, we need COVID-19-specific estimates for β (effective contact rate), γ (transition rate to symptomatic), and (transition rate to contagious), which are the baseline (i.e., not depending on the contact tracing process) transition rates in Figure 1. Then, θ and ψ (which instead are also determined by the contact tracing and the testing process) can be derived as discussed in Section IV-A. Note that the digital contact tracing process itself can be described with two parameters: α, the app uptake, and the test delay µ T . In the following, we will assess their impact on the control condition C1 (Theorem 1) for different epidemic scenarios. Specifically, we consider three cases. First, we test the effect of an increasing latent period. Then, we study the effect of a longer infectious period with constant transmissibility (see below for details). Finally, we consider what happens when transmissibility increases.
Regarding the baseline epidemic parameters, generally γ and the ratio β γ , corresponding to the basic reproduction number R 0 , are estimated early in an epidemic. Hence, in the following, β will be set to the value yielding the COVID-19 R 0 for the chosen γ. Intuitively, the basic reproduction number R 0 captures the transmissibility of a disease, i.e., the average number of cases directly generated by one infectious person in a population with a very large number of susceptibles. As we also see in the following, the larger R 0 , the more difficult it is to contain the epidemic with digital contact tracing (and in general).
We start ( Figure 4) with a scenario with an average contagious but asymptomatic window length (γ −1 ) equal to 2 days (typical of COVID-19) and fixed R 0 = 2 (this value correspond to the initial 2020 estimate for COVID-19 in [9]). We test the effect of an increasing latent period length ( −1 ∈ {1, 3, 5} days) on the controllability. Specifically, we plot condition C1 in Equation 1 as a function of app uptake α and testing delay (the two directions along which digital contact tracing can be improved): C1 holds true in the shadowed areas of the plot (therefore, the epidemic is under control). Intuitively, the longer the latent period, the easier is the control of the epidemic because we have more time to intercept tracked people before they become contagious. Figure 4 confirms this: as −1 (the latent window) increases, the importance of small testing delays is reduced, and an app uptake above 80% may be sufficient to control the outbreak. Looking again at Figure 4, we also note that an increase in −1 induces a temporal shift on the controllability boundary (the curves delimiting the shadowed areas in the figure): in other words, the controllability boundary is affected by a shift along the x-axis as the latent window increases.
Next, we consider the effect of the infectious period when the transmissibility is kept constant. Without contact tracing, two epidemics with the same transmissibility evolve in the same way (since the controllability condition reduces to R 0 < 1). However, the length of the infectious period affects the effectiveness of contact tracing. Therefore, the digital controllability of epidemics with the same R 0 is different if their infectious periods are different. In Figure 5, we set the latent window −1 at 3 days (its average value for COVID -19) and vary γ (the duration of the infectious period) in {2, 5, 8} days. Note that we want R 0 to remain fixed (no change in transmissibility), so we vary β accordingly. This implies that each person, on average, infects the same number of people in all three cases captured by Figure 5. We observe that the longer the contagious period, the easier the containment (since the shadowed area expands). This was expected because a longer contagious period increases the probability of catching infected people before they become symptomatic. With respect to the controllability boundary (the curves delimiting the shadowed areas in the figure), a change in γ induces a change in the convexity of the boundary but no horizontal shift.
Finally, in Figure 6 we fix the latent period −1 and the contagious period γ −1 to their typical COVID-19 values of 3 and 2 days, respectively, and we change the R 0 of the epidemic by varying β. Note that this analysis is especially important, given the rise of new variants with increased transmissibility (and therefore greater R 0 ). We study R 0 ∈ {2, 3, 4, 5, 6}. R 0 = 2 is the initial estimate for COVID-19 (original strain), then revised to be much higher in some areas (e.g., the estimate in [60] is R 0 ∼ 4). The Alpha variant (B.1.1.7 lineage) is estimated to feature at least 40% higher R [61] with respect to the original strain, while the Delta variant (B.1.617.2) has transmissibility estimated between 6 and 7 [62], [63]. Note that the apparent even higher transmissibility of the recent Omicron variant (B.1.1.529) seems to be due to immune evasion (e.g., vaccines not as effective as for previous variants) rather than to an actual increase in basic transmissibility [64]. Figure 6 shows that, as expected, the impact of increasing R 0 is much more disruptive than that of different latent/contagious windows. Specifically, even with instantaneous testing, the minimum uptake α needed to control the epidemic increases as R 0 increases. This means that even a small fraction of untracked people can wreak havoc on the containment measures. In practice, however, with R 0 = 4, the control of the epidemic through digital contact tracing becomes impossible: an uptake above 95% is unrealistic for all the reasons discussed in Section I (e.g., technical problems with old smartphones, distrust by a fraction of the population). In this case, R 0 must also be reduced by exploiting mitigation measures (social distancing, masks), in order to reduce the probability of infection upon contact, hence β.

VI. CONCLUSION
In this work, we have discussed the modeling efforts for COVID-19 and we have proposed a SEIR model that factors in digital contact tracing and is capable of producing a closed-  Fig. 6. Controllability when the R 0 = β γ increases (we fix γ and we increase β). We study R 0 ∈ {2, 3, 4, 5, 6}. Control is attained in the shadowed regions.
form condition on the controllability of the epidemic. Leveraging this model, we have studied how the penetration of digital contact tracing apps within the population impacts the control of the epidemic. We have found that the penetration must be in general high, hence digital contact tracing may not be sufficient to contain an epidemic, even with a fast turnaround of tests. Additional mitigation strategies must be implemented, such as social distancing and mask wearing. Additionally, the impact of digital contact tracing is highest when the testing delay is low. If the test turnaround is greater than 4 days, digital contact tracing has zero impact on containment. In future work, we plan to extend the model to account for more variability among nodes, e.g., in terms of contact patterns, (along the lines of [65]), while still striving for closed-form controllability conditions. APPENDIX A PROOF OF THEOREM 1 Proof. We start by writing the ODE system corresponding to Figure 1: The corresponding system of ODE can be rewritten in matrix form as y = Ay, where y = [E U , I U , E T , I T ] T and A is given by the following: The system in (5) describes a dynamic system. Its stability (corresponding to the epidemic being under control or not) is assessed by studying its eigenvalues (see, e.g., [60]), which correspond to the roots of the characteristic polynomial p A (x) of matrix A. In fact, since the solutions to a system of linear ODE y = Ay are of the form y(t) = i c i * e rit [66] (where c i 's are constants and r i 's the eigenvalues/roots), it is clear that a positive root introduces instability into the system, because there would be an exponential function with a positive argument, hence an exponential growth in the epidemic. Therefore, we can study the roots of the characteristic polynomial p A (x) to assess under which conditions only negative roots exist. In order to avoid a trivial case, we assume β > γ (i.e., the epidemic is not under control without contact tracing). Using Descartes' rule of signs, we can derive the number of positive and negative roots of p A (x) without actually having to solve the polynomial (finding a closed form for the roots would not be feasible in this case). Starting with positive roots, we observe the following signs: {+, +, sgn(k 2 ), sgn(k 1 ), sgn(k 0 )}, where we have expressed p A (x) as i k i x i , sgn is the sign function (where sgn(·) = 1 corresponds to sign +, sgn(·) = −1 corresponds to −), and k 2 , k 1 , k 0 are given by the following: k 2 = −β + 2 + γ(γ + ψ) + (4γ + 2ψ + θ), k 1 = (2γ + ψ + θ) + +γ[2(γ + ψ) + θ) − β( + γ + ψ − ψα], By studying the functions sgn(k 2 ), sgn(k 1 ), sgn(k 0 ), we obtain the following relationships between the coefficients' signs: sgn(k 0 ) = 1 ⇒ sgn(k 1 ) = 1 ⇒ sgn(k 2 ) = 1.
In other words, since the rates must be all positive and α ∈ [0, 1], when coefficient k 0 is positive, k 2 and k 1 must also be positive. This implies that not all possible sign permutations in Equation 7 are attainable, as illustrated in Table II.
Discarding unattainable permutations, we can have at most one sign change across the coefficients of the polynomial, which implies at most one positive root. It thus follows that the condition under which we observe no sign changes is also the condition under which the epidemic can be controlled (zero positive roots). Thanks to Equation 9, we know that sgn(k 0 ) = 1 is a sufficient condition for this to happen. Then, solving for k 0 > 0 (with k 0 defined Equation 8), we obtain condition C1 in Equation 1.
To conclude the proof, we only need to verify that there are no complex roots. This is easy to do by applying again Descartes' rule, this time to p A (−x). To this aim, we need to change the coefficient sign of odd-power terms (i.e., k3, k 1 ) in Table II and count the sign changes. By summing the sign changes for positive and negative roots corresponding to the same permutation (equivalently, by summing the sign changes per corresponding row in Table II and Table II with the sign of odd-power terms changed), we obtain the total number of real roots. If we do the math, we discover that the total sign changes are at most 4, hence p A (x) features four real roots. Then, the number of complex roots is given by the difference between the degree of the polynomial and the maximum number of real roots. Since p A (x) is a polynomial of degree 4, we know that there are no complex roots.