Resilience Estimation of Cyber-Physical Systems via Quantitative Metrics

This paper is about the estimation of the cyber-resilience of CPS. We define two new resilience estimation metrics: <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-steerability and <inline-formula> <tex-math notation="LaTeX">$\ell $ </tex-math></inline-formula>-monitorability. They aim at assisting designers to evaluate and increase the cyber-resilience of CPS when facing stealthy attacks. The <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-steerability metric reflects the ability of a controller to act on individual plant state variables when, at least, <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> different groups of functionally diverse input signals may be processed. The <inline-formula> <tex-math notation="LaTeX">$\ell $ </tex-math></inline-formula>-monitorability metric indicates the ability of a controller to monitor individual plant state variables with <inline-formula> <tex-math notation="LaTeX">$\ell $ </tex-math></inline-formula> different groups of functionally diverse outputs. Paired together, the metrics lead to CPS reaching <inline-formula> <tex-math notation="LaTeX">$(k,\ell)$ </tex-math></inline-formula>-resilience. When <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\ell $ </tex-math></inline-formula> are both greater than one, a CPS can absorb and adapt to control-theoretic attacks manipulating input and output signals. We also relate the parameters <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\ell $ </tex-math></inline-formula> to the recoverability of a system. We define recoverability strategies to mitigate the impact of perpetrated attacks. We show that the values of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\ell $ </tex-math></inline-formula> can be augmented by combining redundancy and diversity in hardware and software, in order to apply the moving target paradigm. We validate the approach via simulation and numeric results.


I. INTRODUCTION
Cyber-Physical Systems (CPS) integrate network and software resources to control and monitor physical components operating on different spatial and temporal scales [15]. Examples of CPS include industrial control systems for energy distribution (e.g., smart grids), autonomous vehicles, robotics and next-generation medical systems. Since physical, networked and computational components are deeply intertwined, the protection of the system as a whole highly relies on steerability and monitorability. Steerability refers to the ability of a controller to drive and maintain a Cyber-Physical System (CPS) in a desired operating point, by sending command and control input signals to the system actuators. Monitorability indicates the capability of the system to process and interpret output signals produced by the system sensors, in order to accurately deduce the internal state of the CPS.
The associate editor coordinating the review of this manuscript and approving it for publication was Azwirman Gusrialdi .
The controller follows reference signals such that the CPS ends being asymptotically stable. In the short term, slight variations of either the input or outputs of the system do not affect the stability. However, in the long term, such slight variations may affect and disrupt the system. This is the goal of a cyber-physical adversary. Given the knowledge of central controllers about the physical behavior of a CPS, i.e., steerability and monitorability, the goal of the defender is to face faults and attacks by increasing system recoverability, i.e., the system must be able to adapt and bounce back from stability disruptions, as quick as possible.
Recent studies acknowledge the vulnerability of CPS to integrity and availability attacks [2], [4], [23]. In this paper, we focus on covert attacks [28], [29], i.e., a family of cyber-physical attacks taking the form of physical aggressions against the operation of a CPS, by manipulating input signals to actuators and output signals from sensors. The approach is, however, valid for other family of integrity and availability attacks reported in the control-theoretic literature (being the family of covert attacks those reported as more ambitious to handle [29]).
In this paper, we introduce the notion of k-steerability. The parameter k corresponds to the minimum number of input signals available to act on each individual plant state variable. We also define the concept of -monitorability. The parameter reflects the minimum number of output signals that can be used to monitor each individual plant state variable. We study values of k and with respect to system resilience and the ability to recover a plant state. If due to covert attacks, h input signals are compromised, then steerability of each individual plant state is not entirely lost as long as h is lower than k. This partial steerability can be leveraged to run a covert attack mitigation plan. If due to covert attacks, g output signals are hacked, then the ability to detect the condition is not entirely lost as long as g is lower than . We discuss how k and can be determined and augmented by adding redundant and diverse hardware. k-steerability and -monitorability combine together into the (k, )-resilient metric. Our work is about resilience estimation. It is complementary to work on security risk assessment. In the sequel, we elaborate further on how to use them together. We assume that both input and output signals can be correlated into functionally diverse groups, as a complement to the traditional use of redundancy in critical systems.
The use of redundancy assume the inclusion of alternative copies, e.g., sensors, actuators and controllers, in order to guarantee system availability. If the system finds itself under a situation of attack, and the values of a group of components are not behaving as expected, then the validation of such values can be contrasted with the values of redundant replicas, assuming that there was an attack affecting the system. This technique is complementary to fault tolerance techniques, also used to address situations in which some system components are victims of failures or faults. However, the use of redundancy for security purposes may have some drawbacks. Since the replicas may be seen as identical, once an attacker has managed to compromise one of them, then the rest of the replicas can also be compromised very easily. Hence, we need to impose the use of diversity. For instance, if the replicas are geographically distributed, or the replicas compute their values using different physical phenomena, the approach can improve the way to handle attacks exploiting the physical nature of vulnerable components. Hence, our approach assumes the existence of different replicas behaving in an independent manner and with non-overlapping patterns (e.g., physical patterns) to handle the attacks. In terms of contributions 1 : • We propose a novel design stage approach to cyberphysical resilience thinking in terms of physical processing and intentional attacks.
• We introduce the steerability and monitorability measures paired together as the (k, )-resilience metric, k being the degree of steerability and the degree of monitorability.
• We relate (k, )-resilience metric to recoverabilty, i.e., the number of steps required to recover from attacks.
• We review example CPS and their resilience estimation.
• We validate the approach with numeric simulations and real world examples, including analysis of the proposed ideas under practical attacks with respect to performance disruption. Sections II and III introduce CPS modeling, covert attacks and related work. The k-steerability, -monitorability and (k, )-resilient concepts are developed in Section IV. The design and evaluation of representative CPS that are (k, )-resilient are reviewed in Section V. Section VI concludes the paper. Additional details about our simulations and results are available in an appendix.

II. SYSTEM AND ADVERSARY MODELS
A CPS consists of a plant and a controller. They are distributed and communicate through a network. Several mathematical models exist for representing them [6], [21]. In the sequel, we introduce the necessary modeling background, using a CPS with fluid dynamics as an example.

A. DIFFERENTIAL EQUATION REPRESENTATION
Let us consider as a plant an individual cylindrical tank, with a single inflow and a single outflow of liquid. The tank liquid level can be modeled by the following differential equation [30]: Eq. (1) models the relationship between instantaneous changes of liquid level and difference between the inflow rate and outflow rate. As a function of time t (second), the level of the liquid in the tank is h(t) (cm). Variable α represents a cross-sectional area of the tank (cm 2 ). The term F(t) represents the inflow rate (cm 3 /second). The parameter a denotes the outlet valve coefficient. The outflow rate (cm 3 /second) is proportional to the product of a times the square root of the liquid level as at time t, represented by the term h (t). Note that because of the square root term, the system in nonlinear.
The model represented by Eq. (1) is linearized assuming a linear inflow rate and operation around a liquid level h 0 , termed the operating point. It is assumed the level is maintained at point h 0 + , with | | small. Linearization is based on the observation that the expression (1 + ) β is approximately equal to the expression 1 + β , when 1. In the expression a √ h (t), substituting h(t) by the sum h 0 + , we get a linear model for the outflow rate: Inflow rate F(t) is modeled by the product γ κv. The parameters γ , κ and v respectively denote the valve coefficient, pump coefficient (cm 3 /V second) and voltage applied to the pump (V). The voltage v is the variable governed by the controller. The resulting linear differential equation modeling the liquid level is: Eq. (2) is an approximation that remains valid as long as is relatively small, that is, at the chosen operating point h 0 there are only small level fluctuations. To maintain that condition, the inflow rate γ κv must be equal to the outflow rate a The state space representation of a linear CPS is as follows: Eq.
(3) models the evolution of the CPS. At time i, given input (2) to a state-space representation. The input u i is the voltage applied to the pump. Let t be the continuous time corresponding to the discrete time i. The state variable x i tracks the difference between the liquid level h(t) and operating point h 0 , i.e., x i = h(t) − h 0 , which is the symbol in Eq. (2). The corresponding state, input, output and direct transmission matrices are: The state vector has one element x i [1], which is the current level difference , w.r.t. the operating point h 0 . The state matrix A contains one element, which is used to calculate in a transition, from time i to time i + 1, the change in the amount of liquid leaving the tank, i.e., a [1]. Note that at the operating point, the total amount of liquid leaving the tank is the subtrahend in Eq. (2), divided by α. The input vector has a single element u i [1]. The input matrix B contains one element and calculates the amount of liquid coming into the tank in one transition, i.e., the product γ κ α · u i [1]. Note that this mapping has the linearity advantage, but fidelity if limited to small fluctuations around an operating point. As we move from the operating, the effect of gravity is distorted.
This degree of fidelity is although more than sufficient for the type of analysis conducted in the sequel of this paper.
Together, Equations 3, 4 and 5 model the dynamics of the plant. Integrated in a CPS, the input and output signals are transported over a network. It is reasonable to assume that input and output signals are protected with security protocols. It is also reasonable to expect that such protocols have vulnerabilities, initially unknown but eventually uncovered and exploited by an adversary, in a manner such as the one discussed in the upcoming Section II-C. As a second line of defense, it is also reasonable to believe that attack detection methods, such as the ones discussed in the upcoming Section III, are deployed. The CPS has attack protection and detection. However, this individual cylindrical tank CPS lacks alternative inputs and outputs that can be used to steer and monitor the plant when one input, one output or both are attacked. These backup inputs and outputs should ideally reflect different physical phenomena, protected by different security protocols with of course their own vulnerabilities, but possibly unlikely to be uncovered at the same time as for the main input and output. The systematic estimation of this type of resilience, which at the outset simply calls for common sense, is precisely the purpose of this article.

C. ADVERSARY MODEL
We assume adversaries perpetrating covert attacks. Covert attacks are a family of cyber-physical attacks in which the adversary perturbs the state of a CPS while succeeding to evade detection, i.e., the adversary attempts to remain invisible [32], [34], [37], [38]. It is powerful attack because it is assumed that the adversary knows the plant dynamics (matrices A, B, C and D) and that input and output signals can be spoofed. While an attack is being carried out, the perpetrator manipulates the measurements to conceal the effect of the spoofed inputs. Hence, from the point of view of an observer, responsible for detecting attacks, the measurements look normal. Using Eqs. (3) and (4), attacks are represented as follows: The variable u a i , in U, denotes the addition of the adversary to the signals to the actuators. The term s a i , in R n , represents the manipulation done by the adversary on the sensor measurements.
The adversary model succinctly captures covert attacks where an adversary has the ability to manipulate actuators and sensors. Attacks can be perpetrated by insiders, but also by outsiders due to communication channel vulnerabilities. For instance, an adversary infiltration was perpetrated on a steel mill using a spear phishing email tactic, first achieving access to the corporate network and then succeeding entering the plant network [16]. Generic covert attacks exploiting communication channel vulnerabilities have been modeled in numerous papers [29], [31], [36].

III. RELATED WORK
Methods have been devised to detect covert attacks. They all require the analysis of inputs and outputs of the plant. Rubio-Hernan et al. [24]- [26] have revisited challengeresponse detectors via authentication techniques, initially proposed by Mo et al. [18], [19], [35]. Hoehn and Zhang [10] and Schellenberger and Zhang [27] developed the idea of external synthetic states that evolve in parallel and are coupled to the physical states of the CPS. Adversaries can apply system identification [30] and machine learning [11], [30] to infer the dynamics of the plant. All detection methods acknowledge that the adversary has the ability to learn the dynamics of the CPS. However, they are all based on the important assumption that the knowledge of the adversary is not perfect. Due to this imperfect knowledge, the adversary makes errors that may be caught by the detection methods. Whether they are caught or not depends on the degree of knowledge of the adversary and the level of difficulty to avoid being detected. To make it challenging, detection methods comprise the integration of time-varying elements (inputs or states) concealed in the dynamics of the plant. Assuming the parameters of these elements are changed fast enough, the dynamics of the plant becomes a moving target for the adversary [13], [14]. In other words, the adversary does not have enough time to learn properly, makes errors and perpetrates attacks that are not covert [8], [9]. Next, we discuss in more details the concepts of challenge-response and auxiliary state.

1) CHALLENGE-RESPONSE AUTHENTICATION
Challenge-response detectors, defined in [24]- [26], revisit the authentication signal in [18], [19] to extend error detectors into cyber-physical attack detectors. The resulting scheme provides a real-time protection of the linear time-invariant models of the plant. Built upon Kalman filters and linearquadratic regulators, the scheme produces authentication signals to protect the integrity of physical measurements communicated over the cyber and physical control space of a networked control system. It is assumed that, without the protection of the networked messages, malicious actions can be conducted to mislead the system towards unauthorized or improper actions, i.e., by disrupting the plant services.
Assume u * i as the output of a controller and u i the control input that is sent to the plant, cf. Eq. (3). The idea of challenge-response authentication is to superpose to the control law u * i an authentication signal u i ∈ R p that serves to detect integrity attacks. Thus, the control input u i is given by: The authentication signal is a Gaussian random signal with zero mean that is independent both from the state noise (w i ) and measurement noise (v i ). The authentication signal is used by the detector to identify the malicious signals originated by the adversary. Since the control law u * i carries the authentication signal u i , the detector (physically co-located within the controller) triggers an alarm whenever a malicious signal is observed, i.e., whenever the challenge sent by the controller over the plant is not observed within the measurements returned by the plant. Towards this end, [18], [19] propose to employ a χ 2 detector, i.e., a well-known category of real-time anomaly detectors classically used for fault detection in control systems [3], for the purpose of signaling the anomalies identified in the behavior of the plant.
Further details about some more powerful challengeresponse detectors, capable of identifying adversaries which are empowered by identification tools such as ARX (autoregressive with exogenous input) and ARMAX (autoregressive-moving average with exogenous input) [20], i.e., using identification tools to evade detection, are available in [25], [26].

2) AUXILIARY STATES
The CPS can also be augmented with a synthetic auxiliary state, synthetic outputs and optionally new inputs [10], [27]. The auxiliary state has a linear time-varying dynamics that is evolved in parallel with the CPS. The dynamics is concealed to the adversary. Because it is time-varying, it becomes a moving target that is challenging to identify by an adversary, a precondition to the covert attack [7]. But, it is known to and used by the operator to detect the covert attack. The operator is in synchrony with the linear time-varying dynamics. It is therefore able to track it properly and compare the actual evolution of the auxiliary dynamics with the expected evolution. Significant discrepancies indicate the presence of anomalies, which can be used to identify the adversary.
The CPS model is extended with the auxiliary statex i and additional actuators and sensors (ũ i andỹ i ) related to the auxiliary state. The state x i and auxiliary statex i are correlated. Together with the auxiliary state, the state transformation model is: Together with the additional elements, the sensor measurements are: with Hidden to the adversary, the state sub-matrices A 1,i and A 2,i , the input matrix B i , output matrix C i and direct transmission matrix D i are randomized variables. According to the approach proposed by Schellenberger and Zhang [27], the actual matrices are randomly switched from time-to-time. The operator and CPS are synchronized on the switching sequence, perhaps through a switching signal. This secret is not shared with the adversary. Sensor measurementỹ i is visible to the adversary, but changes over time in a random way. The adversary is challenged with learning the random auxiliary system state, input, output and direct transmission matrices.
We have introduced the system and adversary models and reviewed defense methods. In the next sections, we build upon that material and introduce new ideas to address resilience and state recovery.

IV. THE (k, )-RESILIENT PROPERTY
We define the k-steerability and -monitorability properties. In conjunction, they define the (k, )-resilient property.

A. INTER-VARIABLE DEPENDENCIES
To bright to light the dependency between two variables, we use Pearson correlation coefficients.
Definition 1 (Pearson correlation coefficient): Given two random variables, E and F, and n observations for each of them, their correlation coefficient is defined by where e 1 , . . . , e n (f 1 , . . . , f n ), µ E (µ F ) and σ E (σ F ) are the observations, mean and standard deviation of random variable E (F). A correlation coefficient is a unitless value between minus one and one. When ρ(E, F) is equal to one, we have perfect positive correlation between E and F. When it is minus one, we have perfect negative correlation. Intuitively, when |ρ(E, F)| is between zero and 0.2, the linear correlation is from null to weak. It is moderate between 0.2 and 0.6. Above 0.6, it is strong [33]. Note that null linear correlation does not mean necessarily that variables E and F are independent.
In such a case, there is no linear dependency revealed by the observations, but a nonlinear dependency is possible. For example, Eq. (1) generates nonlinear output correlated with the input. In such a case, existence of correlation can be confirmed calculating the correlation coefficient using a linearized version of the output data. Furthermore, correlation is one way to establish dependencies between variables.

B. DEPENDENCY GRAPH
Let u, x and y be respectively p-element, m-element and n-element column vectors representing the input, state and output variables of a CPS. We define correlation coefficient matrices to capture the relationships that exist between state variables and input or output variables. Definition 2 (Input correlation coefficient matrix): The m× p input correlation coefficient matrix Q is equal to (q i,j ), where i = 1, . . . , m, j = 1, . . . , p. An entry q i,j is the correlation coefficient ρ(x i , u j ) between the state variable x i and input variable u j .
Definition 3 (Output correlation coefficient matrix): The m × n output correlation coefficient matrix R is equal to (r i,j ), where i = 1, . . . , m, j = 1, . . . , n. An entry r i,j is the correlation coefficient ρ(x i , y j ) between the state variable x i and output variable y j .
Definition 4 (Input dependency graph): The input dependency graph is a bipartite graph G U = (X , U , E) where the two sets of vertices are X = {x 1 , . . . , x m } and U = {u 1 , . . . , u p }, the state and input variables. Pearson correlation is used to determine dependencies. There is an edge (x i , u i ) in E if-and-only-if the absolute value of the correlation between variables x i and u i , i.e., |q i,j |, is greater than or equal to a threshold T . Possible values for T are discussed in Section IV-A. In this article, we use strong correlation and a value of T close to one is chosen.
Definition 5 (Output dependency graph): The output dependency graph is a bipartite graph G Y = (X , Y , E) where the two sets of vertices are X = {x 1 , . . . , x m } and Y = {y 1 , . . . , y n }, the state and output variables. There is an edge (x i , y i ) in E if-and-only-if the absolute value of the correlation between variables x i and y i , i.e., |r i,j |, is greater than or equal to a threshold T .
For the dependency graph G U and a vertex x in X , let the expression deg(x) be its input degree, i.e., the number of adjacent vertices in U . Similarly, for the dependency graph G Y and a vertex x in X , let deg(x) be its output degree, i.e., the number of adjacent vertices in Y . The -monitorability degree reflects the availability of at least sensor output signals for monitoring any state variable.
Definition 6 ( -monitorability degree 2 ): Let G Y be the output dependency graph of a CPS. Let be equal to min x∈X deg(x).
Then, the CPS has -monitorability. The notion of steerability is related to the control-theoretic concept of controllability. Controllability refers to the ability to drive a system to any state of its state space, under certain constraints [21]. This is consistent with our conceptualization of steerability, but the latter is a weaker and necessary condition emphasizing redundancy that can be evaluated calculating statistical correlation between state variables and inputs. The idea of monitorability is related the one of observability used in control theory. For example, in Ref. [6] a state variable is observable when it is connected to the outputs, which is consistent with our concept of monitorability. However, the exact techniques behind these two concepts are different and do not capture the same properties. In Ref. [6], observability is determined by computing the rank of an observability matrix. However, this particular technique is not universally recommended in the control literature for observability testing [22]. Our technique measures the correlations between state variables and outputs, intuitively, a necessary condition to ascertain connections between state variables or outputs. While steerability highlights redundancy in inputs, monitorability emphasizes redundancy in outputs.
We introduce the notion of k-steerability. It indicates that there are at least k actuator input signals available for acting on every single plant state variable. Then, the CPS has k-steerability.
Definition 8 ((k, )-resilient): A CPS with k-steerability and -monitorability is said to be (k, )-resilient. The dependency graphs highlight the relationships that exist between the inputs and state variables, and relationships between state variables and outputs. This is essential to formally determine who can control what and who can monitor what. Besides, being (k, )-resilient means that the CPS can tolerate a maximum of k − 1 attacked actuators, while being able to act on every single state variable x i , i = 1, . . . , m. It also means that the CPS can withstand no more than − 1 attacked sensors, while being able to monitor every single state variable x i . As in any system design exercise, there are several objectives that can conflict with each other. In the design of a CPS, security and resilience are two of them. While higher k and/or achieves higher resilience, this may also translate to more points where an adversary can try to control or monitor the plant, that is, the attack surface is augmented. In this article, we provide a methodology to estimate the resilience of a CPS. Complementing the work presented in this article, there are methodologies for CPS security risk assessment [17]. A fine balance between security risk and resilience must be achieved. Figure 1 schematically represents security risk versus resilience. The x-axis represents resilience. It can be quantified either with the resilience estimates k, or a weighted sum thereof. One can also envision a 3D model with one axis for k and another for , forming together a resilience estimation plane. In other words, resilience can examined from the point of view of the inputs, outputs or both at the same time. The y-axis represents security risk. It can be quantified with a risk assessment method [17]. Three scenarios are pictured with three different curves. The black solid line represents a case where security and resilience are in equilibrium. The red dotted line pictures the undesirable case where a growth in resilience implies a strong security risk increase. The blue dashed line shows the most desirable situation where a growth in resilience may imply a security risk increase, due to the augmentation of points where attacks can be perpetrated. Although, the increase is moderate and offset by significant growth in resilience.
We have established the principles of our approach. In the following section, we review a number of designs, explain how the (k, )-resilient property translates into possibilities of acting on and recovering the state of a plant when attacks are perpetrated. We define performance of a CPS design as the ability to maintain the plant in target state, despite the fact that there may be actuators and sensors being attacked. We compare performance of the different designs.

V. REVIEW OF (k, )-RESILIENT DESIGNS
The degree of steerability (k) of a CPS can be increased by introducing a diversity of new actuators. Adding more actuators increases the number of points for acting on a plant. Likewise, monitorability ( ) can be increased by introducing a diversity of new sensors. Adding new sensors provides more monitoring points for detecting anomalies and estimating the state of a plant. As discussed at the end of Section IV, we reiterate that increasing k, or both must be done in conjunction with security risk assessment. For a CPS with fluid dynamics, there is a diversity of sensor types that include flow rate, liquid level, turbidity, water leak, water pressure and gravity liquid level.
Making abstraction of noise for the sake of simplicity, we revisit the state-space representation of Eq. (5) augmented with an inflow rate sensor and an outflow rate sensor. The output column vector y comprises three entries: (1) the level difference (y 1 ), (2) inflow rate (y 2 ) and (3) outflow rate (y 3 ): The design comprises three outputs strongly correlated with the liquid level difference, the correlation is strong between the level difference state variable and any of the outputs. The CPS has three-monitorability, because in G U , min deg(x) for x ∈ X , is equal to three. With respect to Eq. (5), only the output matrix (C) and direct transmission matrix (D) have changed. When there are changes in the actuator configuration, the input matrix (B) needs to be modified. The plant dynamics, represented by the state matrix (A), does not change. Hereafter, we discuss a series of configurations, with increasing k and , i.e., increasing resilience estimation pairs. We review the different possibilities of state recovery according to their (k, )-resilient design.

A. SCENARIOS
We use the quadruple-tank plant of Johansson [12] as experimental testbed, cf. Fig. 2, Part (a), and supplementary material available in an online repository. 4 There are four tanks and two pumps. Each tank has an outlet at its bottom. Pump 1 pushes liquid into Tanks 1 and 4. Pump 2 pushes liquid into Tanks 2 and 3. Tank 3 is placed above Tank 1. By gravity, liquid from Tank 3 flows into Tank 1. Similarly, Tank 4 is placed above Tank 2. By gravity, liquid from Tank 4 flows into Tank 2. We examine three different designs for this CPS: (1, 1)-, (1, 2)-and (2, 2)-resilient.
(1,1)-resilient CPS -In this initial design, there are four ultrasonic sensors measuring the liquid level (one per tank) and two actuators (mechanic pumps) moving liquid into the tanks. Every pump has one liquid input and two outputs. The sensors and actuators are visible on the cyber space. The plant is observed and controlled from the cyber space. The state representation of the plant is as follows: State matrix A has four rows and four columns.  1, 2, 3, 4). Output matrix C is augmented to represent readings of outflows from the tanks.
(2,2)-resilient CPS -This new design comprises new auxiliary actuators connected to fixed-flow Pumps 3 and 4. The fixed-flow pumps can take over the roles of Pumps 1 and 2, respectively. The input matrix of the plant is updated as follows: where η 1 (η 2 ) is the fraction of the liquid flow of Pump 3 (Pump 4) going to Tank 1 (Tank 2), 1 − η 1 (1 − γ 2 ) is going to Tank 4 (Tank 3), λ and w 1 (w 2 ) respectively denote the Pump 3 (Pump 4) coefficient (cm 3 /V second) and voltage (V), not controllable from the cyber space. For Pumps 3 and 4, the input signals are zero or one, corresponding to off and on. The input column vector u i has now four rows. The first two rows are the input voltages to Pumps 1 and 2. The last two rows are the off/on (0/1) signals to Pumps 3 and 4. In the sequel, we bridge the (k, )−resilient property and plant state recoverability.

B. RESILIENCE AND STATE RECOVERABILITY
We connect (k, )-resilient estimation to behavioral properties. Building upon Refs. [5], [36], we quantify the resources needed to adapt and bounce back from disruptions. For the sake of simplicity, we make abstraction of noise. Firstly, we assume that only sensor attacks may occur. When attacks are perpetrated, we show that under certain conditions an increased number and a diversity of sensors make possible recovery of the state of a CPS. Secondly, we assume that both actuators and sensors can be attacked. While attacks are carried out, we demonstrate that it may be possible to identify which actuators are being attacked and how they are being attacked. If at all possible, these actuators can be deactivated.
Non-attacked actuators can be used to run a resilience plan that steers the CPS in a safe state.

1) ATTACKS ON SENSORS ONLY
Let x i and x i be two states in R m , with corresponding length τ output sequences y i , . . . , y i+τ −1 and y i , . . . , y i+τ −1 resulting from the application of corresponding length τ input sequences u i , . . . , u i+τ −1 and u i , . . . , u i+τ −1 . Definition 9 (Recoverable state with sensor attacks): The state of a CPS is recoverable in τ steps, if for all states x i and x i whenever the corresponding observed output sequences are such that y i = y i , . . . , y i+τ −1 = y i+τ −1 , then x i is equal to x i .

Theorem 1 (Recoverable state with sensor attacks):
The state x i ∈ R m of an attacked CPS is recoverable in one step, if for j = 1, . . . , m, there is at least one non-attacked sensor implementing an injective function with input state element x i [j].
Proof: Let C and D be the output and direct transmission matrices of the CPS. Let x i and x i be two states. Because for j = 1, . . . , m at least one sensor implements an injective function, when Cx i Dx i is equal to Cx i Dx i we have that x i is equal to x i . Every state is uniquely determined by the sensor outputs. A technique such as watermarking [36] can be used to determine which sensors are being attacked. Theorem 1 can be used to determine the exact state of CPS under attack.
Case 1: The state of the one-tank system modeled by Eq. (12) is recoverable in one step if only sensor level difference (y 1 ) or sensor outflow rate (y 3 ) is attacked, but not both.
Proof: It follows from the fact that both sensor types are injective functions with domain system states and co-domain length-one output traces. Simulation of Case 1: The one-tank system has one level sensor and one outflow sensor. The state of this system is recoverable if the level sensor or the outflow sensor is attacked, but not both. Fig. 3 (a,b) shows that we can recover the state of the system from the outflow sensor, in case an attack is targeting the level sensor. The simulation is based on Matlab code, available on-line in a github repository. 5 Additional details about the simulation code and results are available in the appendix. Case 2: The state of the (1, 1)-resilient system, cf. Eq. (13), is not recoverable if one sensor is attacked.
Proof: When one sensor is attacked, there are no additional points of observations (Fig. 3 (c,d)). Simulation of Case 2: The (1, 1)-resilient system has only four levels sensors (one per tank). When an adversary perpetrates an attack on these sensors, the state of the system is not recoverable. Fig. 3 (c) shows the levels in each tank, when the system is not attacked. Fig. 3 (d) shows the levels when an attack is perpetrated. Since there is no non-attacked sensor type implementing an injective function on its elements, the state is not recoverable. 5 Cf. https://github.com/mirrored-quadruple-tank/ Case 3: The state of the (1, 2)-resilient system, modeled by Eqs. (13) and (14), is recoverable in one step if, for i = 1, 2, 3, 4, only level sensors y[2i −1] [1], . . . , x[m]) T ). Simulation of Case 3: Details available in the appendix.

2) ATTACKS ON ACTUATORS AND SENSORS
We now assume that both actuators and sensors can be disrupted by an adversary perpetrating a covert attack (cf. Section II-C). When actuators and sensors are attacked, and thanks to redundancy and diversity, it may be possible to determine the state of a CPS and which actuators are attacked. The current state can be recovered provided that the output sequence is unique, w.r.t. that state. Furthermore, we can find out which actuators are attacked and how they are attacked, provided that the output sequence is unique w.r.t. the input sequence. Hence, the entire state of the CPS is recoverable. The non-attacked actuators can be used to mitigate the attack and steer the CPS into a safe condition.
Definition 10 (Recoverable with actuator and sensor attacks): The state of a CPS is recoverable in τ steps, if for all states x i and x i whenever the corresponding observed output sequences are such that y i = y i , . . . , y i+τ −1 = y i+τ −1 , then x i is equal to x i and input signals are such that u i = u i , . . . , u i+τ −1 = u i+τ −1 . Case 4: The state of the (2, 2)-resilient system, modeled by Eqs. (13), (14) and (15), is recoverable in one step when, for Tanks 1, 2, 3 and 4, the inflows are greater than zero, but respectively less than η 1 λw 1 α , η 2 λw 2 α (1−η 2 )λw 2 α and (1−η 1 )λw 1 α . Proof: When the states x i and x i+1 are recoverable according to Definition 9, the evaluation of the product Ax i in Eq. (6) can be determined. Hence, the exact value of the product B(u i + u a i ) can be resolved, i.e., the exact inflow for each tank, despite the presence of the adversary signal u a i on actuators. When for Tanks 1, 2, 3 and 4, the inflows are greater than zero, but respectively less than η 1 λw 1 α , η 2 λw 2 α (1−η 2 )λw 2 α and (1−η 1 )λw 1 α , there is no way to obtain such flows involving Pumps 3 or 4. It means that Pumps 1 or/and 2 have been functioning, but not Pumps 3 and 4. Simulation of Case 4: Case 4 is simulated in Fig. 4. Inflows to Tanks 1, 2, 3 and 4 are shown by pump number. In Part (a), because they are variable flow, Pumps 1 and 2 can achieve inflows that are the same or below the inflows achievable by fixed-flow Pumps 3 and 4. When inflows are below what fixed-flow pumps can achieve, they can only be attributed to variable-flow pumps. When either Pumps 1 and 2 operate or Pumps 3 and 4 operate, we can tell which pair is involved. Discrimination is possible. Part (b) shows a condition where Pumps 1 and 2 are operated in ranges above what fixed-flow pumps can achieve. For example, an adversary adds voltages to signals and provokes inflow increases. Such a condition is VOLUME 9, 2021 FIGURE 3. Simulation of Cases 1 and 2. Part (a) plots the level in a one tank system under normal operation (solid blue line). In Part (b), and assuming solely the ultrasonic sensor is attacked, it is possible to track the level using the outflow sensor (solid red line). In Part (c), tank levels are tracked with ultrasonic sensors in the (1, 1)-resilient system. In Part (d) an adversary spoofs actuators and manipulates sensor signals such that they look as expected (dashed lines), although actual levels (solid lines) are different. The degree of resilience does not enable state recovery. In Part (e), tanks levels are tracked with ultrasonic sensors in the (2, 2)-resilient system. In Part (f), an adversary spoofs actuators and manipulates solely ultrasonic sensor signals (dashed lines). Actual levels (solid lines) can be recovered using observations from outflow sensors.
achievable operating Pumps 1 and 2 alone, or also in combination with Pumps 3 and 4. For this example, discrimination might be impossible.  the levels in Tank 1 and Tank 4, when the attack starts at T = 500 seconds. Case 5: When the state x i of the (2, 2)-resilient system, modeled by Eqs. (13), (14) and (15), is recoverable in one step and the action of the adversary on actuators can be determined resolving column vector u a i in the following equation:  to Pumps 1 and 2. It results into two equations, with one unknown in each of them, i.e., the adversary contributions to the actuator inputs u a i [1] and u a i [2].
3) DISCUSSION Fig. 6 provides an interpretation of all our simulations. We consider that the performance of a system is the capacity to maintain expected levels in tanks. Hence, the performance degradation corresponds to the deviation from the expected levels. The larger the deviation, the lower the performance. In Fig. 6, we represent these deviations in percentages. Figs. 6 (a), (b), and (c) respectively show the performance of the (1, 1)-, (1, 2)-and (2, 2)-resilient systems, when attacks are perpetrated. When a system is not attacked, performance is 100%. Attacks start at T = 500 seconds. The adversary manipulates inputs to drive more liquid in the Tanks 1 and 4. The consequence of the attack is a deviation from the expected system state. Quantifying this deviation, FIGURE 6. Performance evolution of the (1, 1)-, (1, 2)-and (2, 2)-resilient systems, when they are confronted to a covert attack. Performance degradation corresponds to the deviation from their expected levels. The larger the deviation, the lower the performance. The (1, 1)-resilient system, with no recovery capability, experiences a performance drop. In contrast, the (1, 2)-and (2, 2)-resilient systems recover from the attack. The (1, 2)-resilient system recovers with graceful degradation, due to the absence of actuator redundancy, while the (2, 2)-resilient system fully recovers.
we obtain a percentage of performance loss. When the (1, 1)-resilient system (with no recovery capability) is under attack, it experiences a performance drop. In the (1, 2)-and (2, 2)-resilient systems, it is possible to mitigate the effects of attacks and bounce back. As shown in Figs. 6 (b) and (c), respectively, the (1, 2)-resilient system recovers with graceful degradation, due to the absence of actuator redundancy, while the (2, 2)-resilient system fully recovers.

VI. CONCLUSION
We have addressed covert attacks on CPS. We have defined the new k-steerability and -monitorability control-theoretic concepts. The k-steerability concept reflects the ability in a CPS to act on each of its individual plant state variables with at least k functionally diverse groups of input signals. In other words, it reflects the ability of the CPS to mitigate the impact of covert attacks when less than k groups of input signals are compromised, using static functional diversity. Themonitorability concept reflects the number of observations on each state variable of a CPS that can be used to identify covert attacks. Together, k-steerability and -monitorability determine the (k, )-resilient property of a CPS. If we assume that the detection process is conducted by combining strategies, such as redundancy and diversity in hardware and software techniques, the resulting (k, )-resilient concept applies the moving target paradigm, in which the CPS adapts itself to invalidate the acquired knowledge of the adversaries. We have validated our findings by conducting representative simulations. Future work will improve current results by applying dynamic functional diversity, e.g., by applying a functional diversity of components that will evolve over time.

APPENDIX. A. SUPPLEMENTARY MATERIAL TO THE SIMULATIONS
We report in this appendix the simulation of Case 3 (cf. Section V-B). An existing Matlab implementation of the quadruple-tank process for this case scenario (available online at https://github.com/karrocon/pcsmatlab), was adapted and complemented with Matlab and Simulink code, w.r.t. the resilience and adversary models presented in this paper. The resulting code is also available on-line, in our github repository (cf. https://github.com/mirrored-quadrupletank/). The simulation of the Case 3, as in the Cases 1, 2, 4, and 5 (already reported in the main body of this paper, cf. Section V-B), implements a proportional-integral (PI) controller based on the differential equations of the quadruple-tank scenario By Johansson in Ref. [12].
Since the valves of the quadruple-tank scenario are not assumed vulnerable (e.g., we assume they cannot be attacked from the cyber space), we build the attacks assuming that the adversary is only taking control over the pumps (i.e., the adversary manages a remote access to the system, that allows manipulating the input voltages of the pumps acting as actuators of the quadruple-tank plant). Fig. 7 depicts the idea of our attack for both the original scheme of the quadruple-tank scenario in Ref. [12], and the extended (2, 2)-resilient scheme discussed in Section V-A. By attacking the voltage of the pumps, the adversary changes the inflow levels of the tanks. As depicted in Fig. 7, the adversary adds an attack signal to the input voltage of Pump 1. As a result of the attack, more liquid is pumped into Tanks 1 and 4.
According to the theorems defined in Section V-B, the adversary can also attack the sensors, in order to evade detection (i.e., by attacking both sensors and actuators, the adversary perpetrates a covert attack). The attack against FIGURE 7. Simplified representation of two representative attack scenarios. Red lines represent signals generated by the adversary. In Part (a), we assume an adversary perpetrating an attack against the original scheme in Ref. [12]. In Part (b), we assume an attack against the extended (2, 2)-resilient scheme discussed in Section V-A.
the sensors consists to manipulate the measurement signals of the sensors, before reaching the controller (e.g., by means of injection, spoofing and man-in-the-middle cyber attacks, using a remote access from the cyber space). Hence, wrong measurements are provided to the controller, to conceal the detection of the attack against the actuators (i.e., the pumps). In fact, the measurement modification hides the real state of the system to the eyes of the controller. In our simulations, we can separate the processing of truthful signals, from those manipulated by the adversary. To ease the analysis, two simulations are conducted for each scenario, at the same time. The sensor signals of the second simulation are sent to the FIGURE 8. Simulation results associated with Case 3 (cf. Section V-A), with regard to the (1,2)-resilient design. In Part (a,) we show the levels of the plant under normal operation (the ultrasonic level sensors are not under attack). In Part (b), attack mode, and assuming solely the ultrasonic sensor are attacked, we track the level using the outflow rate meters.
controller of the first simulation. Furthermore, and during the attack against the actuators, the adversary intercepts the truthful signals from the controller, and adds a modified input signal to the plant. This represents the disruption of the plant that is captured from the sensors of the system. Finally, the simulations assume that the attacked input voltage of the Pump 1 is increased by 50% w.r.t. its initial value, as shown in Fig. 5(a), in Section V-B. The consequence of this attack is depicted in Fig. 5(b) (also in Section V-B). As a result of the aforementioned attack simulations, with respect to the (1,2)-resilient system (cf. Section V-A), Fig. 8 shows the plant signals associated to the Case 3 (cf. Section V-B). The (1,2)-resilient system has eight sensors (four ultrasonic sensors and four outflow meters) and two actuators (Pumps 1 and 2). If only the ultrasonic level sensors (or only the outflow meters sensors) are attacked, then the state is recoverable. Fig. 8(a) shows the signals from the non-attacked level sensors. When only one family of sensors is attacked (either the ultrasonic or the outflow meters ones), then we can appreciate the system can recover the state by using the non-attacked outflow sensors, as shown in Fig. 8(b). Notice that if we were conducting the full covert attack over the (2,2)-resilient system (cf. Fig. 7(b)), the controller will also be able to recover the system state, using the additional pumps (Pumps 3 and 4), in a more optimal way, as already indicated in the discussion of Section V-B (and shown in the interpretation of results included in Fig. 6 of Section V-B).
FRÉDÉRIC CUPPENS (Member, IEEE) is currently a Professor of Computer Science with the Department of Computer Engineering and Software Engineering, Polytechnique Montréal, Canada. He has worked for more 20 years on computer security topics, including formal models of security policies, access control to network and information systems, intrusion detection, response and counter-measures, and formal techniques to refine security policies and prove security properties. He has published more than 250 technical articles in refereed journals and proceedings. He received the Ampere Award from SEE in 2015 and the Outstanding Research Award from IFIP WG 11.3 in 2016.
NORA CUPPENS (Member, IEEE) received the engineering degree in computer science, the Ph.D. degree from SupAero, and the HDR degree from University Rennes 1. She is currently a Professor of Computer Science with the Department of Computer Engineering and Software Engineering, Polytechnique Montréal, Canada. She has published more than 100 technical articles in refereed journals and conference proceedings. Her research interests include formalization of security properties and policies, cryptographic protocol analysis, formal validation of security properties, and thread and reaction risk assessment. She received the Outstanding Service Award from IFIP TC 11 in 2016 and the Outstanding Service Award from IFIP WG 11.3 in 2017.
ROMAIN DAGNAS (Member, IEEE) received the License degree in mathematics, computer sciences, and physical sciences from the Faculté des Sciences et Techniques de Limoges, France, in 2017, the master diploma degree in computer sciences from 3iL Ingénieurs, France, in 2019, and the master diploma degree in mathematics, cryptology, application coding from the CRYPTIS, Faculté des Sciences et Techniques de Limoges, France, in 2019. He is currently a Research-Engineer of Cybersecurity, working on the design and the assessment of cyber-resilient systems with the Technological Research Institute SystemX, Palaiseau, France.
JOAQUIN GARCIA-ALFARO (Senior Member, IEEE) received the double Ph.D. diploma degree in computer science from the Autonomous University of Barcelona and the University of Rennes, and the research Habilitation degree from Université Sorbonne VI (Pierre et Marie Curie). He is currently a Professor with the Networks and Telecommunication Services Department, Télécom Sud-Paris (Institut Polytechnique de Paris, France) and an Adjunct Research Professor with Carleton University, Ottawa, ON, Canada. He is involved in several research projects at National and European levels, related to ICT security. His research interests include a wide range of information security problems, with an emphasis on the management of security policies, analysis of vulnerabilities, and enforcement of countermeasures. VOLUME 9, 2021