Scalable Security Enforcement for Cyber Physical Systems

The security of Cyber-Physical Systems (CPSs) is increasingly important as more and more of these systems are added to the Internet of Things (IoT). As we increase the complexity and connectivity of our smart systems, we likewise broaden their digital attack surface. Recorded attacks on CPSs have caused significant physical impacts making methods for mitigation of attacks of paramount importance. The use of runtime enforcement (RE) can prevent violation of security policies. Here, runtime enforcers intervene before the CPS is compromised. Two key challenges are presented: (1) for complex systems, methods for automatically composing multiple policies are lacking; and (2) runtime enforcers are themselves executed digitally—meaning they too could have potential security vulnerabilities. We present the first comprehensive runtime enforcement framework which addresses both challenges. It can compose a lot of security policies in parallel and synthesize these policies into the more trustworthy hardware layers of a system. This removes reliance on potentially vulnerable firmware and software layers. We demonstrate our approach with policies to mitigate a set of attacks on a Fused Filament Fabrication (FFF) 3D printer. The experimental results show linear growth in logic element and register usage as the number of policies increase. This compares favourably to the exponential state space explosion that occurs with the conventional approach of monolithic composition. Additionally, we find higher enforcer clock frequencies are possible with the proposed parallel approach compared to existing serial approaches.


I. INTRODUCTION
Cyber-Physical Systems (CPSs) combine the physical and digital worlds, where embedded digital controllers interact with the physical world through environmental sensors and actuators [1].Our modern world is reliant on their functionfrom when you turn your lights on in the morning, which requires electrical generation and distribution (smart grids), to having your smart coffee machine brew your morning coffee automatically based on your wake-up time, to your commute via car, bus, train, or e-bike, which is enhanced by The associate editor coordinating the review of this manuscript and approving it for publication was Wei Yu .integrated embedded systems for both their operation and in their manufacturing processes.
It is thus imperative that designers of CPSs take into account security when implementing their systems.This is not trivial, as illustrated by the range of high profile CPS attacks including the Stuxnet worm damaging Iranian centrifuges [2], the German Steel Mill attack, which prevented a blast furnace from shutting down and caused significant damage [3], and the ransomware attack on Colonial Pipeline, which caused serious disruption to gasoline supply in the United States and resulted in a multi-million dollar payout to the attackers [4].
However, attacks are not just limited to industrial plants, Internet of Things (IoT) devices have been compromised and used to launch Distributed Denial of Service (DDoS) attacks [5], [6], weaknesses in over-the-counter drones have been demonstrated [7], and compromise of Additive Manufacturing (AM) can cause propeller defects which fail in flight [8].
In this work, we focus our attention on Fused Filament Fabrication (FFF) AM, known commonly as filament 3D printers.These devices have become increasingly ubiquitous, with a range of models available for hobbyist to commercial use.Filament 3D printers are CPSs as they sense the environment through sensors (for example, temperature), impact the environment through actuators (for example, heaters and motors) to create 3D objects, and are controlled by an embedded controller.These printers are subject to a number of threats, as illustrated in Figure 1.For example, as 3D printers are connected to the internet for remote monitoring and control, they are exposed to internal and external network threats from malicious actors.Attackers could take advantage of security vulnerabilities in the printer's firmware or software to take control of the device.The attacker could then cause a range of damage potentially including: defects in printed objects, damage to the printer, and start a fire, which poses a significant safety risk.These threats and potential impacts follow a pattern shared by many CPS, where successful attacks have significant impact on the physical world.
Research in formal methods gives us a reliable mechanism for mitigating security vulnerabilities.The area of runtime verification (RV) [9], [10], [11], [12], [13] considers methods to generate runtime monitors which verify a set of policies while the system is running.If a policy is violated the monitor will raise an alarm.RV approaches are often applied where the systems are black-box (where the inputs and outputs of a system are known but internal behaviour and mechanics are not) or too complex for traditional model checking [14].
A key limitation of RV is that it fails to prevent the violation from occurring, and in security, violations cannot be tolerated.The area of runtime enforcement (RE) [15], [16], [17], [18], [19] extends these monitors to intervene before a violation occurs.
In this work, we consider FFF 3D printers as synchronous reactive cyber physical systems.Synchronous reactive systems are those which react to input stimuli and produce outputs continuously.Their function can be separated into logical ticks that consist of reading inputs, performing computation, and emitting outputs.Following the synchrony hypothesis, these logical ticks are considered as atomic events which occur infinitely faster than the environment produces input stimuli.
The system view of a synchronous reactive bidirectional enforcer is illustrated in Figure 2. The enforcer is placed between the environment (labelled Env.) and controller (labelled Ctrl.) such that it can observe and alter Env.inputs and Ctrl.outputs.For every tick of the controller, the enforcer first inspects Env.Inputs I and as necessary edits these to satisfy the set of policies (ϕ) before emitting Safe and Secure Inputs I' to the controller.The controller then executes it's reaction to these inputs and emits Ctrl.Outputs O.The output enforcer inspects and, as necessary to satisfy the policy (ϕ), edits these outputs to produce Safe and Secure Outputs O', which are then exposed to the environment.This cycle repeats for every tick.The number of security policies increases as CPSs and the threat landscape become more complex.Therefore, the need to enforce multiple policies simultaneously has emerged.To simultaneously enforce multiple policies, there exist three methods of composition: monolithic, serial, and parallel.
The monolithic approach consists of taking the product of individual policies to produce a single large policy.This is then synthesised into a single enforcer.This approach has been shown to scale poorly due to state space explosion [20].The serial approach, introduced for bidirectional synchronous reactive systems in [20], synthesises multiple enforcers that are executed sequentially.This overcomes the scalability issues of the monolithic approach, but is implemented in software, which assumes the firmware and software stack to be safe and secure.The myriad of security vulnerabilities and successful compromises of this stack challenge this assumption.This suggests enforcers executing on software platforms are at risk of being compromised by malicious actors exploting software vulnerabilities.This motivates our work to provide a high-trust method for compositional enforcement in bidirectional synchronous systems, as illustrated in Figure 2. We develop a generic framework that supports parallel composition of enforcers in hardware and 14386 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
can be applied to a range of CPSs.To demonstrate this framework, we consider a range of attacks on a FFF 3D printer and develop a range of policies, that we synthesise into enforcers, to defend against the attacks.These enforcers are synthesised to be executed simultaneously and then, through a merge block, a final set of signals, that satisfies all policies, is emitted.
Contributions.The contributions of this work are as follows: • We propose a novel compositional framework for bidirectional RE that supports multiple policies with the parallel approach.
• We provide a tool which compiles multiple policies to a hardware description language for parallel compositions.This is an extension to the compiler easy-rteincremental [20] which we call easy-rte-hardware.
• We consider an FFF 3D printer as our CPS case study for which we develop a set of attacks that may be carried out by a malicious actor.To mitigate these attacks we develop a set of defense policies.These policies are synthesised into hardware runtime enforcers with monolithic, serial, and parallel compositions.
• Our evaluation and analysis compares monolithic, serial, and parallel compositions of non-functional metrics as the number of policies increases.The results demonstrate exponential growth in compile time, compiled code size, synthesised logic elements, and synthesised registers for the monolithic approach.The serial and parallel approaches compare favourably with linear growth for these metrics.Outline.In Section II, we introduce and discuss related work.In Section III, we introduce the preliminaries and notations for RE of synchronous systems.We recall, from existing work, the RE framework for synchronous programs in Section IV.In Section V, we discuss compositional approaches to enforcement, recalling, where appropriate, existing work in monolithic and serial composition before discussing limitations.This motivates our high-trust parallel composition, which we introduce in Section VI.In Section VII, we present the security of 3D printers, with a set of potential attacks and policies that mitigate them.We then explain the synthesis of hardware enforcers using monolithic, serial, and our proposed parallel framework in Section VIII.Results of this implementation are presented Section IX, where we compare monolithic, serial, and parallel composition of enforcers in hardware.In Section X, we discuss challenges, trade offs, and applications of the presented work.Finally, conclusions are drawn in Section XI.

II. RELATED WORK
The generation of enforcers from properties is an existing field of research with varying approaches and applications.Schneider [15] proposes enforcers which delay execution with buffering when a sequence of events does not satisfy the security automata.Edit automata [16] allow enforcers to alter the input sequence by suppressing and/or inserting events.The methods in [18] enable the buffering of events, which are released once the sequence satisfies the required policy.These approaches consider only a single direction of communication, often from the controller to the environment, which limits properties to enforce only outputs.
Bi-directional runtime enforcement was added in Mandatory Result Automata (MRAs) [21] which can reason over communication between two parties.Pearce et al. [22] proposed bidirectional runtime enforcement for various cyber-physical threats in industrial applications with timed policies and, in previous work, we modelled bidirectional jamming, injection, and edit attacks to create enforcers that attack a simulated drone system.
In the security domain there are a number of scenarios where policies need to be composed.A prevalent example is firewall policies.There are many simultaneously active policies, such as allowing the flow of traffic between company IP addresses and port ranges, and blocking access to untrustworthy domains and IP addresses.Often these are supported informally by firewalls or device level security applications.Examples are the Fang [23] and Firmato tools [24] that allow the composition of security policies for firewall management.Other applications include developing, updating, and visualising access control policies, a subset of runtime enforceable policies, in [25].Policies are not just limited to access and firewalls.In [26], a tool polymer is proposed for enforcing composable policies in java applications.Such tools and approaches are common but often informal, meaning they may not gurantee correctness.
In [27], Pinisetty and Tripakis investigated the compositionality of enforcers for a unidirectional RE architecture that allows the enforcer to buffer events (this is equivalent to delaying events).This work explored the synthesis of multiple enforcers, one for each policy, and then if composing these in series or parallel could satisfy all policies.However, buffering of events is not possible in reactive systems.
For reactive systems, the enforcer must react instantly, and so these approaches which allow combinations of halting or delaying are not adequate.However, enforcement frameworks developed in works such as [19], [20], and [28] are relevant.
Unidirectional reactive enforcers, termed shields, are introduced in [28], where safety properties are synthesised into shields which observe environment input and controller outputs.The shields then transform outputs as little as possible to ensure correctness by the safety properties.
Our work is based on [19], which introduces a framework for bidirectional RE, but does not consider composition of multiple policies, and [20], which introduces a framework for incrementally composing policies but considers only serial composition in software.
No existing RE framework considers parallel composition for reactive security enforcers in hardware.The contributions of this work address the risk of security vulnerabilities in software platforms, which undermine enforcer integrity, and simultaneously supports composing multiple policies for increasingly complex reactive systems.

A. RELATIONSHIP TO PHYSICAL LAYER FAULT TOLERANCE
The hardware runtime enforcement we propose is similar in spirit to other approaches that use hardware to ensure a minimum quality of service such as physical layer fault tolerance.
Physical layer fault tolerance can be improved with a variety of approaches that use the principles of redundancy, error detetion, and correction.In [29] and [30], fault tolerance is considered for multi-agent distributed systems where concensus between agents must be reached without continious communication.Our work differs as a single agent (enforcer) is responsible satisfying multiple policies.This requires exammining bidirectional communication between the system plant and controller, rather than communication between agents in a multi-agent system.
However, there remains a largely unexplored area of runtime enforcement, called distributed enforcement, where policies can be applied to and between agents.Elements of control theory from multi-agent systems, like those in [29] and [30], may be leveraged for distributed enforcement, though this is beyond the scope of this work.

III. PRELIMINARIES
In this section, we introduce the notations and the safety automata formalism used to define security policies to be monitored and enforced.We also briefly recall the RE problem for synchronous programs (all the constraints that an enforcer should fulfill).
A finite word over a finite alphabet is a finite sequence σ = a 1 • a 2 • • • a n of members of , and * denotes the set of finite words over .Considering a finite word σ , its length is denoted as |σ |. ϵ is used to denote the empty word over is denoted by ϵ , or ϵ (when the context makes it evident).Given two words σ and σ ′ , their concatenation is indicated as σ A reactive system with finite ordered sets of Boolean inputs denotes the output alphabet, and the input-output alphabet is = I × O .A bit-vector/complete monomial will be used to represent each input (resp.output) event.For example, let us consider I = {P, Q}.Then, the input {P} ∈ I is denoted as 10, while {Q} ∈ I is denoted as 01 and {P, Q} ∈ I is denoted as 11.A reaction (or input-output event) has the following structure: (x i , y i ), where which is a projection that ignores outputs and is based on inputs.Similarly, the output word obtained

the projection on outputs ignoring inputs.
A policy denoted as ϕ (over ) represents a set L(ϕ) ⊆ * .Given a word σ ∈ * , σ | ϕ iff σ ∈ L(ϕ).A policy ϕ is prefix-closed if all prefixes of all words from L(ϕ) are also in L(ϕ): L(ϕ) = {w | ∃w ′ ∈ L(ϕ) : w ≼ w ′ }.Prefixclosed policies are the focus of this study.Security policies are formalized as safety automata, which we define next in this section.
Synchronous programming languages [31] are ideal for developing synchronous reactive systems.They express safety properties via observers [32], which are statically verified (using model checking).Safety automata are analogous to observers but are enforced at runtime.
Definition 1 (Safety Automaton): A safety automaton (SA) A = (Q, q 0 , q v , , − →) is a tuple, where Q denotes the set of states, known as locations, q 0 ∈ Q is a distinct starting location, q v ∈ Q is a distinct non-accepting (violating) location, the alphabet is = I × O , and the transition relation is − →⊆ Q × × Q.Except for q v , all the other locations are accepting (i.e., all the locations in Q \ {q v }).Location q v is a distinct violating (trap) location, thus no transitions in − → from q v to a location in Q \ {q v } exist.Whenever there exists (q, a, q ′ ) ∈− →, we denote it as q a − → q ′ .Relation − → is extended to words σ ∈ * by noting q σ.a −→ q ′ whenever there exists q ′′ such that q σ − → q ′′ and q ′′ a − → q ′ .A location q ∈ Q is reachable from q 0 if there exists a word σ ∈ * such that q 0 σ − → q.
The set of all words accepted by A is denoted as L(A).
Remark 1: We can first determinize and complete a non-deterministic or incomplete automaton provided by the user.We further assume that Q has no (redundant) locations that are unreachable from q 0 .Hence, in the rest of this work, ϕ is a safety policy specified as deterministic and complete SA A ϕ = (Q, q 0 , q v , , − →).
The enforcer must first alter inputs from the environment in each step according to policy ϕ specified as SA A ϕ according to the causality requirement.As a result, we must examine the input policy obtained by projecting on inputs from A ϕ .
Definition 2 (Input SA A ϕ I ): Given ϕ ⊆ * , specified as SA A ϕ = (Q, q 0 , q v , , →), by discarding outputs on the transitions, input SA A ϕ I = (Q, q 0 , q v , I , → I ) is derived from A ϕ .That is, for every transition q (x,y) −−→ q ′ ∈→ where (x, y) ∈ , there is a transition q x − → q ′ ∈→ I , where x ∈ I .L(A ϕ I ) is represented as ϕ I ⊆ * I .Example 1 (Example policy defined as SA and its input SA): Consider I = {B, Q} and O = {X }.Let us consider the policy: P: ''B and Q can't happen at the same time and Q and X can't happen at the same time.''Policy P is defined by the safety automaton in Figure 3a.The input SA for the SA 14388 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
in Figure 3a defining policy P is shown in Figure 3b.Though the SA A ϕ is deterministic, the input SA A ϕ I may be nondeterministic.This is the case with the considered example as shown in Figure 3b.Lemma 1: Consider A ϕ I = (Q, q 0 , q v , I , → I ) be the input automaton derived from A ϕ = (Q, q 0 , q v , , →).The policies we have are as follows: Lemma 1 is an immediate consequence from Definitions 1 and 2. Policy 1 states that if there is a transition from state q ∈ Q to state q ′ ∈ Q in the automaton A ϕ upon input-output event (x, y) ∈ , then there is a transition from state q to state q ′ in the input automaton A ϕ I upon the input event x ∈ I .Policy 2 states that if there is a transition from state q ∈ Q to state q ′ ∈ Q upon input event x ∈ I , then there must be an output event y ∈ O s.t.there is a transition from state q to state q ′ upon event (x, y) in the automaton A ϕ .
In the product SA A ϕ 1 × A ϕ 2 , all the locations in (Q 1 × q 2 v )∪(q 1 v ×Q 2 ) are trap locations.All the outgoing transitions from these locations can be replaced with self-loops, and all such locations can be merged into a single violating location labeled as q v .Any outgoing transition from a location in Q \ The product of SAs is useful to enforce multiple policies using the monolithic approach by first constructing a product of the given SAs.Given two deterministic and complete SAs A ϕ 1 and A ϕ 2 , the product SA A ϕ 1 × A ϕ 2 is deterministic and complete which recognizes the language L(A ϕ 2 ) ∩ L(A ϕ 2 ).
• editI ϕ I (σ I ): Given σ I ∈ * I , editI ϕ I (σ I ) is the set of input events x ∈ I s.t. the word obtained by concatenating x after σ I satisfies policy ϕ I .Formally, When we consider the SA A ϕ I = (Q, q 0 , q v , I , → I ), the members in I that allow to reach a state in Q \ {q v } from a state q ∈ Q \ {q v } is defined as: Let us, for example, consider the SA in Figure 3b derived from the SA in Figure 3a by projecting on inputs.If we consider σ = (10, 0) • (01, 1), we have σ I = 10 • 01.Then, editI ϕ I (σ I ) = I \ {11}.Moreover, q 0 10•01 − −− → I q 0 , and editI A ϕ I (q 0 ) = I \ {11}.
• randEditI Aϕ I (q) If editI A ϕ I (q) is non-empty, then randEditI A ϕ I (q) returns an element (chosen randomly) from editI A ϕ I (q) and is undefined if editI A ϕ I (q) is empty.
• editO ϕ (σ, x): Consider an input event x ∈ I , and an input-output word σ ∈ * .We have editO ϕ (σ, x), the set of output events y in O s.t. the input-output word obtained by concatenating σ followed by (x, y) (i.e., σ • (x, y)) satisfies policy ϕ.Formally, When we consider the automaton A ϕ = (Q, q 0 , q v , , →) specifying policy ϕ, and an input event x ∈ I , the set of output events y in O permitting to reach a state in Q \ {q v } from a state q ∈ Q \ {q v } with (x, y) is defined as: For example, consider policy P defined by the automaton in Figure 3.We have editO A ϕ (q 0 , 01) = {0}.
• randEditO Aϕ (q, x) If editO A ϕ (q, x) is not empty, then randEditO A ϕ (q, x) returns a random element from editO A ϕ (q, x), and if editO A ϕ (q, x) is empty randEditO A ϕ (q, x) is undefined.

B. SELECT FUNCTIONS
In this section, we recall the Select functions, the minD function, and the incremental enforcement function.The incremental security enforcement schemes that we shall be discussing are defined using these Select functions.
• SelectO ϕ (σ, x, Y ): Given an input-output word σ ∈ * , an input event x ∈ I , and a set of output events Y ⊆ O , SelectO ϕ (σ, x, Y ) is the set of output events y in Y s.t. the input-output word obtained by extending σ with (x, y) satisfies policy ϕ.Formally, Considering the automaton A ϕ = (Q, q 0 , q v , , →) defining policy ϕ, and an input event x ∈ I , the set of output events y in Y that allow to reach a state in Q \ {q v } from a state q ∈ Q \ {q v } with (x, y) is defined as: For example, consider policy P illustrated in Figure 3.

IV. RUNTIME ENFORCEMENT WITH SAFETY AUTOMATA
To set the scene, we recall the runtime enforcement approach from [19] which presents how any given word σ ∈ * is transformed to comply with the policy ϕ.
An enforcer for the policy ϕ can only alter an input-output event when it's absolutely essential; it can't block, postpone, or suppress events.
An enforcer may be thought of as a function that modifies input-output words at a high level.An enforcement function for the policy ϕ takes an input-output word over as input and produces an input-output word over that conforms to ϕ as output.
We briefly recall the constraints that an enforcer for any given policy ϕ should satisfy.Formal definitions of these constraints and more details are given in [19].
Constraints that should be satisfied by an enforcer for a given property ϕ: Several constraints, such as Soundness, Transparency, Monotonicity, Instantainety, and Causality, must be met by an enforcer for a given property ϕ.
To be considered sound, the enforcer's output for each input word must always satisfy the property ϕ.In order to maintain transparency, the enforcer leaves the input event undisturbed when no changes are needed to comply with policy ϕ.The monotonicity condition means that the enforcer cannot undo what has already been transmitted as output.The instantainety constraint states that when the enforcer receives a new event, it must respond immediately and provide an output event instantaneously.The causality constraint specifies that, the enforcer has to first transform inputs from the environment in each step according to property ϕ, followed by reading and transforming the output.
Remark 2 (Enforceability): Let ϕ ⊆ * be a policy.We recall from [20] that ϕ is enforceable iff an enforcer E ϕ , for ϕ, satisfying all the constraints such as Soundness, Transparency, Monotonicity, Instantainety, and Causality exists.As discussed in [20], not all safety properties are enforceable.The conditions for enforceability (defining when a given policy is said to be enforceable) are also discussed in [20].Informally, we can understand the SA defining the policy to satisfy the enforceability condition when every accepting state in the SA has one (or more) transition(s) to an accepting state.
We now recall the definition of an enforcement function from [19] that satisfies the above discussed constraints.Every reaction of the system is an input-output event pair (x, y), where x ∈ I is the input, and y ∈ O is the output.When an enforcer receives an input-output pair (x, y) it immediately produces the transformed input-output pair (x ′ , y ′ ).The enforcer first processes the input x to produce x ′ , and then the output y, to produce y ′ and form the pair (x ′ , y ′ ).The enforcement function E ϕ consists of two subfunctions: input enforcement function E I and output enforcement function E O .E I requires the environment input x to produce transformed input x ′ .E O requires the transformed input x ′ and the controller output y (obtained by running the controller) to produce the transformed event pair (x ′ , y ′ ) which is appended to the output of the enforcer.

Definition 4 (Enforcement Function): The enforcement function E
where: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where y ′ = randEditO ϕ (E O (σ I , σ O ), x).The function E ϕ takes a word over * and outputs another word over * .For a word σ ∈ * the projection of σ on inputs is σ I ∈ * I , and the projection of σ on outputs is σ O ∈ * O .The output of function E ϕ is defined through two functions, E I and E O .
Function E I : The input enforcement function E I takes, as input, a word projected on inputs σ I ∈ * I and returns a word in * I for a given word σ ∈ * .Inductively, the function E I is defined.When the input σ I = ϵ I , it returns ϵ I .When I is read as input and E I (σ I ) is returned as output, there are two possible scenarios: • If E I (σ I ) followed by the input x satisfies input policy ϕ I then the input x is concatenated to the previous output of function • Alternatively, E I (σ I ) • x does not satisfy ϕ I and so input x is transformed to input x ′ using randEditI ϕ I (E I (σ I )), which is appended to the previous output of function The output enforcement function E O takes, as input, an input word from * I and an output word from O * and returns an input-output word in * , which is a sequence of tuples with an input and an output for each event.Inductively, the function E O is defined.The output of E O is ϵ when both the input and output words are empty.
If σ I ∈ * I and σ O ∈ * O is read, the output will be E O (σ I , σ O ).If another input event x and output event y are observed, there are two possibilities: x) alters output y to obtain y ′ , and the event (x, y ′ ) is added to the previous output of the function

Remark 3 (Functional definition satisfies constraints):
In [19], it is proved that for any given policy ϕ that is enforceable, the enforcer defined as function E ϕ (Definition 4) satisfies the Soundness, Transparency, Monotonicity, Instantainety, and Causality constraints.Example 2 (Functional definition): For example, consider the policy P: ''MAX_CURRENT and HEATER can't happen at the same time, and MAX_TEMP and HEATER can't happen at the same time'' illustrated in Figure 4, where I = {MAX_CURRENT, MAX_TEMP} and O = {HEATER}.The output of functions E I and E O is illustrated in Table 1.The complete the input sequence σ = (01, 0) • (01, 1) (where σ I = 01 • 01 and σ O = 0 • 1) is processed gradually by the enforcer function.Initially, the input and output words are empty, ϵ I and ϵ O respectively, and so enforcer output is empty, ϵ.The first event, when σ is (01,0) (i.e.MAX_TEMP), it satisfies policy P, so is emitted without any edit.The second event (01,1) (i.e.MAX_TEMP and HEATER) (σ = (01, 0) • (01, 1)) is acceptable to the input enforcer and so E I (σ I ) = 01 • 01.However, the HEATER output of 1 violates P, and so the enforcer transforms the HEATER signal to 0 to satisfy the policy.Thus, the enforcer outputs (01, 0) • (01, 0).

V. RUNTIME ENFORCEMENT WITH MULTIPLE POLICIES
In this section, we discuss monolithic, incremental, and parallel composition methods for enforcing a set of policies expressed as Safety Automaton (SA) in the reactive systems framework.
Example 3 (Example Policies): Let I = {A, B, C} and O = {R}.Consider the following policies: S 1 : ''A and B cannot happen simultaneously, and also B and R cannot happen simultaneously'' and S 2 : ''B and C cannot happen simultaneously.''The safety automaton in Figure 5a and Figure 5b define policies S 1 and S 2 respectively.

A. MONOLITHIC SECURITY ENFORCEMENT
The monolithic approach to the composition of a collection of (SA) policies takes the product of policies.We take the product of SA as defined in [20] as the intersection of policies.We can then synthesise one enforcer for the resulting policy (product of policies) if this resulting policy is enforceable (as discussed in Remark 2).
Specifically, given any two policies ϕ 1 and ϕ 2 , to enforce both these policies, we first compute ϕ = ϕ 1 ∩ ϕ 2 (by computing the product of SA for ϕ 1 and ϕ 2 ).Then if the resulting SA for ϕ is enforceable, we synthesize an enforcer for ϕ using the approach described in Section IV.
Example 4 (Monolithic Approach): Consider policies S 1 and S 2 defined as SA illustrated in Figure 5.The product of these automata, defining S 1 ∩ S 2 , is shown in Figure 6.The policy S 1 ∩ S 2 is enforceable as every accepting state has one (or more) transition(s) to an accepting state (See Remark 2).The behaviour of an enforcer for A S 1 ∩S 2 is illustrated in Table 2 when the input-output word (100, 1)•(110, 1)•(011, 0) is processed incrementally.: Product of automaton S 1 and S 2 .

TABLE 2. Example illustrating behavior of enforcer for
As discussed in the problem description, we focus on how to improve trust and scalability of enforcers with hardware composition.In our framework we cannot use the existing Definition 4 to compose enforcers as, in this definition, an enforcer reads from the environment and edits any inputs to satisfy the policy it is defined for.The same occurs for controller output, the enforcer reads this and edits any outputs to satisfy the underlying policy.This does not support multiple policies as an edit made by one enforcer may not be compatible with the other enforcer.In our framework we require all enforcers acceptable solutions to be considered before an edit is selected.In the following section we recall incremental security enforcement [20] which addresses this concern.

B. INCREMENTAL SECURITY ENFORCEMENT
In earlier work [20], the enforcement layer was incrementally expanded to defend against new threats.This work introduced a framework where each enforcer sequentially takes, as input, a set of possible acceptable solutions.This required a redefinition of the enforcement function from Definition 4 and the definition of Select functions.The Select functions produce subsets of possible acceptable solutions which satisfy the policy.This subset is reduced incrementally by each enforcer's Select function until a final set, which is acceptable to all policies, is produced.A final function, MinD, is then used to pick the edit action from the final set.This is repeated for both input and output.
ϕ 2I are their corresponding input policies), we define the enforcement function E ϕ 1 ⇛ E ϕ 2 : * → * as E O (E I (σ I ), σ O ) where: ).As per the incremental composition in Definition 5, all possible inputs I are passed to the input enforcers.The set obtained via SelectI satisfies all input policies, ϕ 1 and ϕ 2 in this instance.
When a new input event a arrives, it is input to the MinD function along with the output from SelectI.MinD chooses (if required) a suitable element from the set which satisfies all input policies to emit to the controller.If the input a satisfies the policies, it will be in the set from SelectI, and thus will be selected by MinD (this respects the Transparency requirement).
Similarly, all possible output events O and the final input event x are input to the output enforcers SelectO.The set obtained from this satisfies all output policies, ϕ 1 and ϕ 2 in this instance.
When an output event y arrives from the controller, it is input to the MinD function along with the output from SelectO, the set of all possible events which satisfy all policies.MinD chooses (if required) a suitable element from the set which satisfies all output policies to emit to the environment.Similarly to the input enforcement, if y satisfies all policies, it will be the set selected by MinD to respect the Transparency requirement.

C. A NEW ENFORCEMENT APPROACH
We have now considered monolithic and incremental enforcement approaches to composing multiple policies.The problem of composing multiple enforcers is not straight forward.Multiple enforcers as defined in Definition 4 cannot simply be combined, as demonstrated in [20].
The incremental approach resolves this with the definition of Select functions that output all acceptable solutions.Then a merge function, like Rand (which picks an element randomly from the set), can determine the final output which satisfies all policies.
In our work we consider trust and scalability improvements in hardware composition, and so the incremental approach does not work either, for the following reasons: • As the enforcers execute in parallel, they do not consider other enforcer's acceptable solutions as in the incremental approach.Instead parallel enforcers consider only current events (environment input(s) and/or controller output(s)), the internal automaton location, and any clocks.This prevents parallel enforcers from snooping on output from other enforcers.This supports hidden or confidential policies and their respective enforcers.
• As each enforcer produces a set of acceptable solutions simultaneously, these must be combined (via intersection) before an edit can be selected.
• From a practical perspective, passing sets of acceptable solutions (i.e.sets of sets) between hardware components does not scale.The initial set of acceptable solutions is 2 (N I +N O ) where N I is the number of input events and N O is the number of output events.As such, there would be at minimum 2 N I +N O connections between each hardware component.
For these reasons, the existing enforcement functions are not suitable for parallel hardware enforcement and so we propose a new high-trust enforcement scheme in the following section.

VI. HIGH-TRUST HARDWARE ENFORCEMENT
In this section we motivate our hardware implementation of the parallel enforcement strategy in hardware.Like previous efforts proposing runtime enforcement in hardware [22], [33], we also choose this strategy as pure-hardware systems are, by definition, more secure than those relying on software.Software-based systems, as they run atop a hardware platform, must consider software security, hardware security, and cross-domain security [34].Pure hardware systems, on the other hand, have a far smaller attack surface since they only need to be concerned about the subset of the aspects of hardware security.Software systems, which rises in complexity from 'bare-metal' application code all the way through to multi-threaded and networked operating systems, introduce a wide attack surface depending on the nature of the application in question.Without such a software layer, an entire class of vulnerabilities are no longer applicable to pure hardware systems.
Still, there are risks in hardware applications, including in the supply chain, although we largely consider these out of scope for this work (refer to the surveys [35], [36] for problems and solutions posed in the broader hardware security literature).The major concern in this work regards functional bugs which may be exploitable by malicious adversaries.This is because we target hardware implementations which are immutable and without update functionality.The only risk, if your production facilities are trustworthy, would be in implementations that are faulty because these flaws would not be patchable.Ergo, it is essential that the hardware be implemented correctly-the procedures outlined in this section present this formally.

A. PARALLEL HARDWARE ENFORCEMENT
To define a high-trust framework for multi-policy enforcement in hardware we can repurpose the input and output enforcement Select functions from incremental enforcement function.To compose multiple enforcers in parallel we need to redefine the enforcement function as illustrated in Figure 7: • composing all input enforcement functions in parallel, such that each accepts input (x) and releases all satisfactory input combinations (X 1 , X 2 ) to a merge block where input that satisfies all policies (x ′ ) is selected and released to the controller.
• similarly, composing all output enforcement functions in parallel, such that each accepts final input (x ′ ) and controller output (y) then releases all satisfactory output combinations (Y 1 , Y 2 ) to a merge block where output that satisfies all policies (y ′ ) is selected and released to the environment.• E I : * I → * I is defined as: where • Rand is a function that picks an element randomly for a set.
Note the parallel composition of enforcers using Select functions always works.Given two properties, ϕ 1 , ϕ 2 , and also ϕ 1 ∩ ϕ 2 are all enforceable, parallel composition of enforcers of ϕ 1 and ϕ 2 as per the above definition works.The final output obtained does satisfy ϕ 1 ∩ ϕ 2 .
Let us consider input enforcement to understand this (similar reasoning applies to output enforcement).As per parallel composition, defined in Definition 6, all input enforcers simultaneously consider all possible inputs, I .The set obtained (using the intersection of individual SelectI() output) is a valid one that satisfies all input properties (ϕ 1 and ϕ 2 in this case).This set is provided to randEdit which chooses, as required, a suitable edit.
Let us consider the following example to understand this further.
Example 5 (Parallel Composition using Select): Let us again consider properties S 1 and S 2 illustrated in Figure 5.Both properties S 1 and S 2 are individually enforceable and the property S 1 ∩ S 2 is also enforceable.When we compose input and output enforcers for these properties in parallel as per Definition 6, the final output obtained satisfies property S 1 ∩ S 2 .For example, as shown in Table 3, consider the word (100, 1) • (110, 1) • (011, 0) to be processed.Whenever any input is given, the function randEditI ϕ I always selects a valid element to input to the output enforcement function always satisfying all the properties.Theorem 2 (Parallel composition using Select): Consider two policies ϕ 1 , ϕ 2 defined as SA, and where ϕ = ϕ 1 ∩ ϕ 2 .If policy ϕ is enforceable, then E ϕ 1 ||E ϕ 2 as per Definition 6 is an enforcer for ϕ (satisfies the Soundness, Transparency, Monotonicity, Instantainety, and Causality constraints).

VII. SECURITY OF 3D PRINTERS
Functionally, FFF 3D printers are a type of CPS.A filament is fed into an extruder and hotend at the filament material's melting temperature (commonly used plastics range between 200 • C and 250 • C).The hotend nozzle is then moved precisely in 3D space in combination with the extrusion of melted filament to build up, layer by layer, the desired 3D objects.
In recent years FFF printing has expanded rapidly.This includes many quickly produced low-cost models, that enable more hobbyists and consumers to begin printing at home, and high-end commercial machines producing parts for application in high-stake scenarios such as tissue and organ printing [37], automotive [38], and aerospace [39].
These systems, like all digital systems, have potential cybersecurity vulnerabilities [40].Adversaries taking advantage of these vulnerabilities will be motivated by reasons falling into three broad threat categories for 3D printers [41]: (i) Data Theft (which contains the subset of IP Theft), (ii) Sabotage (where some-but not all-of the attack methods will violate the integrity of the design file or the manufactured part), and (iii) either unauthorized part manufacturing or reverse engineering of parts, both of which can depend on either theft of digital design files or their reproduction via reverse engineering from a physical part.Since hardware parts are manufactured for selling to customers, the reverse engineering and authentication of genuine products is a major challenge.
Attack vectors may fall into purely the digital realm (whereby printer software and/or print files are impacted), the physical realm (where attackers might seek to compromise physical aspects of the print process such as by damaging motors or sensors), or some combination thereof, including compromising the raw printer materials [42].Further complicating matters, 3D printers are increasingly networked for increased efficiency and monitoring-such features are now commonly native to industry standard designs, and even when not native, may be added via the use of Open Source (OS) tools like OctoPrint [43].This raises a host of standard cyber-security risks [44], similar to those faced by fast production low security IoT devices [45], [46].Lateral movement with networks with untrusted devices (such as those fast production low security IoT devices!) compound the risk to a networked 3D printer.Demonstrations of AM vulnerabilities include a sabotage attack on 3D printed quadcopter propellers [8], which caused propeller failure during flight, and FLAW3D [47], a Trojan bootloader that caused a reduction in tensile strength by up to 50%.Given the potential for these sabotage-type attacks to compromise safety-critical design features (for instance, imagine an automotive part which has been subtly damaged to fail at high speeds) we focus primarily on how we can defend against sabotage-type attacks.

A. 3D PRINTER MODEL
As illustrated in Figure 8, we consider a generalised and abstracted 3D Printer 5 with: • A heated print bed and hotend, requiring two heaters • Current sensors for each heater • Temperature sensors for the hotend, heatbreak, heatbed, and ambient temperature.
• MAX_CURRENT_HOTEND and MAX_CURRENT_HEATBED represent the maximum current limits for the hotend and heatbed respectively.
The following boolean controller outputs exist: • EN_HEAT_HOTEND and EN_HEAT_HEATBED signals enable the heaters in the hotend and heatbed respectively.
• EN_MOTOR_X, EN_MOTOR_Y, EN_MOTOR_Z, and EN_MOTOR_E signals enable each axis motor or motors for the X, Y, Z, and E axes respectively.

B. ATTACKS
We propose a set of actions an attacker may carry out: • Instruct one or more heater(s) to heat (or operate) above the safe temperature.This could impact the quality of printed items, cause damage to hardware, or result in a fire.Explicitly we consider: A1 Overheat Hotend: Run the hotend heater above the safe maximum temperature A2 Overheat Heatbreak: Run the hotend heater when heatbreak is above the safe maximum temperature A3 Overheat Heatbed: Run the heatbed heater above the safe maximum temperature A4 Overheat Ambient: Run the hotend heater and/or heatbed heater above the safe maximum ambient temperature • Operate one or more actuator(s) at or above maximums to exceed safe current limits.While hardware fuses are likely in most devices, replacing these takes time, and repeatedly blowing fuses could be considered as a denial of service attack.In the worst case, a hardware fuse may fail or blow slowly, causing damage to the printer.Explicitly we consider: A5 Overcurrent Hotend: Run the hotend heater above the safe maximum current A6 Overcurrent Heatbed: Run the heatbed heater above the safe maximum current • Drive stepper motors beyond their axis limits.Low impact consequences could be damaged axes, stepper motors, or printer frame, impacting print quality.The higher impact consequence is a fire from the stepper motor(s) overheating.Explicitly we consider: A7 Stall X Axis: Run the X axis stepper motor while stalling for more than one (1) second A8 Stall Y Axis: Run the Y axis stepper motor while stalling for more than one (1) second A9 Stall Z Axis: Run the Z axis stepper motors while stalling for more than one (1) second A10 Stall E Axis: Run the E axis stepper motor while stalling for more than one (1) second The attack A1 Overheat Hotend runs the hotend (the component responsible for melting the input filament so that it can be extruded through the nozzle) heater by setting EN_HEAT_HOTEND to true when the hotend is at maximum temperature, indicated by the presence of input signal MAX_TEMP_HOTEND.This risks damaging an inprogress print, damaging the hotend, and causing a fire.Policy ϕ 1 , as illustrated in Figure 9, mitigates this.The policy consists of two accepting locations (l0, l1) and one non-accepting violation location (lv).

FIGURE 9.
Policy ϕ 1 which captures the hotend heater should not be enabled when hotend temperature is at maximum.
When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 if the hotend heater is not enabled (EN_HEAT_HOTEND signal absent) or if the maximum temperature of the hotend is not reached (MAX_TEMP_HOTEND signal absent).
If the hotend heater is enabled (EN_HEAT_HOTEND signal present) while the hotend is at maximum temperature (MAX_TEMP_HOTEND signal present) the policy transitions to a violation.The synthesised enforcer will, therefore, prevent violation by suppressing EN_HEAT_HOTEND, ensuring the policy remains in l1.

2) POLICY ϕ 2 MITIGATING ATTACK A2 OVERHEAT HEATBREAK
The attack A2 Overheat Heatbreak runs the hotend heater by setting EN_HEAT_HOTEND to true when the heatbreak (a component, directly attached to the hotend, that is responsible for preventing heat transfer up the filament to prevent blockages) is at maximum temperature, indicated by the presence of input signal MAX_TEMP_HEATBREAK.This risks damaging an in-progress print with inconsistent extrusion, damaging the heatbreak and surrounding components, and causing a fire.Policy ϕ 2 , as illustrated in Figure 10, mitigates this.The policy consists of two accepting locations (l0, l1) and one non-accepting violation location (lv).
When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 if the hotend heater is not enabled (EN_HEAT_HOTEND signal absent) or if the maximum temperature of the heatbreak is not reached (MAX_TEMP_HEATBREAK signal absent).
If the hotend heater is enabled (EN_HEAT_HOTEND signal present) while the heatbreak is at maximum temperature (MAX_TEMP_HEATBREAK signal present) 14396 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the policy transitions to a violation.The synthesised enforcer will, therefore, prevent violation by suppressing EN_HEAT_HOTEND, ensuring the policy remains in l1.

3) POLICY ϕ 3 MITIGATING ATTACK A3 OVERHEAT HEATBED
The attack A2 Overheat Heatbreak runs the heatbed (baseplate which the 3D prints are extruded onto and gradually built on) heater by setting EN_HEAT_HEATBED to true when the heatbed is at maximum temperature, indicated by the presence of input signal MAX_TEMP_HEATBED.This risks damaging an in-progress print, damaging the heatbed and causing a fire.Policy ϕ 2 , as illustrated in Figure 11, mitigates this.The policy consists of two accepting locations (l0, l1) and one non-accepting violation location (lv).When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 if the heatbed heater is not enabled (EN_HEAT_HEATBED signal absent) or if the maximum temperature of the heatbed is not reached (MAX_TEMP_HEATBED signal absent).
If the heatbed heater is enabled (EN_HEAT_HEATBED signal present) while the heatbed is at maximum temperature (MAX_TEMP_HEATBED signal present) the policy transitions to a violation.The synthesised enforcer will, therefore, prevent violation by suppressing EN_HEAT_HEATBED, ensuring the policy remains in l1.

4) POLICY ϕ 4 MITIGATING ATTACK A4 OVERHEAT AMBIENT
The attack A4 Overheat Ambient runs the hotend heater by setting EN_HEAT_HOTEND to true and/or the heatbed heater by setting EN_HEAT_HEATBED to true when the ambient temperature at or above the printer's maximum operating temperature, indicated by the presence of input signal MAX_TEMP_AMBIENT.This risks reducing the life of printer components if operated beyond safe ambient temperature.Policy ϕ 4 , as illustrated in Figure 12, mitigates this.The policy consists of two accepting locations (l0, l1) and one non-accepting violation location (lv).

FIGURE 12.
Policy ϕ 4 which captures that neither the hotend or heatbed heaters should be enabled when ambient temperature is at maximum.
When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 if the hotend and heatbed heaters are not enabled (both EN_HEAT_HOTEND and EN_HEAT_HEATBED signals absent) or if the ambient temperature is below maximum (MAX_TEMP_AMBIENT signal absent).
If either or both hotend and heatbed heaters are enabled (EN_HEAT_HOTEND and EN_HEAT_HEATBED signals respectively) while the ambient temperature is maximum (MAX_TEMP_AMBIENT signal present) the policy transitions to a violation.The synthesised enforcer will, therefore, prevent violation by suppressing both EN_HEAT_HOTEND and EN_HEAT_HEATBEDsignals, ensuring the policy remains in l1.

5) POLICY ϕ 5 MITIGATING ATTACK A5 OVERCURRENT HOTEND
The attack A5 Overcurrent Hotend runs the hotend heater, by setting EN_HEAT_HOTEND to true, when at or above the maximum hotend current, indicated by the presence of input MAX_CURRENT_HOTEND.This risks wire insulation melting which could cause a fire, and reduces the life of the power supply if operated beyond maximum current.Policy ϕ 5 , as illustrated in Figure 13, mitigates this.The policy consists of two accepting locations (l0, l1) and one non-accepting violation location (lv).
When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 if the hotend is not enabled (EN_HEAT_HOTEND signal absent) or if the hotend current is below maximum (MAX_CURRENT_HOTEND signal absent).
If the hotend heater is enabled (EN_HEAT_HOTEND signal present) while the hotend current is at or above the maximum (MAX_CURRENT_HOTEND signal present) the policy transitions to a violation.The synthesised enforcer will, therefore, prevent violation by suppressing the EN_HEAT_HOTEND signal, ensuring the policy remains in l1.

6) POLICY ϕ 6 MITIGATING ATTACK A6 OVERCURRENT HEATBED
The attack A6 Overcurrent Heatbed runs the heatbed heater, by setting EN_HEAT_HEATBED to true, when at or above the maximum heatbed current, indicated by the presence of input MAX_CURRENT_HEATBED, this risks wire insulation melting which could cause a fire, and reduces the life of the power supply if operated beyond maximum current.Policy ϕ 6 , as illustrated in Figure 14, mitigates this.The policy consists of two accepting locations (l0, l1) and one non-accepting violation location (lv).When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 if the hotend is not enabled (EN_HEAT_HEATBED signal absent) or if the hotend current is below maximum (MAX_CURRENT_HEATBED signal absent).
If the heatbed heater is enabled (EN_HEAT_HEATBED signal present) while the heatbed current is at or above the maximum (MAX_CURRENT_HEATBED signal present) the policy transitions to a violation.The synthesised enforcer will, therefore, prevent violation by suppressing the EN_HEAT_HEATBED signal, ensuring the policy remains in l1.

7) POLICY ϕ 7 MITIGATING ATTACK A7 STALL X AXIS
The attack A7 Stall X Axis continues to run the X axis stepper motor, by setting EN_MOTOR_X to true, when the motor is stalling, indicated by the presence of input STALL_AXIS_X.This risks damaging the stepper motor and the axis frame of the 3D printer.Policy ϕ 7 , as illustrated in Figure 15, mitigates this.The policy consists of three accepting locations (l0, l1, l2), one non-accepting violation location (lv), and has one clock, V X , responsible for timing the duration of a stall.When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 until the stepper motor is enabled and is sensed as stalling (both EN_MOTOR_X and STALL_AXIS_X signals present).If this occurs the policy transitions to l2 and the clock V X is reset.
The following transitions can be taken when the policy is in location l2: • The motor is no longer enabled (EN_MOTOR_X absent) or no longer stalling (STALL_AXIS_X absent) and the clock (V X ) is less than one (1) second, then the policy transitions to l1.
• The motor is enabled (EN_MOTOR_X present) and stalling (STALL_AXIS_X present) while the clock (V X ) is less than one (1) second, then the policy remains in l2.
• The motor is enabled (EN_MOTOR_X present) and stalling (STALL_AXIS_X present) and the clock (V X ) is greater than or equal to one (1) second, then the policy transitions to lv.In this case the synthesised enforcer will prevent violation by suppressing EN_MOTOR_X.

8) POLICY ϕ 8 MITIGATING ATTACK A8 STALL Y AXIS
The attack A8 Stall Y Axis continues to run the Y axis stepper motor, by setting EN_MOTOR_Y to true, when the motor is stalling, indicated by the presence of input STALL_AXIS_Y, this risks damaging the stepper motor and the axis frame of the 3D printer.Policy ϕ 8 , as illustrated in Figure 16, mitigates this.The policy consists of three accepting locations (l0, l1, l2), one non-accepting violation location (lv), and has one clock, V Y , responsible for timing the duration of a stall.When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 until the stepper motor is enabled and is sensed as stalling (both EN_MOTOR_Y and STALL_AXIS_Y signals present).If this occurs the policy transitions to l2 and the clock V Y is reset.
The following transitions can be taken when the policy is in location l2: • The motor is no longer enabled (EN_MOTOR_Y absent) or no longer stalling (STALL_AXIS_Y absent) and the clock (V Y ) is less than one (1) second, then the policy transitions to l1.
• The motor is enabled (EN_MOTOR_Y present) and stalling (STALL_AXIS_Y present) while the clock (V Y ) is less than one (1) second, then the policy remains in l2.
• The motor is enabled (EN_MOTOR_Y present) and stalling (STALL_AXIS_Y present) and the clock (V Y ) is greater than or equal to one (1) second, then the policy transitions to lv.In this case the synthesised enforcer will prevent violation by suppressing EN_MOTOR_Y.

9) POLICY ϕ 9 MITIGATING ATTACK A9 STALL Z AXIS
The attack A9 Stall Z Axis continues to run the Z axis stepper motor, by setting EN_MOTOR_Z to true, when the motor is stalling, indicated by the presence of input STALL_AXIS_Z, this risks damaging the stepper motor and the axis frame of the printer.Policy ϕ 9 , as illustrated in Figure 17, mitigates this.The policy consists of three accepting locations (l0, l1, l2), one non-accepting violation location (lv), and has one clock, V Z , responsible for timing the duration of a stall.When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 until the stepper motor is enabled and is sensed as stalling (both EN_MOTOR_Z and STALL_AXIS_Z signals present).If this occurs the policy transitions to l2 and the clock V Z is reset.
The following transitions can be taken when the policy is in location l2: • The motor is no longer enabled (EN_MOTOR_Z absent) or no longer stalling (STALL_AXIS_Z absent) and the clock (V Z ) is less than one (1) second, then the policy transitions to l1.
• The motor is enabled (EN_MOTOR_Z present) and stalling (STALL_AXIS_Z present) while the clock (V Z ) is less than one (1) second, then the policy remains in l2.
• The motor is enabled (EN_MOTOR_Z present) and stalling (STALL_AXIS_Z present) and the clock (V Z ) is greater than or equal to one (1) second, then the policy transitions to lv.In this case the synthesised enforcer will prevent violation by suppressing EN_MOTOR_Z.

10) POLICY ϕ 10 MITIGATING ATTACK A10 STALL E AXIS
The attack A10 Stall E Axis continues to run the E axis stepper motor, by setting EN_MOTOR_E to true, when the motor is stalling, indicated by the presence of input STALL_AXIS_E.This risks clogging the extruder, damaging the stepper motor and the extruder.Policy ϕ 10 , as illustrated in Figure 18, mitigates this.The policy consists of three accepting locations (l0, l1, l2), one non-accepting violation location (lv), and has one clock, V E , responsible for timing the duration of a stall.When the input reset signal RESET is present, the policy remains in the initial location l0, otherwise, the policy transitions to l1.The policy remains in location l1 until the stepper motor is enabled and is sensed as stalling (both EN_MOTOR_E and STALL_AXIS_E signals present).If this occurs the policy transitions to l2 and the clock V E is reset.
The following transitions can be taken when the policy is in location l2: • The motor is no longer enabled (EN_MOTOR_E absent) or no longer stalling (STALL_AXIS_E absent) and the clock (V E ) is less than one (1) second, then the policy transitions to l1.
• The motor is enabled (EN_MOTOR_E present) and stalling (STALL_AXIS_E present) while the clock (V E ) is less than one (1) second, then the policy remains in l2.
• The motor is enabled (EN_MOTOR_E present) and stalling (STALL_AXIS_E present) and the clock (V E ) is greater than or equal to one (1) second, then the policy transitions to lv.In this case the synthesised enforcer will prevent violation by suppressing EN_MOTOR_E.

VIII. IMPLEMENTATION
To evaluate the proposed parallel enforcement method we introduce easy-rte-hardware an extension of easy-rte [22] and easy-rte-incremental [20].The main contribution of this extended compiler is support for policies to be compiled into Verilog hardware components which are run in parallel.Additionally, it adds support for compilation of the monolithic compositions (from easy-rte) and incremental compositions easy-rte-incremental) to synthesisable Verilog.The source code for easy-rte-hardware is available online. 7

A. COMPILING PARALLEL HARDWARE ENFORCERS
The process for compilation, as illustrated in Figure 19, consists of four steps: • First, the original easy-rte parser creates an intermediate XML file which describes each policy completely.
• Second, our main contribution to the compiler, the easy-rte-hardware parser, ingests the intermediate XML and computes the violation recovery table.This table consists of a recovery to each and every combination of violations that the policies have.This is explained further in the following section.
• Third, our modified easy-rte templater takes the new XML and produces synthesisable enforcer Verilog.
• Fourth, the Verilog file is passed to Quartus to synthesise hardware.

1) VIOLATION RECOVERY TABLE
The following algorithm is pseudocode to produce the violation recovery table: for all rowinviolationTable do 6: row.Append(violation, recoveries) end for; The resulting xmlOut is then passed as input to the easyrte templater to produce synthesisable Verliog code.In the next section, we discuss the synthesised hardware.

B. SYNTHESISED COMPOSITIONAL HARDWARE ENFORCERS
The synthesised enforcers are placed between the sensors and actuators (the environment) and the controller of the 3D printer, as illustrated in Figure 20.As illustrated, the enforcer is able to intercept and edit the controller's inputs and outputs as required.In practice, the enforcers could be deployed with custom hardware either on the printer controller's printed circuit board (PCB) or as a stand alone PCB.Alternatively, if new or altered security policies are expected, a fieldprogrammable gate array (FPGA) could be used to maintain reconfigurability.

1) MONOLITHIC
Monolithic compositions result in a single hardware component regardless of the number of policies composed.This component is separated into four subparts (input edit, output edit, transition logic, and storage of state and clocks) as illustrated by Figure 21.The execution of monolithic enforcement follows the four state Finite State Machine (FSM) illustrated in Figure 22.Initially, the enforcer waits for the environment (in location Env.).When an environment input (Input) is produced the FSM transitions to input enforcement (location E In ) and the input signals are passed to input edit logic which, if required to prevent a violation, the edits input.This input is then guaranteed to satisfy the input policies and is sent to the controller (FSM transitions to Ctrl).
Once the controller generates output (Output), the FSM transitions to the output edit logic (E Out&Trans ).Similarly to the input edit logic, any potential violation is edited to ensure the output policies are satisfied.The output is then passed to the transition logic block which determines and executes appropriate location transitions for the monolithic policy.As illustrated in the center of Figure 21, the state and clocks are stored such that each other component can read the current location and value of the clocks.These are updated only by the transition logic block.The output, which is guaranteed to satisfy both input and output policies, is then exposed to the environment as the FSM transitions to Env.where the process begins again when the next input event arrives.

2) INCREMENTAL
Incremental compositions result in multiple hardware components, each policy adds an input and output enforcement component.Explicitly, there are 2 + 2(N ) hardware components, where N is the number of policies.The hardware components in an incremental composition are illustrated in Figure 23.The FSM in Figure 24 controls sequential execution, as with the hardware components, the FSM expands as additional policies are added.The number of FSM states is 5 + 2N , where N is the number of policies.Regardless of the number of policies, the following pattern of execution occurs.First, the FSM is in location Env.waiting for the environment to produce an input event (Input) to transition to the first input enforcement location (E In 1 ) where the first input enforcer edit logic determines which, if any, violation recovery reference to emit (this recovery reference is determined at compile time as described in Section VIII-A1).Sequentially, each input enforcement edit logic is executed as the FSM proceeds through each remaining input enforcement location (E Then transitioning to input selection (location SelectIn) the violation resolution component takes each policy's recovery reference to determine if any edit action is required, and if so, select the appropriate edit.The input then satisfies all input policies and is exposed to the controller as the FSM transitions to Ctrl.
When the controller produces an output event (Output), similar to the input, each output edit logic component is executed as the FSM transitions through E Out 1 , E Out 2 , • • • , E Out N .The violation resolution component then uses each violation recovery reference to determine if any edit action is required and select the appropriate edit.The output then satisfies all output policies and is exposed to the environment as the FSM transitions to Env.where the process begins again when the next input event arrives.

3) PARALLEL
Parallel compositions result in multiple hardware components, each policy adds an input and output enforcement component.Explicitly, there are 2 + 2(N ) hardware components, where N is the number of policies.The hardware components in an parallel composition are illustrated in Figure 25.The input enforcement component for each policy is responsible for, given the policy's location, determining and emitting the appropriate violation recovery reference based on the violation recovery table (explained in Section VIII-A1).The output enforcement components are, in addition to determining the appropriate violation recovery references, responsible for policy location transitions, storing the policy location and any clocks.The FSM in Figure 26 controls execution, unlike the incremental control FSM, the number of states is fixed at seven.The FSM initially waits in location Env. for the environment to produce Input at which point the FSM transitions to the input enforcement location (E In ) where each input enforcement component is executed simultaneously to produce all violation recovery references.The FSM transitions to SelectI where recovery references are supplied to the violation recovery table which selects, as appropriate, to edit the input signals.The input then satisfies all input policies and is exposed to the controller as the FSM transitions to Ctrl.
Once the controller produces Output the FSM transitions to E Out where all output enforcement components are executed simultaneously to produce their violation recovery references.Then transitioning to SelectOut the violation recovery table produces an appropriately edited set of input which is used to update the locations of each enforcer policy when the FSM transitions to E Trans. .Finally, the output, which satisfies all output policies, is exposed to the environment as the FSM transitions to Env.where the process begins again when the next input event arrives.

4) UPDATING HARDWARE ENFORCERS
When the security landscape changes, policies may be added or altered.This requires recompilation of Verilog and resynthesis of some hardware blocks.The amount of recompilation and resyntehsis varies by method of composition.
Monolithic composition requires complete recompilation and resynthesis of hardware.This is the most time and computationally expensive method Incremental composition requires compilation and synthesis of the new or altered policies, the FSM controller, and the violation resolution block.
Our proposed parallel approach requires compilation and synthesis of the new or altered policies, and the violation resolution block needs to be recompiled and synthesised.This means the the violation recovery block is resynthesised for any new or altered policies.However, no existing enforcer blocks need to be resynthesised.

IX. RESULTS
To benchmark the proposed scalable hardware enforcers for the 3D printer we describe each policy from Section VII in easy-rte specification language.We use easy-rte-hardware to compile monolithic, incremental, and parallel compositions of increasing complexity to Verilog.We do this by adding mitigation policies gradually to each composition as follows: 1) ϕ 1 2) ϕ 1 , and ϕ 2 3) ϕ 1 , ϕ 2 , and ϕ 3 4) ϕ 1 , ϕ 2 , ϕ 3 , and ϕ 4  As the complexity of the composition increases, the number of attacks mitigated increases, thus improving the security of the 3D printer.
We collect performance results for the compilation and hardware resource usage of each composed enforcer.All compilation is performed on a Windows 10 machine with an Intel i7-6700 processor running at 4GHz and with 32GB RAM.For hardware synthesis the command line interface for Quartus Prime Lite 22.1 is used with a Cyclone 5 (5CGXFC7C7F23C8) as the target device.
For compilation, we report on time for easy-rte to generate the XML policy description and the Verilog composition, and the time for Quartus to perform synthesis.We also report on the size of XML and Verilog (SV) files.For hardware resources, we report on logic elements, registers, and maximum clock frequency.
Throughout all results the monolithic compositions are completely up to a maximum of six policies.This is due to scalability issues in the easy-rte compiler for product automata.Compositions for seven and above policies were terminated after running for longer than two weeks on highly resourced virtual machines on the cloud.Improving easyrte monolithic composition performance is beyond the scope of this work, but remains an area of exploration for future work.

A. COMPILATION 1) COMPILE TIME
The XML compile time is reported in Figure 27.The left chart includes all results and clearly illustrates exponential growth in the monolithic XML compile time, with the six policy composition taking 10,908 seconds (just over 3 hours).The right chart removes monolithic compositions to better visualise the incremental and parallel results.The beginning of an exponential trend can be seen in compositions eight through ten.The tenth composition took 2.21 seconds for incremental and 3.89 seconds for parallel.Both incremental and parallel demonstrate significantly more scalable XML compile time than monolithic, however the developing exponential trend for parallel and incremental may limit scalability.
The Verilog compile time is reported in Figure 28.The left chart includes all results and, as with XML compile time, illustrates exponential growth in the monolithic compile time.The peak monolithic compile time, for six policies, is similar to XML compile time at 10,915 seconds (just over 3 hours).The right chart removes monolithic compositions to better visualise the incremental and parallel results.Similarly to the XML compile time an exponential trend is starting to be seen for the incremental and parallel compositions eight through ten.The tenth composition took 2.42 seconds for incremental and 4.10 seconds for parallel.Both incremental and parallel demonstrate significantly more scalable XML compile time than monolithic across the ten 3D printer policy compositions, however the developing exponential trend may limit scalability.
The Quartus synthesis time is reported in Figure 30.A clear exponential trend is illustrated by the monolithic compositions, with a peak of 365 seconds for six policies.Parallel and incremental illustrate linear trends with peaks of 152 seconds and 114 seconds respectively.This demonstrates parallel and incremental are significantly more scalable.
Compilation time results demonstrate parallel and incremental methods are more scalable than monolithic.Results from easy-rte-hardware show exponential trends for more complex compositions in parallel and incremental.This is likely due to the limited optimisation in the original easyrte for larger policies.An example is memory managment when producing and reading XML files that do not fix into memory.Results from Quartus illustrate linear scale parallel and incremental time complexity which is encouraging, and suggest that with appropriate optimisation easy-rte could obtain similar trends.

2) COMPILE SIZE
The size of compiled XML files is reported in kilobytes (kBs) in Figure 31.Exponential trends are illustrated for each composition type, a steeper curve for monolithic with a peak XML size of 16,877 kB in the six policy composition, and a peak of 10,963 kB for incremental and parallel in the 10 policy compositions.
The size of compiled Verilog files is reported in kBs in Figure 29.The left chart includes all data and the right has monolithic compositions five and six removed to visualise the incremental and parallel trends more clearly.An exponential trend is illustrated for the monolithic Verilog files, with a peak of 14,971 kB for the six policy composition.Parallel and incremental results peak at 574 kB and 583 kB respectively.
Compile size trends show the proposed parallel and existing incremental methods are more scalable than monolithic.However, these approaches still show exponential trends in XML size.This suggests a XML structure specalised for compositions may further improve scalability.
14404 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. HARDWARE 1) LOGIC ELEMENTS
The number of logic elements used, by the synthesised hardware components, is illustrated in Figure 32.The incremental and parallel trends are piecewise linear.The first six policies (thermal and current protection policies) are relatively simple compared to the final four (stepper motor protection policies).This is reflected in the steeper linear trend for compositions seven through to ten.
No significant monolithic trend is observed in the first six compositions, however the final composition of six policies consumed 27 logic elements compared to the previous three which all consumed seven.This hints that the expected exponential trend may be beginning.However, support for more complex monolithic compositions would be required to verify this.
These low total logic element counts demonstrate the desired low overhead of runtime enforcement hold for our proposed compositional runtime enforcement.

2) REGISTERS
The number of registers in synthesised hardware is illustrated in Figure 33.The monolithic results show insignificant register synthesis with a peak of eight registers for the six policy composition, as they are only required to hold the current state of the enforcer.Additional results for this would be interesting to confirm the monolithic register count remains relatively low.
The incremental and parallel results reflect a piecewise linear trend consistent with the logic element results.Specifically, the increased complexity of policies added in compositions seven through ten results in a steeper linear growth in synthesised registers.

3) MAXIMUM CLOCK FREQUENCY
The maximum clock frequency, in MHz, is reported in Figures 34 and 35.The chart in Figure 34 includes all results, in which the monolithic method shows a downward trend as policy complexity increases.However, Quartus optimisation improves clock frequency for compositions four and five.
The chart in Figure 35 has monolithic results removed to better illustrate the differences between incremental and parallel compositions.The incremental method shows a negative exponential decay trend in the maximum clock frequency with a minimum of 6.5 MHz for the ten policy composition.The parallel method shows a more inconsistent frequency between 75 and 80 MHz for compositions one to six.The frequency drops significantly to between 40 and 50 MHz when the more complex stepper motor policies are added in compositions seven through to ten.
The results here reflect a key difference between the incremental and parallel methods.As you increase the number of policies, the time taken to execute the incremental approach increases.In the parallel approach this is not the case.

A. PARALLEL HARDWARE OR INCREMENTAL SOFTWARE?
In this work, we have proposed a framework for multiple hardware enforcers to enforce simultaneously without suffering from the state space explosion of the monolithic approach (which uses the product of policies).In earlier work, we proposed multiple software enforcers to enforce sequentially, or incrementally.For a designer wishing to deploy enforcers, there is a trade-off between these approaches.
We have discussed hardware, relative to software, benefits from a reduced attack surface, which malicious actors may exploit (Section VI).Through our results we show the clock frequency of the proposed parallel approach is higher than the incremental approach.Therefore, for maximum speed and a reduced attack surface, parallel enforcers in hardware may be the designer's best choice.
Incremental software enforcers, however, benefit from reduced cost.This is due to software deployment on cheaper microcontrollers, rather than more expensive FPGAs or custom hardware required for hardware enforcers.This also impacts the ability for redeployment or updates to enforcers.The incremental approach supports adding new enforcers to the chain, which can then be uploaded to the microcontroller.The hardware approach requires partial re-synthesis (See Section VIII-B4 for more) and either upload to the FPGA or entirely new custom hardware.Therefore, in applications where policies are expected to change or grow reguarly, a software approach may be the designer's approach.
Ultimately, the requirements of the application and desired level of security determines which approach is most appropriate.We note the approach proposed in this work is CPS-agnostic.This means it can be applied to any system that can have policies modelled as DTA, with boolean inputs and outputs, and can include applications more broad than CPS.

B. CHALLENGES
During the development of this framework a number of challenges were encountered, two of sigificance we now discuss.
A key requirement of the parallel approach was to resolve scalability challenges faced by the monolithic approach.Therefore, the hardware design to support parallel composition needed to avoid any exponential growth in resource consumption.This proved to be of significant challenge.
Conceptually, for each input/output event to be enforced, there is a set of possible inputs and outputs that are possible.Each enforcer has a subset of these events which it deems are satisfactory to the policy.These subsets need to be combined (intersection) and a final event needs to be selected.It was the intersection operation that was most challenging to optimise.
After a range of approaches were trialled, the violation recovery table design presented in Section VIII-A1 was selected as the most resource efficient method.This rested on the precalculation of intersections and a final look up table of recovery actions.
The other noteworthy challenge was producing results for the monolithic approach.As this approach is plagued by rapid exponential growth, the time taken to compile the enforcers quickly grew from minutes to hours.This was amplified by the existing monolithic compiler not having sufficient memory management to support larger compositions.Depsite compilation attempts on cloud based virtual machines with significant compute and memory capacity, some compositions had not completed after a week, and hence our monolithic results end after six combined policies.
Though beyond the scope of this paper, there is value in developing an improved monolithic compiler.This would expand comparison with our propsed approach.

XI. CONCLUSION AND FUTURE WORK
The importance of safe and secure Cyber-Physical Systems (CPSs) is made clear by the significant human impact when they fail.Applying Runtime Enforcement to these 14406 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
CPSs provides a method to ensure security policies are not violated during system execution.However, as the number of policies increases, methods to compose these policies becomes important.
In this work, we investigated parallel composition of bi-directional hardware enforcers for enforcing security policies for a FFF 3D printer.We use hardware enforcers as this reduces reliance on a tech stack (e.g.firmware, software, and operating system) with frequently discovered vulnerabilities.In CPSs where updates can be limited, a high trust execution platform is desirable.
We propose a novel bi-directional hardware RE framework that composes policies in parallel.This is the first work that addresses potential security vulnerabilities in software based enforcers and supports parallel policy composition for reactive systems.We provide a tool, easy-rte-parallel, which produces Verilog descriptions of these enforcers from a high-level policy description.We compare our proposed approach with monolithic and serial (incremental) methods to policy composition.
Our results demonstrate the expected state space explosion in the monolithic approach and more scalable increases in resource consumption by both serial (incremental) and our proposed parallel approach.Specifically, we show the consumption of resources is linear and proportional to the complexity of the composed policies.We also show higher clock frequencies compared to the serial (incremental) approach for the same number of composed policies.The results show our proposed approach is well suited where numerous security policies are required, like CPSs such as 3D printers.
In the future, we would like to demonstrate our hardware enforcers in a wider range of CPSs, study the possibility of combining parallel and serial (incremental) approaches, and distribution of enforcers.
Let us first recall the constraints from [19] that an enforcer for any given policy ϕ should satisfy.We have informally discussed about these constraints in Section IV, and more details and explanation about of Definition 7 is in [19].
Definition 7 (Enforcer for ϕ): An enforcer for a given policy ϕ ⊆ * is a function E ϕ : * → * satisfying the following constraints: (Cau) Let us prove this theorem using induction on the length of the input sequence σ ∈ * .
Induction basis.Theorem 2 holds for σ = (ϵ I , ϵ O ) since the function will not release any input-output event as output and thus E O (E I (ϵ I ), ϵ O ) = ϵ .

FIGURE 1 .
FIGURE 1. Threats to the security of 3D printers and the potential impacts of successful attacks.

FIGURE 2 .
FIGURE 2. System view of bidirectional reactive enforcement.

FIGURE 5 .
FIGURE 5. Safety automaton for S 1 and S 2 .

FIGURE 10 .
FIGURE 10.Policy ϕ 2 which captures the hotend heater should not be enabled when heatbreak temperature is at maximum.

FIGURE 11 .
FIGURE 11.Policy ϕ 3 which captures the heatbed heater should not be enabled when heatbed temperature is at maximum.

FIGURE 13 .
FIGURE 13.Policy ϕ 5 which captures the hotend heater should not be enabled when at maximum current threshold.

FIGURE 14 .
FIGURE 14.Policy ϕ 6 which captures the heatbed heater should not be enabled when at maximum current threshold.

FIGURE 15 .
FIGURE 15.Policy ϕ 7 which captures the X axis motors should not be enabled when stalling longer than one (1) second.

FIGURE 16 .
FIGURE16.Policy ϕ 8 which captures that the Y axis motors should not be enabled when stalling for longer than one (1) second.

FIGURE 17 .
FIGURE 17.Policy ϕ 9 which captures the Z axis motors should not be enabled when stalling for longer than one (1) second.

FIGURE 18 .
FIGURE18.Policy ϕ 10 which captures the E (extruder) axis motor should not be enabled when stalling for longer than one (1) second.

FIGURE 27 .FIGURE 28 .
FIGURE 27.Charts illustrating the growth in XML compilation time for Monolithic (Mono.),Incremental (Incr.), and Parallel (Para.).The left chart includes all data points.The right chart excludes the Monolithic results to allow comparison of Incremental and Parallel results.Note the chart scales differ.

FIGURE 29 .
FIGURE 29.Charted Verilog SV compilation size for Monolithic (Mono.),Incremental (Incr.), and Parallel (Para.).The left chart includes all data points and the right chart removes the 5th and 6th compositions for Monolithic to improve clarity of Incremental and Parallel results.Note the chart scales differ.

FIGURE 30 .
FIGURE30.Time taken for quartus to synthesise hardware for each composition as the policy count increases.

FIGURE 31 .
FIGURE 31.Size of compiled XML policy descriptions for each composition as the number of policies increases.

FIGURE 32 .
FIGURE 32.Number of logic elements synthesised for each composition as the number of policies increases.

FIGURE 33 .
FIGURE 33.Number of registers synthesised for each composition as the number of policies increases.

FIGURE 34 .
FIGURE 34.Maximum clock frequencies for monolithic, incremental, and parallel with includes all data points.

FIGURE 35 .
FIGURE 35.Maximum clock frequencies for incremental and parallel compositions.

TABLE 1 .
Functional definition example for policy P.

TABLE 3 .
Parallel composition scheme using Select.