From Digital Twins to Digital Twin Prototypes: Concepts, Formalization, and Applications

The transformation to Industry 4.0 also transforms the processes of developing intelligent manufacturing production systems. Digital twins may be employed to advance the development of these new (embedded) software systems. However, there is no consensual definition of what a digital twin is. In this paper, we provide an overview of the current state of the digital twin concept and formalize the digital twin concept using the Object-Z notation. This formalization includes the concepts of physical twins, digital models, digital templates, digital threads, digital shadows, digital twins, and digital twin prototypes. The relationships between all these concepts are visualized as class diagrams using the Unified Modeling Language. Our digital twin prototype approach supports engineers in the development and automated testing of complex embedded software systems. This approach enables engineers to test embedded software systems in a virtual context without the need of a connection to a physical object. In continuous integration/continuous deployment pipelines, such digital twin prototypes can be used for automated integration testing and, thus, allow for an agile verification and validation process. In this paper, we demonstrate and report on the application and implementation of a digital twin using the example of two real-world field studies (ocean observation systems and smart farming). For independent replication and extension of our approach by other researchers, we provide a laboratory study published open source on GitHub.


INTRODUCTION
For cyber-physical-systems, the Industrial Internet of Things (IIOT), and Industry 4.0 applications, the embedded software is an increasingly crucial asset.With increasing requirements and hence, increasing complexity, new challenges arise for manufacturers and in particular, for the engineers of these systems.While in large software companies, software development is often done by distributed teams of engineers [1], this is usually different for small and medium-sized enterprises (SME) that develop embedded systems [2].Especially, in SMEs, embedded software still is often developed by the same engineers who also develop the electronics and/or mechanical parts [3].
However, with the demand for context-aware, autonomous, and adaptive robotic systems [4], more advanced software engineering methods have to be adopted by the embedded software community.Consequently, the way these systems are developed has to advance.In future development workflows, the embedded software systems will be the center-piece of IIoT applications.To achieve this, the community has to move from expert-centric tools [4] to modular systems, whereby domain experts are enabled to contribute parts of the system.
A survey among 2,000 decision makers about trends and challenges in software engineering found that quality is perceived in the software industry as the single most relevant premise to survive [5].Yet, organizations struggle to achieve software quality along with cost and efficiency [6].During the development of embedded (software) systems, at some point, thorough and reliable tests are necessary to verify and validate the whole system [7].A common way to test the control algorithms of an embedded software system is Hardware-in-the-Loop (HIL) testing.An example for HIL testing at large scale is Airbus with creating iron birds of their aircraft, containing the corresponding electronics, hydraulics and flight controls [8].However, many SMEs cannot afford such redundant hardware just for the purpose of testing software.Hence, test automation is among the most popular topics for testing embedded software [9].Still, automatic quality assurance is a challenge in this context, since hardware is in the loop.
Many different simulation tools were proposed, developed, and sold, with the promise to reduce costs and time needed for verification and validation.Yet, none of these tools is able to combine all aspects of modern machines during all steps of the production life-cycle, due to the complexity of systems and the high amount of data being processed.Thus, multidisciplinary simulation concepts are increasingly important with regard to scalable and highly modular production environments enabled by cyberphysical systems [10].Alongside HIL testing, manufactures implemented different automated testing strategies with Inthe-Loop simulations to reduce costs, e.g., Software-in-the-Loop (SIL), Model-in-the-Loop (MIL), and Processor-in-the-Loop (PIL) simulations [11].
One promising technique to enhance the overall software quality of embedded systems, is the Digital Twin concept.We start with a discussion of related work in Section 2. As there is no common understanding around the concept, we then dissect the different parts of a digital twin in and formally specify the concepts with the Object-Z notation.
Afterwards, the application of digital twins in different industrial contexts are presented to illustrate the approach.

RELATED WORK
Digital twins are not only a growing topic in academia but also in the industry, especially in manufacturing [12].However, there is still no consensual definition of a Digital Twin, as we explain in Section 2.1.Most of the research conducted to find a general definition of a digital twin, are literature reviews [13]- [16] investigating where digital twins are used, which components are part of it, and which level of integration with the CPS exists.In particular, Kritzinger, Karner, Traar, et al. [16] contributed with their literature review to a consensual understanding about which subsystems are part of a digital twin.They consider the digital model, the digital shadow, and the digital twin as three separate levels of integration in the overall concept of digital twins.In this paper, we extend this work by providing a formalization for all these categories.
With regard to mathematical approaches to formalize the concept of digital twins, there is a lack in research papers.Nevertheless, we discuss two approaches [17], [18] that use semi-formal approaches to define the relationships between the different components of a digital twin in Section 2.2.

The Evolution of the Digital Twin Concept
An innovative method for testing and monitoring embedded systems was used for space missions, dating back to the early Apollo missions conducted by the National Aeronautics and Space Administration (NASA).Here, the "Twin" concept was initially employed during the Apollo missions in the late 1960s as a safety precaution.If a system on the spacecraft failed during the mission, engineers had no access to the capsule.A failure to fix problems in a timely manner could be catastrophic for the space mission.At the time, computational power was insufficient for complex simulations, so NASA engineers came up with the idea of building at least two identical space capsules.One was used for the mission while the other remained on Earth, serving as the "Twin" for simulation purposes.Changes to the system were first tested on the Twin before astronauts received instructions.This approach required both capsules to be maintained exactly the same, including replacing parts on the Twin even if it was not used during a mission.NASA had planned to transfer this approach to the Space Shuttle program, but abandoned the idea due to the high costs.
Half a century later, with advancements in computational power and improved simulations, the NASA's Twin concept has evolved into a digital twin.However, there was a second research threads that contributed to the concept.The second thread originated from the manufacturing industry and dates back to 2002, when Grieves [19] first pitched for the formation of a Product Lifecycle Management (PLM) center at the University of Michigan.The presentation slide, as depicted in Figure 1, had the title "Conceptual Ideal for PLM" [20] and sketched the idea of a digital twin and named it "Mirrored Spaces Model" back than [19].
Grieves envisioned with the Mirrored Spaces Model already three crucial components of digital twins: the physical Fig. 1: A Digital Twin by Grieves and Vickers [20] consists of the real space (left side), the virtual space (right side), and the link for data flow from real space to virtual space.The opposite direction is done manually by using information to enhance processes (Source: [20]).
space, the virtual space, and the data link between the physical and virtual spaces.Later, in 2016, Grieves and Vickers [20] defined the digital twin as stated in Definition 1: Definition 1 (Digital twin by Grieves and Vickers [20] (2016)).The Digital Twin is a set of virtual information constructs that fully describes a potential or actual physical manufactured product from the micro atomic level to the macro geometrical level.At its optimum, any information that could be obtained from inspecting a physical manufactured product can be obtained from its Digital Twin.Digital Twins are of two types: Digital Twin Prototype (DTP) and Digital Twin Instance (DTI).Digital twin's are operated on in a Digital Twin Environment (DTE).Definition 1 considered the digital twin to be a collection of technologies and distinguished between two types: the Digital Twin Prototype (DTP) and the Digital Twin Instance (DTI).The Digital Twin Prototype is a set of blueprints, etc., used to construct or maintain the physical twin.The Digital Twin Instance is the specific instance created after the physical twin has been manufactured and is linked to it throughout its lifecycle.Although the vision by Grieves and Vickers [20] reflected solutions that are possible today, the technology available in 2002 only allowed for a rudimentary implementation of what a digital twin is known today.Digital twins were seen as a new paradigm for designing, manufacturing, and servicing products [12].However, the meaning of digital twin may vary depending on the sector they are utilized in [12].
After their introduction, digital twins experienced a hype phase until around the year 2006.The first hype of digital twins was driven by high hopes in the industry.However, the technology did not live up to the hype, and digital twins became a buzzword in marketing departments rather than a fully realized concept.Newman [21] observed and criticized something similar with regard to microservice architectures.Saracco and Henz [12] emphasize that the industry drove the development of digital twins, while academia ignored it.The revival of interest in digital twins in 2016 was thanks to the maturity of IIoT and CPS technologies, and academia also joined the bandwagon.Digital twins reached the peak of the Gartner Hype Cycle of emerging technologies in 2018 [22].Furthermore, an increased number in research papers and special issues published by journals can be registered after 2016.
It was between 2006 and 2016 when Piascik, Vickers, Lowry, et al. [23], and Glaessgen and Stargel [24] proposed their vision for a digital twin for NASA [19].Piascik, Vickers, Lowry, et al. [23] used the term digital twin in their technology roadmap for NASA.However, they described the digital twin concept, but did not define digital twins.The better known digital twin definition was by Glaessgen and Stargel [24] for next generation fighter aircraft and NASA vehicles shown in Definition 2: Definition 2 (Digital twin by Glaessgen and Stargel [24] (NASA) (2012)).A Digital Twin is an integrated multiphysics, multiscale, probabilistic simulation of an as-built vehicle or system that uses the best available physical models, sensor updates, fleet history, etc., to mirror the life of its corresponding flying twin.The Digital Twin is ultra-realistic and may consider one or more important and interdependent vehicle systems, including airframe, propulsion and energy storage, life support, avionics, thermal protection, etc.
They tailored their vision for the specific use case of spacecraft, satellites, and space exploration, where simulations play a crucial role due to the high cost of hardware and human resources.These simulations are used both in the development phase, which indicates at least a MiL approach, and to monitor the systems during missions.To detect anomalies during flight, they also included a channel for sending sensor data from the physical twins to their corresponding digital twins.Loading this data into the simulation with a realistic model supersedes the NASA's Twin approach from the Apollo missions.This is similar to the data link shown in Figure 1, only with far advanced technology and tools.A demonstration of their implementation can be seen in the Perseverance Rover that landed on Mars in 2021 [25].
In parallel to the definition by NASA, Garetti, Rosa, and Terzi [26] defined digital twins for manufacturing as shown in Definition 3: Definition 3 (Digital twin by NASA [26] (2012)).The digital twin consists of a virtual representation of a production system that is able to run on different simulation disciplines that is characterized by the synchronization between the virtual and real system, thanks to sensed data and connected smart devices, mathematical models and real time data elaboration.The topical role within Industry 4.0 manufacturing systems is to exploit these features to forecast and optimize the behaviour of the production system at each life cycle phase in real time.
When the attention on digital twins research rekindled, academia proposed multiple definitions for the concept [13].These definitions were influenced by the realistic simulation approach put forth by NASA.Rosen, Wichert, Lo, et al. [15] linked the digital twin concept to the Industry 4.0 strategy of the German Platform Industry 4.0 [27].They illustrated how simulations evolved over time, from mechanics in the 1960s to simulation-based system design and finally to digital twins since 2015.They also highlighted that modularity, autonomy, and connectivity are crucial requirements for digital twins, among other factors.
The definitions provided by Grieves and Vickers [20] and NASA only included an automated connection from the physical twin to its digital twin.Trauer, Schweigert-Recksiek, Engel, et al. [28] conducted an industrial case study to analyze how the industry perceived and defined digital twins between 2002 and 2019.They traced the evolution of digital twins and presented Definition 4 as a result.

4
(Digital twin by Trauer, Schweigert-Recksiek, Engel, et al. [28] (2020)).A Digital Twin is a virtual dynamic representation of a physical system, which is connected to it over the entire life cycle for bidirectional data exchange.
We present Definition 4 here, because of the inclusion of the bidirectional data exchange from digital twin to physical twin.This bidirectional interaction allows remote control and operation of the physical twin, as well as new opportunities for collaboration between physical twin and digital twin.This poses a challenge for engineers to either develop the software independently for each twin, violating the principle of realistic replication, or to use tools like Docker to containerize the physical twin's software for use as a digital twin.
Depending on the research field, the industry, and use cases, the term digital twin is often used synonymous with concepts like Digital Model, Digital Shadow, and Digital Thread [13], [16].Kritzinger, Karner, Traar, et al. [16] conducted a categorical literature review and analyzed research papers with regard of the proposed concept and how it deviates from a common understanding of the essential parts of digital twins.They classify three subcategories of a digital twin by their level of integration with the physical twin: (i) digital model, (ii) digital shadow, and (iii) digital twin.The differences are depicted in Figure 2.
• Figure 2a shows the digital model.There is no automated connection between the physical object and the digital model.No automated data exchange is realized.State changes in the physical object do not immediately affect the digital model and vice versa.

•
If there is an automated one-way data flow from the physical object to the digital object (see Figure 2b), then this is a digital shadow.A change in state of the physical object leads to a change of state in the digital shadow, but not vice versa.

•
Figure 2c shows a fully integrated digital twin.The data flows are automated between the physical twin and the digital twin in both directions.In such a configuration, the digital twin might also act as a controlling instance of the physical twin.A change in state of the physical twin directly leads to a change in state of the digital twin and vice versa.Fig. 2: Subcategories of digital twins by their level of integration with the physical twins (Source: [16]).
With the increasing importance of digital twins, the International Organization for Standardization (ISO) also published the ISO 23247 series, defining a framework to support the creation of digital twins of observable manufacturing elements, including personnel, equipment, materials, manufacturing processes, facilities, environment, products, and supporting documents [29].
Definition 5 (Digital twin by International Organization for Standardization [29] (2021)).A digital twin assists with detecting anomalies in manufacturing processes to achieve functional objectives such as real-time control, predictive maintenance, in-process adaptation, Big Data analytics, and machine learning.A digital twin monitors its observable manufacturing element by constantly updating relevant operational and environmental data.The visibility into process and execution enabled by a digital twin enhances manufacturing operation and business cooperation One aspect of ISO 23247 that immediately catches the eye is the absence of mentioning of bidirectional communication.The focus is on the monitoring aspect of a digital twin.According to the definition by Kritzinger, Karner, Traar, et al. [16], ISO 23247 only describes a digital shadow [29].
Since 2018, IIoT platforms transitioned from basic data hubs to digital twin (DT) platforms.Lehner, Pfeiffer, Tinsel, et al. [30] evaluated the digital twin platforms provided by Amazon Web Services (AWS), Microsoft Azure, and the Eclipse ecosystem and showed that they fulfill many requirements, yet not all key requirements.Features like bidirectional synchronization between physical and digital twins require additional coding, and automation protocols are not covered yet.According to the categorization of the integration level of digital twins [16], these platforms only help to establish a so-called digital shadow [16].Modern simulation tools such as AutoDesk, aPriori, or Ansys, are using IIoT platforms to feed the simulation with data and enable the integration of automation protocols.Often they are promoted with the promise of a digital twin.However, similar to the cloud providers, these tools also just help to establish a digital shadow.The simulation of a physical twin (PT) still does not cover the entire embedded software system that runs on the digital twin and also lacks the ability of proper bidirectional synchronization between digital twin and digital twin.

Conceptual Models to Define Digital Twins
The presented research projects and papers leave plenty of space for interpretation of the digital twin concept.This is one reason, why there are so many definitions of digital twins.
Fig. 3: Semi-formal description of the relationships between physical twin, digital twin, their connections, and environments as described by Yue, Arcaini, and Ali [17].
Yue, Arcaini, and Ali [17] present a semi-formal approach using UML class diagrams to define the physical twin, digital twin and their relationships by the example of an automated warehouse system (AWS).Figure 3  A state change in one twin, triggers the change of the state of its counterpart.
Furthermore, they payed attention to two aspects, which are often not considered explicitly: fidelity and the twinning rate.Fidelity considers the accuracy and the level of abstraction of the digital twin and the twinning rate is the interval physical twin and digital twin synchronize their states.
However, the semi-formal approach by Yue, Arcaini, and Ali [17] has its flaws.Although they considered the digital model as part of the digital twin, it is not explicitly mentioned in the general overview in Figure 3.Moreover, the digital shadow was ignored completely.
Becker, Bibow, Dalibor, et al. [18] present in their conceptual model of digital shadows for CPS in a simlar approach using also UML class diagrams to show the relationships, but solely for the digital shadow.The focus of the digital shadow is on single assets and their information flow from the physical twin to the digital shadow.They also emphasize that an asset's corresponding model is part of the digital shadow and models can be of different natures/types.
A formal mathematical approach, yet very abstract, of the relationships between physical twins, digital shadow, and digital twin was presented by Lv, Lv, and Fridenfalk [31].A limitation in their approach is that it still offers a lot of space for interpretation and the mathematical notation is peculiar.
In this paper, we extend and merge the relationship diagrams of Yue, Arcaini, and Ali [17] and Becker, Bibow, Dalibor, et al. [18] by also including the digital model and digital shadow to give a full overview of the Digital Twin concept.In addition, we present the formalization of a digital twin software architecture using the Object-Z notation.

Continuous Twinning
In the development phase of CPS, HIL testing still is the common approach.The pressure to reduce costs [6] led to many different approaches to switch from HIL to SIL.To date, for most industrial applications, sensors and actuators are connected via input/output ports to programmable logic controllers (PLCs).Although new wireless communication technologies and more powerful and efficient singleboard computers open up the embedded community for cheaper and faster development processes, the predominance of PLCs will hold for years.It is quite common to use PLCs in a HIL setup, where the PLC is connected to a simulation [32].Engineers can program the PLC and the simulation delivers the virtual context with simulated sensors/actuators to the PLC.As still only one engineer can work on a HIL system at the same time, SIL approaches become more and more popular to enable the collaboration between engineers.Lyu, Atmojo, and Vyatkin [32] demonstrated that a software PLC in a SIL context can be realized with Docker and other tools.
Quality assurance of embedded systems is regulated with standards and norms to ensure robust testing and to prevent malfunctions that might pose a risk to the safety of individuals who work with or use these systems [2].The aviation industry is renowned for its strict and stringent testing procedures, contributing to the fact that aircraft are the safest mode of transportation, statistically.This was not the case half a century ago, as standards and procedures have evolved through various experimentation with different testing strategies.
The digital twin prototype approach presented in this paper, enables engineers to produce the first minimum viable product (MVP) with the first implemented device driver and emulator.Thanks to the publish-subscribe architecture, all additional nodes and emulators can be developed and added iteratively.Putting all modules in a source code management system allows all developers to use the digital twin prototype and enhance the entire system incrementally, without the need to connect to the hardware of the digital twin.As a bonus, this also enables automated SIL testing in continuous integration/continuous delivery (CI/CD) pipelines.
By following CI/CD workflows the development of embedded software systems becomes an agile and incremental process.Beginning with a prototype of a device driver for a single piece of hardware, to entire production plants, to smart factories, agile software development is enabled.This does not only improve the software quality and shorten release cycles, it also allows additional stakeholders to participate in a feedback loop in the development process from the first MVP.Adjusting software requirements or fixing design flaws can be done during development.With this method, digital twins evolve continuously in small incremental steps, rather than in major releases.Nakagawa, Antonino, Schnicke, et al. [33] envision and call this approach Continuous Twinning.

THE DIGITAL TWIN CONCEPT -A FORMALIZA-TION
As Grieves [34] elaborates, there is a flaw in the categorization of the digital twin definition by Kritzinger, Karner, Traar, et al. [16].Stating that digital twins have three subcategories, where a digital twin is a subcategory of itself, leads to endless recursion.Furthermore, this increases the confusion around what a digital twin is and what it is not.However, we do not share the recommendation to ignore the difference between a digital shadow and a digital twin with Grieves [34].To enhance clarity around the concepts and relationships between physical twins, digital models, digital shadows, digital threads, digital twin prototypes, digital templates, and digital twins, we formally specify the Digital Twin concept as follows.We propose, similar to Hasselbring [35], a three-level interleaving of formality in the specification: 1) informal prose explanation and illustrations with examples; 2) semi-formal object-oriented modeling with the UML; 3) rigorous formal specification with Object-Z.
Object-Z [36] is a formal specification notation used to describe the behavior of software systems.It extends the Z notation [37] and enables the incorporation of objectoriented concepts, such as classes, objects, inheritance, and polymorphism, into specifications.Additionally, Object-Z allows for the specification of operations that can be performed on objects, along with constraints on attribute values and relationships between objects, all expressed in a mathematical notation.The following specification has been checked using a type checker provided by the Community Z Tools Project [38].
The formal specification is exemplified through an embedded software system comprising a sensor, an actuator which also serves as a data transmitter, and an embedded control system connected to both.This control system manages data and command exchange between these components.All example components are very basic and are only meant to demonstrate the core ideas.A real system would be more complex, including more third-party dependencies, tools, and frameworks.

The Physical Twin
The digital twin concept starts with the physical twin.

Definition 6 (Physical Twin).
A physical twin is a real-world physical System-of-Systems or product.It comprises sensing or actuation capabilities driven by embedded software.
Figure 4 illustrates the deployment diagram of our simple embedded system.In this example, the sensor is connected via an RS232 interface to the controller, and the transmitter is connected via Ethernet.All data collected from the sensor is processed by the controller logic and subsequently sent to an external source via the transmitter.Commands to modify the sensor's behavior are received by the transmitter and forwarded to the sensor through the control logic.
Consider both devices as black boxes that maintain a list of accepted commands, a method for executing tasks based on the commands and returning a result, and functions for sending and receiving data.Additionally, a device driver holds a corresponding list of commands that can be sent to the devices.The lists on the device and the device driver are identical, and the device driver handles command transmission and response reception.
The UML class diagram in Figure 5 depicts the various classes forming the embedded control system.To align with the clean code principles, abstract classes Device and DeviceDriver are introduced first.Sensors and actuators are considered as devices and thus inherit from Device, as depicted on the left side of Figure 5.All devices are connected to the embedded control system.
The crucial elements of embedded software systems are the connections between the control systems and the sensors/actuators.In this example, the connections are established using different PROTOCOL types (TCP or RS232) to facilitate communication between Device and DeviceDriver.
Specifically, SensorDriver inherits from DeviceDriver and employs an RS232Connection to establish a connection with a Sensor.Similarly, Transmitter and TransmitterDriver (which also inherits from DeviceDriver) establish a connection using TCPConnection.While a Device is treated as an external component running on the device, a corresponding DeviceDriver is an integral part of the embedded control system.
A Device consists of two main components: a Connection object and a set of accepted commands (commandList).The Connection object manages data exchange between a Device and a DeviceDriver.The ExecuteCommand function represents the execution of a task after a command has been sent to the Device.It expects a COMMAND object sent by the DeviceDriver and returns a RESPONSE object.The Send and Receive functions utilize the corresponding functions provided by the contained Connection.
To facilitate the exchange of data from a sensor to another process, such as the control logic, EventHandler objects are introduced.It can be assumed that these EventHandler objects are implemented in a manner similar to the Observer pattern, which also encompasses publish/subscribe architectures.
In this setup, all events received from the Sensor are emitted to all listeners through a Producer, and processes receive these events by including a Consumer.

Object-Z Formalization
The specification of this simple embedded system follows a bottom-up approach.The deployment diagram, as depicted in Figure 4, can be defined using the Object-Z notation.To achieve this, some basic type definitions are introduced: PROTOCOL represents the communication protocols utilized between the devices and the control system, while EVENT is the type employed for data exchange between processes.
Basic type definitions introduce new types in Z and Object-Z.Such internal structure is considered irrelevant for the specification.In this particular specification, any details that are not architecturally relevant are abstracted this way.
The various PROTOCOL types used in the schema architecture are subsequently defined through an axiomatic definition.In this context, TCP and RS232 are established as values of type PROTOCOL: TCP, RS232 : PROTOCOL Up until this point, only basic types have been introduced.However, as Object-Z is object-oriented, objects are also created.In this context, the parent class is denoted as DATA, and it will later be specialized through inheritance into classes specific to the various data types: Communication between devices is represented as a sequence of bits.Given that standard data types such as integers, floats, or strings are irrelevant for the specification, only a bit representation is utilized.
As both a device and its corresponding device driver exchange either RESPONSE or COMMAND, the corresponding schemas inherit from the DATA class.In this context, RESPONSE can represent either MEASUREMENT or STA-TUS:

RESPONSE MEASUREMENT STATUS
Once the data types have been formalized, the various components and their connections can be configured.Initially, the abstract Connection class can be defined as follows: The symbol ?denotes input parameters and !denotes outputs [36].
A Connection possesses a type and manages bit sequences, represented as a stream (dataStream).The Write function appends bit sequences to the stream, while the Read function extracts them by reading bits from it.
The specific implementations, RS232Connection and TCP-Connection, are named after the types they set for the Connection object from which they inherit: The symbol ↓ denotes the union of Connection with all sub-types.Connection is abstract, thus the Connection has to be sub-type that implements it.The symbol © denotes object containment [36].

↾(event)
event : EVENT Each EventHandler registers for a specific EVENT, which can represent, for example, a simple response from the Device.In this example, the EventHandler is an abstract class, and Producer and Consumer are the specific implementations.Assuming both register for the same EVENT, like "NEW-DATA," a Producer can emit new events, and the Consumer receives and handles all incoming events.It is important to note that this relationship is not one-to-one but rather oneto-many, allowing for an indefinite number of Consumers to listen to the same Producer.
The main function of a Producer is the Emit function that is called with a passed DATA object and then all Consumers are notified: After introducing the basic classes, the logic of the embedded control system can be defined.The DeviceDriver manages all communication between the control system and the Device, with communication being established through the Connection class.In this scenario, assume this De-viceDriver is straightforward and serves as a relay between the control logic and the device.
The Consumer handles all incoming DATA from the control logic and forwards them to the device.When responses are received from the device, the emitter forwards these responses to all listeners.
In Object-Z, the symbol ∥ represents a sequential execution.Therefore, the Send function first receives an incoming event by invoking consumer.Consume, and only afterwards, that call's result is received, it is passed to the Connection, which then sends the command to the device.Conversely, incoming responses from the device are received from the connection using connection.Read and subsequently emitted to all listeners through emitter.Emit.Now that the abstract classes for Device, Connection, and DeviceDriver have been established, we can proceed to define the concrete classes for the sensor, named Sensor, and its corresponding device driver, SensorDriver, as depicted in Figure 3a In this example, all incoming commands are dispatched by the control logic, consumed by the driver, and subsequently forwarded to the sensor via the connection.Vice versa, all responses from the sensor are emitted as events by the corresponding producer and can be listened to by all consumers.
The essence of this specification lies in the communication between a device and its device driver, which is captured by the Communication schema.In this instance, the device is a Sensor, and the driver is a SensorDriver.Both the device and the driver share the same commandsList and are connected through an RS232Connection.
In Object-Z, the symbol "∥" signifies the execution of functions in parallel [36].Therefore, ReadFromDevice illustrates the Sensor sending data while the corresponding Sen-sorDriver reads it.Conversely, ReadFromDriver represents the reverse scenario, with communication from the SensorDriver to the Sensor: The details of the control system are not within the scope of this specification.The control logic for an embedded system is often some form of a state machine.State machines fully automate a system, but do not adapt to new or changed processes on the fly.Modern Industry 4.0 application incorporate autonomous behavior, extracted or learned from gathered data and thus, include architectures different from state machines.Furthermore, the orchestration of processes, including different commands to different sensor and actuators, can be quite complex.However, for this example, the only function of the ControlLogic class is to execute the commands received from the transmitter and return the responses from the sensor: The incoming commands contain the value that sets the sample rate of the sensor.To configure the period, the function sendCmd processes events sequentially from the transmitter queue.For each event, the SetPeriod function is called to set the sample rate.The newly configured period is then sent as a command to the sensor, which adjusts its sample rate accordingly.This message exchange is logged in a list called dataLog.
Assume the commands from the transmitter only include a period for the sensor's sample rate.To configure the period, the function sendCmd processes events sequentially from the transmitter queue.For each event, the ChangeBehavior executes SetPeriod to internally set the sample rate and newly configured period is then sent as a command to the sensor, which adjusts its sample rate accordingly.This message exchange is logged in a list called dataLog.
All events originating from the sensor are handled by sendRsp and are sent to the transmitter without any alterations.Once again, the message exchange is recorded in the data list through the LogData command.
With all required classes defined, the schema of the EmbeddedControlSystem from Figure 4

The Digital Model
Modeling and simulation are powerful methods utilized in various fields to evaluate complex systems, processes, and knowledge.They empower researchers, engineers, and decision-makers to examine real-world phenomena within controlled and virtual environments.This, in turn, enables them to make informed decisions and gain insights into the system under investigation.At the core of modeling lies the concept of mathematical modeling, which plays a pivotal role in formally capturing the essence of the system.
Mathematical models are representations of real-world systems employing mathematical equations, relationships, and logical structures.They provide a means to describe and quantify the behavior of a system.While mathematical models are not confined to any specific domain, in this work, we concentrate on their application in the engineering domain.
Before the advent of computers, the construction of machines was primarily carried out on drawing boards.This paradigm shifted with the introduction of computeraided designs (CAD), enabling the creation of 2D and 3D models that could be easily shared and replicated with others.Over the past decades, advancements in tooling and computational power have facilitated the substitution of real prototypes with virtual prototypes.This transition has significantly reduced design cycles and lowered design costs.When components of a system are governed by mathematical relationships, virtual prototypes can be rigorously tested in simulations across a wide range of conditions.This allows for the evaluation of potential design weaknesses, providing immediate feedback on design decisions.
The Digital Model serves as a central component of a digital twin.However, most definitions merely mention digital models, assuming that researchers share a common understanding of what a model entails.This often leads to the assumption that a CAD model constitutes the entirety of a digital model, while a simulation is considered something more than a digital model, despite both being forms of mathematical models.Hence, we define a digital model as follows: Definition 7 (Digital Model).A digital model describes an object, a process, or a complex aggregation.The description is either a mathematical or a computer-aided design (CAD).This definition encompasses various aspects of digital modeling, including the use of CAD as the foundational model for system design, its utilization within simulation tools involving complex processes, and even purely mathematical models.

Introducing the State Machine Example
Although the physical twin is defined as including (autonomous) behaviors instead of a state machine, this example could also be implemented as a state machine, where one can model its different states as follows: A state machine M can be represented by a 5-tuple M, which consists of a finite set of states Q, a finite set of input symbols known as the alphabet , a transition function delta defined as δ : Q × → Q, an initial or starting state q 0 ∈ Q, and a set of accept states F ⊆ Q.The creation of state machines, often done using tools like LabView, remains a common approach employed by engineers for programming machines.This practice falls within the scope of the provided definition of a digital model.
The state machine of the embedded control system can be defined as follows: The corresponding UML state diagram is presented in Figure 6.Upon initiation, the initial state is STANDBY, with the corresponding period value for the sensor's sampler rate set to 0, indicating that no samples are taken at this point.If a command with a value x ∈ , where x > 0, is issued, the state machine transitions to the ACTIVE state.Conversely, if a command with a value x = 0 is received, the state reverts to STANDBY.For values of x < 0, the state of the system changes to OFF.

Object-Z Formalization
This state machine can also be specified in Object-Z.First, the class diagram is displayed in Figure 7. STATE is the parent class:

↾(execute) execute
The execute method will be internally overwritten by the child states.For this example, the specific code that is executed is irrelevant.The states the state machine can be in are defined as subclasses: The EventStateMachine encapsulates the logic responsible for state changes upon receiving COMMAND events and maintains both a STATE (state is also the variable) and a period, which is a number.Initially, the period is set to 0, corresponding to the initial state set as STANDBY.The ProcessEvent function is responsible for modifying the state of the state machine in response to incoming events.

The Digital Template
In their initial definition of digital twins, Grieves and Vickers [20] view the digital twin as a collection of information necessary for constructing and monitoring the physical object.Specifically, the digital twin prototype can be regarded as a virtualized set of blueprints, bills of materials, technical manuals, and similar documentation.When combined with the digital model, which can be used to extract all the information needed for creating blueprints and bills of materials, it can indeed be employed to construct and maintain the physical twin However, this approach does not completely virtualize the physical twin, as later demonstrated by the example of the OSI Model in Figure 17 on Page 19.Thus, the early interpretation of this definition does not fully realize a digital twin of a physical twin.
To encompass all available materials for constructing and maintaining the physical twin, including the software running the physical twin and the digital model, these components can be bundled together into a comprehensive package.We refer to this bundle as the Digital Template.Definition 8 (Digital Template).A digital template serves as a framework that can be tailored or populated with specific information to generate the physical twin.It encompasses the software operating the physical twin, its digital model, and all the essential information needed for constructing and sustaining the physical twin, such as blueprints, bills of materials, technical manuals, and similar documentation.[20] initially defined digital template as a digital twin prototype.However, in Grieves [39], they expanded upon their definition of a digital twin prototype.Their digital twin prototype is all the products that can be made, including all their variants.They take shape over time, from an idea to a first manufactured article [39].We still consider that early versions of their digital twin prototype are only a digital template.However, fully developed, they could also include the digital twin prototype definition presented later in this work.

Object-Z Formalization
The UML class diagram of a digital template is depicted in Figure 8.The digital template includes all documents that either describe the physical twin or are required to build it.Furthermore, it includes the digital model the real system is derived from and the software that operates the physical twin later.For an Object-Z formalization, the general class Document is defined:

The Digital Thread
With the development of CPS, machines began interacting with servers tasked with monitoring and controlling them.This paradigm also applies to digital twins.In this context, the communication channel facilitating such interaction is referred to as a digital thread.Taking inspiration from Leiva [40], we define the digital thread as follows: Definition 9 (Digital Thread).The digital thread refers to the communication framework that allows a connected data flow and integrated view of the physical twin's data and operations throughout its life-cycle.
Data accumulated from physical objects can only be preserved if these objects possess an interface for storing the generated data.Similar to the general digital twin definitions, there is, currently, no universally accepted and standardized solution for digital threads, given their diverse applications across various domains.
Furthermore, it is crucial to understand that the digital thread encompasses more than just the communication protocol.It also involves applications and functionalities that assist in tasks such as monitoring, analysis, planning, and execution.These applications have the capacity to incorporate and share knowledge derived from the digital template and the gathered data preserving the physical twin's evolution through time [41].

Object-Z Formalization
The UML class diagram for a digital thread between the previously formalized physical twin and a digital twin, which will be defined later in this paper, is illustrated in Figure 9.The DigitalThread exists of a PTtoDTConnection that sends measurement and status messages (see the RE-SPONSES Object-Z class) and the DTtoPTConnection, which sends commands to the physical twin.To send data, a Trans-mitterDriver is used to to establish a Connection.Notice that this connection is not between a DeviceDriver and a Device, but between two transmitters, e.g. using the LoRaWAN protocol.Both connection types gather data from processes (DigitalThreadProcess).In general, these processes can be different in each digital thread.Referencing our example again, the ControlLogic represents a PTDigitalThreadProcess, since it forwards all sensor message to the transmitter, which then can transmit the data to the digital twin.On the digital twin's side, the DTDigitalThreadProcesses can include many different kinds of processes.However, there are is at least one process that is included: the process that decides which command is sent to the physical twin to adjust its sample rate.Since the digital thread is meant to show the evolution of the physical twin over its life-cycle, all the gathered data has to be stored in some form of a database.Hence the database is a DigitalThreadProcess that is part of the digital thread.
Formalizing this with Object-Z, we first define the Digi-talThreadProcess: • Monitor: This is the first stage of the framework.In this phase, the system continuously collects data and monitors its performance and the surrounding environment.This can involve data from various sensors, actuators, or monitoring tools that gather information about the system's behavior, resource utilization, and external conditions.
• Analyze: To gain insights into the system's behavior and performance, the data collected through monitoring, gets analyzed.The goal is to identify patterns, anomalies, and potential issues and hence, to understand the current state of the system.
• Plan: Based on the analysis of the system's current state, the system formulates a plan for actions to be taken.This plan may involve adjustments, optimizations, or corrective measures aimed at improving system performance, resource allocation, or other relevant parameters.
• Execute: In the last phase, the system carries out the actions defined in the planning stage.These actions can be automatic or semi-automatic, depending on the level of autonomy and control designed into the system.The system implements the planned changes to achieve the desired state.
• Knowledge: This component is critical for learning and adaptation.It involves maintaining a repository of historical data, models, policies, and best practices.The system uses this knowledge to make more informed decisions in subsequent iterations of the MAPE-K loop.Over time, the system becomes better at self-optimization and self-management by learning from its past experiences.
These stages are executed sequentially one after another and all have permanent access to the Knowledge about the system.The realization of the data flow between the different stages is part of the Digital Thread.Also, applications around the different stages, which are, for instance, connected via APIs, are also part of the Digital Thread, if they provide better insight for the corresponding physical twin to the user.

The Digital Shadow
To fully harness the potential of the digital thread, a process situated at either end of the digital thread must consolidate all the disparate elements into a platform that users can utilize to gain insights into the current state of the physical twin.In the context of the Digital twin concept, this role is fulfilled by the digital shadow.The digital shadow is defined as follows: Definition 10 (Digital Shadow).A digital shadow is the sum of all the data that are gathered by an embedded system from sensing, processing, or actuating.The connection from a physical twin to its digital shadow is automated.Changes on the physical twin are reflected to the digital shadow automatically.Vice versa, the digital shadow does not change the state of the physical twin.
The configuration of the digital shadow for the physical twin, as specified previously, is illustrated in Figure 11.It is  important to note that some parts of the physical twin are not depicted in the figure.The digital shadow operates on a server that establishes a network connection to the physical twin, either through a cable or wireless.In this example, assume a wireless connection between the physical twin and its digital shadow.As the UML class diagram in Figure 13 shows, many classes from the physical twin can be reused.The transmitter uses the same device driver as the physical twin, the event handlers are equal, and also the message types can be reused.Only the classes for the Monitor and Analyze stages of the MAPE-K model are new.A direct association between the two classes is not required, as they exchange data via an Observer pattern using the event handlers.Software package to enhance these two classes, are ignored in this example.
For data retrieval, the digital shadow employs a connected transmitter.To facilitate transmitter operation, the physical twin's transmitter device driver can be repurposed.All data is then transmitted from the driver to the MAPE-K components.It is worth mentioning that MAPE-K is not an obligatory component of the digital shadow; it is used only for distinguishing representations between CPS, digital shadows, and a digital twin.
Since machines controlled by external computers/servers already exist in the form of CPS, it is essential to clarify the distinction between a digital shadow and a CPS.As illustrated in Figure 12, the digital model holds the same level of importance as Knowledge.However, a CPS does not necessarily have to include a model of the connected machine, and even if it does, this model may not always be up-to-date.In contrast, for a digital shadow, this scenario is different.In the monitoring stage, all received data automatically updates the digital model.
Another distinction is that a CPS can be used to directly operate the physical object.In contrast, a digital shadow's sole purpose is to monitor the physical twin and provide data for analysis, enabling insight into the received data.Consequently, the Planning and Execution stages of the MAPE-K model are not inherent components of the digital shadow.While they can be incorporated, the automated change of state in the physical object is not a function of the digital shadow.

Digital Shadow
Fig. 12: A digital shadow realized with the MAPE-K reference model.The Plan and Execution stages are not included, since there is also no data exchange from the Execution stage to the physical twin.

Object-Z Formalization
The UML class diagram in Figure 13  A direct association between the classes is not required, as they exchange data via an Observer pattern using the event handlers.Software packages to enhance these two classes, are again ignored in this example.
A digital shadow specification with Object-Z can be done as follows.The Transmitter and its operation are managed by the corresponding TransmitterDriver, both of which can be reused from the Object-Z formalization provided for the physical twin earlier.Additionally, all exchanged messages and the EventHandler can also be reused.Any status changes occurring in the physical twin are emitted as STATUS events, while all measurements are emitted as MEASUREMENT events.An emitter-producer is responsible for transmitting all consumed events to any registered listener.The most crucial component here is the digitalModel, which is an object of the previously specified EventStateMachine.
All status changes are handled by the handleState function, which reads all STATUS messages from the queue and forwards them to the digital model (state machine) for event processing.Subsequently, the result of the state machine's operation is emitted to all registered listeners.Since measurements do not impact the state machine's state, they are individually read from the queue via the handleMeasurements function and immediately relayed to all registered listeners.One such listener could be a database (part of the Knowledge state) responsible for storing all data.
It is worth noting that the digitalModel could also be a separate process that registers as a listener and consumes the STATUS messages.In this example, the direct reference in the Monitor class was used for better demonstration purposes.
The Analyze stage is also a DTDigitalThreadProcess and can be a (semi-)automated stage of the MAPE-K model in the context of the digital shadow.In this particular example, the Analyze stage serves a singular purpose, which is to verify whether the received state from the physical twin aligns with the state of the digital model or not.The outcomes of this comparison can then be emitted to all registered listeners.One potential listener could be a service responsible for notifying a user if any disparities in states are detected.Nonetheless, independent from the MAPE-K model, the analysis from the monitored events could also be done manually by a user, since no further stage is following: With these processes, the DigitalShadow schema can be defined.Since the MAPE-K example is only used for a better visualization of the concept, we use a more generic schema definition for the digital shadow: DigitalShadow digitalModel : DigitalModel© DThreadProcesses : P DTDigitalThreadProcess DTtoPTConnection : DTtoPTConnection© Please notice that no data is sent from the digital shadow to the physical twin.The DTtoPTConnection solely receives data from the physical twin.

The Digital Twin
After defining and specifying the digital thread and digital shadow, the subsequent step is to comprehensively define the digital twin.The digital twin expands upon the digital shadow by enabling automatic synchronization of all alterations made to the digital model with the corresponding physical twin.This means that any changes made to the physical twin are mirrored in the digital twin, and vice versa.Ultimately, the digital twin evolves into a complete replica of the physical twin.To formulate this definition, we draw upon the digital twin definitions put forth by Saracco [41] and Trauer, Schweigert-Recksiek, Engel, et al. [28]: Definition 11 (Digital Twin).A digital twin is a digital model of a real entity, the physical twin.It is both a digital shadow reflecting the status/operation of its physical twin, and a digital thread, recording the evolution of the physical twin over time.The digital twin is connected to the physical twin over the entire life cycle for automated bidirectional data exchange, i.e. changes made to the digital twin lead to adapted behavior of the physical twin and viceversa.

Digital Twin
Fig. 14: A digital twin realized with the MAPE-K reference model.The status change of the digital model and the corresponding data exchange from the Execution stage to the physical twin is fully automated.
Extending the system utilized in this example results in the addition of an extra communication channel from the digital twin to the physical twin, as illustrated in Figure 15.In the previously shown Figure 11, the digital shadow only facilitates communication from the physical twin to the digital shadow.Now, all modifications within the digital model are also transmitted from the digital twin to the physical twin.
Moreover, the MAPE-K model must be adapted to accommodate the digital twin, as depicted in Figure 14.The Monitor and Analyze stages in this new model are identical to those in the digital shadow, as shown in Figure 14.The Plan stage takes the analysis results and formulates an execution scenario for the Execution stage if changes to the physical twin are necessary.The key distinction from the original MAPE-K reference model lies in the digital twin, where the Execution stage interacts with the digital model.Only if a positive result is returned, the command is sent to the physical twin.Consequently, the digital model serves as the final control instance, and all incoming and outgoing changes are verified against the digital model.

Object-Z Formalization
The Object-Z formalization of the digital twin can be built upon the digital shadow, incorporating two additional stages of MAPE-K as mentioned previously.First, the Plan class is introduced:

Digital Twin Physical Twin
Fig. 15: The digital twin extends the digital shadow in a way, that the communication between physical twin and digital twin is bidirectional.Additional to communication from the physical twin to the digital twin, all changes in the digital twin are automatically sent to the physical twin.All results generated during the planning stage are emitted via the Producer emitter.Similar to the other stages, the Plan stage has direct access to the digitalModel.However, in this example, no specific access details are provided.
The primary objective of this stage is to formulate a plan outlining which part of the physical twin's software needs modification and how those modifications should be implemented.This task is executed through the plan function.All incoming data is consumed and subsequently passed to the Planning function.The resulting plan is then emitted to all registered listeners.
The last DTDigitalThreadProcess is the Execute class, which is kept straightforward as well.It receives all plans from the previous stage through the execute function.The commands are validated against the digitalModel, and the outcome is sent to the physical twin.The transmitter producer emits the command as an event to the Transmitter-Driver, which subsequently consumes this command and transmits it to the physical twin: Please note that the concrete implementation of the digital model in this context is not critical.The digital model could exist as a separate process that receives events through consumers and provides responses via producers.Alternatively, it could collect all events from the Execute stage and independently transmit the results to the transmitter.There are numerous ways to realize this concept; however, the fundamental idea remains constant: changes to the digital model automatically trigger changes in the state of the physical twin, without requiring any user intervention.

↾(INIT)
Similar to the digital shadow, we again define a generic schema DigitalTwin without the MAPE-K processes: The schemes DigitalShadow and DigitalTwin look similar in this Object-Z formalization.The main difference is that the digital twin can send state changes automatically to the physical twin.

The Digital Twin Prototype
Today's existing modeling and simulation tools can rapidly create a digital twin of a single component or process, and publish/subscribe architectures allow all messages between processes to be captured and sent to a database or an IoT platform.However, complex Industry 4.0 applications require the integration of multiple sensors and actuators into a larger system, posing a challenge with no simple solution yet.The embedded community still uses various industrial interfaces and communication protocols such as ProfiBus, ProfiNet, ModBus, CANOpen, OPC-UA, or MQTT, to name a few.Some are proprietary, making integration difficult, for instance, ProfiBus and ProfiNet.
Robust software testing for communication protocols is challenging due to the difficulty of emulating or simulating them.Software engineers frequently use mock-up functions in unit tests to avoid the expensive networking exchange of data between processes, allowing them to obtain expected values.However, even robust unit testing with comprehensive edge case coverage is insufficient.Therefore, some approaches use simulation tools that replace the communication protocols between hardware components with software interfaces.For Industry 4.0 applications, both approaches are inadequate, as insufficient testing can jeopardize the safety of human operators.Despite this, simulation tools are crucial for the development of Industry 4.0 applications as a source of data for sensors and actuators.
The software part of the connection can be formalized as shown in the Communication schema.The physical part, however, where the data is sent between Device and De-viceDriver cannot be replaced in the same way.Hence, the approach still involves real hardware in the development loop.During development and testing, the Connection object is the central piece.Without a counterpart, no command is executed, and no data is exchanged.Thus, engineers always require the hardware connected to the embedded software system they develop and test.Replacing the Connection with a software mockup to circumvent HIL would result in a different Connection object than used by the original SensorDriver.Thus, the configuration during development would differ from the real counter part it is deployed on later.Furthermore, not all communication protocols used in industry are properly mockable.This can be demonstrated by the example of ModBus and OPC-UA applications on the OSI-Model shown in Figure 17.Unlike Ethernet-based communication protocols that implement and cover all layers of the OSI-Model, communication protocols based on serial connections, such as ModBus or CANOpen, are placed on the model's 7th layer, the Application Layer.No additional host layers exist.Sending/receiving data is handled immediately by the Data Link and Physical Layers.This means that the physical hardware handles the necessary actions required for data exchange.Mocking these layers is difficult.On the other hand, communication protocols based on TCP, such as OPC-UA, can easily be mocked by opening a socket on the TCP layer and connecting another device to it.For serial protocols, this is not true.On connection, the driver tries to establish a connection to another device via RS232.As no device is connected, this would fail, and a connection error would be thrown.
Replacing the entire physical twin during development and testing, which includes the hardware interfaces, leads to a fully virtual representation of the physical twin and engineers do not necessarily need the hardware anymore for development.This is the main difference to the digital twin prototype definitions by Grieves and Vickers [20] and Grieves [39].We define the digital twin prototype as follows: Definition 12 (Digital Twin Prototype).A Digital Twin Prototype (DTP) is the software prototype of a physical twin.The configurations are equal, yet the connected sensors/actuators are emulated.To simulate the behavior of the physical twin, the emulators use existing recordings of sensors and actuators.For continuous integration testing, the DTP can be connected to its corresponding digital twin, without the availability of the physical twin.

Object-Z Formalization
To reduce the dependency of the embedded software system on the hardware during development and testing, communication protocols such as RS232 need to stay on the host layers of the OSI-Model without the need of changing the original connection properties of a device driver.This circumvents the layers that include the hardware.However, rerouting the connection disconnects the device and its driver.The rerouting only works if another process exists at the other end of the connection.So far, there is none.That is why not only the connection has to be emulated, but also the device.To begin, the emulated connection is defined first.The Object-Z formalization for EmulatedConnection is as follows: The EmulatedConnection object inherits from the abstract Connection class, and thus has all its properties and functions.This is shown on the OSI-Model in Figure 17.The safe way to stay in the host layers is to route all other communication protocols to TCP and from there again back to the original protocol.Hence, the EmulatedConnection does not replace the connection objects of Device and DeviceDriver.Instead, it is an independent additional connection that provides interfaces for a device emulator and a device driver to connect to with their original protocols.The Emulated-Connection then uses TCP and forwards all incoming data via the function EmulateRead and all outgoing data via the function EmulateWrite between the emulated device and device driver.
How can this be realized without reconfiguring the device or device driver?Simply by using tools such as socat (SOcket CAT) [42].Socat is a command-line utility that allows for bidirectional data transfer between two endpoints, typically over a network or through pipes.It is similar to the more well-known tool netcat, but with support for multiple connection types and protocols (TCP, UDP, SSL, PTY, etc.).With two virtual serial ports (client and server) via socat for the emulator and the device driver, a connection can be established without the need to change the configuration.In the background, socat forwards the data between the ports via a TCP connection.
A device emulator for a sensor could be like the one shown in Figure 18.Similar to the real sensor, the Sen-sorEmulator inherits all properties and functions from the generic Device class.There is only one difference; instead of executing a command and responding with the real result, the emulator uses virtual context for the response.Virtual context can be a list of previously recorded data from the real device or context provided by a simulation.In this example, we assume that the virtual context is previously recorded data with the real device.Formalizing the em- ulated device and connection with Object-Z requires the definition of another data subtype first.Since the sensor responds to commands with a RESPONSE type, a subtype of RESPONSE named RECORDING can be defined:

RECORDING RESPONSE
The abstract class Emulator inherits all properties and functions from the abstract class Device, and SensorEmulator inherits from Emulator:

Emulator Device
Although it may seem more obvious to inherit from Sensor, the emulator cannot inherit its properties and functions from there.Most devices are a black box for the developer, and vendors only provide a technical manual and support to interact with the device.Thus, an emulator only mimics the behavior of the real counterpart and provides its API with corresponding return values.However, this is enough to replace the real device with the emulator for development and testing.A developer is mostly interested in the connection and data exchange part, not the internal behavior of a connected device.Due to abstraction reasons, the Sensor object in this example was very simple.That is why the SensorEmulator can also inherit all properties from Emulator and change the ExecuteCommand function to always return RESPONSE objects from the virtualContext set: The SensorDriver remains as it is and does not need any changes.The communication between an emulator and the SensorDriver can be specified as follows using EmulatedCommunication: The EmulatedCommunication object now includes an additional Connection object in the form of EmulatedConnection.
The communication from the emulator to the device driver, labeled as ToDrv is now a composition of the connections from the device to the EmulatedConnection.From there, the data is sent to the device driver, where the EmulatedConnection receives it and forwards it to the connection defined by the device driver.The EmulatedConnection is not part of either the device/emulator or the device driver.Therefore, in this example, the SensorDriver cannot differentiate between whether it is connected to a real device or an emulator, which is the goal of our approach.

Summary of the Digital Twin Concept
The relationships between the different concepts are illustrated in the UML diagram in Figure 19.We extended the semi-formal approaches by Yue, Arcaini, and Ali [17]  The special feature of the digital twin prototype is that it is operated by the same Embedded Control System as the physical twin.This software does not even recognize, whether physical hardware or emulated hardware is used.Notice that the Digital Model used by the digital twin prototype is a different instance than the Digital Model updated by the Digital Shadow.Advanced Digital TWins can use the Digital Twin Prototype to evaluate "what-if" questions in more realistic scenarios that include the full software stack.

APPLICATION OF THIS CONCEPT
In the following, two projects are presented, where the previous definitions and methods were already applied in real life contexts.

Field Experiment with Underwater Ocean Observation Systems
The digital twin prototype approach was developed for a network of ocean observation systems and tested during the research cruise AL547 with RV ALKOR (October 20-31, 2020) of the Helmholtz Future Project ARCHES (Autonomous Robotic Networks to Help Modern Societies) [43].In ARCHES, with a consortium of partners from AWI (Alfred-Wegener-Institute Helmholtz Centre for Polar and Marine Research), DLR (German Aerospace Center), KIT (Karlsruhe Institute of Technology), and the GEOMAR (Helmholtz Centre for Ocean Research Kiel), several digital twin prototypes for ocean observation systems were developed.The major aim of this project was to implement robotic sensing networks, which are able to autonomously respond to changes in the environment by adopting its  measurement strategy, in both space and in the deep sea.A field report on employing digital twin prototypes in this context is published by Barbie, Pech, Hasselbring, et al. [43].
Five digital twin prototypes of ocean observation systems constructed at AWI and GEOMAR were developed.They vary in construction, payload, and configuration.The distance between AWI and GEOMAR are a few hundred kilometers.Hence, the digital twin prototypes were used to develop the software, without a permanent connection to the physical ocean observation systems.The microservices were implemented with ROS and encapsulated in Docker.How the different digital twin prototypes of the ocean observation systems were developed, was describe by Barbie, Hasselbring, Pech, et al. [44].A special feature in this project was that the digital twin prototypes were used as digital twins of the physical twins underwater.The fully virtualized embedded software systems showed the state of the physical twins.This way, no extra software to run a digital twin was required.
Furthermore, with digital twin prototypes it was possible to develop and test scenarios before the mission took place.Automated testing is implemented through CI/CD in Gitlab.During the mission, all exchanged messages on the digital twin and digital twin were recorded and can now be used to increase the quality of the CI/CD pipelines.

Case Study with Smart Farming Applications
As the digitalization of agricultural processes promotes the use of digital twins for various use cases [45], we also report on a case study that experimented with the digital twin prototype approach for a smart farming application.
The smart farming project SilageControl with a consortium of the Silolytics GmbH (project lead), Blunk GmbH, and Kiel University used digital twins to adopt the digital twin prototype approach for development and maintenance.The major goal of SilageControl is to improve the process of silage making, i.e. the fermentation of grass or corn in silage heaps.In order to avoid mold formation, the harvested crop is compacted by heavyweight tractors.As displayed in Figure 20, these tractors are equipped with a sensor bar, which includes GPS sensors, an inertial measurement unit (IMU), and a LiDAR.In combination, the sensors enable the continuous and accurate representation of the tractor's position / orientation and the shape and volume of the silage heap.
Since silage making is season dependent, the digital twin prototype approach is used to improve the sensor platform independent from the current season.The first field experiments were conducted from May to October 2022.During this period, sensor data was be recorded to further improve the accuracy of physical models and create scenarios for automated testing of future features.Thereby, data gathered by the digital twin improves the digital twin/digital twin prototype and vice versa.A case study with more details about this project was published by Barbie, Hasselbring, and Hansen [46].

CONCLUSION AND FUTURE WORK
Digital twins find applications across all layers in Industry 4.0 scenarios [44].However, there exists confusion in the  definitions of digital models, digital shadows, digital twins, and digital twin prototypes.While many studies attempt to list and categorize these differences, a formal description has been lacking.Therefore, in our Digital Twin concept, we formally specified the various components, ranging from the physical twin to the digital twin, culminating in a fully virtualized digital twin prototype capable of substituting the physical twin during development.To underscore the distinctions among these different facets of the digital twin from a software engineering standpoint, we provide an Object-Z formalization for each component.
We extended the digital twin concept by the Digital Template.A digital template describes the physical twin and is used to build it.It includes the physical twin's Digital Model, describing documents, and the Embedded Control Software.
We have provided real-world application examples to illustrate the practical context.A proof of concept for the formal specifications was demonstrated in a demonstration mission showcasing the viability of digital twins in ocean observation systems [43].Moreover, we offered insight into how this approach could be employed in the SilageControl smart farming project, which aims to enhance the silagemaking process through the development of a sensing platform [46].
The usage of digital twin prototypes transforms the way how embedded software systems are developed.By starting with the emulation of hardware sensor by sensor, actuator by actuator, and communication protocol by communication protocol, the development of embedded software systems becomes an iterative process.Furthermore, the integration of a fully operational digital twin prototype heralds a shift towards collaborative efforts between engineers and domain experts, regardless of their physical location or connection to the hardware.
Besides reducing the time that is needed for testing by switching from HIL to SIL testing with digital twin prototypes, this approach also avoids expenses for redundant hardware and paves the way for more efficient development workflows that are otherwise difficult to implement for embedded software systems.Digital twins become a key enabler for fully automated integration testing of embedded software systems in CI/CD pipelines.While building, testing, and releasing of software is possible for embedded software just like in other fields of software engineering, integration testing with hardware interaction is expensive, due to the HIL testing, and is often done manually.Thus, the integration tests are a bottleneck in the verification and validation activities and, hence, the release of new software.Anyway, with proper integration testing, developers increase the robustness of the embedded software systems.This may even embrace Industrial DevOps methods in the embedded field [3].
In summary, digital twins have the potential to enhance the quality of embedded software systems, concurrently reducing costs and accelerating development speed.These benefits align with the challenges cited by both Ebert [6] and Ozkaya [5]], who identified the challenges to achieve quality while managing costs and efficiency.
Nevertheless, the digital twin community still has a lot of home work to do.The lack of a consensual definition of digital twins leads to a lot of room for interpretation what a digital twin is.Instead of introducing abstract approaches that are described using an attached case study, researchers should focus more on formal approaches to demonstrate and distinguish different approaches.This still may leads to many different digital twin definitions, but at least the community is able to consolidate similar approaches and has a starting point to discuss differences, flaws, or benefits of different approaches.With the introduction of virtualization tools such as Docker and open platforms such as GitHub, the distribution of code and tools to replicate results of a research study or experiment with an approach became easy and has no costs attached.
The validation of research results and the reproducibility of experiments are integral aspects of good scientific practice [47].However, replicating the conducted field experiments from our ARCHES demonstration mission or the SilageControl case study using similar hardware can be quite expensive.To facilitate independent replication of the digital twin prototype approach by engineers and other researchers, we have developed a digital twin prototype using cost-effective hardware, specifically a PiCar-X by SunFounder [48].This digital twin prototype is based on the ARCHES Digital Twin Framework [49] and is publicly available on GitHub [50].More comprehensive details about the PiCar-X digital twin prototype will be presented in a separate publication.
depicts the relationships.Physical twin and digital twin exchange data via the PT-To-DT-Connection and DT-To-PT-Connection.

Fig. 4 :Fig. 5 :
Fig.4: The deployment diagram of an embedded system comprising a sensor, a data transmitter and the embedded control system both are connected to.The sensor is connected via RS232 and the transmitter via transmitter via Ethernet.

A
Device comprises a Connection object and a set of accepted commands (commandList).The Connection object is responsible for managing data exchange between a Device and a DeviceDriver.The ExecuteCommand function represents the execution of a task following the transmission of a command to the Device.It expects a COMMAND object sent by the DeviceDriver and returns a RESPONSE object.The Read and Write functions make use of the corresponding functions provided by the contained Connection: Device ↾(INIT, Send, Receive, commandList) connection : ↓Connection© commandList : P COMMAND connection ̸ ∈ Connection #commandList > 0 Producer ↾(INIT, event, Emit) EventHandler Emit occuredEvent?: ↓DATA eventToEmit!: ↓DATA eventToEmit!= occuredEvent?A Consumer registers via the Observe to an EVENT and only listens to the emitted events and handles them in a queue.The Consume function returns always the first element in the queue: Consumer ↾(INIT, event, queue, Observe, Consume) EventHandler queue : P ↓DATA INIT queue = ∅ Observe ∆(queue) item?: ↓DATA . In this particular example, Sensor and Sensor-Driver are interconnected using an RS232Connection.The outcome of an executed command is categorized as a RESPONSE, which can represent either a MEASUREMENT or a STATUS object.The remaining functions within these specific classes remain consistent with those in the abstract parent classes Device and DeviceDriver: Sensor ↾(INIT, Send, Receive, commandList) Device connection : RS232Connection© ExecuteCommand command?: COMMAND result!: RESPONSE command?∈ commandList A SensorDriver inherits the EventHandlers from its parent class: SensorDriver ↾(INIT, Send, Receive, commandList) DeviceDriver connection : RS232Connection©

Fig. 6 :Fig. 7 :
Fig.6: A state machine of the embedded control system formalized for the physical twin.
EventStateMachine ↾(INIT, ProcessEvent, state) state : ↓STATE period : Z INIT period = 0 ProcessEvent ∆(state) newEvent?: COMMAND newState!: ↓STATE state ′ = newState!It is important to note that, at this stage, the EventStateMachine has no connection to the physical twin.All modifications and updates are made manually, and there is no automatic synchronization between the digital model and physical twin.The schema for the digital model than includes the state machine: DigitalModel ↾(INIT, ProcessEvent) stateMachine : EventStateMachine© INIT stateMachine.INIT ProcessEvent = stateMachine.ProcessEvent

Fig. 11 :
Fig.11: The digital shadow is deployed separately from the physical twin.The automated communication is unidirectional from the physical twin to the digital shadow.Status changes and all other data is sent by the physical twin and received by the digital shadow via transmitters.The digital shadow can reuse the transmitter driver from the physical twin.The logic inside the digital shadow is based on the MAPE-K model.
is reduced to the two new classes for the Monitor and Analyze stages.All other classes and relationships are identical to the UML class diagram of the physical twin in Figure 5 on Page 7.

Fig. 13 :
Fig. 13: Reduced UML class diagram of the digital shadow.The MAPE-K stages Monitor and Analyze are included, all other classes and relationships are identical to the UML class diagram of the physical twin in Figure 5 on Page 7.

Fig. 16 :
Fig. 16: UML class diagram of the digital twin, including only the MAPE-K relevant classes Monitor, Analyze, Plan, Execute, and the EventHandlers used for data exchange.All other classes are identical to the UML class diagram of the digital shadow in Figure 13.
This class is also a DTDigitalThreadProcess and includes a Consumer component to receive data from the Analyze stage.

Fig. 18 :
Fig. 18: UML component diagrams for sensor and emulator components.The real SensorComponent in (a) can be replaced by an EmulatedSensorComponent (b) and the SensorDriver (c) cannot distinguish whether it is connect to the real sensor in (a) or the emulated on in (b).

Fig. 19 :
Fig.19: Relationships between physical twin, digital model, digital template, digital shadow, digital twin, and digital twin prototype.
(a) Sensor bar in lab environment (b) Sensor bar mounted on a tractor

Fig. 20 :
Fig. 20: Sensor bar which monitors the process of silage making.
Assume for this example that the DeviceDriver fully implements all interactions with the Device and hence, the commandList for both instances is equal.The Receive and Send functions in this class also utilize the Connection's Read and Write functions.Any further implementations beyond this scope are not relevant to our specification.Data exchange between different processes, such as the DeviceDriver and the ControlLogic, occurs through Even-tHandlers: Similar to the Device class, the DeviceDriver class also contains a Connection object, a set of commands, a set of known behaviors, and a function that maps a behavior to the corresponding command that can be sent to the Device: Similar to the SensorDriver, the TransmitterDriver represents only a data relay between device and control logic: Communication device : Sensor driver : SensorDriver ∀ x : device.commandList• x ∈ driver.commandList∀ x : driver.commandList• x ∈ device.commandListReadFromDevice = device.Send ∥ driver.Receive ReadFromDriver = driver.Send ∥ device.Receive The Transmitter class is akin to the Sensor class in many ways.It handles incoming commands and provides responses in return.However, since the Transmitter is an actuator, it does not return measurements but instead sends data using another communication protocol, such as LoRaWAN.It is important to note that this communication differs from the Communication schema described earlier.Additionally, the Connection object solely represents the connection between the Device and DeviceDriver and does not pertain to the communication between two transmitters: [17]ical Environment.The Physical Environment is not a real class, but the real world context in which the Device operates.Changing behaviors lead to changes in the current state of the physical twin.Hence, the physical twin updates its state and sends the change of state via the Digital Thread, which was named Twinning in Yue, Arcaini, and Ali[17], to the Digital Shadow.Different to the formalization by Yue, Arcaini, and Ali[17], the physical twin is not directly connected to the digital twin, but via the Digital Shadow, which is included by the digital twin.In our Object-Z formalization of the digital shadow and digital twin, we illustrated the difference utilizing the MAPE-K model and showed that the digital shadow does not send any data to the physical twin.All state changes are received by the digital shadow, which then changes the Digital Model.Only the Digital Twin updates state changes similar to the change of state of the Physical Twin.Instead of physical processes, the digital twin uses the Digital Model, which operates in a Virtual Environment, to change the physical twins state.During the development phase, the Digital Twin Prototype can replace the physical twin.A digital twin prototype executes commands on Emulated Hardware in a Virtual Environment.The Virtual Environment should mirror the real world, which can be realized via a Simulation.To describe and construct the Physical Twin its Digital Template can be used, since it includes the Digital Model and the Embedded Control Software.
[18]Becker, Bibow, Dalibor, et al.[18]for the digital twin the digital shadow.A Physical Twin performs actions using real Devices in a