State Perception and Prediction of Digital Twin Based on Proxy Model

The maintenance of critical components plays a crucial role in ensuring the overall stable operation of equipment and minimizing damages caused by functional errors. However, Traditional operation and maintenance (O&M) modes suffer from problems such as reliance on empirical judgment, lack of data support, insufficient preventive maintenance, and inadequate collaborative management. To address these issues, a viable approach is to adopt more intelligent O&M modes. Based on the characteristics of digital twin technology, such as virtual interaction and real-time feedback, a digital twin framework for critical component maintenance of equipment is proposed, providing a new approach for the practical application of digital twin in intelligent maintenance processes. This framework consists of two key components: the digital twin maintenance model and the proxy model. The process of establishing the digital twin model is elaborated in detail, and a mechanism that integrates digital twin technology and the proxy model is proposed, along with a prediction process based on the fusion of simulation and monitoring data. Finally, based on the summary of the modeling process and the proxy model, a visualization interface for intelligent maintenance of components is built using relevant engineering software.


I. INTRODUCTION
During the usage of mechanical components, damages and even failures can occur due to external and internal factors, making it difficult to maintain a stable operating state over a long period of time [1]. As components are used over an extended period, the operating environment becomes increasingly complex, posing new challenges for equipment diagnosis and maintenance [2], [3], [4]. Currently, the maintenance of mechanical components is mainly based on after-the-fact maintenance and limited experience-based predictive maintenance, which limits the accuracy and With the emergence of industrial transformation strategies such as Germany's Industry 4.0 and China's Made in China 2025, as well as the development of big data, cloud computing, and sensor technologies, digital twin technology has been widely applied [10]. Braatz R. [11] proposed a novel probabilistic fault detection and identification method which adopts a newly developed deep learning approach using Bayesian recurrent neural networks (BRNNs) with variational dropout. Liu et al. [12] developed a lightweight digital twin model using multi-fidelity surrogate (MFS) modeling to monitor the structural health of a crane boom in realtime, ensuring its operational capability under design load capacity. Chen et al. [13] constructed a digital twin-driven residual effective life prediction model based on an implicit semi-Markov model to predict the remaining effective life of equipment. Song et al. [14] proposed a solution called ''algorithm-measurement fusion'' that combines mechanistic modeling and measured data to meet the timeliness and accuracy requirements of digital twin models and construct a digital twin framework for ''form-essence integration'' of major equipment. M.G. Kapteyn [15] proposed a method that combines a component-based reduced-order model library with Bayesian state estimation to create a data-driven digital twin. Ierapetritou M [16] reviewed recent advances in the area of surrogate models for problems in modeling, feasibility analysis, and optimization. Two of the frequently used surrogates, radial basis functions, and Kriging are tested on a variety of test problems. Xie et al. [17] aim to describe the development of an AR-supported automated environmental anomaly detection and fault isolation method to assist facility managers in addressing problems that affect building occupants' thermal comfort. Glaessegen [18] integrated ultrahigh fidelity simulation with the vehicle's on-board integrated vehicle health management system, maintenance history and all available historical and fleet data to mirror the life of its flying twin and enable unprecedented levels of safety and reliability.
Although complex simulations are still insufficient to meet engineering requirements, high-precision simulations that reflect the real-time operating details of components may take several hours to complete. Therefore, relying solely on high-precision simulations is impractical and unfeasible [19]. An alternative solution, which has garnered much attention, is to use proxy modeling to capture the main features of the original model. This technology approximates input-output relationships while reducing computational costs, providing better real-time feedback and reliability predictions that reflect the actual operational conditions of physical entities, thereby assisting in equipment component diagnosis and maintenance [20].
In summary, the concept of digital twins has provided a new approach to the maintenance and management of critical components. This paper emphasizes the advantages of proxy modeling in state perception and prediction of digital twin models. The main contributions include: (1) proposing a general framework and implementation roadmap for real-time perception and prediction of component operational status based on digital twin and proxy modeling, (2) utilizing engineering software such as ANSYS, PyCharm, and Unity to visualize operational results for efficient and convenient operation.

II. DIGITAL TWIN FRAMEWORK FOR EQUIPMENT COMPONENT MAINTENANCE
Mechanical equipment components have long operating cycles and harsh working environments [21]. The virtual interaction and real-time feedback features of digital twins are beneficial for ensuring the efficiency and reliability of critical equipment components. In response to the characteristics and requirements of mechanical equipment in actual use, we focused on key issues such as state monitoring and fault prediction for critical equipment components. We established a multidimensional digital twin model for maintenance management [22]. In equation (1), M DT represents the digital twin model used for component monitoring and diagnosis, C PE represents the physical entity of the component, C VE represents the virtual component model, C CN represents the connection between the physical entity and the virtual model, C DD represents real-time monitoring data obtained from sensors and other equipment used in the component, and C PD represents predicted data obtained through proxy models.
Digital twin model is a virtual, digital counterpart of a real-world physical object or process that reflects the state and behavior of the real world by collecting, processing, and analyzing large amounts of data. Proxy model is a mathematical or physical model used to build a digital twin that simplifies real-world physical processes or phenomena and describes these processes or phenomena with a set of equations or rules. Proxy models can help researchers model and analyze the real world to quickly and accurately predict behavior and outcomes in different situations. Twin data are data associated with digital twin models, including data used to build and train digital twin models, data used to validate and test the models, and data used to monitor and update the models. Digital twin framework for component maintenance and operation management is shown in Figure 1.
The framework is based on the physical entity and its 3D model of equipment components, combined with emerging technologies such as intelligent sensors and machine learning, and utilizes proxy models to construct multidimensional virtual models of geometry, physics, behavior, and rules. The digital twin maintenance model and the physical component entity interact with each other in realtime, and the data is classified and stored on the twin data platform. By analyzing the data, the operating status of the equipment components can be diagnosed and predicted.  State prediction is an important means of keeping equipment components in good operating range. Proxy models are widely used in state prediction and can be used for fault diagnosis, performance optimization, and improving system robustness [23]. By learning known monitoring and simulation data, future states of equipment components can be predicted.
Based on the information contained in the twin maintenance model, the framework realizes the diagnosis of equipment component faults and equipment prediction driven by proxy model algorithms. Relevant maintenance strategies are specified according to actual situations to achieve intelligent maintenance of equipment and ensure that equipment components are in a healthy state [24], [25]. The proposed framework can improve the informatization level of equipment maintenance, solve various defects of traditional operation and maintenance, and achieve cost reduction and efficiency improvement.

III. ESTABLISHMENT OF DIGITAL TWIN MODEL A. CONSTRUCTION OF VIRTUAL MODEL
I-beam is a common structural steel that can be used in the construction of bridges and overpasses, as well as in the manufacturing of machinery and equipment. Considering the significant impact of the state of I-beams on overall equipment operation under different working conditions, this study focuses on the construction method of its digital twin model, in order to better achieve the full life cycle energy efficiency evaluation of the equipment [26]. According to the characteristics of I-beams, this study divides the digital twin model of I-beams into four steps: geometric model, physical model, behavioral model, and rule model [27]. The geometric model is used to describe the geometric characteristics of I-beams. Software such as SolidWorks and 3D Max can be used to draw the geometric model of the object [6]. The geometric model of the I-beam is drawn based on parameters such as flange height, leg width, and web thickness. The physical model mainly adds physical attributes to the geometric model, such as stress, modal, and deformation, through various simulations. Based on the geometric model, finite element analysis meshes are divided as shown in Figure 2; the generated mesh finally results in 65,212 nodes and 14,350 elements. The behavioral model is built based on the physical model, and the behavioral information is obtained from the MES system. According to actual parameter information, the behavioral model of the response process is constructed. For the I-beam model, the actual working conditions mainly focus on the stress of each part after bearing a certain load. Based on this, the Ansys tool is used to add corresponding process constraints for simulation, and the analysis results are shown in Figure 3. Based on historical data and prior knowledge, the rule model of the component object is constructed to reflect the true rules of the operation state of the I-beam model.

B. TWIN DATA OF I-BEAM STEEL
Twin data is the driving force behind digital twins and requires the collection, storage, and analysis of data to be completed under certain requirements. The twin data of I-beam steel is divided into six parts: physical entity-related data, virtual model-related data, proxy model data, fusion data, connection data, and domain knowledge data, as shown in Figure 4. In certain specific usage environments, the physical entity data of I-beam steel cannot be collected, and during simulation, accuracy cannot be guaranteed due to the interference of uncertain factors. Therefore, it is necessary to combine real-time monitoring data and simulation data to complement each other. To meet the real-time and timely requirements of digital twin services, a real-time transmission channel for the data of each part of the digital twin body needs to be established. Installing full-range sensors and wireless transmission devices for physical equipment I-beam steel can realize reliable and stable data collection. Real-time interaction between the real world and the virtual world data can be achieve between engineering software for virtual models and digital twin service platforms.
As the system runs, data and information continuously accumulate during the equipment usage process. Data fusion is a cyclic process of accumulation, updating, and improvement. By constructing iterative rules and mechanisms, new data and new strengths can be absorbed based on the original data to ensure that the result of data fusion remains advanced and more informative. Through the synchronous operation and interaction of the physical I-beam steel entity and the virtual model, functions such as state awareness and fault diagnosis for the physical I-beam steel entity can be achieved through the comparison of physical and simulation states, fusion analysis of physical and simulation data, and virtual model validation.

C. INTERACTIVE CONNECTIVITY
The connection between physical steel and virtual model: Physical equipment equipped with a series of sensors and wireless transmission devices, such as Bluetooth, WiFi, NFC protocols, sends the monitoring data collected by strain gauges and pressure sensors to the receiving platform for storage, thus achieving accurate mapping of the digital twin model to the physical steel entity.
Exchange of digital twin data: Steel twin data comes from physical entities, twin models, environmental parameters, historical experience, and other fused data. Due to the different representations of data from different sources and different collection sequences, it is difficult to share fused data. Therefore, it is necessary to break the information silos and achieve twin data exchange and sharing. Steel twin data is usually stored using databases and cloud services, and physical steel data is uploaded and twin data is called using protocols such as ZigBee, 5G, and Socket.
External service interaction: Due to the increasing user demand and pursuit of high-quality efficient services, the single-purpose functions and attributes of digital twin services are limited [28]. Therefore, it is necessary to establish interaction-related mechanisms to achieve the interaction and connection of digital twin services, including intelligent operation and maintenance, health management, decision optimization, and other service interactions. VOLUME 11, 2023

IV. THE REAL-TIME STATE PERCEPTION DRIVEN BY PROXY MODELS A. MODELS INTEGRATION
Model integration refers to combining several different models together to obtain more accurate and robust prediction results. In practical applications, model integration is a very important technique in the field of machine learning because a single model often cannot solve all problems, and model integration can improve the overall prediction by exploiting the advantages of different models.
The effectiveness of model integration depends on the diversity among the integrated models. Diversity can be achieved by using different algorithms, different features, different hyperparameters, etc. Diversity can improve the predictive power of a model, but it can also increase the variability between models, leading to an increase in model complexity.

B. KNN ALGORITHM
The KNN algorithm (K-Nearest Neighbor Algorithm) is a basic classification and regression method, which is a nonparametric machine learning algorithm. The basic idea of the KNN algorithm is to classify or regress new data points based on the distance between samples, that is, to classify the new data point into the category of the k nearest known data points. The implementation steps of the KNN algorithm are as follows: (1) Determine the value of k, that is, select k nearest neighbors; (2) Calculate the distance between the new data point and the known data points; (3) Take the k nearest known data points as neighbors of the new data point; (4) Classify or regress the new data point based on the categories of the k neighbors.
In classification problems, the predicted result is usually the category with the most frequent occurrence among the neighbors, that is, the voting method. In regression problems, the predicted result is usually the average value of the target values among the neighbors. The advantages of the KNN algorithm are that it is simple, easy to understand, easy to implement, and suitable for multi-classification problems and high-dimensional data. When processing the deformation and equivalent stress of the I-beam under different working conditions using the finite element software ANSYS, the different meshing forms determine the speed of calculation and the accuracy of the solution. However, the digital twin model has the characteristics of real-time interaction and iterative feedback. To meet the relevant requirements, the solution results with a meshing cell size of 5mm are exported, as shown in Table 1, and the node coordinates with a meshing cell size of 12mm are exported. The KNN algorithm is used to obtain the solution results, as shown in Table 2, and is used to input the RBF proxy model.

C. RBF PROXY MODEL
The RBF (Radial Basis Function) proxy model has good nonlinear approximation and generalization capabilities, and can perform well even with small datasets. It is widely used in areas such as function approximation, classification, and regression. RBF neural networks are artificial neural networks that use radial basis functions as activation functions, and their output is a linear combination of the input radial basis functions and weight coefficients. The radial function ϕ(x) satisfies the condition that for a fixed point c, it is equal for all equidistant x around the point c, it satisfies ϕ(x) = ϕ (∥x − c∥), that is, the function values are the same for points that are equally spaced around a fixed point c.
36068 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. There are many common radial functions, and the more common Gaussian functions used in this study are: The principles of the RBF neural network are illustrated in Figure 5. In this figure, the input layer X j (j = 1, 2, 3, . . . , n, where n represents the number of sample points) is a vector of design variables corresponding to the j-th sample point. The hidden layer C j (j = 1, 2, 3, . . . , h, where h represents the number of hidden layer nodes) is a vector of radial basis functions corresponding to the j-th sample point. The weight vector w j (j = 1, 2, 3, . . . , h) corresponds to the j-th sample point. The output layer Y is a vector of target functions obtained by adding the radial basis functions with weights in the hidden layer.
The activation function of the RBF neural network can be expressed as: The formula is composed of the following terms: X j --the j-th input sample; C i --the center of the i-th node; σ --the width parameter, which controls the radial range of the node.
From the RBF neural network structure, the network output can be obtained as follows: In the equation (4), w ij represents the weight of the basis function.
The establishment process of RBF proxy model generally includes the following steps: (1) Data collection: collect data for modeling, including input variables and corresponding output variables.
(2) Selection of basis functions: choose a suitable set of basis functions, usually Gaussian functions.
(3) Calculation of distances: calculate the distances between each input vector and other vectors.
(4) Construction of proxy model: use the values of basis functions to construct the proxy model, and determine the coefficients of the basis functions through optimization methods such as least squares.
When obtaining simulation data of I-shaped steel in finite element analysis software, the force surface of I-shaped steel is divided into 6 impression surfaces, and RBF proxy model is trained to export data. Each node establishes a proxy model, that is, 4411 proxy models are built, to obtain 6 working conditions of force states on each impression surface of the node. The deformation and equivalent stress of other positions of I-shaped steel are calculated using interpolation. The use of proxy model greatly shortens the calculation process in simulation analysis, which is an important guarantee for realizing the characteristics of digital twins, such as real-time monitoring, even iteration, and virtual-real interaction.

D. VISUALIZATION
As a widely used real-time 3D authoring platform, Unity3D has a large community of developers and is applied in multiple fields, including digital twins in the industry. With Unity, model data, sensor data, or point cloud data can be transmitted and rendered in real-time. After adding physical characteristics and behavior logic, not only can simple and abstract models and data be processed into photo-realistic real-time rendering effects, but also can be interacted on multiple platforms in the form of AR/VR/MR, realizing digital twins.
Currently, the traditional way to achieve digital twins using Unity is through communication between the data service and Unity, as shown in Figure 6. First, smart sensors are installed on the manufacturing equipment, and the monitoring data is uploaded in real-time by the sensors. Secondly, there needs to be a receiving service, which can be a simple backend service that receives the data uploaded by the sensors. Next, Unity obtains the data in real-time from the server through the socket method. Finally, using the real-time obtained data, Unity drives the mapped virtual device (which is currently manually modeled) in real-time.
The data transmission between Unity and PyCharm relies on Socket communication, which is a node for bidirectional communication between processes on different hosts, constituting the programming interface for a single host and the entire network. When data needs to be sent, the corresponding application process segments the data to fit transmission at the network layer. When a packet is received from the network layer, it is confirmed and lost packets are set for timeout retransmission. On the server-side (PyCharm), a local IP address of 127.0.0.1 is created, then bound with Bind, and the listening is started with a maximum connection of 5, waiting for the connection and data reception from the client-side (Unity). The server reads the position of the six points of the I-beam five times per second, calculates the stress results  in real-time using the RBF proxy model according to the stress conditions, and visualizes the stress results in the Unity interface, as shown in Figure 7.

V. CONCLUSION
This article introduces the application of digital twin technology in the diagnosis and health management of key equipment components, as well as the fusion mechanism of digital twin technology and proxy models, achieving intelligent prediction of component operating status.
Based on the characteristics of virtual interaction and realtime feedback of digital twin technology, a digital twin framework for device key component operation and maintenance is proposed, providing new ideas for the practical application of digital twin in intelligent operation and maintenance processes. In the process of combining digital twin technology with proxy model algorithms, the article first proposes the process of building a digital twin model, realizing the integration and visualization of monitoring and simulation data. Then, the proxy model algorithm is applied to analyze and predict the stress status of component operation and maintenance. Based on digital twin technology and proxy models, a new solution is provided for the development of intelligent operation and maintenance systems and for the application of digital twin technology.
In summary, the application of digital twin technology and machine learning algorithms is an effective way to achieve fault diagnosis and operation status monitoring of key equipment components. However, the application of digital twin technology is still in the exploration stage and has certain limitations. The focus of the follow-up research work is to establish more accurate digital twin models, which are an important basis for state sensing and prediction. By improving the modeling approach, data quality and model structure, the prediction accuracy and practicality of digital twin models can be improved. The application of digital twins in intelligent operation and maintenance still has great potential for development. Therefore, the method proposed in this study is only at the theoretical level, and in future work, it will be attempted to apply it to actual projects.