Constrained Optimization of Sensor Placement for Nuclear Digital Twins

The deployment of extensive sensor arrays in nuclear reactors is infeasible due to challenging operating conditions and inherent spatial limitations. Strategically placing sensors within defined spatial constraints is essential for the reconstruction of reactor flow fields and the creation of nuclear digital twins. We develop a data-driven technique that incorporates constraints into an optimization framework for sensor placement, with the primary objective of minimizing reconstruction errors under noisy sensor measurements. The proposed greedy algorithm optimizes sensor locations over high-dimensional grids, adhering to user-specified constraints. We demonstrate the efficacy of optimized sensors by exhaustively computing all feasible configurations for a low-dimensional dynamical system. To validate our methodology, we apply the algorithm to the Out-of-Pile Testing and Instrumentation Transient Water Irradiation System (OPTI-TWIST) prototype capsule. This capsule is electrically heated to emulate the neutronic effect of the nuclear fuel. The TWIST prototype that will eventually be inserted in the Transient Reactor Test Facility (TREAT) at the Idaho National Laboratory (INL), serves as a practical demonstration. The resulting sensor-based temperature reconstruction within OPTI-TWIST demonstrates minimized error, provides probabilistic bounds for noise-induced uncertainty, and establishes a foundation for communication between the digital twin and the experimental facility.


Introduction
Safe and efficient performance of nuclear power plants requires remote monitoring, condition-based maintenance, and real-time control via data streamed from physical processes [1]-especially in advanced reactors (e.g., microreactors), fission batteries, small modular reactors, and integrated energy systems.In nuclear systems, sensor capacities and real-time data streaming are severely limited, particularly for critical process responses such as coolant levels, temperature, velocity, pressure, neutron distribution and power fields.Furthermore, extreme operating conditions, high costs, limited accessibility and safety regulations, all impose significant constraints on sensor placement.The optimization of sensor placement is critical for accurate reconstruction of fields of interest, and must take into consideration not only access and safety constraints, but also the underlying physics.
In general, sensor placement optimization is NP-hard and cannot be solved in polynomial time.There are n p = n!/((n − p)!p!) possible combinations of choosing p sensors from an n-dimensional state.Exploiting low-dimensional structure inherent to the process physics is crucial for efficient optimization over complex environments.This paper explores constrained optimization of sensor placement tailored for nuclear digital twin paradigms.Empowered by strategically placed and informative sensors, digital twins continuously stream real-time data from physical assets for digitally-enabled decision-making, control, risk assessment, and predictive maintenance, as outlined in Figure 1.Digital twin realizations effectively function as virtual sensors that can predict structural lifespans and ensure structural integrity throughout product lifecycles, for example in aircrafts [2], [3].
The goal is "an integrated multi-physics, multi-scale, probabilistic simulation" [6], [7] mirroring the lifecycle of a complex system in order to reduce infeasible or costly physical testing.In nuclear applications, digital twinning requires real-time, high-precision simulations featuring online autonomous calibration to real data via machine learning (ML) (e.g., using k-means clustering and artificial neural networks [8]).Evaluating the effect of uncertainty in the digital twin, as simulated by ML models on reactor instrumentation [9], and establishing (through sensors) a real-time two-way connection between the physical and the virtual spaces are crucial considerations for the success of digital twins [10].
Reduced-order models (ROMs) are key enablers of digital twins that compress high-fidelity, high-dimensional simulations into low-dimensional surrogate models with fewer degrees of freedom, significantly reducing computational burden while still capturing the characteristics of the relevant process [11], [12].Nuclear applications require this ability to accurately simulate high-dimensional fields with minimal computational resources and without the complexity of full-order models [13], [14].Projectionbased ROMs, which represent high-fidelity physics using y = Sx + η Figure 1: Digital twins in nuclear power plants.Digital twin frameworks consist of a real/physical space containing physical assets, a virtual digital space containing computer-aided design (CAD) replicas, simulations and Artificial Intelligence (AI), a data space and a decision space [4], [5]; all of which are enabled by sensors providing two-way communication between the virtual (ROMs and simulations) and the physical spaces.This digital twin characterizes the lifecycle of OPTI-TWIST capsule, which is inserted into the TREAT reactor at Idaho National Laboratory (INL) to test fuel compositions.low-rank/data-driven modal decompositions, have been widely adopted for modeling fluids and turbulence [15]- [19] and nuclear core composition [20]- [22].Such modal decompositions are closely related to empirical orthogonal functions for surrogate models in atmospheric sciences [23], [24], electrodynamics [25], [26] and heat transfer [27], [28]; as well as model reduction of stochastic processes [29] and balanced model reduction for optimal control [30], [31].ROMs not only provide substantial dimensionality reduction for downstream decision-making and control, but also supply valuable physical information that can be leveraged for optimizing sensor placement.
Optimal sensor placement approaches typically optimize an objective, such as information criteria [32], [33], over feasible sets of sensor configurations, framing sensor placement as a submodular selection problem [34].Such problems can be efficiently optimized for hundreds or thousands of candidate locations using convex [33], [35] or greedy submodular optimization approaches [34].Sensor placement in linear time-invariant systems have been optimized using gradient descent methods with similar computational complexity [36].However, modern nuclear and fluid simulations have millions of grid points, making such techniques computationally intractable.Leveraging ROMs to optimize sensor placement, by exploiting low-dimensional patterns in data, drastically reduces the number of sensors required for accurate reconstruction of full fields [37]- [41].Empirical ROM interpolation methods have been successfully adapted for optimizing sensor placement to minimize reconstruction error [42]- [44], including under greedy cost constraints [45].However, these methods do not admit hard constraints within reactors or predict reconstruction uncertainty under a given sensor configuration with measurement noise.
This work develops a data-driven optimization approach for a nuclear component prototype, incorporating spatial constraints.In certain regions of a reactor, the placement of sensors may be constrained due to limited space availability or specific requirements dictating predetermined sensor locations, restricted areas within the reactor, fixed numbers of sensors within a region, or a minimum allowed proximity between sensors.Our target application is the Out-of-Pile Testing and Instrumentation Transient Water Irradiation System (OPTI-TWIST) prototype which is electrically heated to emulate the neutronic effect of a nuclear fuel.Production version of TWIST serves as a multi-purpose test rig for surrogate fuel rodlets, and simulate transient loss of coolant accident scenarios, to assist in qualification of an identical irradiation rig for the Idaho National Laboratory (INL) Transient Reactor Test Facility (TREAT).
We adapt data-driven methods based on modal decomposition [44] to enforce these constraints during optimization, and develop placement strategies for full-field reconstruction based on sparse spatially constrained sensor measurements.Our algorithm minimizes error covariance using D-optimal design criteria, which provides an evaluation metric for a given sensor configuration and corresponding estimates of reconstruction uncertainty under noisy measurements.Using empirical and theoretical validation, the present work demonstrates the technique to be near optimal using exhaustive enumeration of all feasible sensor configurations for a low-dimensional dynamical system.The optimized sensors under constraints are demonstrated to provide highly accurate reconstruction and uncertainty estimates under noisy measurements when compared to random placements in high-dimensional 2D heat diffusion, OPTI-TWIST steadystate and transient temperature fields with up to 40510 candidate sensor locations.

Background
This section first describes the need for optimal sensor placement in nuclear digital twinning applications through different stages within product lifecycles and product realizations.Next, we detail the reconstruction of latent flow fields from sparse sensor measurements using reduced order modeling.

Sensing in nuclear reactors
A nuclear digital twin is a digital CAD replica of a physical counterpart whose complexity can vary from that of an individual fuel rodlet, heat pipe, or nuclear reactor, to that of an integrated energy system utilizing several different energy sources (e.g., wind, solar, and nuclear).Within this digital replica, visual and virtual representations of all the different sensors and actuators of nuclear components are provided [46].Potential application areas can leverage these digital representations for design and licensing, plant construction, training simulators, predictive operations and maintenance, autonomous operation and control, failure and degradation prediction, the generation of insights from historical plant data, and safety and reliability analyses.
Real-time sensor data streaming through private or Industrial Internet of Things (IoT) communication protocols is indispensable for creating digital twin architectures for nuclear applications (see Figure 1).The sensors provide continuous self-validation of the ML/AI models which not only reflect the current state of the dynamical system but also predict, in real time, future states of the dynamics.Current sensor technologies in the nuclear field reflect a preference that sensors be installed in easily accessible areas.Only a few algorithms have been developed for sensor placement in nuclear reactors such as the generalized empirical interpolation method [47], reinforcement learning [48], and a directed graph approach for minimizing postulated faults with maximum imperceptibility [49].
Optimal sensor technologies empower digital twins by critically enabling the integration of in-field and real-time raw data into the virtual replica of a physical prototype at any point during its product lifecycle [50], which include the design, manufacturing, service and retirement stages.Our data-driven sensor placement methodology is designed for optimizing sensors in the design stage.Virtual sensors are converted into physical sensors and validated based on experimentation throughout design, manufacturing and service.If new constraints arise during production, optimal sensing techniques can incorporate them in real time and then suggest the next best set of optimal sensor locations.

Sparse sensing for reconstruction
The core of our work is the reconstruction of latent fields x ∈ R n from p noise-corrupted sensor measurements y ∈ R where η consists of zero-mean, Gaussian independent and identically distributed (i.i.d.) components, and S ∈ R p×n is the desired sensor (measurement) selection operator.
In nuclear applications, the number of measurements p is severely limited relative to the large dimensionality of the latent field.We encode the field dynamics as a linear combination of spatial basis modes ψ k (ξ) weighted by time-varying coefficients For each field or full state x at a fixed t, the vector a composed of the r coefficients a k (t) defines a low-rank embedding of the form where the modes ψ k comprise the columns of Ψ r .This basis, which can be built from spectral or data-driven decomposition methods, is typically chosen so that the embedding dimension is as small as possible, i.e., r ≪ n.
Given this assumption, high-dimensional states can be directly recovered from measurements via the maximum likelihood estimate of the basis coefficients, â = (SΨ r ) † y: known as gappy proper orthogonal decomposition (POD) [29].The gappy estimator is well-posed when the number of sensors equals or exceeds r.Importantly, the inherent compressibility of physical fields enables a drastic reduction in the number of sensors required for highfidelity reconstruction.The critical enabler for sparse sensing is the fact that nuclear processes are strictly governed by a small set of underlying physics.As we shall see, strategic selection of sensor measurements-based on noisy flow physics-allows for an extremely small number of deployed sensors to be used.

Data-driven basis
The data embedding rank dictates the minimum number of sensors required for reconstruction, necessitating a choice of basis with the lowest possible rank.Given full state data sampled from physics/CFD simulations X = x 1 . . .x m , the proper orthogonal decomposition (POD) [51] provides the minimal rank approximation to data argmin where the low-rank embedding is given by the projection of the data onto orthogonal POD modes, Ψ T r X.The solution to (4) is computed using the singular value decomposition of the data matrix, X = UDV * , where the leading r left singular vectors comprise the desired POD modes The singular values (diagonal entries of D), quantify the decreasing energy contribution of each successive mode and determine the truncation rank.Most physical data have much fewer degrees of freedom than the ambient data dimension, allowing a very small choice of r.The cumulative energy captured by the leading r modes is In practice, the smallest possible r capturing 90-99% of cumulative energy above the noise threshold is used as the model truncation rank.Therefore, POD is also the workhorse of projection-based model order reduction, used for projecting governing equation terms onto POD modes to obtain highly computationally expedient surrogate models for high-fidelity physics.

Methodology
This section describes our methodology for optimizing sensors for reconstruction in constrained settings.First, we describe the ROMs used to set up a linear reconstruction problem for recovering high-dimensional fields from sparse measurements (i.e., sensors).We then characterize our sensor placement optimization objective in terms of reconstruction error.Next, we detail how our greedy algorithm selects the next optimal sensor in terms of a strategic projection operator, and modify the selection step to enforce the necessary constraints while still maintaining optimality.

Sensor placement for reconstruction
The placement of sensors is defined by a measurement selection operator S ∈ R p×n that optimally recovers modal mixture a from sensor measurements y.This measurement selection operator S encodes point measurements with unit entries in a sparse matrix where e j are canonical basis vectors for R n , with a unit entry in component j (where a sensor should be placed) and zeroes elsewhere.Here, γ = {γ 1 , γ 2 , . . ., γ p } ⊂ {1, 2, . . ., n} denotes the index set of sensor locations with cardinality p. Sensor selection then corresponds to the components of x that were chosen to be measured: The selection of sensors is based on the optimal estimation of the entire state vector x ∈ R n from p experiment outputs y ∈ R p with additive i.i.d.Gaussian noise The values of x at unmeasured locations can be recovered by solving a linear system of equations for the basis coefficients via the Moore-Penrose pseudoinverse of SΨ r (gappy POD (3)): The row indices of Ψ r correspond to sensor locations in the state space that effectively condition the matrix inversion, enabling accurate reconstruction of the estimated state x.
Optimal design of experiments [52] for estimation problems involves the strategic selection of a set of experiments to gather sufficient information about the domain, enabling accurate predictions for measurements where experiments were not performed.Statistical criteria, such as A, D and E-optimality, are used to select the set which minimizes or maximizes different properties of the Fisher information matrix.Fisher information [52] measures the amount of information a random variable contains about the estimated parameter, such as its true mean or standard deviation.The Fisher information matrix defines covariance matrices associated with maximum-likelihood estimates and is (SΨ r ) T (SΨ r ) in our case.A-optimal designs minimize the trace of the inverse of the Fisher information matrix, whereas E-optimal designs maximize the minimum eigenvalue of the information matrix.Doptimal designs [53] minimize the generalized variance of the parameter estimates by maximizing the determinant of the Fisher information matrix [54].
Optimal design for gappy estimation involves placing sensors at limited points in the domain to accurately reconstruct flow fields over the entire domain.In contrast to classical optimal design in which each sensor can be used multiple times out of a set of candidate sensors, candidate sensors can only be used once in the gappy framework.In this setting, design of experiments aims to optimize the sensor selection S to optimize statistics of the estimation error a − â, an r-dimensional random variable with zero mean and covariance The eigenvalues of this covariance matrix characterize the statistical and geometric measures of estimation error "size" [55], shown in Table 1.Generalized variance, defined by det(Σ), characterizes correlations among pairs of variables.When it is large, the variables have little correlation with each other; when it is small, the variables are strongly correlated.On the other hand, average variance, given by tr(Σ), is the sum of the population variances.A-optimal criteria minimize this average variance, while E-optimal criteria minimize the maximal variance of Σ.The variance, which measures the uncertainty in the estimated response, should Generalized variance det(Σ) = Π i λ i area, (hyper)volume Average variance tr(Σ) = i λ i linear sum

Maximal variance λmax maximum dispersion
Table 1: Statistical and geometric measures for error covariance [55] be small for minimal deviation between estimated and true values [53].
We consider D-optimal design for flow field reconstruction with information matrix (SΨ r ) T (SΨ r ), which depends on the selected sensors S(γ).The determinant objective maximizes the information volume via maximization of its determinant, given a budget of p sensors.The maximizing sensor set of this criterion is also the maximizer of its logarithm When p = r, Equation 10 is equivalent to the maximizer of log | det(SΨ r )|.Direct optimization of this criterion leads to a brute force combinatorial search.This sensor placement approach builds upon the empirical interpolation method (EIM) [39] and discrete empirical interpolation method (DEIM) [40] which select the best interpolation points for evaluating nonlinear terms in projection-based reduced order models.However, these methods do not directly optimize statistics of the error or minimize error covariance.In the next section, we develop a greedy strategy for optimizing sensor selection under constraints built upon the pivoted QR factorization [41]- [44], and analyze the reconstruction performance with respect to D-optimal design criteria.

Column-pivoted QR decomposition with spatial constraints
The QR factorization with column pivoting decomposes a matrix W ∈ R m×n into a unitary matrix Q, an uppertriangular matrix R, and a column permutation matrix Π, such that WΠ = QR.As described above, each column index of Ψ T r corresponds to a single sensor location in the state space.We applied QR pivoting to the transpose of our basis, i.e.W = Ψ T r , and use the permutation matrix to store information about the sensors selected.The pivoted QR decomposition is a greedy algorithm for optimizing Equation 10 that, in each iteration) selects a new column pivot (sensor location) with maximal twonorm, then subtracting from every other column vector its orthogonal projection onto the pivot column [41], [42].This projection is given by a Householder reflector that maps any vector ν to − sign ν 1 σe 1 , where σ = ∥ν∥ 2 and Householder projections effectively zero out the subdiagonal components of column vectors in each iteration to induce upper-triangular structure in W, constructing R in place.Householder reflectors can be written in the standard form I − 2uu T , where u has unit norm.Consider the following partial QR factorization at the k th iteration in the pivoting procedure: where , and Π ∈ R n×n is the permutation matrix containing information about the first k chosen sensors [56]- [58].In unconstrained QR pivoting, the (k + 1) st iteration selects a column from the submatrix R (k) 22 with the maximal twonorm, then swaps the selected column with the (k + 1) st column while updating permutation indices Constraints are integrated within this step, by forcing the pivot column index to be selected from the latest set of allowable indices based on the constraints under consideration.The k + 1 st iteration in constrained optimization selects the pivot column with largest 2 norm, r (k) 22 (i) , from the constrained/unconstrained set of allowable column indices in R (k) 22 .The QR pivoting algorithm results in the following diagonal dominance structure in R: Constraints are imposed in the final r − s steps of pivoting, ensuring that the largest contributing terms in the objective function expansion remained unaffected: The main driver of this optimization is the point at which constraints are introduced into the pivoting procedure, as allowing upper triangularization to proceed normally in the starting iterations maximizes the leading diagonal entries of R, ensuring that domain-specific constraints do not drastically affect the diagonal dominance property, but only the trailing R ii , which are optimized by choosing the best pivot from the allowable locations.The three types of spatial constraints handled by the algorithm (Figure 2) are:

Region constrained:
This type of constraint arises when we can place either a maximum of or exactly s sensors in a certain region, while the remaining r − s sensors must be placed outside the constraint region.
• Maximum: This case deals with applications in which the number of sensors in the constraint region should be less than or equal to s.In each iteration a pivot column (sensor location) is chosen from the set of all columns until s selected pivots lie in the constrained region.Successive pivots with the largest 2-norm are selected from among the unconstrained column indices.
• Exact: This case deals with applications in which the number of sensors in the constraint region should equal s.The algorithm follows the same procedure as the maximum sensor placement case if the number of sensors in the constraint region equals or exceeds s.However, if there are fewer than s sensors in the constrained region, the algorithm forces the deficit of sensors to be placed in the constraint region at the end of the pivoting procedure.until the iterate k equals r − s, then imposes the selection of user-specified sensor locations in the final s iterations.

Distance constrained:
This constraint enforces a minimum distance d between selected sensors.Accordingly, the first pivot is the column index of Ψ T r with maximal 2-norm, the default (unconstrained) base step.The (k + 1) st iterate now selects the pivot column with maximal 2-norm from among the re- 22 that are at least distance d away from the previous k selected sensors.This is an adaptive constraint because the set of allowable sensor indices is updated with each pivoting iteration to remove the d-neighborhood of the kth sensor.
Although we mainly consider the minimal allowable number of sensors to be p = r, the truncation rank of the basis, additional sensors can be added for redundancy and robustness through the oversampling optimization proposed by B. Peherstorfer et al [59].R ←Copy(W) for all k ∈ 1, ...., r do 10: dlens ←computeConstraints(dlens,γ,CONSTRAINTS, k,s c ,s,d) ▷ update kth sensor

Uncertainty analysis
Under noisy measurements, errors in estimation are transmitted into reconstruction errors.Geometrically, estimation errors are characterized by (hyper-)ellipsoids in r dimensions whose axes describe these errors.Statistically, the confidence intervals for the estimation of states are characterized by the η-confidence ellipsoid that contains a − â with probability η where Σ is the covariance matrix in Equation 9and is the cumulative distribution function of a χsquared random variable with r degrees of freedom.An important scalar measure of the quality of estimation is the volume of this ellipsoid where Γ is the gamma function.D-optimal designs minimize the volume of the ellipsoid, which is inversely proportional to the determinant of our information matrix.
A small volume for the η-confidence ellipsoid implies a strong correlation between the estimation errors in each component.Under Gaussian noise assumptions, 3σ standard deviations computed from the diagonal entries of the covariance matrix Σ ii measure the uncertainty in predicting the i th component, establishing error bounds for reconstructing flows from noisy measurements.
The following text, for ease of notation, uses Θ = SΨ r to signify the mode measurement matrix, hence Σ = Θ T Θ represents the Fisher Information.In order to quantify the uncertainty in each reconstructed component of the state under noisy measurements, we analyze the expected covariance of the full state fluctuations.Klishin et al [60] provide the following estimate of the expected state covariance for a regularized gappy estimator, which now depends on the covariance of our measurement model as follows where E[∆y∆y T ] = β 2 I for uncorrelated noise.The standard deviation in the reconstruction of each grid-point can be calculated through the diagonal entries of this matrix, providing an uncertainty heatmap of the whole reconstructed domain based on the sensor configuration Furthermore, model recalibration for digital twinning can be informed by analyzing the distribution of each component of the estimated coefficients â, even when the true coefficients are unavailable.The predicted mean µ i = E[â i ] is estimated by averaging measurements over the training data where y i are each component of noisy measurements with standard deviation β.Similarly, we can estimate the expected covariance in components of â using the diagonal entries of The diagonal part T of the covariance matrix can be used to calculate the standard deviation in the distribution of the estimated POD coefficients.
The predicted standard deviation and mean of the POD coefficients together provide statistical metrics to measure uncertainty in the reconstruction of the flow field due to noise measurements when true readings are unavailable.In nuclear digital twins these error bounds can be used to detect the divergence of sensor readings from expected values.Statistics of the error provide means to "flag" or "signal" re-calibration of the digital twin, detect anomalies and classify erroneous readings.

Results
This section demonstrates the constrained and unconstrained sensor placement algorithm on a randomly generated state space system, the 2D heat diffusion through a thin plate, and the OPTI-TWIST prototype.In the randomized system, all possible sensor placements given a fixed budget of sensors are computed to demonstrate the near optimality of our approach.Next, we investigate reconstruction of temperature fields in 2D heat diffusion with a constant heat source, a simplified model of the OPTI-TWIST heater.Uncertainty analysis is conducted on noisy measurements for varying signal-to-noise ratio (SNR).Using constrained optimized sensor placement with our approach, we reconstruct, with minimal error, the flow field inside OPTI-TWIST-in comparison to randomly selected sensor locations.

Discrete random state space
We first demonstrate the near optimality of constrained QR pivoting by using a low-dimensional linear timeinvariant (LTI) system.The dimensionality of this system is small enough to enumerate all possible placements in order to empirically compare the constrained QR placements with the optimum placements.Consider the following LTI system: with randomly generated system A, measurement C, and actuation B matrices sampled i.i.d.from a normal distribution, n = 25 states and p = q = 25 randomized measurements and actuators, respectively.x, u, and y represent the state, input, and output vectors, respectively.We empirically studied the optimality of our proposed algorithm by computing all possible r = 7 sensor selections, leading to a brute force search of n r = 480, 700 choices of S. The log determinant of SΨ r was evaluated for all possible combinations of the seven sensors, then binned into the histogram shown in Figure 3a.This computation is only tractable for a small state dimension-as even for n = 100, the brute-force search results in O(10 10 ) complexity.The optimization outcome (determined via Equation 10) for sensors selected using the QR pivoting approach is represented by the solid black line in Figure 3a.This sensor selection is observed to be nearly D-optimal, exceeding 99.74% of all candidate placements.We studied region-constrained pivoting by allowing only s ≤ 2 sensors to be selected from the first s c = 5 components of the state (the constraint region).A brute-force search across all possible combinations of s sensors in the constraint region (and r − s elsewhere) was carried out, resulting in 155, 040 possible combinations in selecting the seven sensors binned in Figure 3b. Figure 3b compared our first strategy (i.e., placing exactly s = 2 sensors in the constraint region in the first two iterations of pivoting (dotted line)) with another strategy in which a maximum of two sensors were allowed in the constraint region throughout the pivoting procedure (dashed line).Our exact approach has a log determinant exceeding 99.78% of all possible region-constrained sensor placement options, while the max approach is observed to exceed 99.87% of all possible combinations.Thus, both approaches provide a near-optimal subset of region-constrained sensors for reconstruction.
In predetermined sensor placement, a specified number s of sensors were selected in advance, leaving the algorithm to optimize those that remain.Our strategy runs unconstrained QR pivoting to select the first r − s pivots (sensors), then selects the predetermined sensors in the remaining s pivoting iterations.The results of the log determinant objective evaluated for our optimization strategy (dashed line) are compared against a brute force search across all possible candidate placements that contain the two predetermined sensors reflected in Fig- ure 3c.Our strategy outperforms 99.99% of all possible placements, exhibiting near-optimal solutions.These re-Figure 4: Reconstruction error comparison.Proximity between the brute-force optimum and QR selected sensors for unconstrained, region-constrained, and predetermined sensor placement leads to orders of magnitude lower reconstruction error (ϵ ∼ O(10 −15 )) compared to random placements.Incorporating constraints results in accuracy comparable to that of the optimal placement (red stars).
sults show that the introduction of constraints results in minimal distance to the true optimum.We analyze the negligible effect of this distance on the reconstruction error We compared the reconstruction achieved via each set of constrained QR sensors with sensor placements sampled from the mean of the log det distributions.(These represent the most likely sub-optimal sets to be chosen at random.)The reconstruction error for each of these randomly placed sensors was then calculated (see the blue violin plots in Figure 4, where the green square and circle reflect the reconstruction error of the QR-selected sensor locations for the different constraint cases).Random sensor placements with a sub-optimal log det objective fall between 31 and 32, resulting in a highly inaccurate relative reconstruction error (ϵ) between 30 and 350.The QR-optimized strategy for unconstrained/constrained sensor placement results in significantly lower reconstruction error ϵ ∼ O(10 −15 ): We conclude that the proximity between the brute-force optimum and our greedy placements produces negligible loss in reconstruction performance.Vastly suboptimal random placements illustrate the inverse relationship between the objective function and performance: the lower the log determinant, the higher the reconstruction error.However, this system was randomly generated, and the dynamics do not evolve according to localized features in state space that allow one sensor to capture information on spatially correlated states.Next, we demonstrate the algorithm on a heat diffusion model that allows for spatial and physical interpretation of the resulting sensors.

2D heat flow through a thin plate
Temperature fields are reconstructed via constrained/unconstrained sensor placement in a 2D plate undergoing thermal diffusion from a heat source, based on a simplified model of the OPTI-TWIST diffusion.Temperatures at the boundaries are fixed at 100°C at x = 0, and 36°C elsewhere in ∂D.The initial temperature throughout the domain D at t = 0 is also 36°C.Heat transfer from the heat source over the domain is governed by the heat equation where u is the desired temperature and α = 2 (mm 2 /s) is the thermal diffusivity constant.The solution of the partial differential equation is simulated for 1000 time steps using finite differences with ∆x, ∆y = 1 and ∆t = 0.125.We reconstruct the temperature fields and analyze the uncertainty in reconstruction of each pixel due to adding i.i.d.Gaussian noise η ∼ N (0, 0.01) to the measurements for the unconstrained, constrained (Figure 5), and predetermined optimized sensor placements (Figure 6).A total of r = 10 sensors are selected for reconstruction, and s = 2 sensors are allowed in the constrained region or are predetermined.Distance constraints impose a Euclidean distance of at least 2 between selected sensors.Similar to the nuclear OPTI-TWIST (subsection 4.3), optimized placements favor sensors near the heat source.Constraining sensors away from the heat source results in higher reconstruction errors and uncertainty than unconstrained optimization.In this example, both regionconstrained max and exact (s = 2 within x < 10) optimization result in identical sensor placements (Figure 5c), with only two sensors near the heat source.This results in higher error because of high-energy modal contributions adjacent to the heater (Figure 7c).Removing heater adjacent sensors results in higher uncertainty of approximately 10°C near the heater (Figure 5f).The distanceconstrained sensor placements, which also placed six sensors near the heat source, result in the lowest reconstruction error under constraints (Figure 5d).When predetermined sensor locations are close to the unconstrained optimal locations as in Figure 6a, the reconstruction errors    predetermined sensor locations away from this wavefront results in a noticeable increase in reconstruction uncertainty in the right half of the domain (Figure 6d), and strengthens the case for data-driven placement strategies.The optimized sensors reflect the highest energy amplification in the POD modes, which occur near the heat source (Figure 7c).The leading POD modes capture this diffusion of heat from the heat source boundary to the rest of the domain (Figure 7c).Approximately 96% of the cumulative energy is captured by the leading two POD modes, while the remainder capture only 4%.
Gappy POD was used to estimate the rank 10 and rank 20 model coefficients from noisy measurements (test dataset of 500 snapshots), and compare estimation errors with our predicted uncertainty analysis to test the descriptive capability of the different rank truncations.The standard deviations σ i computed from the diagonal entries of the covariance matrix Σ ii measures the uncertainty in predicting the ith component.Approximately 498.5 out of 500 reconstruction errors should lie within 3σ i of the mean.As more modes are included in the POD approximation, the selected sensors capture more information about the underlying physics of heat diffusion.As seen in Figure 7, the rank 10 POD model fails to capture the underlying physics after time interval t = 100, whereas the rank 20 model is more descriptive of the dynamics over a longer time interval t = 500.With more modes and sensors, the error covariance in each component narrows and the 3σ bounds become tighter.When clean measurements or ground truth coefficients are unavailable, bounds on distributions of estimated â are use-  ful to inform recalibration of digital twins.The uncertainty estimation for any POD coefficient, for example â0 , can be bounded using the predicted standard deviation or 3σ (Equation 21, shown in green in Figure 7d) and mean (Equation 19, shown in red in Figure 7d).As the SNR decreases, the dynamics of heat diffusion are corrupted by noise, resulting in wider distributions of the estimated state.Note that predicted uncertainty bounds are more accurate at higher noise levels due to numerical rank approximation error overwhelming the low noise contribution to error.In other words, uncertainty analysis is more accurate under larger ratios of measurement noise to model error.
In summary, when uncertainty in sensor measurement is known, this framework enables estimation of the expected error distribution as a function of measurement noise, as well as study of the growth in error as the sensor noise increases.When sensor uncertainty is unknown, filtering and Bayesian inference techniques may be used for uncertainty quantification with these linear embeddings.The developed algorithm can handle reconstruction of flow fields in the presence of constraints and noise with high accuracy.Uncertainty analysis of predicted states plays a key role in detecting erroneous readings in digital twins.In the next example we reconstruct temperature flow field for a gravitational advection driven physical system, OPTI-TWIST.

Steady-state simulation of the OPTI-TWIST prototype
The next example follows the new design paradigm suggested by digital twins.In traditional design practice, modeling and simulation insights are often leveraged at the experimental design stage in order to build physical models and place sensors.However, limitations regarding space, installation, cost, and signal fidelity of the experimental device pose challenges in deploying the desired number of sensors.Our holistic approach integrates experimental constraints, Computational Fluid Dynamics (CFD) simulations, and optimization objectives (reconstruction) in a principled way to optimize the placement of sensors in the design phase of the digital twin.
Here, our sensor placement optimization is demonstrated on the OPTI-TWIST prototype, which is electronically heated to mimic the neutronics effect of TWIST prior to insertion into a reactor at INL. Tem-  perature is the field of interest, and point thermocouples will be used as the sensors.The OPTI-TWIST prototype was designed to simulate thermal-hydraulics behavior of TWIST during irradiation in the reactor, as well as to measure the effect of loss of coolant on the fuel rodlet temperature.In OPTI-TWIST, the fuel-rod specimen is replaced by an instrumented electric cartridge heater, and loss of coolant is controlled by a quick-opening valve at the bottom of the capsule.To provide the temperature fields necessary to train the sparse sensing algorithm, a simplified 2D CFD model of the OPTI-TWIST geometry is developed using StarCCM+ (Figure 9) [61].The CFD model accounts for steady-state turbulent natural circulation conditions, including two controlled parameters: heater power ( q) and outer surface temperature (T sur ).These two controlled parameters (i.e., heater power and surface temperature) were varied, while keeping the initial temperatures (T 0 = 300K) and the system pressure (P sys = 2250psi) constant throughout.The data are comprised of 49 steady-state temperature fields resulting from seven heater powers and surface temperatures uniformly sampled at 350-650W and 240-420K, respectively.The convergence criterion was the maximum liquid temperature, which showed negligible fluctuations after 2000 time steps.First, our optimization is run on the steady-state temperature fields, resulting in the unconstrained optimal placement shown in Figure 8c.Unconstrained optimization selects three sensors near the heater (Figure 8c); however, space restrictions make these heater-adjacent locations experimentally infeasible.Enforcing sensor constraints to lie outside the heater region results in a reconstruction error of ϵ = 0.174-an increase of only .006. Figure 8b contrasts these optimized sensor reconstructions with ensembles of randomly placed sensors.Observe that ϵ unconstrained < ϵ constrained ≪ ϵ random , i.e. placing sensors in random locations leads to significantly larger reconstruction errors (Table 2).Gaussian i.i.d.noise η ∼ N (0, 0.01) is added to the measurements to analyze the uncertainty heatmaps in reconstruction of the true temperature profile.Removing sensors from heater-adjacent locations leads to an increase of approximately 0.5K in the uncertainty in reconstruction of the flow field close to the heater (Figure 8g) compared to uncertainty resulting from unconstrained sensor placement (Figure 8f).The uncertainty in reconstruction is higher by 40-50K throughout the domain when sensors are placed randomly (Figure 8e).Stratified contours of reconstruction errors occur where random sensors fail to accurately capture the transition between hot and cold (Figure 8b).Therefore, randomly placed sensors fail to capture heater-correlated fluctuations and result in higher reconstruction errors and uncertainty.
The leading two POD modes, which drive approximately 99% of the energy content, capture the heat advection from the heat source.Thus, only five sensorscorresponding to the first five POD modes-are required to reconstruct the flow with high accuracy.The cumulative energy content, along with a visualization of the first three POD modes, is given in Figure 10a.These POD modes capture the interfaces between lower and higher temperatures as the advection flow progresses for different operating ranges.Sensors placed at random locations fail to capture the underlying physics of the system.Increasing the number of random sensors selected does not significantly improve the reconstruction of the flow field.An ensemble of sensors placed at random locations produces large relative reconstruction errors that average approximately 35 as the number of sensors is increased from 5 to 30, as seen in Figure 10b.Random placement of sensors increases the reconstruction error by ten orders of magnitude.The random placement strategy alludes to data-agnostic sensor placement at best-guess locations for reconstructing temperature fields.
Unconstrained optimization favors locating sensors close to the heat source, due to the richer dynamics that exist there.Imposing sensor constraints within QR results in a near-optimal placement outside this region, as well as negligible loss of reconstruction performance.Moreover, the error decreases with more optimized sensors (unconstrained or constrained); however, random placements still suffer from high error even with additional sensors.Therefore, placing or adding sensors without optimizing them in regard to the underlying flow or deployment constraints can introduce large errors in the corresponding digital twins especially in the presence of noisy sensor measurements-and may even cause sensor damage.Incorporating such considerations prior to setting up a physical experiment enables precise uncertainty quantification and helps validate the predictions of a digital twin.

Transient simulation of the OPTI-TWIST prototype
Analyzing the effect of power transients on reactor core coolant temperature, pressure, and velocity is essential for real-time safety monitoring and control of a nuclear reactor.Here, we optimize sensor placements to capture the dynamics of the heat flow when the heater power is varied as a transient.During transients it is essential to capture the instance when sensor readings start diverging from predicted metrics in the presence of noise.This can signal the need for model recalibration and can prevent accidents caused by power surges at a nuclear facility.The data is obtained from the 2D CFD model described in subsection 4.3 which runs for 600s, and is comprised of 1000 temperature fields sampled at every 0.6s.The heater transient power profile P (t) can be described by where P o = 10W, P 2 = 250W, t 1 = 200s, t 2 = 2t 1 and, t 3 = 3t 1 .Similar to the steady state temperature profile, richer dynamics are located in heater adjacent regions (Figure 12a).The unconstrained sensor layout shows a number of sensors near the heater (Figure 12c), however due to spatial constraints all sensors must be located away from the heater (Figure 12b).The increase in reconstruction error due to imposing constraints is as low as 1%.The algorithm is trained on the first 500 timesteps and reconstructs the temperature profiles at the last 500 timesteps from optimized sensors readings with additive i.i.d .Gaussian noise η ∼ N (0, 0.9).We study the reconstruction uncertainty using the predicted distribution of estimation coefficients in subsection 3.3.As shown in Figure 11, the rank 10 gappy POD model is insufficient to characterize the transient behavior in test data and reconstruction errors increase beyond the established bounds at t = 200s, whereas the rank 20 model captures the dynamics more effectively over the entire time horizon.When information regard-ing the true state is available, model recalibration can be signaled by bounds established for the error distribution in the presence of increasing noise (Figure 11d).When true coefficients are unavailable, the bounds established for the distribution of estimated POD coefficients âi can be used to flag erroneous readings.This uncertainty estimation of the flow field during transients ensures safe operating conditions during experimentation and testing of nuclear reactors.

Discussion and Outlook
The reconstruction of reactor core flow fields using a limited budget of sensors has emerged as a critical enabler for digital twinning of nuclear assets.However, achieving optimal sensing in high-dimensional real-world systems extends beyond the nuclear industry and encompasses diverse fields such as biology, physics, aviation, and automotive industries.Engineering systems are subject to different constraints and limitations on sensors, making the selection and optimal placement of sensors while considering these constraints a crucial aspect of algorithm development.In this study, we have developed strategies for placing sensors to satisfy constraints related to sensor proximity, regional limitations on sensor quantity, and design or user prescribed locations.Through these strategies, we have demonstrated the effectiveness of adaptive sensor placement in satisfying constraints while maintaining optimality and accuracy of the reconstructed responses of interest.Moreover, we showcase the scalability and broad applicability of the algorithms on a variety of applications and constraints.
Nevertheless, more complex constraints may arise in nuclear, fluid, or aerospace applications in which the capability to achieve flow reconstruction based on sparse measurements is indispensable.For instance, in each reactor region the maximum number of allowable sensors is usually design-specific and cannot be exceeded.Moreover, an emerging practice during nuclear fuels tests is the use of distributed sensors (e.g., fiber-optic sensors or multipoint thermocouples).In fiber bragg grating (FBG), the refraction index changes along the sensor length and provides distributed measurements.Designing fiber optic sensors-which act as line sensors with different measurements at each point-is a novel challenge, as line sensors can be topologically shaped along various structures in engineering systems.Optimizing such topologies is another future direction for adaptive sensor placement.Fiber optic bundles are used for recreating high-quality images in both nuclear engineering and neuroscience.Optimizing the locations for these bundles to capture the best quality images is another interesting research direction.
Furthermore, it might be very costly to place sensors in certain areas of the reactor, due to the need for specially designed sensors capable of withstanding harsh working conditions.Other areas may entail spatial constraints.Thus, multi-objective optimization based on optimizing the sensor cost, spatial locations, as well as predicted dynamics will be essential.Time-dependent dynamics and the study of transients is invaluable in the nuclear field.Sensor placement based on time-dependent data from OPTI-TWIST and the use of dynamic mode decomposition or a nonlinear embedding such as autoencoders can help generalize to new physics scenarios, and will require new uncertainty estimates that can handle bias inherent to these types of models.
The ultimate goal is to extend the algorithms to inform users of optimal locations and timesteps to collect spatiotemporal sparse measurements to reconstruct core flow profiles during power transients.This should naturally evolve to the capability of performing optimal sensor placement for multi-class classification, where the algorithm must select the sensor network capable of predicting which accident scenario is more likely to occur faster than real time.Examples of such accident scenarios include Loss of Coolant Accident (LOCA), Reactor Initiated Accidents (RIA), and loss of power.These scenarios can be easily realized in a non-destructive fashion within the TWIST prototype by opening a valve, or suddenly shutting the heater power off.Data-driven sensing frameworks have the potential to identify sensor maps capable of detecting off-normal conditions, anomalies, and injected signals, enhancing resilience and security of the physical twin against cyber-attacks.

Figure 2 :
Figure 2: Greedy selection of the next sensor involves choosing the next pivot column of Ψ T from the set of allowable sensor locations specified by the constraint.

2 .
Predetermined: This type of constraint occurs when a certain number of sensor locations are already specified, and optimized locations for the remaining sensors are desired.The strategy employed selects pivots from among all column indices of R (k) 22

Figure 3 :
Figure3: Enumeration of log det(SΨ r ) T (SΨ r ) (Xaxis) over all possible placements of 7 out of 25 candidate locations (100,000-500,000 possible placements binned into histograms).The introduction of constraints into QR optimization results in a log determinant that is near optimal (optimum shown in red) for the three types of constraints.

Figure 5 :
Figure 5: Reconstruction of the temperature field through selected sensors along with uncertainty in reconstruction.Uncertainty heatmaps (e,f,g) correspond to placements/reconstructions (b,c,d) respectively.Reconstruction of the temperature field at t = 1000, based on the different constraints demonstrate that constraining sensors far away from the heater region result in higher reconstruction error (c,d) and higher uncertainty (f,g) than unconstrained optimization (b,e) respectively, which favors sensors adjacent to the heat source.

Figure 6 :
Figure 6: Informed sensor placement lowers uncertainty in reconstruction.Two different predetermined sensor layouts (a,b-shown in green) may lead to similar reconstruction errors (a,b) but increase reconstruction uncertainty (d) when sensors are distant from QR-optimal locations (c).

( a )
Rank 10 model (b) Rank 20 model (c) Leading POD modes by energy contribution.(d)Increasinguncertainty in estimation of âk with a decreasing SNR.

Figure 7 :
Figure 7: Analysis of uncertainty in estimation due to noisy sensor data.Uncertainty estimation reveals rank 10 models are not sufficiently descriptive of dynamics after t =100s (a) under noisy measurements.A higher rank 20 approximation is required (b) despite the leading 10 POD modes capturing 99% of the energy (c).Predicted statistics of the estimated coefficients, such as the standard deviation 3σ (green) and mean (red) effectively bound the estimated â0 under increasing noise.

Figure 8 :
Figure 8: OPTI-TWIST temperature profile reconstructions through different sensor layouts and uncertainty in estimation caused by noisy sensor measurements.Unconstrained optimization places sensors near the heater region (c), resulting in highly accurate reconstruction with ϵ = 0.168 (a), with constrained optimized sensors resulting in comparably high accuracy ϵ = 0.174 (d).Random sensor placement (b) results in inaccurate reconstructions (ϵ = 25.24) and large estimation uncertainty (e) compared to that of optimized sensor locations (f,g).

Figure 9 :
Figure 9: Geometry and mesh of OPTI-TWIST.The axial symmetry of the OPTI-TWIST is exploited by simulating only half the domain as the cartridge heater is placed at the center of the capsule.The geometry and mesh reveal richer dynamics near the heater region.
(a) Leading POD modes and energy contribution.

Figure 10 :
Figure 10: An informed trade-off between reconstruction accuracy and number of sensors is possible for QR optimized sensors.The leading POD modes capture 99% of energy content and just 5 sensors are enough to obtain a accurate reconstruction with ϵ ∼ O(10 −1 ) (a).QR selected sensor accuracy increases with an increase in the number of sensors as compared to random placements that produce orders of magnitude larger relative reconstruction errors (b).

( a )
Rank 10 model (b) Rank 20 model (c) Increasing uncertainty in estimation of âk with a decreasing SNR.(d) Increase in estimation errors with a decreasing SNR.

Figure 11 :
Figure 11: Estimation error analysis for OPTI-TWIST heater transients.(a) Uncertainty estimation reveals that a rank 10 model is not sufficiently descriptive of dynamics after t =300 under sensor noise.The rank 20 approximation (b) is valid over a longer time horizon of 500s of test data.As the SNR decreases, (c) POD coefficient variance increases, which propagates to an increase in uncertainty of estimation errors (d).

Figure 12 :
Figure 12: Sensor placements superimposed on reconstructions for heater power transients.Heateradjacent temperature fluctuations result in (c) sensors optimized close to the heater and a corresponding low reconstruction error ϵ = 1.026.When constraints are imposed, sensors are placed outside the green constraint region (b), resulting in higher reconstruction error ϵ = 2.042.

Table 2 :
Summary of the relative reconstruction error (ϵ) and optimization criteria (log | det SΨ r |) for sensor placement given in Figure 8.