Bayesian Sensor Calibration of a CMOS-Integrated Hall Sensor Against Thermomechanical Cross-Sensitivities

For the first time, Bayesian sensor calibration is used to identify efficient calibration procedures for a sensor cross-sensitive to two parasitic influences. The object under study is a thermomechanically cross-sensitive sensor system for determining the magnetic induction <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula>. The packaged system comprises a Hall sensor, a stress sensor, and a temperature sensor. The three sensor signals are combined in a polynomial sensor response model with 11 parameters to determine <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula> compensated for offset and cross-sensitivities. For the calibration, sensors are exposed to mechanical stress values between 0 and −68 MPa, temperatures between −40 and <inline-formula> <tex-math notation="LaTeX">$100 ^{\circ} \text{C}$ </tex-math></inline-formula>, and <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula> values between −25 and 25 mT. A sample of 35 sensors serves to extract the prior model parameter distribution of their fabrication run. The Bayesian experimental design is applied to identify sets of 2–8 optimal calibration conditions under I-optimality and G-optimality. The Bayesian inference then allows to obtain the posterior model parameter distribution of any uncalibrated sensor from the same run. Any such sensor is thereby turned into a <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula> measuring device with individually quantified accuracy. The method was successfully applied to 15 validation sensors. In the case of I-optimality, the median root-mean-square (rms) textsigma values of the ±1 σ confidence intervals for the extracted <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula> values were found to be 113–71 <inline-formula> <tex-math notation="LaTeX">$\mu \text{T}$ </tex-math></inline-formula> after near-I-optimal calibrations based on 2–8 measurements. Over the entire range of temperature and mechanical stress and for applied <inline-formula> <tex-math notation="LaTeX">$| {B} | \leq $ </tex-math></inline-formula>25 mT, corresponding experimentally determined medians of the rms deviations between predicted and applied <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula> values were found to be 89–71 <inline-formula> <tex-math notation="LaTeX">$\mu \text{T}$ </tex-math></inline-formula>. Analogous observations apply to G-optimality. In short, Bayesian calibration made it possible to obtain functional <inline-formula> <tex-math notation="LaTeX">${B}$ </tex-math></inline-formula> sensors of known accuracy with significantly fewer calibration measurements than model parameters. This was enabled by prior knowledge collected by the thorough characterization of 35 prior-generating specimens.


Bayesian Sensor Calibration of a I. INTRODUCTION
U NWANTED, yet often unavoidable parasitic influences affect the operation of many sensors. Uncompensated, such cross-sensitivities cause systematic errors in the values inferred from a sensor system's output signals and thus diminish its accuracy. Temperature is well known to modulate the response of virtually every sensor. It has been shown to impair the stability of Hall sensors [1], [2], [3], [4], pressure sensors [5], [6], [7], and mechanical stress sensors [8], [9], [10]. In fact, many sensors exhibit more than one cross-sensitivity. For example, semiconductor Hall sensors are affected not only by temperature but also by mechanical stress [11]. Volatile organic compound (VOC) sensors used as air quality sensors lack selectivity to individual air compounds [12]. Electronic tongues are cross-sensitive to various components in solutions [13]. Electronic noses have exhibited similar selectivity issues, for instance in classifying water, methanol, and ethanol vapors [14]. Likewise, cross-sensitivities to other gases, such as C 2 H 5 OH, SO 2 , and NO, have been suspected for an ozone-humidity-temperature sensor system with a sensitive WO 3 film [15], as found in [16] for NO 2 sensors relying on this sensitive material. Among devices for physical measurands, mechanical sensors often possess cross-sensitivities to other mechanical constraints. A six-degree-of-freedom force-This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ moment transducer for example has exhibited for each load component cross-sensitivities to the other five components in addition to temperature [17], [18], [19]. Similarly, inertial sensors, such as gyroscopes, require compensations against stress, temperature, and the quadrature error [20], [21], [22].
All these examples highlight the importance of calibration for guaranteeing the cross-sensitivity-free and, thus, accurate operation of sensor systems. When calibrating sensors, well-chosen calibration conditions are applied to each individual device. The outcome of the measurements performed under these conditions allows to model the individual relationship between measurand and output signals. Understandably, it is of interest to keep the number of calibration conditions low [23], [24]. This is particularly true in the context of large production volumes, where calibration is known to be time-consuming and greedy of resources [25], [26], [27], [28], [29], causing up to 50% of overall sensor costs [29]. However, a minimized calibration procedure jeopardizes one's ability to guarantee a specified high sensor accuracy.
The present study takes up the challenge of calibrating a sensor system possessing two cross-sensitivities by applying the method of Bayesian sensor calibration [4]. This method was formulated in [4] in general terms and demonstrated on a simple model case with a single, thermal cross-sensitivity. Here, the demonstration of its usefulness is therefore expanded to a more demanding case. The object under study is a Hallstress-temperature sensor system sensitive to thermal drift and mechanical loads, fabricated in complementary metaloxide-semiconductor (CMOS) technology. Sensor elements for the magnetic field, temperature, and mechanical stress are cointegrated with elaborate analog-to-digital circuitry [30].
Without any doubt, CMOS Hall sensors have already reached a high level of development. The 1-D Hall sensors allow to measure the out-of-plane [11] or an in-plane component of the magnetic induction B [11], [31], [32], [33], [34], [35]. The 2-D Hall sensors serve to determine angular information by measuring the in-plane components of B [34], [36], while 3-D Hall sensors give access to all three components. Such 3-D Hall sensors have, for example, combined vertical and horizontal Hall plates [37], [38] and were realized as isotropic monolithic devices [39], [40].
Nevertheless, as understandable from semiconductor transport theory, such B sensors are affected by temperature variations and mechanical stress. Temperature T acts via the temperature-dependent Hall mobility of the charge carriers [11], [41]. The cross-sensitivity to mechanical stress has its origin in piezoresistance and the piezo-Hall effect [11], [42], [43], [44]. In the former, pseudo-Hall signals are caused by shear components of the mechanical stress tensor. In the latter, the magnetic sensitivity of planar Hall plates is affected, e.g., by the sum s = σ x x + σ yy of the in-plane normal components σ x x and σ yy of the mechanical stress tensor [45], [46]. Mechanical stresses responsible for both effects are caused by thermomechanical properties of the heterogeneous sensor assemblies [47], [48], [49], [50], [51] and by the swelling of the packaging materials when exposed to moisture [52], [53], [54].
Several approaches have aimed to cure these disturbances at the levels of device operation and system architecture. Integration of temperature sensors has allowed to effectively compensate thermal output signal drifts [2], [54]. The current spinning technique allows to obtain an averaged output signal largely cleared of contributions caused by the shear piezoresistance effect [55], [56], [57], [58], [59], [60], [61]. The cointegration of temperature and stress sensors together with a Hall sensor has enabled the analog compensation of the crosssensitivities [3], [62], [63]. Alternatively, the digital signal processing of the sensor signals has also allowed to obtain Hall sensor signals largely cleared of the parasitic contributions [3], [45], [54], [64], [65]. As stated in [45], accuracies better than 1% for the temperature range required by automotive applications can be achieved only by compensating the long-term drift of the Hall sensitivity associated with mechanical stress. It is noteworthy, however, that questions concerning the efficiency of the required calibration procedures and their relation to the predictive accuracy of the compensated sensors were systematically addressed in none of these previous studies.
On a different track, data science concepts, such as machine learning (ML), have recently gained popularity in the field of sensor calibration. Therefore, they give reason to hope that the open questions may be addressed by numerically based methods. Among the most widespread ML methods in sensor calibration are artificial neural networks (ANNs) in the form of multilayer perceptrons (MLPs) [66], [67], [68], convolutional neural networks (CNNs) [69], [70], and fuzzy neural networks (FNNs) [71]. Other approaches have relied on random forests (RFs) [67], [72], [73], Gaussian process regression (GPR) [73], [74], [75], and Bayesian neural networks [76]. These methods have been applied for temperature compensation [66], [67], temporal drift compensation of field-effect transistor sensors [70], and compensating commercial water quality sensors in order to extend the calibration lifetime [69]. ANNs are much appreciated for their effectiveness in classification tasks [77], [78], [79]. An advantage of ML approaches is their ability to handle unknown, potentially complex input-output relations. This often comes at the cost of an intense training effort and the need to determine the predictive accuracy by additional validation data [66], [67], [71] or by cumbersome numerical sampling [76].
Since in the present case, the relationship between the three input quantities (B, s, T ) of the sensor system under study and its three output signals is well described by low-order polynomials [80], we opt for the Bayesian approach that has already proven effective in cases with a single, thermal crosssensitivity [4], [6]. In [4], a Hall-temperature sensor system was calibrated for the range between −30 and 150 • C. This was accomplished by applying between one and three thermal calibration conditions, implying fewer measurements than the seven parameters contained in the sensor model. The rootmean-square (rms) accuracies after these modest calibration procedures were 78, 41, and 34 µT.
The prerequisite for the effectivity of the method was the availability of prior information gained from a set of thoroughly characterized sensors termed prior-generating specimens. The prior information is obtained by thoroughly characterizing those specimens. From their estimated sensor model parameters, one infers the sensor model parameter distribution of the ensemble in the form of a prior mean and a prior covariance matrix. This prior information is combined with the limited evidence provided by the calibration of other sensors about their individual responses. Updated model parameters and an updated covariance matrix of each individual sensor are thereby obtained.
Section II recalls the central mathematical elements of Bayesian sensor calibration. In Section III, we describe the mentioned Hall-stress-temperature multisensor system and the experimental infrastructure and procedures to calibrate it. The results in Section IV demonstrate that these sensors can be effectively compensated against mechanical stress and temperature variations with five or fewer thermomechanical calibration conditions while offering a high sensor accuracy. Then, Section V discusses the results and is followed by the conclusions.

II. BAYESIAN CALIBRATION
Since this work applies the method described in detail in [4], here, we only summarize the most important definitions and results.
We use the term sensor system as a synonym for the CMOS-integrated, packaged Hall-stress-temperature sensor system providing the application and test case of this article. An individual sensor system is considered as a specimen of a sensor ensemble with statistically distributed properties. A specimen or a group of specimens, therefore, constitute samples of the ensemble. We reserve bold symbols for vectors and matrices and roman symbols for scalar values.
In the present case, the magnetic induction B is the measurand of interest. It plays the role of the dependent variable. The output signals of the Hall sensor and the stress and temperature sensing elements of the sensor system are summarized as v = (V H , V S , V T ). The components of v provide the independent variables of the Bayesian analysis. The v-dependent measurand B is modeled by φ(v) ⊤ w using a set of basis ) and the corresponding model parameters w = (w 1 , . . . , w M ) ⊤ , where (·) ⊤ denotes the matrix transposition and M is the model dimension. In the present case, M = 11. The goal of optimal calibration is to find w such that B can be inferred from v with the highest possible accuracy. A crucial role in this process is played by the design matrix (V ) defined as [77,Ch. 3] for any list V = (v 1 , . . . , v N ) of N sensor output signal vectors. The Bayesian sensor calibration method consists of three steps.
1) Prior information about the sensor parameter vectors w of a considered ensemble is gathered by thoroughly characterizing a group of Q so-called prior-generating specimens, which constitute a sample of the ensemble. In the present case, Q = 35. Therefore, the so-called prior probability distribution p 0 (w) is constructed. 2) The second step consists of two substeps, namely, a Bayesian update and a Bayesian design of experiment.
a) The Bayesian update considers exposing any previously uncalibrated specimen to a small set of N calibration conditions, whereby its output signals V and the corresponding applied magnetic induction values B = (B 1 , . . . , B N ) ⊤ are collected. From p 0 (w), V , and B, one obtains the updated probability distribution p 1 (w). This allows to turn the previously uncalibrated specimen into a measuring device allowing to translate any combination of its output signals v into a prediction of the applied B. This inference is achieved by the posterior predictive response B 1 (v, V , B) and its accuracy expressed by the posterior predictive variance σ 2 1 (v, V ). b) The Bayesian design of experiment then aims for optimizing the postcalibration measurement accuracy of previously uncalibrated specimens. It minimizes σ 2 1 (v, V ) subject to a criterion of one's choice. For this purpose, a numerical search within the range of output signals arising from the range of expected operating conditions of the specimen is carried out. As a result, one identifies the combination of sensor output signals V min achieving minimality.
3) In the third step, previously uncalibrated specimens of the ensemble are calibrated. Any such device is thereby exposed to calibration conditions that are known to produce sensor outputs V cal near V min . Based on its individual calibration results, V cal , the specimen is turned into a measuring device whose output signals v allow to infer B. This inference uses the posterior predictive response B 1 (v, V cal , B), while the accuracy of the inference is quantified by σ 2 1 (v, V cal ). These steps are now presented in the mathematical detail required by the experimental study in Sections III and IV. First, for prior generation, each prior-generating specimen is exposed to 1110 characterization conditions. For each specimen numbered i = 1, . . . , Q, one thereby records sensor output signals Using standard linear regression, the model parameter vector w i of prior-generating specimen no. i is then obtained by where the term in front of B denotes the well-known Moore-Penrose pseudoinverse [81] of the design matrix evaluated at V i . From the distribution of w i in w space, one derives the prior probability distribution p 0 (w) of the ensemble and approximates it by a multivariate normal distribution with and covariance matrix With w 0 and 0 , one is able to infer B from an output signal v of any uncalibrated specimen of the ensemble. This B value is given by while the accuracy of the prediction is quantified by the variance where σ 2 denotes the variance of the individual B measurement of thoroughly characterized specimens. The quantities B 0 (v) and σ 2 0 (v) are the mean and variance of the distribution of B values inferrable from output signals v of any uncalibrated specimen, given only the prior information. The prior predictive probability distribution of B is in fact the normal distribution defined by these parameters. This is a consequence of the assumed multivariate normality of p 0 (w) and of the Gaussian statistics of the individual measurement.
When evidence about a specimen's individual response becomes available in the form of calibration data V = (v 1 , . . . , v N ) and B = (B 1 , . . . , B N ) ⊤ , Bayes' theorem allows to determine the posterior probability distribution p 1 (w) valid for the specimen. This is again a multivariate normal distribution, with posterior mean [77, Sec. 3.3.1] and posterior covariance matrix Equations (7) and (8) describe the Bayesian updates of w 0 and 0 mediated by the new evidence V and B complementing the evidence available from the prior characterization.
Similar to the prior case, w 1 (V , B) and 1 (V ) enable B to be inferred from output signals v of the specimen that has yielded (V , B) during calibration. This posterior predictive value is and the corresponding posterior predictive variance is The 68.3% confidence interval of the posterior prediction is the For the design of experiment, as in [4], we consider two optimization criteria, namely, G-optimality and I-optimality.
The G-optimality criterion [82] considers the objective function with optimum defined as In contrast, I-optimality [82], [83] relies on the objective function and identifies the optimum at Finally, the Bayesian sensor calibration is ready to be applied to uncalibrated specimens. In this work, such uncalibrated specimens serve to experimentally demonstrate the validity of the outlined method. They are therefore termed validation specimens. Since the optimal V min lists sensor output signals rather than calibration conditions (B, s, and T ), it can be challenging to perform the calibration under the optimal conditions. Nevertheless, by analyzing the responses of the prior-generating specimens, one is able to identify calibration conditions that within the group of prior-generating specimens have yield output signals close to the optimum. When these near-optimal calibration conditions are applied to a validation specimen, they produce the specimen's individual near-optimal output signals V cal . The specimen's individual w 1 (V cal , B) and 1 (V cal ) is then updated from w 0 and 0 according to (7) and (8), respectively. Using (9) and (10), these allow to infer B from the validation specimen's measured output signals v using the posterior predictive response B 1 (v, V cal , B) with accuracy quantified by σ 1 (v, V cal ).

III. EXPERIMENT
The Bayesian sensor calibration methodology is now applied to the sensor system mentioned above. Details of the system are presented in Section III-A. The experimental setup and data acquisition are then described in Sections III-B and III-C, respectively.

A. Sensor System
The Hall sensor microsystem under study was fabricated in an industrial 180-nm CMOS technology. An optical micrograph of an unpackaged sensor chip in Fig. 1 highlights the Hall sensor, the mechanical stress sensor, and the temperature sensor. The Hall sensor consists of two interconnected Hall plates sensitive to the out-of-plane magnetic induction component B and providing the Hall voltage V H . The respective output signals V S and V T of the stress and temperature sensors reflect s and T near the Hall sensor. Further details about the system and its architecture are reported in [30].
The Hall voltage V H is described by Optical micrograph of an unpackaged CMOS sensor chip comprising a pair of n-doped silicon Hall plates, a temperature sensor based on n-and p-doped resistors [2], and a piezoresistive stress sensor (orange dashed line), realized by 16 n-and p-doped resistors placed around the two Hall plates [8]. The stress sensor is electrically connected in a Wheatstone bridge configuration. Output signals are conditioned and digitized by cointegrated analog and digital circuitry [30].
where S A (s, T ) and V off (s, T ) denote the stress-and temperature-dependent absolute Hall sensitivity and the residual offset voltage at B = 0, respectively.
The purpose of the sensor system is to allow to infer B from the sensor signals v = (V H , V S , V T ). By rearranging (15), this inference is achieved by inverse Hall sensitivity and the equivalent offset field, respectively. Note that the arguments of S A and B off are chosen to be the sensor signals V S and V T reflecting the parasitic influences s and T . The stress sensor is designed to be sensitive [8] to mechanical stress exerted on the Hall sensor, by external constraints, e.g., compressive forces F acting on the sensor package [80] in addition to thermomechanical loads [48]. Values of s caused by perpendicular forces up to 20 N are expected to be about −68 MPa at 25 • C [3], [80]. Fig. 2 shows the measured values of 1/S A and B off of a representative specimen as a function of V S and V T . The data were obtained by varying the temperature from −40 to 100 • C and applying perpendicular compressive forces F to the sensor package between 0 and 20 N, as detailed in Section III-B. The sensor signals V H , V S , and V T were shifted to reduce their values at the reference conditions B = 0 mT, T = 30 • C, and F = 0 N to zero. In addition, they were rescaled.
Inspection of the data in Fig. 2 shows that in comparison with F = 0 N, the applied forces of 20 N reduce S A by between 3.6% and 4.5%, depending on temperature. Similarly, at F = 0 N, S A is reduced from the room temperature value by about 27% at 100 • C and increased by about 42.6% at −40 • C. It turns out that 1/S A is well fit by a polynomial model of degree 4 in V S and V T , while B off is well modeled by a second-degree polynomial. An appropriate polynomial model is selected in Section IV-A.
The present study covers 50 specimens assembled as pairs in 25 dual-die TSSOP-16 packages [4], [80]. The specimens are split into two groups, the first of which comprises Q = 35 randomly selected specimens for the generation of the prior and the remaining 15 specimens serve for validation.  Fig. 3 shows a schematic of the experimental setup with a close-up photograph of a sensor system soldered to a printed circuit board (PCB). An air streamer (Dragon Air Streamer, Froilabo, France) connected to the thermal chamber enables to vary T of the specimen hosted by the chamber. An x yz-table allows to align the thermal chamber within a Helmholtz coil, which serves to apply B between −25 and 25 mT. The Helmholtz coil was calibrated using a Tesla meter (Gauss/Tesla Meter Series 8000, F.W. BELL, Milwaukie, OR, USA). The custom-built thermal chamber rests on a motorized test stand (Test Stand ESM303, Mark-10, Copiague, NY, USA). A movable, customized spring load system equipped with a reference force sensor (Mark-10, Series 5 Force Gauge, Copiague, NY, USA) allows to expose the specimen packages to perpendicular forces F by pushing a rod against their surface. The rod penetrates the thermal chamber through an opening in its ceiling. Under the applied forces, the output signals of the stress sensors cover a comparable range of values as when specimens are exposed to other test procedures such as high-temperature operating lifetime (HTOL) testing. The setup is controlled by a LabView routine.

C. Characterization
The 35 prior-generating specimens were exposed to the following characterization conditions: 1) T was varied from −40 to 100 • C in steps of nominally 10 • C; 2) At each T < 100 • C, F was varied from 0 to 20 N in steps of nominally 5 N, while for T = 100 • C, F was varied in the same steps up to 15 N; 3) At each T and F combination, B was set to −25, 0, and 25 mT; 4) At each condition, five successive readings of the three sensor signals were recorded. Fig. 4 shows the measurement history of a representative specimen. The first three graphs show T , as measured by the temperature reference sensor, F as measured by the force reference sensor, and the applied B values. The last three graphs show the resulting output signals V H , V S , and V T . Fig. 4(g) shows the range in v space enveloping 97% of all 35 × 1110 data points of the prior-generating specimens.
For the Bayesian data analysis in Section IV, the output data of each prior-generating specimen are formatted as the list of independent variables V i = (v i1 , . . . , v i1110 ) with i = 1, . . . , 35. The dependent variable vector of the Bayesian analysis common to all prior-generating specimens forms the column vector B = (B 1 , . . . , B 1110 ) ⊤ .

IV. RESULTS
Section IV-A is dedicated to the selection of a polynomial model able to adequately describe the observed sensor responses. Thereafter, we closely follow the procedure laid out in Section II.

A. Model Selection
It is guided by three requirements. First, the model needs to be linear in its parameters w. Second, its complexity described by M should be large enough to allow the experimental data to be accurately fit. Third, M should at the same time be as small as possible to avoid overfitting [84], [85], [86], [87]. With the expectation that response surfaces such as those in Fig. 2 lend themselves to Taylor series expansion, we focus on polynomial models in the variables V H , V S , and V T of increasing complexity. A selection of 15 such models is proposed in the Appendix. For each model, we proceed as follows. For the prior-generating specimens, we carry out the linear regression of their data V i with i = 1, . . . , 35, and B according to (2) and hence obtain the list of parameter vectors w i . The rms deviation B i between the data B and the fit function of specimen no. i evaluated at V i , i.e., (V i )w i , is then given by where |·| denotes the Euclidean norm. The results are compiled in Fig. 5. For each model, the 35 resulting rms deviations are summarized as a box plot. The box represents the interquartile range (IQR), with the median indicated by the solid line in the box; the whiskers embrace all data points lying within 1.5 IQRs below the first quartile and above the third quartile. Values beyond the whiskers are considered as outliers and plotted as diamond symbols. Overall, models nos. 9 and 15 are found to achieve the best fits, as highlighted also by the inset in Fig. 5 Note that the first four terms are independent of V H and are thus well-suited for modeling B off , while the other seven terms are proportional to V H and thus aptly model V H /S A . The model dimension M = 11 sets the minimum number of calibration conditions required to determine w of a specimen without prior knowledge. Note that in Fig. 2, the fit surface for 1/S A was obtained with the last seven basis functions in (18), whereas that of B off relied on the first four.

B. Prior Generation
The model parameter vectors w i obtained with model no. 9 for the 35 prior-generating specimens constitute the database for determining p 0 (w) of the sensor ensemble, of which they constitute a sample. By applying (3) and (4), we compute the mean w 0 and the covariance matrix 0 . Fig. 6(a1) shows 0 by a heat plot.
From w 0 and 0 , using (5) and (6), we next infer the prior predictive mean B 0 (v) as a function of v = (V H , V S , V T ) and similarly the prior predictive standard deviation σ 0 (v). The value of σ required in (6) is taken to be 57.7 µT. This value is determined from the 35 × 1110 prior-generating specimen data as σ = ( i B i 2 /35) 1/2 . The prior predictive confidence range of the inferred B 0 (v) value, quantified by σ 0 (v), is plotted in Fig. 6(b1). In fact, the plot shows the values of σ 0 (v) on the top surface of projected onto the (V S , V T )-plane. This choice is justified by the observation that, for given V S and V T , σ 0 assumes its maximum in the V H -direction either at B = 25 mT or B = −25 mT, i.e., on the top or bottom surfaces of in Fig. 4(g), respectively. Furthermore, the two values of σ 0 on these two surfaces differ little due to the modest Hall sensor offset. The plot also shows the (V S , V T ) data provided by the characterization of a representative prior-generating specimen. The white dashed rectangle defines the extent of in the (V S , V T )-plane. It embraces 97% of the characterization data acquired with the prior-generating specimens for 0 N ≤ F ≤ 20 N and −40 • C ≤ T ≤ 100 • C. It covers the ranges −1 ≤ V S ≤ 3 and −1.6 ≤ V T ≤ 1.8.

C. Calibration at near-optimal stress-temperature conditions
The next goal is to identify combinations of N calibration conditions such that the corresponding sensor outputs of any uncalibrated specimen minimize its posterior predictive uncertainty (σ 2 1 ) with respect to the chosen optimality condition. The search for optimal conditions is carried out within the domain . Thereby, one ensures that the search reasonably covers the range of operating conditions to which the specimen will later be exposed and which were consequently covered during the characterization of the prior-generating specimens.
We perform the calibration measurements exclusively with B = ±25 mT. This is justified by the fact that for given (V S , V T ), the relationship between B and V H is highly linear and thus well determined by a pair of measurements. Moreover, these measurements should ideally lie as far apart as possible in the V H -direction. Within the operating range, this is the case when B = ±25 mT. Therefore, the corresponding v search domain by definition consists of the top and bottom surfaces of . Since the V H values assumed at B = ±25 mT depend on s and T and thus on V S and V T , these two surfaces can be parameterized as where + and − denote the top and bottom surfaces, respectively. The search is therefore carried out in the 2-D region delimited by −1 ≤ V S ≤ 3 and −1.6 ≤ V T ≤ 1.8. One has to be aware that for each combination of V S and V T identified as an adequate calibration condition, a pair of calibration measurements is carried out, namely, at B = ±25 mT. Switching B requires only the Helmholtz coil current to be inverted, which is fast and thus efficient.
In what follows, the procedure is illustrated in detail for N = 2. In other words, a single optimal calibration combination of (V Smin , V Tmin ) is to be identified. For this purpose, is obtained using (8) and (10). This then serves to evaluate f G (V ) and f I (V ) [cf. (11) and (13)]. The two objective functions are shown in Fig. 7 as a function of the two variables V S and V T of V . Their minima were identified numerically in Python using the SciPy library [89]. The optimal calibration conditions (V Smin , V Tmin ) are shown as gray dots, while the white dashed line delimits the search domain. These values are (3, −0.3) for G-optimality and (2.2, −0.14) for I-optimality.
The next task is to identify calibration loads F and T able to elicit response signals (V Scal , V T cal ) from any specimen near the ideal values (V Smin , V Tmin ). By analysis of the prior-generation data and the loads applied there, F = 20 N and T = 20 • C are concluded to be a reasonable choice for G-optimality, whereas F = 15 N and T = 20 • C are for I-optimality. Under these near-optimal loads, calibration output signals V cal are recorded from a specimen being calibrated. Based on V cal and B cal = (−25 mT, 25 mT) ⊤ , one deduces B 1 (v, V cal , B cal ) = φ(v) ⊤ w 1 (V cal , B cal ) and σ 2 1 (v, V cal ) of the sensor, using (7)-(10). Fig. 6(a2) symbolizes the posterior covariance matrix 1 of a representative validation specimen after near-G-optimal calibration at V cal . The corresponding confidence interval, as quantified by σ 1 (v, V cal ) on the top surface of , is shown in Fig. 6(b2), again as a function of V S and V T . The optimal calibration condition (V Smin , V Tmin ) is indicated by the gray dot, while the near-optimal condition is indicated by the black dot. The white dashed boarder again delimits . The semitransparent white dots show sensor output signals (V S , V T ) of the representative specimen under the same set of load conditions as applied during characterization of the priorgenerating specimens. These actual output data, including corresponding V H values for B = −25, 0, and 25 mT, are used for the comparison of actual data with the predictive response B 1 (v, V cal , B cal ) and thus for the validation of the method in Section IV-D. Similar to Fig. 6(b2), Fig. 6(c2) shows σ 1 as inferred from near-I-optimal calibration of the same validation specimen.
Finally, optimal and near-optimal calibration strategies with N > 2 load conditions are analyzed. We considered N = 4, 6, and 8, implying combinations of 2, 3, and 4 (F, T ) conditions, respectively. At each condition, measurements are carried out with applied B = ±25 mT. Results are shown in the last three columns of Fig. 6. The heat plots in Fig. 6(a3)-(a5) again symbolize the resulting 1 of the updated probability distribution of w. These matrices are obtained for measurements performed at the optimal calibration conditions indicated by the two [Fig. 6(a3)], three [ Fig. 6(a4)], and four [ Fig. 6(a5)] gray dots in Fig. 6(b3)-(b5), respectively. Like for N = 2, these optimal conditions were determined numerically using a search algorithm programmed in Python and taking advantage of the SciPy library [89]. Actual calibration data of a representative validation specimen were obtained under near-optimal conditions indicated by the black dots. These substitute conditions were again selected based on the prior characterization data, as described for N = 2. The contour plots show the resulting (V S , V T )-dependent updated predictive confidence interval defined by σ 1 derived from near-optimal calibration conditions V cal . These are again the values of σ 1 on the top boundary of .

D. Validation
For the purpose of calibration strategy validation, the 15 validation specimens were submitted to the same characterization procedure as the prior-generating specimens, as described in Section III-C. Per validation specimen, this results in a set of 1110 data triples v = (V H , V S , V T ) with the corresponding applied B values. The set of validation data of a representative validation specimen is shown in Fig. 6(b2)-(b5) and (c2)-(c5).
For each validation specimen, we compute the 1110 residuals B 0 − B for N = 0 and B 1 − B for the cases N = 2, 4, 6, and 8 discussed in Section IV-C and for N = 1110. The latter case means that the Bayesian update is performed with all available data.
Accuracies of the validation specimens before and after calibration are shown by orange box plots in Fig. 8. After near-G-optimal calibration, the distribution of the rms residuals of the 15 validation specimens is shown in Fig. 8(a1), while the distribution of the maximum absolute residuals is reported in Fig. 8(a2). Fig. 8(b1) and (b2) summarizes the corresponding observed accuracies after near-I-optimal calibrations of the validation specimens. For N = 0, the distributions of the rms and maximum absolute residuals computed for the prior-generating specimens are reported as well by blue-colored box plots.
Furthermore, the fraction of residuals lying within the ±σ 1 -interval of the inferred B values, i.e., B 0 for N = 0 and B 1 for all other N , has been computed. In its top section, Fig. 9 shows these fractions for G-and I-optimality. Orange bars represent again the validation specimens, whereas blue bars result from the prior-generating specimens.
In addition, for each specimen and each abovementioned N value, f 1/2 G (V cal ) and f 1/2 I (V cal ) are computed using (11) and (13). These quantify the maximum and rms predictive uncertainties resulting from the prior (N = 0) and updated (N = 2, 4, 6, 8, and 1110) knowledge about the specimens. V. DISCUSSION For the first time, a Hall-stress-temperature sensor system was calibrated against two cross-sensitivities by means of Bayesian calibration. The calibration was carried out with fewer measurements than model parameters. For a sensor model with M = 11 parameters, the method was demonstrated by calibrations using N = 2, 4, 6, and 8 measurements requiring only one, two, three, and four force-temperature (F, T ) pairs as calibration conditions. All identified G-and I-optimal calibration strategies allow an unknown B pervading calibrated sensors to be inferred from the sensor system's output signals V H , V S , and V T with known accuracy. The accuracy increases with N , as shown in Figs. 6, 8, and 9.
Near-G-optimal calibrations lead to a progressive refinement of the covariance matrix with each additional (F, T ) calibration condition [cf. Fig. 6(a1)-(a5)]. The refinement of 1 entails a corresponding shrinking of the uncertainty σ 1 within the operating range of accordingly calibrated specimens and hence allows a more accurate inference of B from the signals. For example, the maxima σ 1 on the top surface of after a single (F, T ) calibration under G-optimal [cf. Fig. 6(b2)] and I-optimal [cf. Fig. 6(c2)] conditions are smaller than the minimum of σ 0 before calibration [cf. Fig. 6(b1) and (c1)]. With increasing N , the uncertainty shrinks further, as shown in Fig. 6(b1)-(c5). A salient feature of Bayesian sensor calibration in comparison with nonprobabilistic approaches is that it not only allows to infer B from the sensor signals but simultaneously also provides the uncertainty of the inferred B value. It does so by providing the standard deviation σ 1 within and even beyond its boundaries.
Figs. 6(b2) and (c2) and 7 show the G-and I-optimal calibration conditions found for a single (F, T ) calibration condition as gray dots. It is noteworthy in both cases that the near-optimal calibration within the grid of (F, T ) conditions defined in Section III-C is performed at T = 20 • C. Calibration close to room temperature is favorable from an economic point of view. It is interesting to know whether the near-optimal conditions F = 20 N (G-optimality) and F = 15 N (I-optimality) can be replaced without significant loss by the more economical F = 0 N. This can be assessed using the validation data. We conclude that the maximum σ 1 increases to about 428 µT from about 283 µT in the near-optimal case. Similarly, the rms σ 1 increases to 160 µT from 113 µT. In conclusion, it is advisable in this case to apply a nonzero force to ensure proper stress compensation.
The posterior accuracy is now evaluated in terms of the residuals of the 15 validation specimens. In the following, the focus is on rms residuals after I-optimal calibrations reported in Fig. 8(b1). Two outliers among the 15 × 6 = 90 rms values, indicated by the diamond symbols, are neglected. The figure shows the distribution of the rms residuals and describes the tradeoff between absolute accuracy and calibration effort as a function of N . Before calibration and based on the prior knowledge alone, the rms residuals were found to be 74-383 µT, corresponding to relative errors of 0.3%-1.5% in comparison with the maximum B value of 25 mT. After calibration at the near-I-optimal single (F, T ) condition, the rms residuals were reduced to 65-129 µT, i.e., 0.26%-0.52%. With increasing (F, T ) conditions, only small further improvements to 63-89 µT (N = 4), 53-90 µT (N = 6), and 58-99 µT (N = 8) are achieved. It is noteworthy that near-I-optimal calibration with N = 2 allowed to significantly reduce the inaccuracy of the outlier in Fig. 8(b1) for N = 0. It is captured by the upper whisker. By a thorough calibration with N = 1110, the residuals are further reduced to 47-63 µT (0.19%-0.25%). This is only a minor further improvement considering the additional effort.
In [80], a non-Bayesian calibration of 20 Hall-stresstemperature sensor systems provided an accuracy of 149 (0.5%) and 236 µT (0.79%) using N = 570 and N = 6 measurements, respectively, for B values in the range of ±30 mT. A polynomial model comprising six parameters was used to infer the magnetic induction B for temperatures between −40 • C and 125 • C and mechanical stress values comparable to those applied here. In conclusion, the accuracy achieved in the present Bayesian study after calibration with N = 2 exceeds that of the non-Bayesian case in [80] with N = 6 by a factor of 1.5-3.
The continuous reduction of both objective functions, f G and f I , with increasing N is apparent from Fig. 9. It confirms that the Bayesian optimal calibration design is effective and works with smaller numbers of calibration measurements than model parameters. In the following, we discuss the N -dependent uncertainty left by the calibration based on the 15 f 1/2 I values (i.e., the 15 values of the rms σ 1 in ) for the near-I-optimal calibrations in Fig. 9(b) again neglecting the outliers. For comparison, the rms σ 0 in is 267 µT. After a single (F, T ) calibration (N = 2), the 15 f 1/2 I values are reduced to 112-114 µT. Thereafter, like the residuals (cf. Fig. 8), they are only slightly further reduced to 87-89 µT (N = 4), 73-83 µT (N = 6), and 68-77 µT (N = 8). When all available data (N = 1110) are used, a further reduction to 57.9-58 µT close to σ = 57.7 µT is achieved. A similar observation was found in [4]. After 14 Hall-temperature sensor systems were calibrated at two temperatures near the I-optimum, their uncertainty was reduced from 203 (rms σ 0 ) to 41 µT (rms σ 1 ). In comparison, the more complex model in the present study additionally ensures the stress compensation and therefore demands a more substantial calibration effort in order to achieve a similar accuracy. In [6], similar observations were made regarding the accuracy improvement after a Bayesian calibration of temperature-sensitive pressure sensors. This study used a polynomial model with five model parameters and determined calibration conditions in terms of I-optimality.
The application of near-optimal (F, T ) calibration conditions produces sensor signals V cal differing from the identified optimum V min minimizing f G or f I . The optimal f G and f I values are represented by green dots in Fig. 9, while the boxes include the corresponding values obtained with near-optimal V cal values. All experimental values in Fig. 9 closely follow the theoretical optimal values. Consequently, near-optimal calibration strategies were obviously identified. Nevertheless, how to select a near-optimal (F, T ) pair of conditions for a given (V Smin , V Tmin ) will likely depend on the required sensor specification in view of its application and may also be subject to the question of cost-effectiveness.
The decision to rely on Q = 35 specimens for the prior generation was taken in view of the complexity of the model with its M = 11 parameters. With M Q = 385 prior-generating data, the information available to determine the symmetric 0 was significantly larger than the number of its independent entries, namely, M(M + 1)/2 = 66. Compared to the study in [4], where Q = 14 and M = 7, the ratio of available prior information to independent entries of 0 was chosen here to be even larger, namely, 385/66 ≈ 5.8 in comparison with 98/28 = 3.5 in order to ensure a trustworthy prior. However, the question of how far the multivariate normal distribution described by (3) and (4) is a trustworthy approximation of the subjacent multivariate student distribution [4] remains open at this point and deserves a dedicated thorough study.
Nevertheless, in the context of this question, it is remarkable that in all calibration cases, as highlighted by the top of Fig. 9, more than 67% of the applied B values lie within the ±σ 1 -interval of the B value inferred from the sensor signals. This is close to 68.3%, the well-known cumulative probability within the ±1σ 1 range of a Gaussian around its mean. We interpret this as evidence that the prior probability distribution of w estimated from the prior-generating specimens using (3) and (4) provides a reasonable picture of the actual w distribution of the sensor ensemble.
The formalism described in Section II and applied here to the special case of Hall sensors is applicable whenever the measurand of a sensor system is well modeled by a response function of the form φ(x) ⊤ w linear in the model parameters w, where x denotes the independent variables. There is no fundamental restriction regarding the set of basis functions φ(x). Sensor types that may benefit from Bayesian sensor calibration in the present form include ion-selective chemical sensors [90], [91], inertial sensors [22], [23], [92], and mechanical sensors [5], [8], [10], [17], [18], [93]. By using the method of multivariate Bayesian regression and inference [77], [88], we expect the method to be generalizable to multisensor systems designed to provide values of more than a single measurand. Sensors modeled by more complex response functions nonlinearly involving some model parameters, such as chemical sensors, do not preclude the application of Bayesian methods. However, the mathematics will no longer boil down to matrix calculus and likely entail heavier computations [76], [77], [88]. In these cases and others without available explicit models, ANNs may be helpful [67], [68].

VI. CONCLUSION
In this article, the method of Bayesian sensor calibration was successfully applied to a multisensor system affected by two parasitic sensitivities. Bayesian calibration of the investigated Hall-stress-temperature sensor system guarantees a satisfying accuracy even when relying on fewer calibration measurements (N = 2, 4, 6, and 8) than model parameters (M = 11). For comparison, a thorough calibration with a set of N = 1110 conditions leads to a median residual of 0.21% referred to B = 25 mT. This is only 0.07% better than a calibration using six measurements. A second strength of the Bayesian approach to sensor calibration is that it enables to predict the accuracy resulting from calibration. The validity of the accuracy predictions was experimentally verified. The accuracy of the validation specimens after parsimonious calibration was indeed found to be as predicted. The ability to predict the accuracy of specimens after calibration distinguishes the Bayesian approach from ANN-based ML algorithms, where the trustworthiness of the trained ANN, instead of being confirmed, is established by testing it using independent data [14], [71], [76], [94], [95].
The successful reduction of calibration conditions still ensuring competitive accuracy is rooted in the prior distribution of sensor model parameters. A fundamental requirement for establishing such useful prior knowledge is that the specimens belong to an ensemble of sensor systems with a reasonably narrow distribution of response parameters. In the present case, this is ensured at the technology and hardware levels by a commercial standard 180-nm CMOS process of X-Fab Silicon Foundries (Erfurt, Germany) for the fabrication and by a sophisticated sensor design [30]. It is clear that the effort needed to acquire the prior database is intense and represents a weighty factor in the total calibration cost. However, the parsimony of the reduced calibration schemes building upon the prior allows to save costs, possibly over entire production volumes. Whether the Bayesian approach is able to offer a net cost saving in some calibration task will depend on aspects extending beyond the limits of purely scientific questions.  Table I lists the 15 polynomial basis functions φ(v) used to model the magnetic induction B as a function of V H , V S , and V T . The models are systematically arranged in five triplets, namely (1, 2, 3), (4, 5, 6), . . . , (13,14,15). Each triplet has the same set of basis functions modeling 1/S A and an increasing number of basis functions modeling B off . The first model, no. 1, uses the same basis functions as in [4] and has no polynomial terms in V S . For models 2 and 3, the terms V S and V S , V S V T are added to 1 and V T , respectively. The same principle applies to all further triplets of basis functions where the basis functions modeling 1/S A are progressively expanded. For example, the triplet (4, 5, and 6) has the additional term V H V S in comparison with the triplet (1, 2, and 3), while the terms for modeling B off are the same.