Region-to-Region Kernel Interpolation of Acoustic Transfer Functions Constrained by Physical Properties

A method to interpolate the acoustic transfer function (ATF) between regions using kernel ridge regression (KRR) is proposed. Conventionally, the ATF interpolation problem is strongly restricted and situational, depending on knowledge of environmental conditions while not accounting for source position variation. We derive our interpolation function as the solution of an optimization problem defined on a function space where every element holds the acoustic properties of the ATF. By making the space a reproducing kernel Hilbert space (RKHS), we can guarantee that our problem has a known and unique optimizer. The generality of the formulation of this method enables region-to-region estimations, with variable source and receiver within the assigned bounds. The definition of a RKHS also allows for the use of kernel principal component analysis, thereby efficiently providing greater noise robustness to our interpolation function. Our proposed method is compared with a previously established region-to-region interpolation method in numerical simulations where the advantages of the KRR approach are confirmed, showing lower error and greater stability for higher frequencies.


I. INTRODUCTION
T HE prediction of the variations of a sound signal as it is transmitted between two arbitrary points cannot be performed analytically without prior knowledge of the boundary conditions. Owing to physical phenomena that affect the way sound waves are propagated, the environment may cause changes to the original source signal that must be accounted for. By modeling this effect to be that of a finite impulse Juliano G. C. Ribeiro, Shoichi Koyama, and Hiroshi Saruwatari are with the Graduate School of Information Science and Technology, University of Tokyo, Tokyo 113-8656, Japan (e-mail: ribeiro-juliano@g.ecc.u-tokyo.ac.jp; koyama.shoichi@ieee.org; hiroshi_saruwatari@ipc.i.u-tokyo.ac.jp).
Natsuki Ueno was with the University of Tokyo, Tokyo 192-0397, Japan. He is now with the Tokyo Metropolitan University, Tokyo 192-0397, Japan (e-mail: natsuki.ueno@ieee.org).
Digital Object Identifier 10.1109/TASLP.2022.3201368 response (FIR) filter, we can predict these changes by understanding the impulse response (IR) between the points. This IR is also referred to as the acoustic transfer function (ATF) when in the frequency domain, and it represents the frequency response of the signal recorded at a given receiver point when the source signal was a unit impulse in the time domain. There are many applications of characterizing the acoustics of an environment spatially, such as room acoustic analysis [1], [2], virtual-and augmented-reality systems [3], [4], sound reproduction [5], sound field equalization [6], echo cancellation [7], and speech dereverberation [8].
Most established methods to estimate the ATF spatially are not appropriate for interpolating its values for continuously variable source or receiver position points assuming proper restrictions. As it is a transfer function, one method is to determine its poles and/or zeros, similarly to a standard transfer function of a linear system [9], [10]. However, these models are not properly applicable to the spatial interpolation of the ATF as the relation between the filter coefficients and the source positions is not well understood. Another approach is to interpolate the IR by considering the acoustic field to be a combination of plane waves and then using compressed sensing [11]. In [12], a method of measuring IRs based on a time domain formulation can be used to estimate the IR with high accuracy by utilizing equivalent sources. However, assumptions about the ATF are imposed in [11] and [12] that are not sufficiently general. Moreover, neither method accounts for variable source positions.
The ATF problem is strongly related to understanding sound wave propagation. Once converted to the frequency domain, the function describing the pressure changes caused by a sound wave traveling through the air, now referred to as the sound field, satisfies the Helmholtz equation on the spatial parameters. In the spherical coordinate system, the Helmholtz equation's spherical wave functions can be determined regardless of boundary conditions [13], meaning that the acoustic characteristics of a room can be expressed as constant coefficients of a series expansion of spherical wave functions of the Helmholtz differential operator. In [14], the sound field is estimated by an empirical truncation of the spherical wave function expansion, allowing for spatial interpolation of sound fields.
In a previous work, Samarasinghe et al. proposed a physical model of the ATF capable of representing it continuously for variable source and receiver positions [15]. This method This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ separates the ATF into a direct component, which addresses the direct incidence from the source to the receiver, and a reverberant component, which represents the sound field resulting from interactions with the environment. By considering the direct component to be known and approximating the reverberant component using a finite sum of spherical wave functions, this method can be used to estimate the ATF between variable sources and receivers within regions. However, this model lacks essential properties of the ATF such as reciprocity [16], a property commonly observed in a variety of situations [13], [17]. As the ATF is reciprocal, the enforcement of this essential property is expected to represent real ATF functions. The method is also dependent on reference positions of each region, which are not referenced in the Helmholtz equation at all and introduce another degree of uncertainty.
We propose a method for interpolating ATFs between two regions based on kernel ridge regression (KRR). Instead of truncating the function series, we utilize an infinite dimensional representation of ATFs. A similar approach was used in the sound field estimation problem in [18], [19]. To apply KRR, we define a reproducing kernel Hilbert space (RKHS) that guarantees any estimation holds the physical properties of ATFs. The features of our proposed method are as follows: 1) the reverberant component of the estimated ATF satisfies the homogeneous Helmholtz equation, 2) the estimation does not set a truncation order or an expansion reference position as is required in [15], and 3) the reciprocity of the ATF is taken into consideration.
Preliminary results have been presented in [20]. However, in this paper, we reformulated the RKHS based on the plane wave expansion to make the analysis more straightforward. This formulation is also more representative of the physics of the problem and can be used to expand the concept using directional weighting [21]. In addition, we perform experimental validation with noisy samples, absent from [15], [20], and introduce kernel principal component analysis (KPCA) to increase the robustness against noise.
The remainder of this paper is organized as follows. In Section II, we explain the problem statement and how the interpolation model will be derived. This is followed by Section III, where we explain the properties the ATF is supposed to hold, then define the RKHS model so that the expected properties are always satisfied. Subsequently, in Section IV, we explain the derivation of the kernel ridge regression and kernel principal component analysis procedures, which together compose our proposed method. Then, in Section V, we introduce the truncation-based formulation from [15], which will be compared with our method. This is followed by Section VI, where the simulator conditions are explained, and then the simulation results are discussed. Finally, we conclude in Section VII.

A. Notations
The set of natural numbers, which for our purposes includes 0, is N. The set of real numbers is R, and the set of complex numbers is C. The set of nonnegative real numbers is R + . The application of n consecutive cartesian products of a set A will be A × A × · · · × A = A n . Italic letters denote scalars and scalar functions, lower-case boldface letters denote vectors, and upper-case boldface letters denote matrices. A vector x ∈ C n will have its elements listed as italic letters with indexes, i.e. x = [x 1 x 2 . . . x n ] T . For letter constants, i is the imaginary unit, I is the identity matrix, e = exp(1) is Euler's number, and 0 is the null vector. When the argument of a function is suppressed, we will replace it with a centered dot ·. We assume harmonic time dependence exp(−iωt) with time t and angular frequency ω.
All integrals used will be considered to be Lebesgue integrals. The space of square-integrable functions on the Lebesgue integral defined on the domain D will be referred to as L 2 (D). Functions that only differ in a null measure set will be treated as the same function, as is conventional.
The source and receiver position pairs will be written as r|s, where r is the position of the receiver and s the position of the source. For notational simplicity, the letter q will also be used to refer to a given sound/receiver pair interchangeably whenever convenient to do so.

II. PROBLEM STATEMENT
Let us denominate the space where the measurements will be taken as Ω ⊂ R 3 with stationary acoustic properties. Consider now a pair of positions (s, r) ∈ Ω 2 containing a loudspeaker (acting as a monopole source) and a pressure microphone (acting as an omnidirectional receiver). Therefore, for a given wavenumber k, we define the ATF h(r|s, k), h : Ω 2 × R → C as the function that relates the signals between two points. The wavenumber k is calculated as k = 2πf /c, where f is the frequency and c is the speed of sound. Henceforth, k will be omitted from argument lists for notational simplicity, as the calculations are separated by frequency.
Our objective is to interpolate the ATF in a region-to-region manner, that is, to establish a method of interpolating ATF values freely within assigned boundaries. The ATF problem is closely related to the sound field estimation problem. As such, we represent h as a solution to the inhomogeneous Helmholtz equation with the point source at s as where ∇ 2 r is the Laplacian operator applied only to the receiver coordinates, and δ(r|s) is the Dirac delta function centered at the source point s and evaluated at the position r. The solution for h is represented by the sum of particular and homogeneous solutions [13], which can be regarded as the direct component h D : This assumption is also used in [15], [20]. The proper representation of sound fields is dependent on the size of the analyzed region and the number of measurement points [14], [16]. Therefore, it is not viable to define an ATF interpolation model that works for the entire environment, but rather between regions within it. We identify two regions: the source region Ω S and the receiver region Ω R as illustrated in Fig. 1. In [15], the direct component h D is assumed to result from the direct incidence of the source signal on the receiver point without encountering obstacles. That is, h D is equivalent to a sound field propagated in the free-field, and as such is given by the three-dimensional Green's function associated with the Helmholtz equation G 0 , as shown in [13].
where · 2 is the Euclidean norm. Since the direct component is considered to be known, we aim to approximate the value of h R (r|s) ∀(r, s) ∈ Ω R × Ω S based on multiple ATF measurements. As such, after determining the value of h D , this component is extracted and only h R will be interpolated.
To measure the ATF, we distribute a total of L loudspeakers at the positions {s l } L l=1 in Ω S and M microphones at the positions {r m } M m=1 in Ω R , and we then measure all N = LM ATF values. Similarly to the setup described in [15], [20], the ATF values should be measured by taking a source and a receiver devices and determining their frequency response, then using that frequency response to estimate the required ATF values by varying their positions one by one.
The input data vectors {q n } N n=1 , referring to the collection of source/receiver pairs measured, the error vector e, and measured output data vector y will be given as where e represents the probabilistic added noise, a random variable with normal distribution N (0, 2 I), that is, a Gaussian distribution with mean 0 and variance 2 , which determines the signal-to-noise ratio (SNR). The interpolation function of the reverberant componentĥ R will be derived by solving the following optimization problem: where g ∈ H is a generic function variable, · H is the norm of the feature space H and λ ∈ R + is the regularization constant. This optimization problem has different solutions depending on the configuration of H. Therefore, to derive the interpolation function, it is necessary to define the feature space H so that it permits a unique minimizer for J and holds the relevant acoustic properties of the reverberant component of the ATF.
Having determined the interpolation function for the reverberant component, we can then add back the direct component and obtain the interpolation function for the ATFĥ aŝ

III. PHYSICAL FORMULATION OF ATF
The ATF between the points r ∈ Ω R and s ∈ Ω S is the signal received at r when the signal originating from s is the Dirac delta function in the time domain. It is necessary to consider real world effects that change the signal, such as the reverberation within the enclosure of the environment, making the ATF unpredictable in the general case. In order for our interpolation method to fulfill the necessary properties of the ATF, we must properly define the feature space H.

A. Basic Properties of ATF
We now explain the observed properties of the ATF that will be included in our interpolation model explicitly. Part of our objective is to derive a feature space for the reverberant component, as outlined in Section II. Therefore, we will discuss how these properties impact the characterization of the reverberant component given the known properties of the direct component. The properties are as follows: 1) The reverberant component of the ATF is a solution of the homogeneous Helmholtz equation on the receiver variable. 2) As the Helmholtz operator is invariant to translations and rotations of the coordinate system, the estimation should be invariant to rotations and changes of origin positions.
3) The ATF is reciprocal, meaning that the ATF measured between a source and a receiver point is the same regardless of which is considered the source and which the receiver. The reverberant component is caused by reflections, diffraction, and any other physical phenomena that do not result from a direct incidence from a source to a receiver. Therefore, it is equivalent to a sound field absent of a source, and as such, the reverberant component must satisfy the homogeneous Helmholtz equation on the receiver variable [13], whatever the source point might be, addressing item 1).
Next, we address item 2), that is the invariance of the derivation of the model to rotations and translations. Since the direct component is known and only considers the Euclidean distance between the relative positions of a source and a receiver, it satisfies the second property. Therefore, the reverberant component expression must be invariant to rotations and translations of the coordinate system to guarantee the interpolation function will be.
Item 3), which will be referred to henceforth as the reciprocity, signifies that the ATF originated at a source point s and recorded at a receiver point r is the same as the ATF measured at s had the signal originated from r [13], [16], [17].
Since h and h D are reciprocal, h R must be reciprocal as well. By combining (10) with the reciprocity, we have that the Helmholtz equation must also be satisfied on the source variable.
where ∇ 2 s is the Laplacian differential operator applied only to the source position coordinates.
Owing to the differential property, we know that h R must be bounded, making it appropriate for interpolation. We will utilize this characterization of the reverberant component to define our feature space properly, as well as devise the KRR and KPCA formulations and explain their merits over the estimation used in [15], which will be used for comparison.

B. RKHS Model for Reverberant Components
A RKHS is a specific class of functional Hilbert spaces that contain a reproducing kernel function. This means that, for a Hilbert space represented by the function set H and the inner product ·, · of functions defined on a domain D, there is some kernel function κ : D 2 → C that projects elements of H on the domain points.
In [20], we constructed a RKHS formulation based on a series expansion of wave functions, defining the feature space H and the inner product based on the infinite series. In this paper, however, we will define the RKHS here using an equivalent of a Herglotz wave function [22] for the ease of applicability in RKHS problems [23]. We define the plane wave expansion for the reverberant part of the ATF as where G ∈ L 2 (S 2 × S 2 ) is the equivalent of g in the L 2 domain, and S 2 is the space of R 3 with norm 1. To guarantee the reciprocity of every g, we impose that G must also always be reciprocal. The Herglotz wave function facilitates the creation of our feature space H with a simple inner product [23].
where · is the Euclidean inner product (commonly referred to as the dot product), g 1 , g 2 ∈ H are generic functions, and · represents the complex conjugate operation.
The space given in (15) is always complete on the induced norm of the accompanying inner product as the space of reciprocal L 2 functions is a closed set, and the transformation between both sets preserves the proportionality of the norms: where · L 2 is the norm of the integral space L 2 .
Since the norms are proportional, Cauchy sequences in H correspond directly to reciprocal Cauchy sequences in the L 2 space, inheriting their convergence. We expect an approximation based on elements of H to hold all relevant acoustic properties. This Hilbert space admits the following reproducing kernel function: where j 0 is the spherical Bessel function of order 0. The computation of the integral in (18) can be performed by combining the Rayleigh equation with the orthogonality of the spherical harmonic functions [24], as expanded upon in the appendix. We can guarantee that κ is a reproducing kernel function by applying the inner product and checking if it holds the projection property for a generic function g and generic source/receiver position points s and r, respectively: We can then simplify the above expression as shown in (14).
We have shown that κ is indeed the reproducing kernel of the Hilbert space, making H as constructed a reproducing kernel Hilbert space. This reproducing kernel satisfies the homogeneous Helmholtz equation for any source/variable pair, it is invariant in regards to rotation and translation of the coordinate system, and is reciprocal in regards to the first source/receiver pair input.

IV. INTERPOLATION FUNCTION OF REVERBERANT COMPONENT
RKHSs are widely applied in physics and machine learning applications [23], [25] because of their adaptability and the existence of the representer theorem [26], which guarantees solutions of minimization problems such as (8) with solutions expressed as linear expansions of kernel functions anchored on the domain points of the data set [27]. In other words, the solution to our optimization problem is given aŝ Therefore, optimizing the defined loss is equivalent to finding the complex coefficient vector α = [α 1 α 2 . . . α N ] T ∈ C N for the interpolation function of the reverberant componentĥ R . We will adopt the alternative notation below for ease of operation: where the function vector κ is κ(r|s) = κ(r|s, q 1 ) κ(r|s, q 2 ) . . . κ(r|s, q N )) . (24) Therefore the cost J and the optimization problem (8) can be rewritten as where K is the Gram matrix given as This problem can be solved by minimization on C N . By taking the gradient of J in regards to α, we can see that and the Hermitian matrix of J is given by Since K is a Gram matrix, it is positive-definite, meaning that H(J ) is also positive-definite. Therefore, any critical points of J will be minimum points. To find those minimum points, we find the point with a null gradient: Since K is positive-definite, it is also nonsingular. Thus, the coefficient vector can be computed.
This means J only has one minimizer, meaning it is a global minimizer as well. Therefore, the reverberant component of the interpolation function that optimizes our problem is given bŷ h R (r|s) = κ(r|s)α = κ(r|s)(K + λI) −1 y.
This estimation is referred to as kernel ridge regression [27]. As the interpolation function is a linear combination of kernel functions, it holds the pertinent acoustic properties we embedded when creating the RKHS. Finally, the interpolation function is given by adding back the direct component to the estimations.

A. Application of KPCA for Increasing Noise Robustness
The interpolation functionĥ necessarily satisfies the fundamental properties of the ATF. The addition of noise into the problem is sure to cause performance degradation, and as such, some form of criterion that diminishes its effect was also applied. We opted for the KPCA, an established and widely-used method of dimensionality reduction [28] which requires the data to be projected into a higher dimensional RKHS. As the ATF has already been characterized in terms of a RKHS, it is natural to consider the use of the KPCA for the ATF interpolation problem.
Another property that makes the definition of the RKHS attractive for approximation problems is the characterization of the data and its features. The KPCA is a nonlinear extension of the principal component analysis method, a technique of dimensionality reduction and feature extraction that involves capturing the essential properties of a data set on the basis of an extended notion of the covariance function applied to each pair of samples when the feature space they are in is a RKHS. The KPCA requires the data to be 0-mean, or centered, so the first step in applying it is to make the mean of our data set 0: where x is our centered data and 1 N ∈ C N ×N is defined as After our data have been centered, we calculate the covariance matrix of the data, defined as the matrix of the inner products between every pair of elements of x.
For the standard principal component analysis, the product used is the · product between vector samples of data in R d spaces, d ∈ N. This is somewhat limited when the dimension of the vector space is much smaller than the number of data samples. For the kernel principal component analysis, however, we assume that the measurement matrix y is actually made of elements of a RKHS. Thus, instead of the usual Euclidean inner product, the inner product assumed here is given by the kernel function [29]. If the measurement vector y was already centered, the covariance matrix is straightforwardly the Gram matrix K.
If the data were not centered, the matrix will be given by the centered Gram matrix K : From here, the KPCA and the standard principal component analysis behave identically. We first apply the singular value decomposition (SVD). Since K is a Gram matrix, it is positive-definite and symmetrical, meaning that it has a complete eigenspace, all of its singular values are positive, and the SVD is equivalent to its diagonalization. The same can be said of K except that the singular values are only expected to be nonnegative. Therefore, there is an orthogonal matrix V and a diagonal matrix Σ such that The nonzero elements of Σ form a vector σ, where they are arranged in a descending order. Each singular value σ n is related to a column of V. For a given threshold σ th , to apply the KPCA on our data, we must determine n ≤ N such that σ n > σ th and σ n +1 ≤ σ th . In doing so, we build the new lower-dimensional space V red into which the data will be projected, acting as a noise filter. The columns of V red correspond to the first n eigenvectors. Our filtered signal y is given as Therefore, the average has to be returned to the data postprocessing. Indeed, the choice of a filter carries a considerable risk: if the data are not consistent with the devised physical model, the choice to discard the eigenspace related to the eigenvalues ordered higher than n would mean eliminating an arbitrary portion of the data that may or may not be representative of the whole. Conversely, if the method does work, it shows the power of the physical model given to the ATF by showing the correlation between the data and the expected result.
Currently, the KRR already employs regularization, another form of coping with noisy data sets. Regularization has the effect of shrinking the influence lower-correlation components have on the estimation, while using the KPCA to project the data completely eliminates their contribution to the data. These components tend to be relatively noisier compared to the more highly correlated vectors associated with higher principal values. Therefore, if the noise level is significant, we expect the KPCA to be more effective in filtering noise than the KRR, while for clean signals, their performance should be similar.

V. COMPARISON WITH TRUNCATION-BASED ESTIMATION
We now explain the previously established method that will be used as a standard of comparison. To estimate the acoustic transfer function, that is, to constrain the method by upholding the physical characteristics of the ATF, we must analyze forms of representing such functions. In [15], an interpolation method that only relies on properties of sound fields and does not rely on knowledge of the geometrical or acoustic properties of the environment was proposed.
By expressing the coordinate source and receiver points in spherical coordinates, one can determine the solution to the sound field problem by solving the homogeneous Helmholtz equation, which is a series expansion of spherical wave functions. The positions r and s will be where r c and s c are chosen reference positions in Ω R and Ω S . r and s are the respective radii, θ r and θ s the zenith angles and φ r , and φ s the azimuth angles. The wave functions are products of wave functions for the single variable case.
Since the reverberant component satisfies the Helmholtz equation on both variables, it is possible to express it as a series expansion on coefficients γ μ,μ ν,ν , which are functions of the reference positions: where ϕ μ ν (r) = j ν (kr)Y μ ν (φ r , θ r ) is the spherical wave function of order ν and degree μ, j ν is the spherical Bessel function of order ν, and Y μ ν is the spherical harmonic function of order ν and degree μ. In [20], we have shown this infinite modal expansion can be expressed in terms of a RKHS with the same kernel function.
In [15], the reverberant component is given by truncating (41) at two separate empirical orders N R and N S . These orders are dependent on the frequency and the size of the region. The relation between the number of orders ν ord , the wavenumber k and the radius of the region R is given in [16] as Therefore, the interpolation function is given by estimating the coefficients up to finite orders: This estimation will be referred to as the truncation (TR) hereafter. The method of coefficient extraction is described in detail in [15]. The environment is entirely characterized by the calculated coefficients, meaning that knowledge of the shape and physical properties of the room is not necessary. This method also satisfied the Helmholtz equation on both source and receiver coordinates, as the ATF is expected to. For our application, these components were estimated sequentially using a regularized least-squares criterion in each region [14]. However, this model still has issues that should be addressed. The first is the fact the estimation is dependent on defining centers and orientation for the coordinate system. The Helmholtz operator is invariant in regards to rotations and translations of the coordinate system, however these factors do change the derived truncated expansion. This means the centers of the estimation and the orientation of the coordinate systems could also be seen as hyperparameters of the model, with unpredictable effects. Another issue is that the process is divided in stages, meaning error from the first estimation accumulates with the second estimation. The reciprocity is also not enforced in any way.

VI. NUMERICAL EXPERIMENTS
The proposed methodology was experimented on using the image source method [30]. The experimental conditions were chosen for a projected reverberation time of T 60 = 0.45 s. For that end, we adopted a rectangular room with dimensions of 3.2 m × 4.0 m × 2.7 m in the x, y, and z directions respectively. The speed of sound was c = 340 m/s.
The proposed and truncation-based methods were compared in two different experimental settings: when the regions are separate as shown in Fig. 2, and when they overlap as shown in Fig. 3. In both cases, the regions were represented by spheres of 0. Both the source and receiver arrays used where concentric double-layer spherical shell arrays. The distribution of measurement points in either layer was chosen with the t-design criterion of evaluation [31], [32], as they achieve near uniform sampling [33]. The number of points in a t-design rig is given by a chosen t ∈ N as (t + 1) 2 . The arrays in the source and receiver regions always had the same number of points. This model was adopted due to the fact the TR relies on two consecutive applications of sound field analysis. If the number of points in the source or receiver region is insufficient to represent sound fields for a specific frequency, the estimation is unreliable.
There was one virtual array setting utilized in the experiments: a concentric dual shell array with an internal radius of 0.19 m and an outer radius of 0.20 m centered at the center of the regions. These arrays had a total of M = L = 32 points each, corresponding to t-design arrays of t = 3, giving a total of N = 1024 points from which the coefficients for each method are derived from. These test points were only used to derive the model, not to validate it, as doing so would condition the method to the training data.
In order to avoid spatial aliasing for the TR, we avoided having more total modes for the truncation than the total number of measurement points in each region. As the total number of modes up to an order ν max is (ν max + 1) 2 ≤ 32, we can determine a maximum of N S = N R = √ 32 − 1 = 4 orders. From (42), this means the maximum frequency for which the TR will be considered reliable is f max = N R c/eπr out = 796.28 Hz. We compared both methods up to a maximum frequency of 1000 Hz, beyond this theoretical limit.

A. Evaluation Criteria for Experiments
The proposed methodology for estimation of the ATF in noisy environments was tested by comparing the truncation and the KRR with and without application of the KPCA. We experimented with two different noise levels: 20 dB and 40 dB. The noise used had a complex Gaussian distribution, with mean 0 and variance determined by the SNR. There were four different experimental settings: TR without the KPCA, KRR without the KPCA, TR with the KPCA (TR+KPCA), and KRR with the KPCA (KRR+KPCA). As mentioned previously, the KPCA was applied with a simple filter criterion where the principal values would be the eigenvalues surpassing σ th = . The regularization constant λ for the KRR was given by the SNR as 10 − SNR 10 , under the assumption of a 0 mean complex Gaussian distribution for the noise and a Gaussian prior with correlation matrix given by the Gram matrix for the KRR, which has been shown to be a reasonable model for sound fields in [34]. For the truncation, four hyperparameters were considered: the centers of the coordinate systems and the hyperparameters associated with each region. These hyperparameters were optimized using a standard optimization tool [35] and a leave-one-out cross validation (LOO-CV) criterion with the quadratic error, here adopted for its unbiased nature and closed form solution [36]. The first comparison criterion is an evaluation of the frequency-domain performance of each method, expressed by the normalized mean square error (NMSE) defined in (44). The experimental results for this test can be seen in Fig. 4 NMSE(f ) = 10 log 10 where N = 9025 is the number of test points and the pairs {q n } N n=1 represent the collection of all the test source/receiver points. The test points represent all possible pairs between 95 source points and 95 receiver points picked for model validation, both randomly chosen with a uniform distribution within their respective regions. This validation set is completely separate from the 1024 pairs used to derive the interpolation functions, and none of the sources or receivers in the validation set coincides with any of the source or receivers in the virtual arrays. The frequencies evaluated were 50 sampled frequencies between 0 Hz and 1000 Hz without including either end.
We also evaluated the performance of each method in performing pointwise estimations with the visualized reconstructed pressure distribution from a single source located at s 0 on a regularly spaced 0.72 m × 0.72 m square grid centered in r 0 , for a total of 10201 points and analyze the capability of the method to reproduce that sound field for a frequency of 804 Hz. Since this frequency exceeds f max , the TR pressure distribution was only shown for the cleaner 40 dB SNR case, with application of the KPCA. The KRR estimations are still shown with both levels of noise to better observe how well the application of the KPCA improved the noisier sample. We also analyze the pointwise error of the estimations with the normalized error (NE) defined in (45). NE(r|s) = 10 log 10 |h(r|s) −ĥ(r|s)| 2 |h(r|s)| 2 . (45)

B. Experimental Results
As shown in Fig. 4, the KRR outperformed the TR for every frequency, for both region configurations and the KPCA was successful in mitigating the effects of noise in the samples. Special attention should be given to the 20 dB SNR case, where the TR diverged radically without the KPCA, but was actually able to perform better with it. Particularly for higher frequencies, when the models start to lose interpolation power, the KPCA showed itself to be effective. We conclude that the derived RKHS is an accurate representation of the analyzed data and the KRR is both more apt for lower frequency interpolations and is valid for a higher maximum frequency.
We also evaluated the pointwise interpolation performance of the methods. The attempted reconstructions can be seen in Figs. 5 and 6, and the corresponding pointwise errors are shown in Figs. 7 and 8. The KRR is compared with the best case scenario of the TR, as indicated by the NMSE results: TR+KPCA with 40 dB SNR. In both configurations, the KRR was shown to perform better. After the application of the KPCA, the KRR of the 20 dB SNR case shows notably better performance than the TR selected. This shows that the KRR not only achieves lower mean error, but also the reconstructed field diverges less from the expected on a pointwise basis as well.
Since the chosen method of feature extraction uses a filter, the successful application of the KPCA further validates the KRR  interpolation: even without prior knowledge of the signal itself, projecting the data into the principal vectors resulted in a cleaner sample, showing that the method was indeed mostly filtering out noise and that the enforcement of reciprocity enforces behavior expected to be observed in the data. We infer that the ATF function being approximated is thus an element of the RKHS, or at least close to it, which means the derivations presented in this work are effective and plausible representations of the ATF functions analyzed. Consequently, the enforcement of basic properties of the ATF is effective in extracting properties of the data set and provides greater robustness to noisy samples.

VII. CONCLUSION
We have introduced a kernel ridge regression interpolation method for the acoustic transfer function that is invariant to translations and rotations of the coordinate system, admits variable source and receiver positions, enforces reciprocity on its estimations, and is capable of interpolating ATF values for an arbitrary environment without prior knowledge of its physical or acoustic properties. This is a region-to-region method and is only constrained by the physical properties of the ATF. The derived estimation surpassed the previously established truncationbased method for every frequency. The application of the KPCA to increase noise robustness improves results further, even for the truncation method. The KRR when applied in conjunction with the KPCA was also shown to be capable of remaining more stable than other methods for higher frequencies even when the data has a high SNR.

APPENDIX
We will derive the kernel function in (18) using identities of the spherical harmonic functions. We begin by citing the Rayleigh equation [24] adapted for our case, where one of the vectors has unit norm and the other is multiplied by the wave number k: where r ∈ R 3 is some positional variable, θ r and φ r the zenith and azimuth angles of r in regards to the origin, v ∈ S 2 is a unit vector, θ v and φ v the zenith and azimuth angles of v. Thus, we want to compute the following integral: We also know that the spherical harmonic functions are orthogonal in regards to the above integral [24], and that Y 0 0 (φ v , θ v ) = 1/ √ 4π. Thus: By combining both expressions we have that: meaning the kernel function can then be derived by reorganizing the following integral: and substituting this expression into (18).