An Extension of Multiquadric Method Based on Trend Analysis for Surface Construction

The representation of spatial discrete data is crucial for the data analysis of meteorology, agriculture, geological exploration, and other fields. Among various methods for scattered data interpolation, the multiquadric (MQ) method is the most favorable in the field of surface construction. However, the classical MQ method has accuracy concern on the boundary and is time consuming for large dataset. This study proposes an algorithm by integrating the MQ method with trend surface analysis. In which, the low-order polynomial trend surface equation is first used to model the overall trend. Then, the MQ equation is applied to fit the residual surface after removal the trend from the data. Our implementation can eliminate the distortion in data missing areas by the classical MQ method, and the modeling efficiency can be improved significantly since the local MQ method divide the residual surface into a group of subsurfaces. The accuracy and efficiency of the proposed algorithm are validated on a synthetic model. The performance of the developed algorithm is further examined on the elevation data collected in Tibet and the seabed of a strait in Norway. The results show that with an equivalent resolution, the developed algorithm can be much more efficient than the classical MQ method and well-developed Kriging method.


I. INTRODUCTION
G EO-DATA collection is limited by objective factors, such as terrain and weather, and the collected data points are always irregularly distributed on the plane [1], [2]. The interpolation techniques have been developed to improve the understanding of the structure and geometry of geological terrain [3], [4]. These interpolation algorithms include the trend surface analysis, multiquadric (MQ) method, Kriging method, thin-plate spline, minimum curvature method, inverse distance weighted method, and so on [5], [6], [7]. Typically, the data value is approximated by the polynomial functions of the geographic coordinates of the data points. Once the coefficient matrix is determined from the known data points, the value at any desired point can be approximated analytically. The MQ method was first proposed by Hardy in 1971, who present the basic formula of the MQ equation, and applied it to construct the topography from discrete terrain data [8]. The MQ method has been widely applied to geology, meteorology, hydrology, biology, and other fields [9], [10], [11], [12], because its mathematical meaning agrees with the related physical processes. Comparisons of different interpolation on mathematical surfaces generated from sparse and scattered data have proved that the classical MQ method developed by Hardy in 1971 still performs exceptionally well in terms of the condition number of the resulting equation, efficiency, and accuracy [13], [14]. However, the distortion is a major concern special for data points near the boundary, since the interpolation is subject to insufficient data or an extremely uneven and scattered distribution of data [15], [16].
To alleviate the boundary distortion of MQ surface, we applied a trend surface function to give a basic model. Trend surface analysis method has the highest efficiency, but it leaves features of interest highlighted in the residuals, which result a coarse resolution of local structures. Previous studies have suggested that the function of trend surface interpolation can better reflect the overall characteristics of discrete datasets, while which losses local characteristic information and is unsuitable for modeling too complex discrete data [17], [18]. Therefore, we intend to combine these two methods to construct a spatial mathematical surface, and seek for approximate the spatial fluctuation characteristics of discrete datasets with high accuracy.
The efficiency of the MQ method is also a major concern arises from large datasets. The MQ method is one of the global interpolant method, which depends on all data points to establish its governing equation [19], [20]. Actually, the DOF of the governing equation of the MQ method equals the number of the discrete data, and the time is mainly spend on solving the equation to get the coefficient matric. To accelerate the MQ method, we developed the local MQ method special for modeling of large datasets by dividing the entire area into a group of small subareas.
In this study, we first present the interpolation surface fitting methods based on trend function and MQ function. Then, introduce the hybrid scheme of modeling method based on the MQ surface function combined with trend surface function. Subsequently, the local MQ method is proposed to accelerate the hybrid scheme to model large datasets by dividing a large area into a group of small subareas. After that, the accuracy and efficiency of the hybrid method is validated by datasets generated by a test function. Finally, the algorithm is applied to elevation data in Tibet and seabed terrain data to assess their practice performance.

II. METHODOLOGY
Given M distinct scattered points, the bivariate interpolation problem is to find a function f : D ⊂ R 2 → R that interpolates these data satisfying f (x i , y i ) = z i . In this section, we will illustrate the polynomial trend surface analysis and MQ interpolation technique.
1) Trend Surface Function: The trend function T (x , y) is applied to match the observed quantities at discrete points (x i , y i , z i ), i = 1, 2, · · · M , and the trend surface model is considered as a polynomial function of geographic coordinates where, x and y are the average plane coordinates of the M discrete points along the x -and y-axis, respectively. n is the order of the trend surface, and the larger value of n presents a higher accuracy expectation for the data-intensive area [21]. However, the violent fluctuation happens when a large n is used on the data-sparse area, which will result in artificial errors in the data missing area. c ij is the trend surface coefficient, and its number N = (n + 1)(n + 2)/2. Equation (1) can be represented as a system of the linear equation as where, Z is derived by defining T (x i , y i ) = z i . Typically, the number of discrete points is much larger than that of trend surface coefficients. Therefore, (2) is an overdetermined system, which should be solved in the sense of least squares by minimizing AC − Z 2 . Then, we can get The method of Gaussian elimination with complete pivoting is employed to solve (3) to get the trend surface coefficients C. Subsequently, the trend surface can be analytically approximated by following (1).

2) MQ Function:
The MQ function Q(x , y) used to fits the data points can be presented as where, w i is the conicoid coefficient, s is the shape parameter used to adjust the smoothness of the quadric surface [22]. Hardy (1971) with d i is the distance between the i th data point and its nearest neighbor [23]. Franke (1982) gives the s selection empirically as s = 1.25D/ √ M , with D is the diameter of the minimum circle that encloses all data points [19]. However, its value is related to the number and distribution of data points, as well as the fluctuation and observation accuracy of the dataset, and we define s = 0 .2D/ √ M according to practical applications. Defining Q(x i , y i ) = z i the (4) can be presented as the matrix form as Solving (5) yields the MQ coefficients W, and then the MQ surface will be expressed analytically by (4). Here, we should note that the number of elements of conicoid coefficient W is much larger than that of trend surface coefficient C. The elements of matrix B relate to the plane distance between the data points, which is determined by the location of the data points. It is obvious that the more evenly distributed of the data points, the smaller the condition of the matrix. Therefore, the damped least square method is preferred to solve (5) to guarantee the accuracy of W, since the extremely uneven distribution of data points will result in the ill condition of matrix B. Based on the damped least square method, (5) is transformed into a regularized formulation as where, I is the identity matrix, λ is the damping factor. Since the storage requirements are M 2 and the computational complexity is of the order of M 3 to solve the linear system, this method is inefficient for large values of M .

III. NUMERICAL TECHNIQUE
In the hybrid scheme, the MQ function and trend function are employed to delineate the local and overall features of the data points, respectively. The combination of these two methods, resulting in overall MQ surface integrated with trend surface (GMQT) and local MQ surface integrated with trend surface (LMQT), the details following.
1) GMQT Method: The overall MQ surface implies all data points are used to construct the MQ function, and the surface function consists of the trend function and MQ functions, which can be expressed as (4).
2) LMQT Method: There will be an efficiency challenge to the GMQT method due to the large degree of freedom (DOF) of the resulting system of the MQ equation. The LMQT method divides the entire area into NK subareas and the local MQ surface is employed on each subarea. Then, the entire surface can be expressed as where, the trend function T (x , y) is identical to Section 2.1. Q k (x , y) is the local MQ function to fit the data points (x i , y i , z i − T (x i , y i )) locate in the k th, k = 1, 2, . . . , NK subarea. The selection of subareas plays a key role in the LMQT algorithm, and integration of the trend function and local MQ function is given in the following. a) Search for the plane coordinates of the collected data points (x i , y i ), and define the entire area by x ∈ [x min , x max ] and y ∈ [y min , y max ], as indicted by Fig. 1. Then use the side length as d (usually much smaller than the average dot pitch) to discretize the area into gridded data points (x l , y l ), l = 1, 2, . . . , NL. Fig. 1 shows the plane coordinate of the collected data points and gridded data points by solid circle and rectangle, respectively. b) Calculate the coefficient of the trend surface in order n (1, 2, or 3) and formulate the trend surface function (9), according to the collected data points (x i , y i ), i = 1, 2, . . . , M . Then the trend value on gridded data points (x l , y l ), l = 1, 2, . . . , NL can be obtained analytically by (10) T c) Assemble triangulation network using the plane coordinates of the collected data points (x i , y i ), i = 1, 2, . . . , M , which resulting NK Delaunay triangles. Then, search the gridded data points enveloped by each triangle, as shown in Fig. 1. d) Determine the k th subareas to establish the local MQ function. There are NK subareas each consisting of a center triangle and its neighboring triangles, and the local MQ function is employed to calculate the values of the gridded data points enveloped in the center triangle. As the central triangle, its neighbor triangles are searched recursively outward (for example, the recursive depth is 3), and the vertices of the neighboring triangles are used to construct the local MQ surface function p(x t , y t ), t = 1, 2, . . . , NT , as shown in Fig. 2. e) The steps of integration of trend surface function and local MQ surface function in the k th subarea are given in the following.
Step 1: Search the collected data points (x t , y t ), t = 1, 2, . . . , NT in the k th subarea, and calculate the differential between its true value and its trend value, according to the following: Step 2: Treat p(x t , y t , δz t ), t = 1, 2, . . . , NT as collected local surface, and establish the MQ equation according to Section 2.2. Then the local MQ function can be obtained by Step 3: Calculate the trend surface value and local MQ value of the gridded data points (x l , y l ), l = 1, 2, . . . , NL in the k th triangle, according to (10) and (12), respectively. The constructed surface on the k th triangle can be presented by f) Repeat D and E until k = NK . Then, the local surfaces make up of the entire area surface.

IV. NUMERICAL EXPERIMENTS
In this section, the accuracy and efficiency of the developed algorithm is validated by a synthetic model. Then, the GMQT method and LMQT method are applied to construct the topography of areas in Tibet and seabed of a strait in Norway, respectively. The classical MQ method and well-developed Kriging method is treated as control to assess the accuracy and efficiency.

1) Accuracy and Efficiency Validation on a Synthetic Model:
In order to verify the accuracy of the GMQT method and LMQT method, we choose the synthetic model defined by [17]. The analytical geometry is according to (14), and analytical surface in the domain [0, 1] × [0, 1] is presented as Fig. 3(a) We designed three groups of scattered data with 36 (with 6 points evenly sampled with 0.2 step length in both x-and y-direction), 121 (with 11 points evenly sampled with 0.1 step length in both x-and y-direction), and 441 (with 21 points evenly sampled with 0.05 step length in both x-and y-direction) discrete points to reconstruct the surface defined by 10 201 grided data (with 101 points evenly sampled with 0.01 step length in both x-and y-direction). The details of these three groups of discrete points for modeling test are indicated in Fig. 3(b)-(d).
The accuracy and efficiency of the GMQT and LMQT are validated by the classical MQ method and Kriging method. Figs. 4-6 show the absolute error by the MQ, Kriging, GMQT, and LMQT on the 10 201 grided points modeling from 36, 121, and 441 discrete points, respectively. Table I further count the average error and maximum error by different methods. As shown in Figs. 4-6, the maximum absolute error of these four methods occurs at the convex hull of the conical surface. As indicated in Table I, both the average error and maximum error decrease with the increase number of the original sampling data, which agrees with truth that the interpolation accuracy is positive related with the density of collected data. The MQ, GMQT, and LMQT has similar absolute errors, while the Kriging has the largest average error and maximum error.
These experiments of different methods are completed on a PC with a main frequency of 4 GHz. Fig. 7 shows the time cost to construct the surface defined on 10 201 grided points by different  methods. As shown in Fig. 7, the time consumed by the LMQT is linear increased with the number of discrete points. The LMQT spend more time than MQ, GMQT, and Kriging with 36 and 121 discrete points, while it became much more efficient than other methods as the number of discrete points increase to 441.
As shown in Figs. 4-7 and Table I, the numerical experiments on the synthetic model illustrate that both the developed GMQT and LMQT perform with high accuracy, the LMQT can be much more efficient than the classical MQ method and Kriging method on large scattered datasets.
2) Application of the GMQT Method: The elevation data collected from a certain place in Tibet is selected to examine performance of the GMQT method. As shown in Fig. 8(a), there   The results of the classical MQ and Kriging method indicated by Fig. 8(b) and (f), respectively, are employed for comparison purpose. As shown in Fig. 8(b), the classical MQ method has obvious distortion in the data missing area near the boundary. As shown in Fig. 8(c)-(f), both the GMQT method and the Kriging method can suppress the distortion in the missing data region, and the consistency of the results of these two methods indicates their correctness. The comparison between Fig. 8(c)-(f) shows that the trend functions of different orders have little influence on the accuracy of the GMQT method. As shown in Table II  .82] m. A total of 9.09 × 10 5 grided points are defined to construct the seabed terrain, and the area is discretized by grids with uniform 64.2553 and 64.3172 m intervals along xand y-axis, respectively. Considering the GMQT method cannot be completed due to the limitation of computer memory, the LMQT method is preferred to calculate the elevation defined on 9.09 × 10 5 grided points. Fig. 9(a) and (b) shows the mapping results of the 9.09 × 10 5 data points by the LMQT method and Kriging method, respectively. There is no significant difference between these two methods; both constructed seabed terrains reflect the ups and downs of seabed topography with high quality. However, in terms of time consumption, the LMQT method only takes 20.8 s, while the well-developed Kriging method (adopted by the commercial software surfer 18) takes 404 s. The numerical experiments on a large set of elevation data of a strait in Norway demonstrate that the developed LMQT method is more efficient than the well-developed Kriging method by commercial software.

V. CONCLUSION
We developed an algorithm based on the integration of classical MQ method and trend analysis for surface construction.
The GMQT method employs an extra trend surface function to provides basic model for the classical MQ method, the LMQT method further divides the entire survey area into many subareas, and the DOF of each resulted governing equation is much smaller than that of the GMQT method. The synthetic model shows that our proposed method has equal accuracy to the classical MQ method and Kriging method, and LMQT shows the efficiency advantage on large sets of scattered points. The application on the elevation data collected from Tibet illustrates that the GMQT method can suppress the distortion of the classical MQ method on the data missing area, especially for that near the boundary. The surface construction on the seabed terrain demonstrates that the LMQT method is more efficient than present commercial software. Both the synthetic model and practical expertise show that the developed method can efficiently construct the surface with high accuracy.