Combined Adjustment Pipeline for Improved Global Geopositioning Accuracy of Optical Satellite Imagery With the Aid of SAR and GLAS

Owing to the widespread availability of multiple remote sensing methods, regions are often covered by satellite observation from multiple sources. However, the geometric positioning accuracy of optical satellites, which is the most common data source, is generally insufficient. An intuitive idea to improve accuracy is to integrate the dominant accuracy of multisource information. Previous articles have used synthetic aperture radar (SAR) and geoscience laser altimetry system (GLAS) data with improved geopositioning accuracy. However, the accuracy and applicability of existing combination methods are unsatisfactory because of various restrictions. In this article, a pipeline with automatic extraction of tie points and combined adjustment was designed based on nonstereo SAR images and GLAS data, considering both the high precision and wide distribution of multisource data. Experiments using real satellite data (ZY-3, GF-7, GF-3, and ICESat-2) and test areas covering complex landforms demonstrated the feasibility and effectiveness of the proposed pipeline. The geopositioning accuracy in three experimental areas reached 3.16, 3.36, and 3.17 m in the horizontal direction and 1.45, 1.25, and 1.28 m in the vertical direction. Our pipeline not only permits high accuracy close to the nominal accuracy of heterogeneous reference data but can also be extended to global high-precision geopositioning without ground control.


I. INTRODUCTION
E ARTH observation satellites are widely used in a variety of fields to obtain geospatial information because of their advantages of fast data acquisition, low cost, and lack regional restrictions [1]. In particular, abundant global multiple source observation data have been obtained by satellites launched in recent years around the world, including optical satellites, synthetic aperture radar (SAR), and the geoscience laser altimeter is produced when projecting the image-space coordinates of the virtual control points (VCPs). This reduces the geopositioning planar accuracy of the SAR reference, similar to the situation for optical images. H is the altitude of the satellite orbit, θ is the angle of ray, and δ is the similar error on optical satellite images.
system (GLAS). Examples include the Quickbird, Worldview optical series, and ICESat/GLAS series in the United States, Spot and Pleiades optical series in France, TerraSAR-X series in Germany, Advanced Land Observing Satellite Series in Japan, and Gaofen optical/SAR series satellites in China. These satellites have transformed global mapping from traditional singlesource data to combined multisource data [2]. Among these multisource data, optical satellite images offer rich detail but have poor geopositioning accuracy without ground control, which is restricted to attitude accuracy. For instance, the geopositioning error of the ZY-3 satellite is only 10 m in the horizontal direction and 5 m in the vertical direction [3]. However, the clear texture and high stereo matching performance of ZY-3 is suitable for three-dimensional (3-D) mapping, modeling, and other applications. On the contrary, while some satellite types do have good initial geopositioning accuracies, limitations make them inconvenient to extract effective information directly. For SAR satellite images, geopositioning is based on the principle of range-Doppler which is not significantly affected by attitude determination. As such, SAR usually has high uncontrolled geopositioning accuracy. uncontrolled horizontal geopositioning error of 3 m (CE90) was observed in the GF-3 satellite after calibration [4]. However, substantial noise within SAR imagery makes matching and 3-D geopositioning difficult. The GLAS can accurately and directly acquire the terrain height through time-delay ranging (e.g., the vertical accuracy of the ICESat-2 satellite is better than 1 m [5]). This situation presents a significant opportunity for an integrated approach to multisource satellite observation data. Such an approach would help improve the geopositioning accuracy of optical satellite images using SAR and GLAS data.
Previous articles on the combination of multisource satellite data have focused on converting multisource data into virtual control points (VCPs) of optical satellite images, with GLAS data used as elevation control points (ECPs) and SAR data used as planar control points. This approach can be divided into three categories as follows: 1) Combined adjustment of optical satellites and GLAS data.
Li et al. [6], Jin et al. [7], and Lin et al. [8] filtered the laser altimetry data of ICESat2 based on topographic features (such as slope and gradient). The remaining laser points were used to generate ECPs. The geopositioning accuracy of the optical images was improved by adjusting with these ECPs. However, such methods can only improve the accuracy of optical images in the vertical direction, and not in the horizontal direction. 2) Combination adjustment of optical satellites and stereo SAR images.
Jiao et al. [9], [10] and Zhu et al. [11] obtained tie points between optical data and SAR through crossmodal matching. The VCPs were generated by the space intersection (triangulation) of these tie points on the stereo SAR images. After adjusting with these VCPs, the optical-satellite geopositioning accuracy was effectively improved. However, these methods rely completely on stereo SAR data. In China, for example, the number of stereo SAR satellites is relatively low because there are no on-orbit stereo SAR satellites, and only off-orbit stereo images can be obtained [12]. Therefore, the applicability of this method is restricted by the difficulty in obtaining stereo SAR images in the target region. 3) Combination adjustment of optical satellites, nonstereo SAR, and digital elevation model (DEM) data.
Fan et al. [13] obtained the elevation value of SAR imagery points, and then used corresponding Shuttle Radar Topography Mission (SRTM) [14] data to calculate their geographical coordinates and generate VCPs of optical imagery. With the help of SRTM, Zhang et al. [15] produced an SAR orthophoto of the Hubei region of China, and obtained plane control points (CPs) through crossmodal matching [16]. These CPs were used to improve the geopositioning accuracy of the optical images. Both methods used a DEM as a reference in the vertical direction to support SAR-generated plane CPs, Fig. 1 shows that their accuracies will be limited by the accuracy of the DEM used (e.g., the horizontal and vertical accuracies of SRTM are 20 m (CE90) and 16 m (LE90) [17]). In this article, we aimed to develop a novel multisource combined geopositioning pipeline using 1) multisource data with high accuracy and 2) data that are widely available for global geopositioning. Thus, the pipeline is based on nonstereo SAR and GLAS data with high precision (compared with optical satellites) and broad accessibility. However, planar SAR (i.e., no stereo intersection) imagery lacks reference to the vertical direction; moreover, there are no stereo SAR imageries to space intersections and no DEM to obtain elevation values. To address these issues, we used GLAS to improve the overall elevation accuracy as [6] and regarded it as the vertical reference of the SAR image. After generating high-precision CPs for optical images, the geopositioning accuracy of optical images was improved through iterative combined adjustment. We evaluated the performance of the pipeline using real data from four satellites (ZY-3, GF-7, GF-3, and ICESat-2) over three experimental areas covering a variety of terrains. For the three areas, the combined geopositioning accuracy of optical images reached 3.16, 3.36, and 3.17 m in the horizontal direction, and 1.45, 1.25, and 1.28 m in the vertical direction. These results confirm the global high-precision geopositioning capability of our pipeline.

A. Pipeline Overview
Our combined adjustment pipeline was developed to improve the geometric performance of optical satellite images by integrating multisource satellite data with higher or better geopositioning accuracy. In our approach, rational polynomial coefficients (RPCs) of optical satellite images with initial errors are adjusted by 3-D CPs with favorable reliability and distribution generated from SAR and GLAS data.
Two types of CPs for optical satellite images are produced from these multisource tie points: ECPs and CPs. Although these CPs contain geometric information of the SAR and GLAS data, they are not "traditional" CPs because their generation results in errors. The error of the ECP is due to inconsistencies in the initial plane error of the optical satellite image; the error of the plane CPs is due to the lack of a vertical reference without a stereo SAR image pair.
An iterative method was used to solve these errors and obtain CPs with high accuracy. First, we extracted GLAS data to form ECPs and improve the absolute accuracy in the horizontal direction of the optical images by removing the bias of the RPCs. The tie points between the optical and SAR images were then identified using a cross-model matching algorithm [25]. The vertical ground coordinates of the CPs were triangulated by stereo optical parts from the SAR-optical tie points with improved elevation accuracy. The horizontal ground coordinates of the CPs were projected by the RPCs of the SAR and these vertical coordinates. Finally, all CPs, ECPs, and TPs were iteratively adjusted to improve the geopositioning accuracy of the optical images. The pipeline framework is illustrated in Fig. 2.

1) Automatic Identification of Multisource Tie Points:
Multisource tie points identified in multisource satellite data are associated with the observation and premise of the combined adjustment. They are usually obtained by a matching algorithm and are selected based on certain criteria. In our pipeline, we used the scale invariant feature transform [18], [19] and random sample consensus (RANSAC) [20], [21] to extract and filter optical-optical tie points. The other two types of multisource tie points, optical-GLAS, and optical-SAR tie points are as follows: a) Optical-GLAS tie points: Given the laser point vector where p i,k is the kth image-space point measured on image i , corresponding to the kth laser point, and RPC(·) describes the transformation from the object-space coordinate to imagespace coordinate in the RPCs domain. We called these projected points, optical-GLAS tie points.
b) Optical-SAR tie points: Gaussian pyramid features of oriented gradients [25] was selected for cross-modal matching, and geometric verification was used to eliminate false matching. However, because the robustness of cross-modal matching is lower than that of same-modal matching, there were many outliers. Furthermore, owing to the large coverage area and different imaging modes of the cross-modal satellite images, entire images did not satisfy the geometric transformation (affine or homograph). This led to difficulty in eliminating these outliers by RANSAC [20] directly. Fortunately, we noted that geometric transformation was maintained in the local images, which can be approximated as an affine camera model image [26], [27]. Consequently, we used the block RANSAC method to eliminate the mismatch of the divided image blocks based on affine transformation.
2) Elevation Accuracy Improvement by Combination With GLAS Data: To generate ECPs, we adopted a similar pipeline in [6]. First, the laser points with high risk are eliminated according to a terrain-based method, restricting the slope and roughness of their cover area (detail in [6]). Second, one image [such as the nadir camera (NAD) image] of the stereo optical images was selected as the seed image. Then, optical-GLAS tie points were projected onto this image using (1), and the corresponding points in other images were subsequently obtained by least-squares matching. For the ECPs, the ground coordinates were laser point vectors, and the measured points in the image were the corresponding points matched.
Given the ECPs and optical-optical tie points, the error equation based on the bias compensation of affine transformation in the image-space of the RPC [22]- [24] can be formed and solved. The specific error equation and details are shown in the first two rows of (2), only v 1 and v 2 are required to be adjusted. After introducing the GLAS data and combined adjustment, the vertical geopositioning accuracy of the optical satellite images was improved.
3) Generation of Control Points: The CPs comprise the ground coordinates and measured image coordinates. For our pipeline, these ground coordinates should contain high horizontal geopositioning accuracy information from SAR images and high vertical geopositioning accuracy information from GLAS data. Thus, the vertical ground coordinates of the CPs were triangulated by the points on the optical images of the optical SAR tie points and the RPCs of the optical satellite images with improved vertical geopositioning accuracy. The horizontal ground coordinates of the CPs were projected by the vertical ground coordinates above, points on SAR images of optical-SAR tie points, and the RPCs of the SAR images. The multisource tie points between the optical satellite and SAR images are the measured image coordinates of the CPs. Currently, CPs with high accuracy from SAR and GLAS data have been generated, and they are used in combined adjustments to improve the geopositioning accuracy of optical satellite images.

C. Combined Adjustment
Due to the CPs from GLAS and SAR not being the real ground CPs, we used these CPs as weighted observations. All observations are regarded as tie points, and the RPC compensation parameters and their object-space coordinates are corrected at the same time. The object-space coordinates of ECPs and CPs have additional constraints.
Given the multisource tie points, ECPs, and CPs, the error equation can be linearized as follows:  (4) with s, l are the image-space coordinate that calculated the RPCs of optical satellite images compensated by the affine parameters.
The observations v 2 and v 3 represent the constraints on the object-space coordinates of the CPs. L 2 and L 3 are zero vectors, which means that the ground coordinates after correction of these CPs should approach their initial ground coordinates as much as possible. B 2 = [ 0 0 1 ] T and B 3 = [ 1 1 1 ] T are coefficient matrices of ECPs and CPs. P 1 , P 2 , P 3 are the weights of the corresponding observation. They contain the contribution of each type of observation and are key to combined adjustment. The weight P 1 is usually set as unit weight because it comes from the same optical registration. The weight P 2 can be set as 1/σ 2 h , of which σ h is the nominal vertical geopositioning accuracy of GLAS data. The weight P 3 can be set as 1/(σ 2 φ + σ 2 λ + σ 2 h ), where σ φ and σ λ are longitude and latitude uncertainty from the combination of σ h and nominal horizontal geopositioning accuracy of SAR data. Such a weight strategy ensures that we make full use of the expected geopositioning accuracy information of heterogeneous control data.
Finally, to obtain the robust solution of (2), the spectral correction least-squares method [28] was used. The M-estimation [29], the Huber [30] method, was used to restrain the gross error.

D. Iterative Refinement
It should be noted that there was a vertical error in the ECPs generated by the above steps. This is because the optical-GLAS tie points (Section II-B-1) are projected directly without coregistration between optical satellite images and GLAS data. Consequently, the initial optical geopositioning error led to vertical error H err (Fig. 3). This error is related to the local terrain of the projected ground position (the error of undulating terrain is  greater than that of flat H errF ). Thus, the final vertical geopositioning accuracy after the combined adjustment with GLAS was reduced. Combined with the analysis in Fig. 1, this vertical error led to the planar geopositioning error of CPs generated in Section II-B because the horizontal ground coordinates of CPs need the vertical ground coordinates from space intersection with RPCs improved by GLAS. This reduced the accuracy of the final combined adjustment.
To solve this error, we used iterative refinement (Fig. 4). Specifically, after the first combined adjustment, in which SAR data assisted, the horizontal accuracy of optical satellite images was significantly improved (even if it did not reach the optimal level). At this time, the laser points were reprojected to obtain ECPs with improved RPCs and higher horizontal accuracy (Fig. 3). Consequently, the horizontal ground geopositioning accuracy of the CPs was improved by reprojection with SAR images. Subsequently, a recombined adjustment was used to improve the final geopositioning accuracy. The above process was repeated iteratively until the final accuracy converged (generally to 2-5 iterations).

A. Data Description
To test and validate the performance and applicability of our pipeline, multisource satellite data including ZY-3, GF-7, GF-3, and ICESat-2, located in the areas of Songshan and Taihang, China, were collected. The Songshan area comprises various landform types, including mountains, urban development, plains, and farmland, while the Taihang area is located in Taihang Mountains, which are mountainous terrain. The ZY-3 optical satellite carries stereo cameras in a three-line array. Its ground sampling distance (GSD) between forward (FWD) and backward (BWD) cameras is 3.5 m, and the GSD of the NAD is 2.1 m. FWD and BWD cameras of Gf-7 can perform stereo imaging, whose GSD is 0.7 m. Images of GF-3 were taken from the ultra-fine strip work mode; the horizontal geopositioning accuracy without ground control is better than 3 m [4]. GLAS data were obtained from the ICESat-2 ATL08 product. The diameter of the laser footprint point is 17 m, the interval along the track is 0.7 m, and the vertical accuracy is better than 1 m [5].
Independent checkpoints in each test area were identified to quantify the geopositioning accuracy of the combined adjustment. Specifically, the checkpoints in the Songshan (ZY-3) and Songshan (GF-7) areas were from high-precision matching and artificial pricking with the benchmark (DOM and DEM), and the benchmark meets the scale of 1:5000; the ground coordinates of the checkpoints in the Taihang Mountain area were global positioning system points obtained by survey, and the horizontal and vertical accuracies were all better than 0.2 m.
Hereafter, A and B represent the Songshan area and C represents the Taihang area (see Table I and Fig. 5 for details of survey area).

B. Result of Tie Points Identification
We identified the tie points between optical-optical, optical-GLAS, and optical-SAR. An overview of the multisource tie points is listed in Table II, and their distributions are shown in Fig. 6.
The identified multisource tie points were well distributed in each study area (Fig. 6), confirming the precision of the combined adjustments in Section II-C, and the details of the CPs are shown in Fig. 7. Specifically, owing to the rich texture and small gray difference in optical satellite images, the  optical-optical tie points were evenly and densely distributed in each survey area (blue points in Fig. 6). Only the steep mountains in survey area C showed poor local matching. Thus, there was interior consistency among the satellite images in the combined adjustment. The optical-GLAS tie points were evenly distributed in the strips (orange cross point in Fig. 6) and provided good vertical control.
In contrast, optical-SAR multisource matching was different from homologous matching, which includes geographic prediction and template matching. Matching generally conformed to the geometric transformation (the blue line in Fig. 8 is the initial matching), and the coarse difference could not be easily eliminated. However, after the block RANSAC elimination in Section II-B, the correct matching was retained well (red line in Fig. 8).

C. Combined Adjustment Performance
To evaluate the geopositioning performance of our pipeline, six methods were designed for comparison as follows: 1) The original optical images were directly intersected without adjustment. 2) Optical image adjustment only.
3) Combined adjustment of optical and GLAS data. 4) SRTM-assisted optical and SAR combined adjustment. 5) SRTM-assisted optical, SAR, and GLAS combined adjustment. 6) Our combined adjustment method for optical, SAR, and GLAS without refinement. 7) Our whole pipeline including iterative refinement. Among these methods, Method I tests the initial geopositioning accuracy of optical imagery without ground control; Method II tests the elevation accuracy after adjustment of the optical image only; Method III tests the elevation accuracy that is improved only with the participation of GLAS data; Method IV tests the improvement of plane accuracy by combined adjustment with SAR (SRTM-assisted elevation reference) only; Method V tests the performance of the existing method (optical, SAR, and GLAS combined adjustment); Method VI tests the method of considering the elevation accuracy after laser lifting as a SAR reference without RPC refinement; and Method VII tests the entire pipeline proposed.
The minimum error (MinE), maximum error (MaxE), and root mean square error (RMSE) were used for quantization accuracy. The results of the experiments in different study areas are listed in Table III. 1) Performance of the Existing Methods: When there was no multisource data participation (methods I and II), the results show that, after optical image-only adjustment, there was still a large horizontal error; for example, for Method II in each area, the error was 6.99 m (A), 7.93 m (B), and 4.98 m (C). This implies that using GLAS points projected in the optical image produces a vertical error for these horizontal errors. We solved this using GLAS point reprojection and iterative refinement (Section II-D).
After introducing GLAS data (Method III), the results show that even if ECPs are only being projected instead of being matched, the combined adjustment between the optical satellite and GLAS data can significantly improve the vertical geopositioning accuracy of optical satellite images. This is the premise of the pipeline. If the vertical accuracy cannot be improved in the case of only GLAS data and no stereo SAR, we lose the elevation reference and cannot obtain the horizontal ground coordinates from the SAR images (Section II-B-3).
After introducing SAR data (Method IV), the SRTM data were used as the elevation reference for the SAR images. Owing to the insufficient accuracy of SRTM (16 m [16]), there is a large coordinate error in the horizontal direction of the CPs. Therefore, the improvement of horizontal geopositioning accuracy of method IV is limited (7.59-6.43 m for area A and 7.29-7.07 m for area B). Even some of the accuracies were reduced after combined adjustment in area C (3.5-7.65 m), where there are numerous mountains and the initial horizontal accuracies of optical satellite images were good.
Method V introduced SAR and GLAS data, but SRTM was still used as the SAR elevation reference according to existing methods [13], [15]. The results show that the accuracy improved by 7.87-6.43 m for area A and 9.28-7.34 m for area B in the horizontal direction; thus, the improvement was less than that for methods VI and VII. In area C, the accuracy was reduced because of poor DEM accuracy owing to the complex terrain, consistent with the findings for Method IV. Moreover, GLAS data were used only after terrain screening, limiting the improvement in elevation. In summary, this method cannot effectively use dominant SAR data information.
To further investigate errors caused by the insufficient accuracy of SRTM DEMs, we designed an addition experiment (Fig. 9). The true CPs were generated from the reference DEM and DOM by automatic matching or manual selection. Fig. 9 shows an error of several meters between the elevation value read from SRTM and the true value [blue line in Fig. 9(a)]; this error is propagated through the calculation of the SAR horizontal coordinates (Fig. 1), resulting in a large in-plane error [red arrow in Fig. 9(b), where arrow length represents error magnitude]. These plane errors reduced the geopositioning accuracy of the combined adjustment. Thus, obtaining a better elevation reference for SAR is an important aspect of the optical and SAR joint adjustment, and is also the core of our pipeline.
2) Performance of Our Pipeline: Using Method III (Table III), the vertical geopositioning accuracy improved to 2.3 m in area A, 1.41 m in area B, and 1.37 m in area C after optical GLAS combined adjustment. Under this condition, the method described in Section II-C was used to generate CPs for combined adjustment. Without RPC refinement, the geopositioning accuracy (Method V) improved to 3.59 m for area A, 4.36 m for area B, and 3.38 m for area C in the horizontal direction; and 1.83 m in area A, 1.26 m in area B, and 1.35 m in area C in the vertical direction. These results confirm the feasibility of the proposed method, which can improve overall optical vertical accuracy by introducing GLAS data and then providing elevation data for SAR images to generate CPs with better horizontal accuracy.
After the first combined adjustment, the horizontal accuracy of the optical satellite image improved to approximately 4 m (Table III; Method V). At this time, the elevation error of the initial ECPs can be reduced by reprojection (Section II-B-4), and it can be eliminated through iteration refinement. Fig. 9(a) shows that the vertical accuracy of the ECPs in the three iterations was significantly better than the elevation value obtained from SRTM. The error after the third iteration was reduced to approximately 1 m (close to the ICESat nominal accuracy).
The reduction in the error in the elevation reference also improved the horizontal accuracy of the CPs. Fig. 9(b) (blue arrows) indicates the horizontal accuracy of the CPs generated  using SAR images. In the last iteration, the errors converged to approximately 3 m (close to the GF-3 nominal accuracy), which is less than the error caused by the SRTM as the elevation reference. In conclusion, the overall accuracy of this algorithm is better than that of SRTM-assisted optics, SAR, and GLAS [13], [15]. In summary, the proposed pipeline compares favorably with existing methods, permitting better accuracy and wider applicability.
The final horizontal and vertical geopositioning accuracies of each area (Method VII) were 3.16 and 1.45 m (for area A), 3.36 and 1.17 m (for area B), and 3.17 and 1.28 m (for area C), respectively, close to the nominal accuracy of multisource reference data. This accuracy is better than that of Method VI (combination of SRTM-assisted optics and SAR) [13], [15]. This demonstrates that the information of multisource data is fully utilized. In terms of applicability, there was only one SAR image in survey area A, and although there were two SAR images in survey areas B and C, stereo image pairs were not formed. As such, stereo SAR and optical combination methods [9]- [11] could not be used. Regardless, the method proposed here worked well in all experimental regions, and offers global applicability.

IV. CONCLUSION
In this article, we developed a combined adjustment pipeline to integrate the dominant information of SAR and laser data to improve the geoaccuracy of optical satellite images. Based on theoretical and experimental validation, the following conclusions can be drawn. Our pipeline can effectively determine the geoaccuracy of optical satellite images without the use of ground control. When tested using real satellite data from three study areas containing various landforms, the geopositioning accuracy reached 3.16, 3.36, and 3.17 m in the horizontal direction, and 1.45, 1.17, and 1.28 m in the vertical direction (Table III). This accuracy is close to the nominal accuracy of heterogeneous reference data (GF-3: 3 m and ICESat2: 0.75 m), which demonstrates that our pipeline makes full use of multisource data precision information. Finally, compared with existing pipelines, our pipeline achieves better accuracy and wider applicability; moreover, it benefits from no limit on the accuracy of the auxiliary DEM [13], [15] and does not require stereo SAR images [9]- [11]. The proposed pipeline offers the potential for application in global high-precision geopositioning.