A New RD-RFM Stereo Geolocation Model for 3D Geo-Information Reconstruction of SAR-Optical Satellite Image Pairs

Nowadays, numerous earth-observation data are obtained from different kinds of platforms, in which data from synthetic aperture radar (SAR) sensors and optical sensors account for the vast majority. They have been widely used in different fields such as 3 dimensional (3D) geo-information reconstruction, object recognition and others. Researchers have investigated the suitability of the rational function model (RFM) for 3D geo-information production with stereo SAR-optical images. Compared with optical remote sensing data, parameters of the RFM for satellite SAR images are not provided to users in most cases, which increases the workload of users applying the SAR-optical stereo geolocation based on RFM. Moreover, the fitting accuracy of RFM to the Range Doppler (RD) model, which is considered as the rigorous sensor model for SAR imagery, do not always meet the requirements. Therefore, a new RD-RFM stereo geolocation model by integrating the RD model of SAR images with the RFM of optical images are proposed for 3D geo-information reconstruction with SAR-optical stereo image pairs. Experiments are conducted based on dataset from Gaofen-3 (GF-3) SAR satellite and Gaofen-2 (GF-2) optical satellite covering urban and mountainous areas in Xinyi City and Dengfeng City, Henan Province, China, respectively. Results indicated that our proposed model can achieve the best performance with different terrain and different convergence angle of stereo SAR-optical image pairs. Compared with the traditional stereo geolocation model based on the RFM for both SAR and optical data, our proposed model is more concise and efficient for the production of 3D geo-information with stereo SAR-optical image pairs, and it’s easy to be implemented for users.


I. INTRODUCTION
With the improvement of quality and quantity of earth observation data from space-borne satellites, many researchers have investigated the potential use of them, in which the three dimensional (3D) geo-information is a key product for the applications of obtained datasets. Traditionally, the 3D information is derived from optical or synthetic aperture radar (SAR) satellite stereo images based on photogrammetry or radargrammetry, which can reach a relatively The associate editor coordinating the review of this manuscript and approving it for publication was Stefania Bonafoni . high performance with provided precise orbit information [1], [2]. However, 3D geo-information derivated with photogrammetry or radargrammetry also suffered from different weaknesses. For example, optical images are influenced by weather and environmental conditions, and SAR images are harder to be interpreted.
Therefore, studies about SAR-optical stereogrammetry and data fusion for the 3D geo-information reconstruction have been investigated by many researchers since the 1980's [3]- [5]. However, the obtained absolute geolocation accuracy was relatively poor limited by the low resolution of remote sensing datasets. Therefore, based on datasets from airborne sensors with very high resolution, Wenger et al. [6] achieved a meter-level accuracy of building height by integrating InSAR data with aerial orthophotos. In 2015, Zhang et al. used multisensor datasets from GeoEye-1 and TerraSAR-X for 3D building reconstruction, and the accuracy can reach meter-level based on a stereo mapping model with combinations of an angle model for optical images and range-doppler (RD) model for SAR images [7], [8]. Recently, Qiu et al. proposed a new strategy for automatic SAR-optical stereogrammetry with space-borne imagery from WorldView-2, TerraSAR-X and MEMPHIS sensors, and the reconstruction accuracy can also reach meter-domain in urban areas [9]. Unlike space-borne SAR sensors, orbit parameters for optical satellite images are not provided to users in most cases, which makes it impossible for the development of rigorous sensor models. There are too many parameters required for the establishment of these rigorous models, which makes it rather complicated.
As a replacement for the rigorous sensor models, the rational function model (RFM) is proposed for geometric processing of remote sensing images [10], [11]. Using 80 rational polynomial coefficients (RPCs) and 10 normalized parameters, the RFM can fit both rigorous sensor models for SAR and optical imagery. Benefit from the simplicity and suitability of the RFM, the integration of SAR and optical satellite imagery becomes easy to be processed for the production of 3D geo-information. Bagheri et al. proposed a new framework for SAR-optical stereogrammetry based on the RFM in urban areas in 2018 [12]. Based on the proposed model, the geolocation accuracy of the produced point cloud is about 2 m over experimental urban areas in Munich and Berlin. The application of the RFM can simplify the 3D reconstruction problem in urban areas. However, RPCs of SAR images are not provided by image vendors except the Gaofen-3 satellite. Therefore, the production of RPCs increases the workload of users for the application of this model. Moreover, the suitability of the RFM for SAR images in mountainous areas is relatively poor, which will introduce extra fitting errors during the reconstruction for 3D geo-information [13]. Therefore, we proposed a new RD-RFM model for SAR-optical stereogrammetry which is suitable and applicable both in mountainous and urban areas. In this model, parameters including orbit information of SAR satellite images and RPCs of optical satellite imagery, all come from supplementary files provided by image vendors. The performance of this proposed model is verified using SAR imagery from Gaofen-3 (GF-3) satellite and optical imagery from Gaofen-2 (GF-2) satellite.
Launched since 2016, the GF-3 satellite has been working for more than 3 years. Benefited from the characteristics of high resolution and multiple imaging modes, SAR images obtained from GF-3 have been widely applied in many fields, such as object recognition, resource monitoring and others [14]. Before the application of these images, the geometric performance of GF-3 SAR images has been investigated by many researchers. In 2017, Liu et al. verified that the systematic geolocation accuracy can be achieved within 3 m based on precise orbit information [15]. After that, Wang et al. proposed an integrated orientation model for GF3 stereo image pairs. Using sparse ground control points (GCPs) and image pairs with large convergence angles, the 3D geolocation accuracy can reach about 1.5 m in planimetry and 0.6 m in height [16]. Moreover, a cross-calibration model was proposed for the geometric accuracy improvement of GF-3 SAR images based on the RD model, which can also achieve a relatively high performance based on datasets with very high resolution [17]. In 2019, Lu et al. produced a new model for 3D building reconstruction using GF-3 images, and the results indicated great performance in building reconstruction [18]. Therefore, the geometric performance of GF-3 can reach meter-level with the help of supplementary orbit parameters.
Unlike GF-3, the GF-2 satellite is the first civilian optical satellite with a resolution better than 1 m [19]. Two cameras are attached on the platform, one of them is a panchromatic camera with a resolution of 1 m and the other is a multispectral camera with a resolution of 4 m. The side-viewing angle can be changed between ± 35 • freely, and the revisiting period is 5 days. Like some world-class optical satellites, RPCs of images obtained from GF-2 satellite are provided by image vendors rather than the precise orbit information. Therefore, the geometric performance can only be verified and calibrated based on the RFM [20]. In 2018, Yao et al. proposed a multi-observation model for the geolocation accuracy improvement of GF-2 imagery without GCPs. Results indicated that the geolocation accuracy of GF-2 dataset can be improved from about 80 m to 25 m in planimetry [21]. However, the performance after calibration without GCPs is not accurate enough. With the help of GCPs, the geolocation accuracy of GF-2 images can reach sub-meter level for futher application [19].

II. METHEDOLOGY A. OVERVIEW
To obtain the 3D geo-information with SAR-optical stereo imagery, we propose a new RD-RFM stereo geolocation model for the reconstruction process. Firstly, the geometric performance of SAR images is improved with precise orbit information and a suitable internal and atmospheric propagation delay calibration model. Secondly, tie points extracted from calibrated SAR images are considered as virtual GCPs (VCPs) to correct the bias of RPCs for corresponding optical images. After the geometric correction of optical images, the RD-RFM model is proposed by reprojecting parameters of SAR images from the geodetic Cartesian coordinate system to the geodetic coordinate system. Based on the reprojection process, the 3D geodetic coordinates can be calculated based on the RD-RFM model. Fig 1 shows the flowchart of our proposed methods.

B. RANGE DOPPLER MODEL
The Range Doppler model is the rigorous sensor model according to the imaging principle of SAR sensors [13]. For a single SAR image, the geometric relationship of an VOLUME 8, 2020 image space coordinate and the corresponding object space coordinate can be established as: where (X , Y , Z ) are the position vector of the target point. (X S , Y S , Z S ) and (V x , V y , V z ) are the position vector and velocity vector of the satellite platform in a geodetic Cartesian coordinate system. λ and f d are the wavelength and Doppler center frequency of the SAR sensor, and R is the distance between the target point and the SAR sensor. h is the target height above the surface of the earth. R e and R p are the semi-major and the semi-minor axis of the earth given by: with f is the flattening factor [13]. According to the Range Doppler model, the geolocation accuracy of SAR imagery is influenced by the following aspects: satellite orbit error, systematic time delay error, atmospheric propagation delay error, the doppler frequency error and the target elevation error [22]. 1) Satellite Orbit Error: The satellite orbit error can be divided into the position error and velocity error of SAR sensors, which including errors in the along-track direction, cross-track direction and the radial direction. Satellite position error in the along-track direction and radial direction will cause the geolocation error in the range direction, and position error in the along-track direction will lead to the geolocation error in the azimuth direction. The influence of the satellite velocity error can also be separated into three directions as mentioned above. The velocity error will result in the doppler frequency error and the geolocation error in the azimuth direction, while the caused error in the range direction can be ignored [15].
2) Systematic Time Delay Error: The systematic time delay error is mainly composed of the measurement error of the internal propagation delay. The internal propagation delay error will result in the geolocation error in the range direction. And the caused geolocation error changes with different combinations of pulse width and bandwidth [23]. As for the GF-3 SAR satellite, the internal propagation delay error is relatively stable with a fixed combination of pulse width and bandwidth as shown in Fig. 2. With different mode of polarization, the deviation of internal propagation delay is within 3 ns.
3) Atmospheric Propagation Delay Error: The influence of the atmospheric Propagation delay error is caused by many components in the atmosphere and the emission frequency of radar signals, which will also lead to the distance measurement error in the range direction. Usually a static calibration model is applied for the calibration of atmospheric Propagation delay error [15]. 4) Target Elevation Error: The target elevation error will lead to the geolocation error in the range direction due to the side-viewing principle of SAR sensors. Supposing the target elevation error is h and the incident angle is θ, the caused geolocation error r in the range direction can be calculated as: According to the previous analysis, the geolocation accuracy of SAR imagery is mainly influenced by the satellite position and velocity error, the internal and atmospheric propagation delay error and the target elevation error. For the GF-3 satellite, A dual-frequency GPS receiver is installed on the GF-3 satellite SAR platform. Therefore, the measurement accuracy of satellite position and velocity can reach 0.05 m and 0.05 mm/s, respectively. And a static calibration model is used for the geometric correction of internal and atmospheric propagation delay error, which can both reach the accuracy and efficiency requirements. And the target elevation error can be compensated using a suitable earth model.
After calibration, the geolocation accuracy of GF-3 can reach very high accuracy [16]. According to Eqa. 1, the geolocation of a target point can be calculated with a stereo SAR image pair accuracy as: By linearzation of Eqa. 4, the 3D geolocation of a target point (X , Y , Z ) can be calculated using a least-square method.

C. RATIONAL FUNCTION MODEL
The rigorous sensor model of optical satellite images are the collinear equations. Traditionally, the form of the collinear equations can be listed as: where (x, y) and (X , Y , Z ) are the target coordinates in the image space and object space. (x 0 , y 0 , f ) are the interior orientation parameters representing the principal point and the focal length of the sensor. (X S , Y S , Z S ) is the vector of sensor position, and (a 1 , a 2 , is the rotation matrix composed of the sensor attitude parameters. And the sensor position and attitude parameters also known as exterior orientation parameters. Usually, the correction of optical satellite imagery based on the collinear equations has to adjust all exterior parameters at the same time, which is hard to solve. Moreover, optical satellite on-orbit parameters are not provided to users in most cases. Instead, the RFM, which has a good fit for the collinear equations, is introduced for the geometric process of optical remote sensing imagery [10]. Usually, 80 polynomial coefficients are used to represent the correspondences between image space coordinates and object space coordinates as follows: where (x, y) is the normalized image space coordinate, and(X , Y , Z ) denotes the normalized longitude, latitude, and height in object space. Num S , Den S , Num L and Den L are third-order polynomials consisting of 80 rational polynomial coefficients (RPCs) with a i , b i , c i and d i (i = 0, 1, 2, · · · , 19). Besides, there are 10 more parameters used for the normalization of the image space and object space parameters as: and However, due to the measurement error of interior and exterior orientation parameters, bias can also be found in the fitted RPCs. Therefore, an affine transformation model is applied for bias compensation of RPCs: where M x and M y are the residual parameters. r and c are the extracted image space coordinates. a 0 , a S , a L and b 0 , b S , b L are the affine transformation parameters (ATPs). With GCPs inlcuded, the geo-correction model can be written in the form of matrix as: where A is the coefficient matrix from (M x , M y ) to a 0 , a S , a L and b 0 , b S , b L . Res is the residual vector and x is the unknow vector.

D. RD-RFM STEREO GEOLOCATION MODEL
The 3D geolocation of an object can be solved using stereo image pairs, and the difference lies in applied models.
Traditionally, the stereogrammetry model for SAR-optical stereo imagery is developed based on the RFM with RPCs. To achieve this process, the RPCs of SAR satellite images should be produced by the user firstly. Usually, the RPCs can be calculated with the help of a terrain-dependent or a terrain-independent approach [24]. While the terrain-dependent method needs lots of well-distributed GCPs consuming a lot of finance and manpower, the terrain-independent method is widely used for the production of RPCs for SAR image as shown in Figure 3. VCPs distributed at a grid-shape format are generated at different heights. With the help of the rigorous sensor model, the distributed object space coordinates can be reprojected to image space. Based on the generated point sets in image space, the RPCs can be generated using a least-square method.
After the generation of RPCs, both SAR and optical imagery can be processed with the RFM. Using a traditional space intersection method [25], the normal equation is: It's easy to deliver the partial derivative with the help of RPCs. Using a traditional least-square methods, the object space coordinates can be calculated.
Although the RFM is of simplicity, the RPCs of each SAR image should be generated first by users. Moreover, the accuracy of the fitted RPCs is influenced by the terrain for SAR imagery. Therefore, a new RD-RFM stereo geolocation model is proposed which is more suitable for the reconstruction of 3D geo-information for SAR-optical stereo image pairs is proposed. By integrating the RD model and RFM, the normal equation for the RD-RFM stereo geolocation model can be derived as: As demonstrated previously, parameters of the RD model and RFM are reprojected in different coordinate systems. Therefore, we first translate parameters of the RD model from the geodetic Cartesian coordinate system to the geodetic coordinate system. Usually, the transformation is given as: where (X , Y , Z ) is the coordinates in the geodetic Cartesian coordinate system, (φ, λ) are the radian of the object space coordinates and H is the height in the geodetic coordinate system with φ = B * π 180 λ = L * π 180 (16) e is the first eccentricity of the WGS-84 ellipsoid and N is the curvature radius given by The partial derivatives using the RFM is easy to be calculated, and the partial derivatives of B, L, H from the RD model can be derived as:   And the partial derivatives from (X , Y , Z ) to (φ, λ, H ) can be given as: Substituting the coefficients obtained from the RD model in Eqa.13 with Eqa.18 and Eqa.19, the partial derivatives of the RD model can be transformed from the geodetic Cartesian coordinate system to the geodetic coordinate system, which has a consistency with the coefficients from the RFM. Therefore, the RD-RFM stereo geolocation model can be written in the form of vector as: And a traditional least-square method can solve this problem efficiently.  mode, and the nominal resolution is 1 m and 3 m, respectively. The resolution of GF-2 optical images is 0.81 m. Detailed information about the experimental dataset is listed in Table 1.

B. GEOLOCATION ACCURACY IMPROVEMENT OF SAR AND OPTICAL IMAGERY
Before the verification of the proposed RD-RFM stereo geolocation model, SAR and optical images are geometrically calibrated first. For SAR images from GF-3 satellite, the geolocation accuracy can be improved by the provided precise orbit information and a set of suitable calibration models. Table 2 gives the root-mean-square error (RMSE) of absolute geolocation accuracy of all ckeck points of GF-3 SAR images in different area according to different models. And the geolocation accuracy of the fitted RFM are always lower than the traditional RD model, especially in the mountainous area around Dengfeng City. After calibration, the geolocation accuracy of SAR images can reach pixel-level in both experimental areas. Therefore, extracted points from calibrated SAR images can be considered as VCPs for the geometric correction of optical images. Fig. 6 illustrates the geometric consistency between SAR images and optical images before and after correction. The consistency between SAR and optical images is very TABLE 2. Geometric performance of GF-3 SAR images before and after calibration with different models (pixels). poor before correction, which can reach dozens of pixels. After geometric correction, pixel-level consistency can be obtained as shown in Fig. 6 (b) and (d) in both experimental areas.
After calibration, the RPCs of SAR images are produced based on a terrain-independent method. The fitting accuracy of RFM to RD model for SAR images is shown in Table 3. In the urban and mountainous areas, the fitting accuracy of RFM to the RD can both reach an acceptable level. However, the maximum value of fitting error varies according to different terrain. In the mountainous area, the maximum fitting error can reach about 2.5 pixels for SAR images, indicating a low fitting accuracy and consistency of the RFM.
For the space intersection process of stereo image pairs, an iteration method is applied to get the final results. For better convergence, the initial value is important for the result of the intersection process. Therefore, the central coordinates of experimental SAR images are considered as the initial value. For the RFM, LONG_OFF, LAT _OFF and HEIGHT _OFF in the RPCs are considered as the central coordinates. As for the RD-RFM model, the above coefficients of the optical image covering the same experimental area can also be applied as the initial value. Based on a credible initial value, the space intersection process can converge quickly.

C. 3D STEREO GEOLOCATION VERIFICATION IN DENGFENG CITY
In the mountainous area around Dengfeng City, 3 check points are involved to verify the geolocation accuracy of a SAR-optical stereo image pair with the resolution higher than 1 m. The results of stereo geolocation accuracy with different models are listed as shown in Table 4. In this area, a stereo image pair with a large convergence angle was tested in this experiment, and the convergence angle of this dataset is 40.08 • .
The 3D geolocation error of each check point and the RMSE of all check points are calculated. The first three rows in Table 4 gives the stereo geolocation accuracy using a traditional RFM in the mountainous area with different terrain. As demonstrated previously, the RFM has a relatively poor fitting accuracy to the rigorous RD model for SAR images in mountainous areas. Therefore, the 3D geolocation accuracy obtained from the traditional RFM is poor than our proposed RD-RFM model. And the RMSE shows the difference of 3D geolocation accuracy obtained from different models.
Moreover, the 3D geolocation accuracy decreased greatly with the increase of terrain height based on the traditional RFM, which also reflects the fitting accuracy of RPCs to the RD model of SAR images. In comparison, the geolocation accuracy remains stable with the change of terrain height based on our proposed RD-RFM stereo geolocation model. Therefore, the performance of our proposed RD-RFM model is better than the traditional RFM for the stereo geolocation of SAR-optical image pair covering mountainous areas, and the degeneration of the RFM caused by the low fitting accuracy can not be ignored in mountainous areas.

D. 3D STEREO GEOLOCATION VERIFICATION IN XINYI CITY
In an urban area around Xinyi City, 2 optical images and 1 SAR image are included to verify the stereo geolocation accuracy considering different conditions of convergence angle. These 2 optical images are obtained with different side-viewing angles and directions, and the influence of convergence angle for SAR-optical stereo geolocation accuracy   is also analyzed. Table 5 lists the 3D stereo geolocation accuracy obtained from SAR-optical stereo image pairs with a convergence angle of about 45 • . Five check points are applied in this experiment.
In Table 5, results obtained using a traditional 3D stereo geolocation model based on the RFM are relatively stable with the change of the terrain height in this urban area. In this flat area, the geometric performance of the traditional RFM should be better than the results obtained in the mountainous area. However, the degeneration of registration accuracy between SAR and optical images caused by the low resolution of SAR image will also influence the 3D geolocation accuracy. Therefore, the geolocation error based on the RFM in the urban area is sometimes lower than that in the mountainous area due to errors along the reprojection and production of RPCs. In comparison, the 3D geolocation accuracy obtained based on the RD-RFM model is more acceptable, which can reach a sub-pixel level for the SAR image.
What's more, Table 6 lists the 3D geolocation accuracy of a SAR-optical stereo image pair with a narrow baseline.
The convergence angle between the optical image and SAR image is about 10 • . With the decrease of convergence angle, the geometric performance of both stereo geolocation models degenerated. The obtained geolocation accuracy in planimetry with the traditional RFM varies from about 2.21 m to 5.51 m, and the height geolocation accuracy varies from -9.23 m to 7.66 m. In comparison, the performance of our proposed RD-RFM stereo geolocation model is more acceptable. The largest difference of geolocation accuracy is about 2.5 m in planimetry and 7.07 m in height, which is much high than results obtained from the traditional RFM. Figure 7 illustrates the 3D geolocation error distribution with datasets covering different areas and processed with different models. Figure 7   the mountainous area and urban area. Moreover, the RD-RFM model can also obtain a relatively high 3D geolocation accuracy with a stereo image pair with small convergence angle. Therefore, our proposed model provides a more intuitive and efficient way for the 3D geo-information production from SAR-optical image pairs.

IV. CONCLUSION
In this study, we propose a new RD-RFM stereo geolocation model for the reconstruction of accurate 3D geo-information from SAR-optical stereo image pairs. Based on geometrically calibrated SAR images, VCPs are extracted and applied for the geometric correction of optical images. Considering the difference between the RD model and RFM, parameters of the RD model are reprojected to a geodetic coordinate system. Then the reprojected RD model of SAR images is applied together with the RFM model of optical images, which formed the RD-RFM model for 3D geo-information reconstruction. In comparison, a traditional RFM for SAR-optical stereogrammetry is applied based on the produced RPCs of SAR images. Three stereo image pairs covering an urban and mountainous area in Henan Province, China are involved in our experiments, and the results indicated that our proposed RD-RFM model can achieve the best performance with different terrain. In the mountainous area, the proposed model can reach within 2 m in planimetry and about 2 m in height, while results in the urban area can reach about 1.6 m in planimetry and 1 m in height. Moreover, the RD-RFM model also performs better than the traditional stereo geolocation model based on RFM using stereo SAR-optical images with small convergence angle. Therefore, our proposed RD-RFM stereo geolocation model is more simplified without the transformation from RD model to RFM, which provides an more intuitive and efficient way for users to derive the 3D geo-information using SAR-optical stereo image pairs.