An Automatic and Accurate Method for Marking Ground Control Points in Unmanned Aerial Vehicle Photogrammetry

Owing to the rapid development of unmanned aerial vehicle (UAV) technology and various photogrammetric software, UAV photogrammetry projects are becoming increasingly automated. However, marking ground control points (GCPs) in current UAV surveys still generally needs to be manually completed, which brings the problem of inefficiency and human error. Based on the characteristics of UAV photogrammetry, a novel type of circular coded target with its identification and decoding algorithm is proposed to realize an automatic and accurate approach for marking GCPs. UAV survey experiments validate the feasibility of the proposed method, which has comparative advantages in efficiency, robustness, and accuracy over traditional targets. Additionally, we conducted experiments to discuss the effects of projection size and viewing angle, number of coded bits, and environmental conditions on the proposed method. The results show that it can achieve robust identification and accurate positioning even under challenging conditions, and a smaller number of coded bits is recommended for better robustness.


I. INTRODUCTION
I N THE past decades, because of the rapid development of unmanned aerial vehicle (UAV) technology, UAV photogrammetry has been widely used in several industries including surveying and mapping [1], [2], geology [3], [4], [5], geographic information system (GIS) [6], [7], and urban and rural planning [8], [9], among others. Some scholars have also used it in archeology [10], [11], ancient building conservation [12], deformation monitoring [13], [14], and large-scale engineering safety [15]. UAV photogrammetry has the characteristics of high efficiency, low cost, automation [16], [17], and distinctive spatial and temporal resolution compared with other remote sensing technologies [18], [19], [20]. It has high adaptability to deal with complex terrain and high flexibility to choose different UAV types for different needs [21], [22]. Without ground control points (GCPs), the UAV photogrammetry project is geo-referenced exclusively using the UAVs global navigation satellite systems (GNSS) receiver. In the case of GNSS single-point positioning, any point in the project will typically be approximately 1-10 m of its real-world location. This accuracy can be improved to the centimeter level if the UAV is equipped with GNSS real-time kinematics (RTK) and works well [23], [24]. This degree of accuracy may be more than sufficient for 3-D modeling projects or environmental monitoring that only requires relative accuracy. But for projects that usually require more accurately geo-referenced outputs, including land surveying, earthwork, and topographic surveys, among others, a certain number of GCPs need to be deployed in the survey area [25], [26], [27]. The absolute or relative coordinates of these GCPs are commonly measured with GNSS RTK [28], [29] and total stations [30], [31], among others. Presently, the frequently used targets in UAV surveys include X-shaped, L-shaped, O-shaped targets, etc. [32], [33], and the corner points of ground objects with obvious features are also widely used as GCPs [34]. Notably, the accuracy of UAV survey projects depends on many factors, including typical parameters of the camera, RTK and other measuring equipment, route planning, environmental conditions, spatial distribution and absolute accuracy of GCPs, and accuracy of pinpointing the exact target center, among others [35], [36], [37], [38].
Most studies on GCPs have focused on their applications or the impact of their deployment and distribution on aerial survey results. Al-Halbouni et al. [39] proposed that GCPs placed at the survey area's outskirts could provide more accurate topographic data. Agüera-Vega et al. [40] investigated the effect of the number of GCPs on the accuracy of digital surface models. Florinsky et al. [41] studied the relationship of GCPs and the model accuracy differences between altitude and horizontal direction. Martínez-Carricondo et al. [23] evaluated the spatial accuracy of UAV orthophoto and topography when processed with direct geo-referencing or GCPs. However, few studies have been performed on the imaging conditions of GCPs, the efficiency and accuracy of marking GCPs, and their effects on other processes involving GCPs.
GCPs need to be placed in suitable locations based on the image acquisition geometry to cope with terrain constraints [25], [26]. Following the acquisition of images from UAV surveys, the image coordinates of GCPs in corresponding photos must be obtained to play the role of GCPs in subsequent data This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ processing and analyses, such as aerial triangulation and dense matching [42], [43]. Manually marking GCPs has become an unfavorable factor affecting data processing efficiency and automation in the context of the improvement and automation of various photogrammetric software [44], [45]. Moreover, it is not conducive to the subsequent data processing because of human error if the GCPs are manually marked. Thus, it is significant to realize an automatic and accurate method for marking GCPs, which is friendly to different user groups and adapts to the application scenes of UAV photogrammetry.
Coded targets have been frequently applied in the field of close-range photogrammetry. Points of interest on the surface of the close-range object can be identified and positioned with high accuracy using coded targets and metric cameras [46], [47]. Coded targets can be roughly classified into dot-dispersing, centripetal, and color-coded targets, among others. [48] They usually have unique appearance features and can be automatically identified by computer vision methods, and the coding information they carry can be decoded as their unique identity numbers [49], [50]. These characteristics allow them to be used as GCPs in UAV photogrammetry. Taking Schneider's coded target (hereafter referred to as SNDRT) as an example [see Fig. 1 [51], most of the related studies focused on its design and applications in close-range photogrammetry. A few scholars also used SNDRT in UAV surveys. These studies indicate the potential of this approach but also expose the problems of migrating SNDRT directly to UAV applications [28], [30]. Compared with close-range photogrammetry, UAV photogrammetry's environmental and lighting conditions are more complex, and the projections of targets are smaller [47], [52]. Accordingly, the size of traditional coded targets will have to be increased if they are directly applied to the UAV surveys; otherwise, there will be unfavorable constraints on the flight height and the viewing angle.
Herein, a novel centripetal circular coded target (PX circular coded target, hereafter referred to as PXCCT) was designed [see Fig. 1(a)], and its identification and decoding algorithm was developed based on the characteristics of UAV applications. PXCCT and the algorithm cooperated to realize a robust, reliable, and automated marking method in UAV photogrammetry projects. UAV survey experiments were conducted to verify the performance of PXCCT with SNDRT and cross target as control groups. Furthermore, a series of experiments were conducted to discuss the robust differences between PXCCT and SNDRT in UAV applications, explore the effect of the coded bits on their recognition and decoding, and verify the adaptability of PXCCT to challenging conditions.

A. Coded Target Design
The design of coded targets should be aimed at application purpose and scenario, taking into account other factors, including coding capacity, structure, size, reliability, and uniqueness of identification [48], [49], [50], [53]. There are several types of coded targets, among which the SNDRT is a typical representative of circular coded targets. SNDRT is primarily applied in close-range photogrammetry with the advantages of a stable structure, simplicity, compactness, and reliability [51]. UAV photogrammetry projects have a larger field of view than close-range photogrammetry and typically have to shoot photos while moving. Furthermore, the environmental and lighting conditions of UAV surveys are more complex. Most cameras onboard UAVs are nonmetric or consumer-grade digital cameras, which exhibit much greater magnitudes of instability and distortion [54]. Therefore, the improved coded target PXCCT is proposed to address these challenges and achieve automatic and accurate GCP marking in UAV surveys.
From outside to inside, PXCCT consists of a white background, a large black circle, a black and white band (coded band), and a central crosshair. These white and black parts can be reversed if necessary. Compared to the SNDRT, PXCCT eliminates the small circle in the center and adds a large circle on the outside. The orange part in Fig. 1(a) shows the segmentation of the coded band (12-b), and the blue part indicates the proportional relationship of the bands. When the targets are imaged in UAV surveys, the black area tends to shrink, and the white area tends to expand. Consequently, appropriate adjustments to the width of the bands based on the color can help optimize the effect of identification and decoding. Herein, the proportional relationship of a:b:c:d=7:3:7:3 is proposed to construct PXCCT [see Fig. 1(a)].
The coded bits of PXCCT are determined by the number of the coded strips. The coded band in Fig. 1(a) is divided into 12 strips, which means the coded bits of this PXCCT are 12 b. White strips represent 1 and black strips represent 0. By arranging the strips clockwise from any starting point, a 12-b binary sequence is obtained. The minimum value of this sequence is generated through circular shift and converted to decimal as the decoded value of this PXCCT.

B. Image Preprocessing and Rough Screening
The original images from UAV fieldwork are typically heavily influenced by noise due to changing environmental and lighting conditions. Therefore, the images need to be preprocessed before obtaining contours that meet the features of coded targets. First, a three-channel color image was converted to a single-channel gray image by the weighted average formula [55] Gs = 0.299R + 0.587 G + 0.114B (1) where Gs is the grayscale value of a gray image pixel. R, G, and B are the intensity value of the RGB image pixel. If the image adopts other color systems, the appropriate weighted average formula should be selected on a case-by-case basis. Subsequently, Gaussian filtering [56] and erosion [57] were applied to the gray image to suppress its noise and white area expansion, as well as to ensure that the image's edges are as distinct as possible (see Fig. 2). Edge detection was performed on the preprocessed image with the Canny operator [58] to obtain a binary image with strong edge information. The Suzuki operator [59] was then used to extract contours from the binary image to obtain up to hundreds of thousands of contour information. Too many contours contradict the elliptical features. These contours have features, such as too long or too short perimeter and too large or too small roundness. Therefore, rough screening in this article was performed based on the perimeter and roundness of the contours where r is the roundness of the contour, S is the area of the contour, and l is the perimeter of the contour. When the contour is a perfect circle, the theoretical value of its roundness is 1.
Other objects' contours in the background area can be effectively filtered out by adjusting the contour perimeter and roundness thresholds based on the practical situation.

C. Ellipse Fitting and Fine Screening
The remaining contours after rough screening contain contours of PXCCTs and some approximately elliptical contours of similar size. Herein, the grayscale statistical analysis (hereafter referred to as GSA) method was used for further fine screening and obtaining the image coordinates of PXCCT. First, least squares ellipse fitting was performed on the contour [60]. The image coordinates of the points on the target contour are noted as (x i , y i ) and the total number of points is indicated as N ; subsequently, the general equation of an ellipse is expressed as follows: Then, the objective function is expressed as follows: The optimal objective function was solved using 4AC − B 2 = 1 as an additional constraint to obtain five ellipse parameters.
The rectangular area, in which the ellipse of the target contour is located, can be estimated with ellipse parameters. The ellipse was mapped to a standard circle by performing a projection transform on this rectangular area to use a uniform standard for fine screening and subsequent decoding [61] ⎡ where (x i , y i ) are the points in the rectangular area before transformation, (x i , y i ) are the points after transformation, and H is the projection transform matrix (see Fig. 2). After the projection transform, the rectangular area was converted into a square area inscribed with a standard circle. The size of the square area was scaled to 200 × 200 pixels.
Using the GSA method, the contours were finely screened. The target area was processed with the Otsu method [62] to obtain the binary image after unifying its size. Subsequently, we started from the center of the area and took a circular sampling outward pixel by pixel. The binary value of every single pixel-wide circle was averaged and arranged from inside to outside along the radius direction. This way, the target area's radial grayscale distribution was obtained. If the target area contains the projection of the coded target, its radial grayscale distribution should have obvious regularity based on the design standard of PXCCT. There should be two regions (δ 1 , δ 3 ) with theoretical average grayscale values of 0, a region (δ 2 ) with average grayscale values between 0 and 1, and a region (δ 4 ) with theoretical average grayscale values of 1 (see Fig. 3). Objects that do not have similar grayscale distribution features can be filtered out by setting a threshold for the average grayscale values of each region. The robustness of PXCCT was effectively improved using the GSA method to deal with different scenarios of UAV photogrammetry.

D. Center Positioning and Decoding
Almost all contours of objects other than the coded targets in the background were filtered out after rough and fine screening. The Devernay algorithm [63] was used to extract the subpixel edges of ellipse contours. From the general equation of the ellipse (3), we can obtain where x c and y c are the coordinates of the ellipse centers, which are the image coordinates of the exact centers of coded targets. For the projections that are considered to match the PXCCTs features, the region δ 2 of the radial grayscale distribution shows the location of the coded band. The number of coded bits is defined as t, and the edge of a strip in the coded band is considered to be the start position. A t bit binary sequence was obtained after one cycle of sampling on a scanning trajectory, which was performed every 360/t • along the circumferential direction. Another round of sampling was performed with a new start point moving 1 • along the ring from the end point of the previous scanning. By repeating the sampling this way, 360/t scanning trajectories were generated and 360/t binary sequences were obtained (see Fig. 4).
The decimal number of a binary sequence was computed t times through circular shift, moving the first value of the sequence to the end each time. The smallest decimal number was selected as the decoded value of this binary sequence. Overall 360/t decoded values were obtained by performing this process on all binary sequences. If the decoded value with the highest frequency appears x times in total and the threshold is T , the decoded value was considered to be the code of this PXCCT when x · 360/t > T .

A. Materials and Methods
In order to verify the feasibility of PXCCT in UAV photogrammetry, UAV survey experiments were conducted with SNDRT and cross targets as the control groups to compare the robustness, efficiency, accuracy, and other performances between PXCCT and traditional coded or noncoded targets. A construction site located in Wuhan with an area of approximately   18725 m 2 served as the survey area. The average slope of this area was approximately 12.09 • , and the elevation range was approximately 11.13 m. To the south, only a few shrubs existed, small grassland to the east, and sycamore trees to the north. Finally, no vegetation cover existed on the exposed surface of other areas (see Fig. 5).
Triplets of a PXCCT, SNDRT, and cross target with the size of 1.2 × 0.4 m were used to set three types of targets under the same conditions (see Fig. 6). The size of each subtarget is 0.4 × 0.4 m. The triplets are made of matte foam board, which has the advantages of low price, easy portability, and stable structure. Overall, 14 targets were distributed in the survey area following the principle of uniformity and stability (see Fig. 5). The subtargets were fully flat to avoid being affected by other factors, including wind and occlusion. Tianyu CTS-632R10 total station with a prism was used to accurately measure the relative coordinates of each subtarget [see Fig. 7(b)]. The detailed parameters of the total station are presented in Table I.
Phantom 4 RTK, released by DJI in June 2018, was applied to acquire the orthophotos [see Fig. 7(a)]. The Phantom 4 RTK is a small quad-rotor high-precision aerial survey UAV with a centimeter-level navigation and positioning system and a high-performance imaging system. The detailed parameters are presented in Table II.
For this investigation, route planning was performed based on the longitude and latitude range of the survey area with 70%-80% overlap. Four sets of routes with different AGL (above ground level) flying heights (30,35,40, and 45 m) were generated, respectively. The route with a flying height of 30 m is visualized in Fig. 8. The number of photos acquired from the four routes was 215, 158, 146, and 106, respectively.

B. Photogrammetric Processing
The proposed algorithm was used to identify and pinpoint PXCCTs in the nadir photography, whereas Agisoft Metashape Professional was used to identify and pinpoint SNDRTs and cross targets. Agisoft Metashape is an effective commercial photogrammetric software that can perform photogrammetric processing on digital images to generate 3-D spatial data, which is widely used in surveying and mapping, GIS, cultural heritage protection, and other fields [64]. Agisoft Metashape provides an automatic marking function for SNDRT. However, it is used for close-range photographic surveys. Due to the characteristics of UAV surveys, the target size of SNDRT has to be increased to maintain the projection size, which increases the cost and instability of its layout. There are many false-negative (missed targets) and false-positive results (misidentified objects) in practical applications, which meant that the advantages of high efficiency and precision of the coded target dissipate. Metashape's automatic marking function for cross targets requires an aerial triangulation (align photos) first. Because the cross target is the noncoded target, if the UAV is not equipped with an airborne GNSS or the tolerance is not properly set, there will be many false-negative or false-positive results that can easily lead to marking confusion (see Fig. 9). Therefore, performing aerial triangulation to compare accuracy with the other two target  I  TOTAL STATION PARAMETERS   TABLE II  types is difficult. Its false-negative and false-positive results were thus counted only for reference. Additionally, cross targets were manually marked and encoded to compare the efficiency and accuracy with the other two types of targets.
The accuracy of UAV survey projects is affected by many factors, including the UAV camera's performance, spatial distribution of GCPs, environmental conditions, object coordinate accuracy of GCPs, and accuracy of marking, among others. These factors interact with each other and determine the accuracy of aerial triangulation. The images of the routes with various flying heights were combined with the marking results of three types of targets. By selecting five of the 14 evenly distributed targets as GCPs and the other 9 as checkpoints, overall, 12 times of data processing were performed by Agisoft Metashape. After the aerial triangulation, the "Optimize" tool was used to reduce the effect of camera distortion. The impact of marking results on the aerial triangulation was evaluated by comparing the object coordinate errors and reprojection errors of checkpoints.
Three indicators were defined to assess PXCCTs performance: correct rate C, recall rate R, and decoding rate D. Assuming that there are N targets to be identified in a photo, n 1 targets are detected by the recognition algorithm. After verification, n 2 of them are indeed the targets to be identified, then the number of false-positive results is n 3 = n 1 − n 2 . Among the correctly identified n 2 targets, the number of correct decoding results is n 4 , subsequently, the number of incorrect decoding results is n 5 = n 2 − n 4 . Consequently, the correct rate C = n 2 /n 1 , the recall rate R = n 2 /N , and the decoding rate D = n 4 /N . The correct rate reflects the ability of the identification algorithm to correctly identify the corresponding targets. A low correct rate means that many objects other than the targets are erroneously recognized as targets, which will greatly affect the reliability of automatic marking results. The recall rate reflects the identification algorithm's ability to detect the corresponding targets. A low recall rate means that some targets are unidentified. The decoding rate reflects the ability of the decoding algorithm to appropriately parse the corresponding targets. A low decoding rate means that some targets' decoding results do not match their design codes. In practical applications, all three indicators should be close to 100% to ensure the reliability and practicability of the automatic marking results.

C. Result
The automatic marking performance of three target types in four experiments are presented in Table III. In all experiments, PXCCT guaranteed stably high correct, recall, and decoding rates. The marking result of SNDRT in the experiment with an AGL flying height of 30 m could reluctantly ensure the calculation of aerial triangulation. The recall rate of SNDRT dropped to 14.23% at a flying height of 35 m, whereas it was unidentified when the flying height was 40 or 45 m. The imaging of the No. 19 target, which was laid out on an uphill road with a large viewing angle in the nadir photography, with a flying height of 45 m, is depicted in Fig. 10(a). SNDRT subtarget of the No. 19 target could not be automatically identified in the four experiments affected by the large viewing angle. In contrast, the PXCCT subtargets were all correctly identified and decoded. The ability to perform stable and reliable automatic marking under demanding conditions, such as complex terrain and variable lighting, is one of the necessary conditions for applying coded targets to UAV photogrammetry.  When the AGL flying height was 30 m, the number of cross targets automatically identified by Agisoft Metashape exceeded the design number of 14. There are many targets outside the survey area, which is the result of many false-negative or false-positive results. For instance, the leaves in Fig. 9 were misidentified as cross targets because they are shaped like X in the images. Therefore, cross targets were marked manually for subsequent aerial triangulation. However, it is difficult to distinguish the center of the cross target and control the marking error with empirical judgment in some images because of the interference of camera distortion and expansion of the white area [see Fig. 10(b)]. Conversely, the proposed marking method for PXCCT pinpoints the center coordinates based on contour extraction and least squares ellipse fitting, which are less affected by noise, distortion, and empirical judgment and are beneficial to guarantee the accuracy of marking GCPs.
In the experiments with flying heights of 35 m and above, the recall rate of SNDRT was too low to analyze the object coordinate errors and reprojection errors. Therefore, the experiment with a flying height of 30 m was selected to compare and analyze the efficiency and accuracy of three target types. As presented in Table IV, no significant difference exists in the aerial triangulation accuracy of PXCCT and SNDRT, both of which can meet the accuracy requirements of most UAV applications. PXCCT is slightly higher than cross target regarding object coordinate error and is 40.34% higher than cross target regarding reprojection error. The proposed algorithm took 31 s to mark the PXCCTs, comparable to the 27 s that Agisoft Metashape took to mark the SNDRTs. It is extremely time-consuming to manually mark the cross targets, which took 1620 s. The experiments with flying heights of 35, 40, and 45 m could not be compared with the aerial triangulation accuracy due to the insufficient identification results of SNDRT. Considering the marking results' robustness, reliability, efficiency, and accuracy, PXCCT is regarded as the most suitable for this investigation among three target types. Thus, the proposed method can significantly reduce the time of marking GCPs and improve data processing automation.

IV. DISCUSSION
This article proposes a new coded target PXCCT and its identification and decoding algorithm for marking GCPs. UAV survey experiments were conducted with SNDRT and cross target as control groups. The results show that PXCCT has the advantages of automatic marking and takes less time than cross target, which relies on manual marking and correction. For scenarios in which many GCPs are projected in UAV photogrammetry, such as the practitioner's drones only have standard navigation GNSS, the time and labor cost of office work can be economized with the implementation of automatically marking GCPs. Compared to noncoded targets or feature objects commonly used as GCPs, the efficiency and automation of data processing can be improved. The accuracy of marking is comparable to SNDRTs and higher than that of cross targets. It demonstrates that PXCCT inherits the property of coded targets for high-precision marking in closerange photogrammetry when applied to UAV photogrammetry as GCPs. In the four experiments with different AGL flying  IV  ACCURACY AND EFFICIENCY COMPARISON OF THREE TARGET TYPES   TABLE V RELATIONSHIP BETWEEN CODED BITS AND CODING CAPACITY heights, PXCCT guaranteed exceedingly high correct, recall, and decoding rates. The recall rates of SNDRT in the experiment with flying heights of 35 m and above were insufficient to provide enough GCPs for subsequent data processing. The SNDRTs with large viewing angles could not be automatically identified, implying that the robustness of PXCCT is significantly improved compared with the traditional coded targets when applied to UAV surveys.
The AGL flying heights and the viewing angles had limited variation in the above experiment. Due to better robustness, the metrics of PXCCT did not decrease significantly with increasing flying height. Few studies have investigated the effect of projection size, viewing angle [25], and number of coded bits on the identification of GCPs. Therefore, three sets of experiments were performed to discuss further the performance of PXCCT under demanding conditions and different scenarios and the influence of coded bits on detection.

A. Robustness
Experiments were conducted with SNDRT as the control group to explore the robustness of PXCCT to projection size and viewing angle. Agisoft Metashape was used to identify SNDRT, and the proposed method was used to identify PXCCT. The study site selected for this investigation was the Friendship Square of Wuhan University. This square has a flat surface to set up targets with a wide field of view. The sunny weather on the experiment day was conducive for the UAV camera imaging. In total, 32 targets of each type were separately prepared to ensure that PX-CCT and SNDRT are under the same projection size and viewing angle. The size of each target was 0.3 × 0.3 m and the material was nonwoven. The layout of the targets is shown in Fig. 11, with 16 PXCCTs each for regions A and C, and 16 SNDRTs each for regions B and D. The targets were fully flattened and fixed to avoid distractions, including occlusion and deformation.
The number of coded bits affects PXCCT's coding capacity based on the design principle (see Table V). The association between the number of coded bits and the coding capacity of SNDRT provided by Agisoft Metashape is also summarized in Table V. While a large number of coded targets are extensively applied in close-range photogrammetry, GCPs used in UAV surveys are much less so that a smaller number of coded bits can be selected instead [46], [65]. Agisoft Metashape has the automatic marking function for 12-b, 14-b, 16-b, and 20-b SNDRT. In this investigation, the minimum number of coded bits provided by Metashape, i.e., 12 b, was selected to make two types of targets for the comparative experiment. In this case, PXCCT has a coding capacity of 350, which exceeds the number of GCPs needed for almost all UAV surveys.
The unit pixel size of the UAV camera is noted as s(m), the focal length is f (m), and the effective frame is l × w(pixel). When the AGL flying height is h(m) in the nadir photography, the ground sampling distance (GSD) is g(m), and the coverage is L × W (m). If the size of the target is d × d(m), the projection size of the target in the center of the frame is x × x(pixel). The following equations can be presented: Particularly, for DJI Phantom 4 RTK, s = 2.41 μm, f = 8.8 mm, l = 5472, w = 3648 (see Table II). The projection size of targets was the variable by controlling the flying height of the UAV. The minimum flying height was 10.43 m, corresponding to the projection size of 105 pixels for the target in the center of the orthophoto. The maximum flying height was 109.5 m, corresponding to the projection size of 10 pixels. Each target type was sampled 192 times respectively at each height [see Fig. 12(a)]. The viewing angle was set as the variable by controlling the UAV to fly on a spherical surface with a radius of 15 m. The maximum pitch angle of the UAV was 84 • , corresponding to the viewing angle of 7 • in the center of images from the near-nadir flights. The minimum pitch angle was 12 • , corresponding to the viewing angle of 78 • in the center of images from the oblique flights. Each target type was sampled 256 times at each angle [see Fig. 12(b)]. The route planning is shown in Fig. 12, and DJI Phantom 4 RTK was used to conduct the experiment.
After detecting PXCCT and SNDRT using the proposed method and Agisoft Metashape, respectively, the number of recognition, correct recognition, and correct decoding results in each photo were counted. Correct, recall, and decoding rates of PXCCT and SNDRT were subsequently calculated with projection size or viewing angle as variables.
In experiments with projection size as the variable, the recall rates of PXCCT and SNDRT were the first to fall below 100% among the three indicators. The following analysis focused on the recall rate to compare the robustness of two target types. As shown in Fig. 13, Agisoft Metashape can ensure that the recall rate is >99% when the SNDRT is projected to be ≥47 pixels, at which time the flying height is 23.30 m, the coverage is approximately 814.34 m 2 and the GSD is 0.63 cm. As the projection size decreases, the SNDRTs recall rate rapidly approaches 0. The proposed recognition algorithm for PXCCT guarantees a recall rate of >99% when the target is projected to be ≥37 pixels. In this case, the flying height is 29.59 m, the coverage is approximately 1313.35 m 2 , and the GSD is 0.81 cm, which is an improvement of 27.0% compared with the control group.   Experiment with viewing angle as the variable have similar characteristics, with recall rate being the most important metric for the robustness of PXCCT and SNDRT. As shown in Fig. 14, Metashape can guarantee a recall rate of >99% for SNDRT when the viewing angle is ≤ 35 • . The proposed algorithm for PXCCT can ensure that the recall rate is >99% when the viewing angle is ≤ 56 • . There is a 60.0% improvement in the pitch angle range compared to that recorded in the control group. Dealing with     19. Changes in the number of contours during recognition. δ 1 , δ 2 , δ 3 , δ 4 , and δ 5 correspond to scenes of cement, brick, mud, gravel, and grass, respectively. flat viewing angles is important for improving the robustness of targets in oblique photography.
The identification and decoding of PXCCT by the proposed algorithm under demanding conditions is shown in Fig. 15. PXCCT can still guarantee the effect of identification and decoding and give the accurate coordinates of GCPs even if the projection size and viewing angle are harsh. This implies that PXCCT can deal with vertical (nadir or near-nadir) and oblique photography and ensure the reliability of the marking results.

B. Coded Bits
The number of GCPs in UAV photogrammetry is usually much less than that recorded in close-range photogrammetry. The coded bits of PXCCT and SNDRT used in the above experiments are 12 b. In this case, PXCCT has a coding capacity of 350 that exceeds the possible number of GCPs used in UAV surveys. The number of coded bits of the circular-coded targets is theoretically determined by the number of strips of the coded band. The variation of the strips may also affect identification and decoding. Additional experiments were performed on the 8-b and 10-b PXCCT to ascertain the relevance between the coded bits and the robustness of PXCCT. Other conditions, including the size of targets and route planning, were consistent with the comparative experiment of the 12-b coded targets.
The findings of the experiments are shown in Figs. 16 and 17. Correct, recall, and decoding rates of 10-b PXCCT are guaranteed to be >99% when the minimum projection size is 35 pixels and the maximum viewing angle is 64 • . The minimum projection size of 8-b PXCCT under the same conditions is 25 pixels and the maximum viewing angle is 67 • . Herein, the AGL flying height is 43.8 m, the coverage is approximately 2877.66 m 2 , and the GSD is 1.20 cm. The results show that the robustness of PXCCT was further improved when the number of coded bits was reduced to enhance its applicability in UAV surveys. Therefore, a smaller number of coded bits should be selected as the coding capacity allows for more robust marking results.

C. Environment
The environmental and lighting conditions in UAV photogrammetry, which are more complex and changeable compared with close-range photogrammetry, pose challenges to the robustness and reliability of the proposed method by affecting the contour extraction and screening. To verify the adaptability of PXCCT to complex and variable conditions, experiments were conducted under different lighting scenes: sunny, cloudy, shadow occlusion, etc.; different environment scenes: cement, brick, mud, gravel, grass, etc.; different disturbance conditions: slight mud, water, dust, etc. (see Fig. 18). In total, 25 targets with the size of 0.15 × 0.15 m were set up in each scene, and DJI Phantom 4 RTK was used to take 48 photos from various angles. The number of samples in each scene was 1200.
The number of contours in all environment scenes before and after two contour screenings was counted and visualized in Fig. 19. The number of contours obtained by the Suzuki algorithm was approximately 100 000. After rough screening based on perimeter and roundness, the number of contours quickly dropped to approximately 100. After fine screening using the GSA method, 25 contours were obtained by fine screening, which is the number of PXCCT in each photo. The results indicate that the contour screening strategy in this study can significantly improve the robustness of recognition and decoding and enable the application of PXCCT in UAV surveys.
The correct, recall, and decoding rates of PXCCT under different scenes are counted and summarized in Table VI. The recognition and decoding result under half-shadow and halflight with brick as the background is shown in Fig. 18(c). Some PXCCTs in the grass scene were unidentified because a target was contaminated with large muddy stains. The unidentified PXCCTs in other scenes were affected by the wind blowing the foam boards over. Consequently, PXCCT performs well under different environmental and lighting conditions, ensuring adaptability to several potential scenarios of UAV surveys.

V. CONCLUSION
Presently, marking GCPs in the workflow of UAV photogrammetry projects still generally needs to be manually completed. There are problems of low efficiency and human error, which negatively impact on the subsequent data processing. This article proposes a new PXCCT and corresponding recognition and decoding algorithm to automatically mark GCPs in UAV surveys. The robustness and adaptability of this method were improved while ensuring efficiency through the rough and fine strategies for contour screening.
The results of the UAV survey experiment show that the proposed method can efficiently and accurately detect PXCCT in UAV images without relying on the internal parameters of the images and positioning and orientation system (POS) information of the UAV. The recognition and decoding results of PXCCT indicate that its robustness is significantly higher than that recorded for traditional coded targets. Accurate image coordinates can still be obtained by the proposed method when it is difficult to distinguish the center of traditional noncoded targets due to camera distortion, too large viewing angle, or too small projection size. The proposed method has high adaptability to the various possible environmental and lighting scenes of UAV surveys. The above results show that the proposed method meets the requirements of UAV photogrammetry projects. The time and labor costs of practitioners whose drones only have standard navigation GNSS can be economized, and automatically and accurately marking GCPs can be achieved especially for scenarios that require high precision such as dam safety and natural landform mapping. Additionally, by comparing the robustness of PXCCT with different coded bits, a smaller number of coded bits is recommended when the coding capacity allows.
In close-range photogrammetry, object coordinates of points of interest with micron-level accuracy can be achieved through a combination of factors, such as coded target and metric cameras [46], [66]. How to fully use the advantages of marking GCPs with high precision and how to discuss the impact of marking accuracy on the quality of photogrammetric products, such as digital elevation models and digital orthophoto maps are the critical concerns in our future research.