Research on Static Vision-Based Target Localization for Astronaut Assistant Robots

Aiming at the problem of localizing and grasping targets by an astronaut assistant robot, developing assisting staff to complete specific tasks, and using a self-built visual system as an experimental platform, this paper proposes two improved static target localization methods based on binocular vision and the traditional localization method: (a) a localization method based on region of interest matching and (b) a localization method based on a pixel coordinate offset.Under the same conditions, the traditional binocular vision localization method and the two improved localization methods are used to locate a spherical target. The experimental results are compared and analysed, and the effectiveness of the improved methods is verified by two performance indicators: the localization accuracy and the localization time efficiency. The experimental results show that to a certain extent, the two improved localization methods can improve the localization efficiency of the vision system and positively affect the performance and flexibility of the astronaut assistant robots.


I. INTRODUCTION
In space exploration, the astronaut assistant robot are playing an increasingly important role in assisting astronauts to complete the designated work and can also replace astronauts to complete some dangerous tasks [1].To complete these tasks,the robot should be able to effectively perceive the environment [2] and determine the location of a target.At this time, visual technology has a significant advantage in that it can be used as the "eye" of the astronaut assistant robot to realize the localization function and provide location information [3].
In recent years, with the gradual expansion of the intelligence requirements of all walks of life, machine vision has also been rapidly developed and gradually applied to various fields [4] [5].For example, the application of machine vision to vehicle detection in the transportation industry [6] can improve the reliability of advanced driver assistance systems (ADASs).In the aerospace field, machine vision and neural network are combined for apply robot gesture recognition and body pose estimation [7] so that the robot can perform corresponding actions and realize humancomputer interaction.In industry, machine vision is applied to pointer-type instrument detection [8] to help workers perform calibration tasks for certain instruments, which can solve the problem of the low efficiency and subjective error caused by manual calibration and can improve the measurement accuracy.Machine vision plays a particularly important role in many of the above applications.
At present, to solve the defects of traditional monocular vision [9], researchers are studying the multi-eye vision system,carrying out many studies of the corresponding measurement methods, and are obtaining many meaningful research results.In the literature [10], a measurement method for non-cooperative spacecraft based on binocular vision has been proposed.Although this method has high accuracy, the time consumption of the measurement process is high.The reason is that the stereo matching in this method is very time-consuming.In another study [11], a binocular vision target localization method based on a point cloud was presented.This method generates a point cloud by combining the calibration result and the disparity map obtained by stereo matching, and performs noise reduction processing on the point cloud.A three-dimensional reconstruction of the target is achieved within an appropriate distance [12], but the reconstruction takes a long time and generates a large amount of randomly distributed noise during the generation of the point cloud, even if it is processed by de-drying, which is likely to produce a larger impact on the localization accuracy [13].Another work [14] introduced a method for extracting three-dimensional navigation lines of field roads by binocular vision.In this method, first, the roads are distinguished from the surrounding environment by enhancing the recognition of image shadows [15] and the method of information fusion [16].Second, the stereo matching image primitives are extracted [17], then a threedimensional measurement is realized after stereo matching, and finally the autonomous navigation line of the road is obtained [18].The matching primitive extraction part in this method is very complicated, and the stereo matching requires a long time.The experimental results of the method satisfy the accuracy requirements, but the error is still relatively larger.In [19] [20], three-eye and four-eye visual techniques were employed to achieve the target measurement.Among them, [19] designed a three-mesh camera model with a variable baseline and three binocular subsystems composed of three-eye camera.The depth information could be obtained from different angles, and the appropriate baseline distance could be adjusted according to different measurement targets [21].Finally, the size of the measurement target was obtained by comprehensively processing the three binocular subsystems, and the method had good precision and stability.In [20], a four-dimensional reconstruction of a target was realized by the four-eye visual measurement system (FMS).Compared with the traditional binocular vision system, the FMS can form six binocular vision systems to increase the matching constraint and reduce the mismatch probability, which improves the accuracy of the stereo matching while also improving the measurement accuracy.However, the hardware and software of the three-eye and four-eye vision systems are complex, time-consuming, costly, and difficult to implement.
To solve the above problems and meet the localization requirements of astronaut assistant robots, two improved binocular vision localization methods are proposed in this paper: (a) a localization method based on region of interest matching, which uses the ROI to remove the image background and determine the image analysis range and (b) a localization method based on a pixel coordinate offset, where the pixel coordinates of the target are obtained through the ROI region, and a coordinate transformation is realized by the pixel coordinate offset.Both of these methods utilize the ROI region, which solves the problem of the traditional monocular vision method not effectively obtaining the depth information, can effectively obtain three-dimensional information of the measurement target, and can improve the measurement accuracy while avoiding the high complexity and high time consumption of the multi-vision system in the hardware and software.This method improves the localization efficiency of the astronaut assistant robot vision system.
The structure of the rest of the paper is as follows: Chapter 2 introduces the structure of the visual localization system.Chapter 3 introduces two improved localization methods based on the traditional localization method.Chapter 4 uses the above methods for localization experiments, and the two results of the localization accuracy and time efficiency are compared and analysed.Finally, Chapter 5 presents the conclusions and future work directions.

II. THE COMPOSITION OF THE VISUAL LOCALIZATION SYSTEM
The astronaut assistant robot adopts AAR-2 [22] developed by the Shenyang Institute of Automation, Chinese Academy of Sciences.The structure of the system is shown in Figure 1.Part a is the visual system experimental platform, and part b is the astronaut assistant robot system.The experimental part of this paper is carried out by means of platform a, and the visual platform is transplanted to system b to provide a visual localization function for the robot.The designed vision system provides location information for AAR-2, enabling the crawling system to complete the crawling task.The method uses two industrial cameras to form a binocular vision system for the localization measurement.Compared with a monocular vision system, a binocular vision system can better obtain the depth information of a measurement target in space [23] and obtain the three-dimensional coordinate value of the target in the world coordinate system.The main components of the vision system are as follows: two Basler gigabit network industrial cameras, two kowa 12mm fixedfocus lens, two C-mount interfaces between cameras and lens, and a computer with an Intel Core i5-4460 CPU, 4GB of RAM, and a 64-bit Win7 operating system.

FIGURE 1. Schematic diagram of the AAR-2 system
The operation flow of the binocular vision system is shown in Figure 2. First, the camera driver and software development kit (SDK) are installed in the computer to complete the configuration of the camera IP address, control the binocular camera for image acquisition, and then perform image processing, such as filter denoising, image enhancement and feature extraction.Finally, the image processing analysis result is input into the binocular localization scheme, and the localization result of the target in the world coordinate system can be obtained.

III. INTRODUCTION TO THE PRICIPLE OF TWO IMPROVED BINOCULAR LOCALIZATION METHODS
In this section, we will introduce the relevant content of the binocular localization methods in the visual system, i.e., the traditional localization method and the two improved localization methods, including the principle and related derivation process, which lay the foundation for subsequent experimental verification and result analysis; for the purpose of this paper, the binocular localization method mainly has the following three requirements: 1)Ensure the real-time requirements of the binocular localization method; 2)Satisfy the accuracy of the binocular localization method and ensure the measurement accuracy; 3)Ensure the ability to locate measurement targets in a more complex context.

A . Implementation of the traditional localization method
To localize a measurement target in space and obtain its three-dimensional coordinates, the general flow of the binocular localization method [24] is shown in Figure 3.The methods used to implement the various parts of this paper are described in 1) to 4).

1) CAMERA OFFLINE CALIBRATION
First, this paper performs a camera calibration of the binocular vision system and establishes a correspondence between the image coordinate system and the world coordinate system, as shown in equation (1).The calibration of the binocular cameras is based on a single camera calibration of the left and right cameras after the calibration images are acquired, and the internal parameter matrix M1 and M2 and the outer parameter matrix R1, T1, R2, and T2 of each camera are obtained; then, the calibration result is used.The binocular camera is calibrated to obtain the structural parameters of the binocular cameras, namely, the rotation matrix R and the translation matrix T. The process is an offline calibration and is completed by the Zhang Zhengyou calibration method [25].
where s is a scale factor, p is the image coordinate vector, P is the world coordinate vector, R R and T T are the rotation matrix and translation matrix of the world coordinate system to the image coordinate system, respectively, and M is the inner parameter matrix.The inner parameter matrix M can be represented by equation (2): T are the rotation and translation matrix of the left camera, respectively, and 2 R and 2 T are the rotation and translation matrix of the right camera, respectively.
Stereo calibration images of the binocular vision system are shown in Figure 4 2 shows the structural parameters of the binocular vision system and the relative positional relationship between the binocular camera coordinate systems, namely, the rotation matrix R and the translation matrix T. After the stereo calibration is completed, the positional relationship diagram between the binocular camera and the calibration plate as shown in Figure 5 can be obtained, and the calibration error map of the binocular vision system shown in Figure 6 can also be obtained.The change in the position of the calibration plate and the relative position of the binocular cameras can be seen within a certain range of the cameras.
Figure 5 shows that the relative distance between the two cameras is approximately 100 mm, and the absolute value of the first element of the translation matrix T in the calibration result is approximately 97.89 mm, which are only 2.11 mm apart.
From Figure 6, it can be seen that the minimum and maximum values of the calibration error are 0.35 pixels and 0.73 pixels, respectively, and the average error is 0.50 pixels, which satisfies the calibration requirements.

2) ORIGINAL IMAGE ACQUISITION AND STEREO CORRECTION
In general, circular targets are useful for detection and identification.Therefore, it is usually necessary to design a suitable circular marker point according to the need to attach a marker to the appropriate position at the measurement target to facilitate the target localization.At this point, the measurement target can be any shape, which is more universal.Therefore, it is of great significance to use a spherical target as the measurement object to measure and locate the target.
After the camera calibration is completed, spherical target images are acquired by the binocular vision system (a ping-pong ball is used as the spherical target, wherein the function of the water cup is to increase the image background complexity), and one set of images is shown in Figure 7. Image processing is the performed, which involves denoising and an image enhancement.Denoising can remove various forms of noise contained in the image acquisition process.Image enhancement can increase the contrast of the image to make the measurement target clearer.After the image processing, reduction in subsequent measurement errors and an improvement in the accuracy are indispensable.In practical applications, for the stereo correction problem [26], the optical axes of the binocular cameras are required to be strictly parallel, the acquired images are completely coplanar, and the pixels are line aligned [27].Therefore distortion correction is also performed in the image stereo correction process [28], which reduces the distortion of the image and makes the image as close as possible to the original undistorted image; this is beneficial for the line alignment of the pixels, but it is almost impossible to achieve a complete coplanar and precise alignment of the imaging plane due to the placement of the binocular camera.To achieve this goal, in this paper, the Bouget polar line correction algorithm is used to correct the binocular images while correcting the distortion so that the binocular images are mathematically aligned to the same observation plane, instead of using a physical alignment.The mathematical model of the Bouget polar correction algorithm [29] is as follows: (a) Decompose the rotation matrix R of the right camera relative to the left camera into two composite matrices l r and r r so that each camera is rotated in part, the two cameras are imaged in parallel, but the pixel rows corresponding to each image are not aligned; (b) Construct the transformation matrix rect R by the translation matrix T of the right camera relative to the left camera, so that the baseline and the imaging plane are parallel, and the transformation matrix rect R is constructed as follows: First, construct e1.The transformation matrix transforms the pole of the left image to infinity, causing the polar line to reach a level.The final image plane should be parallel to the line connecting the origins of the left and right camera coordinate systems.In this case, the translation vector T between the cameras can be taken as the direction of the left pole.The vector T is normalized to the unit vector e1: Second, construct e2. e2 is required to be orthogonal to e1, the direction vector e0=(0,0,1) of the optical axis is selected as the medium, and the e1 outer product is normalized to obtain the unit vector e2:   Then construct e3. e3 is required to be orthogonal to e1 and e2.After constructing e1 and e2, e3 can be obtained by the outer product of e1 and e2: A schematic diagram of the stereo correction process is shown in Figure 8.After the stereo correction is completed, the final result is obtained by image cropping, as shown in Figure 9.At this time, the left and right images of the binocular vision system are completely coplanar, and the rows of the pixels are precisely aligned, thus reducing the search range of subsequent stereo matching from a twodimensional search to a one-dimensional search [30] and filtering out the no-matching points.In this way, a certain line of pixels in one image can be directly searched for a matching point corresponding to another image, which improves the stereo matching efficiency.two-dimensional images through the two cameras and then obtains the parallax information by finding the correspondence between the pixels of the two images.Furthermore, a disparity map of the two images can be obtained.Finally, the coordinate values in the world coordinate system in the three-dimensional space corresponding to the pixel points in the two-dimensional image can be obtained by using the triangulation principle and the disparity information [32].

FIGURE 9. Chart of the stereo correction result
On the basis of the stereo correction, the two images have already achieved a line alignment of the pixels; then, when stereo matching is performed, the corresponding two pixel points can be searched in the corresponding pixel row for stereo matching, and the corresponding disparity value is obtained.A schematic diagram of the matching process is shown in Figure 10, wherein 1 to 5 of the left image correspond to 1 to 5 of the right image, respectively.The box position of 1 to 5 of the same pixel row in the right image is searched based on frame position 5 of the left image until the corresponding box position is found, so it is not necessary to find matching points for the entire image, and it is only necessary to search one by one in the corresponding pixel row.To a certain extent, this approach not only reduces the possibility of mismatching in the stereo matching process but also improves the accuracy and speed of stereo matching at a certain level.v and 2 v are the same, and the following relationship is obtained: At this time, stereo matching is performed, ) , ( 11 v u and ) , ( 22 v u are obtained as a pair of matching points, and the disparity value can be expressed by the following formula: (10) where d is the disparity value between the matching point pairs, and u  is the physical size of the pixel.
At this point, the disparity values of a pair of matching points are obtained, the disparity values of other matching point pairs can be obtained by the same approach, and the disparity map can be obtained from this disparity information, thereby using the disparity map as a basis for the measurement.
For the stereo matching problem, this paper uses a semiglobal block matching (SGBM) algorithm to achieve the stereo matching of binocular images [33], thus obtaining the disparity map of the two images, as shown in Figure 11.As a result, we can obtain the depth information of the target., respectively; is the world coordinate system, P , respectively.Thus, a series of related triangles such as are obtained, and the principle of triangulation [35] can be derived: The three-dimensional coordinates of the target point can be obtained from the above equations ( 11), ( 12) and ( 13), In this way, the three-dimensional coordinate values of the measurement target in the world coordinate system are obtained; the specific measurement results will be given in the experimental results and analysis section.

B. Implementation of the improved localization method
In the visual system, under certain conditions, the traditional binocular localization method often takes more time to complete the localization task.The main reason for this is that the stereo matching process of the binocular images in the traditional binocular localization method is very time-consuming.The number of matching points is large, which results in a slower matching speed.Shortening the time of stereo matching and speeding up the stereo matching is key to solving the above problems.The improved binocular localization method described below aims to solve the problem with the two ideas of shortening the time of stereo matching and increasing the speed of stereo matching.

1) LOCALIZATION METHOD BASED ON REGION OF INTEREST MATCHING
In the image processing analysis we usually focus on the range of the target in the image, that is, the region of interest (ROI), rather than the whole image.If the region of interest is introduced in the binocular vision localization method and the ROI is used to specify the range of the region to be analysed in the image, the image processing accuracy is increased to a certain extent, the processing time is reduced, and the efficiency of the image analysis is improved.
Inspired by the above points, a method of merging the stereo matching part in the traditional binocular localization method with the ROI is proposed, and a schematic diagram of the matching process is shown in Figure 13.First, the target range is determined in the pixel coordinate system of the left and right original images, leaving a certain margin, the ROI is obtained by the rectangular extraction method, and the extraction regions of the two images are defined as ROI1 and ROI2.Second, two new blank images with the same size as the original image are created and stored in a matrix.Again, the ROI is copied to the corresponding ROI1 and ROI2 in the new image by loading the image mask, and the ROI image with the background removed is obtained.Finally, the new ROI image is stereo-matched to obtain the disparity information, and the three-dimensional information of the measurement target can be obtained by using the triangulation principle.
At this point, the ROI matching process can be achieved by the following function relationship: The ROI extraction equation is : The ROI image mask equation is : The ROI image matching equation :  The disparity map of the binocular images obtained with the localization method using region of interest matching is shown in Figure 14.

FIGURE 14.
Binocular image ROI disparity map

2) LOCALIZATION METHOD BASED ON A PIXEL COORDINATE OFFSET
It can be seen from the analysis that the abovementioned localization method based on region of interest matching accelerates the speed and improves the efficiency of stereo matching to some extent, but the impact on the accuracy does not seem to be appreciable.Therefore, based on the ROI employed in the above method, a localization method based on a pixel coordinate offset is proposed, the principle of which is shown in Figure 10.First, we determine the approximate range of the measurement target in the pixel coordinate systems of the left and right images, determine the area where the target is located by using the ROI, and record the coordinates of the upper left corner vertex of the rectangular area in the left image in the pixel coordinate system ; the coordinates of the upper left corner vertex of the rectangular area in the right image in the pixel coordinate system ; and the pixel coordinate in the ROI22 pixel coordinate system . Then, the coordinate system is transformed: the coordinate ) , ( coordinate system, and the coordinate ) , ( Finally, combined with the imaging model, the three-dimensional information of the spherical target in the world coordinate system is obtained by the least square method.The reconstruction principle used to achieve a three-dimensional reconstruction of the target point is shown in Figure 15. The imaging model of the camera [36] can obtain equations ( 18) and ( 19), which describe the imaging relationship of a certain point in the world coordinate system in the left and right camera image coordinate systems, where   Now the above two equations can be sorted to eliminate Z from the equations; then, introduce ) , ( into the above two equations: Equation (20) written in matrix form is:

N MX  (22)
The world coordinates of the measurement target obtained by the least square method [37] are:

IV. EXPERIMENTAL RESUITS AND ANAIYSIS
The main problem addressed in this paper the accuracy of the measurement results and large time consumption of binocular visual localization for a large field of view and complex image background.The traditional binocular localization method can localize a spherical target, but the method can be further improved in terms of localization accuracy and time efficiency.This section will use the binocular vision system described in Chapter 2 to verify the traditional binocular vision localization method and the two improved localization methods introduced in this paper by experiments with the same conditions.Then, we present the experimental results and a comparative analysis.The localization methods are evaluated based on two performance indexes: the localization accuracy and the localization time efficiency.

A. Localization accuracy
For the localization accuracy, this paper considers a pingpong ball as the spherical measurement target for an experimental verification by placing the ball, in the measurement field of view.The ball is positioned horizontally in 10 different positions, and the distance between each two adjacent positions is set to 22.5mm.Then, a set of binocular images is collected for the spherical targets in the above 10 positions.The resolution of each group of images is 2448*2048 pixels, and the collected partial images for the localization experiments are shown in Figure 16.First a calibration of the binocular cameras is required, the relationship for the conversion of the measurement target with respect to the world coordinates to the image coordinates is obtained, and the measurement model is established.The traditional localization method (T-LM), the localization method based on region of interest matching (ROI-LM) and the localization method based on the pixel coordinate offset (PCO-LM) are applied to perform the localization calculation.As shown in Table 3, the final measurement result of the localization experiment is obtained, the three-dimensional coordinates of the target in the world coordinate system with the three localization methods are obtained, and the error is calculated.Figure 17 shows that from the overall distribution of the error, the traditional localization method and the localization method based on region of interest matching exhibit large error fluctuations ranging from 3.99 mm to 12.48 mm and 3.71 mm to 11.69 mm, respectively.The localization method based on the pixel coordinate offset exhibits a small error fluctuation, fluctuating within the range of 1.50mm to 6.33mm and is relatively stable.The measurement method based on the pixel coordinate offset has the smallest measurement error.At the 4th, 5th and 6th measurement points, the measurement error of the traditional localization method is lower than that of the localization method based on region of interest matching, and at the 6th measurement point, the measurement error of the traditional localization method is almost the same as that of the localization method based on the pixel coordinate offset.The reason for this result may be that the effect of stereo matching is better, and a more accurate disparity value is obtained, thereby improving the accuracy of the measurement.As shown in Figure 18, the average errors of the above three methods are 8.63 mm, 8.36 mm, and 3.62 mm, respectively.The error in the traditional localization method is slightly larger than that of the localization method based on region of interest matching, and the errors of the two methods are relatively similar.The localization method based on the pixel coordinate offset exhibits less error, the accuracy is improved by 56.8% compared with the accuracy of the traditional localization method, and the localization accuracy is significantly better than that of the other two methods.The localization method based on the pixel coordinate offset has obvious advantages in terms of localization accuracy, and its measurement accuracy is significantly improved compared with that of the traditional localization method and the localization method based on region of interest matching.The localization result within the measurement range is relatively stable, which can meet the measurement accuracy requirements of the astronaut assistant robot binocular vision system.

B. Localization time efficiency
To verify that the improved binocular localization methods in this paper can effectively improve the localization efficiency of a spherical measurement target and to some extent shorten the localization time and improve the overall performance of the system, we use the three methods to locate the ping-pong ball in the above 10 different positions and also consider the time consumption problem.The time consumption of the three localization methods is shown in Figure 19.In theory, the fastest of the three methods should be the localization method based on the pixel coordinate offset, followed by the localization method based on region of interest matching and finally the traditional localization method.This order results from the fact that compared with the traditional localization method, fundamentally, the localization method based on the pixel coordinate offset saves a lot of time when the pixel coordinate of the spherical target position is determined after extracting the ROI on the image and is converted to the original image coordinates by the coordinate system.The localization method based on region of interest matching reduces the range of the stereo matching regions and background complexity.It can be seen from the above analysis that the localization method based on the pixel coordinate offset has obvious advantages in terms of time efficiency.In the larger measurement field of view, the size of the ROI region is unchanged, and the noise is random.Compared with the conventional localization method and the localization method with region of interest matching, the measurement efficiency of the above method exhibits a large increase, which can meet the measurement efficiency requirements of the astronaut assistant robot binocular vision system.

V. CONCLUSION
In summary, this paper mainly solves the localization problem of the binocular vision system of an astronaut assistant robot.Based on the traditional localization method, two improved binocular localization methods are proposed: a localization method based on region of interest matching and a localization method based on a pixel coordinate offset.The innovations and contributions of this paper are as follows: (a) we develop a binocular vision experiment platform for an astronaut assistant robot, which is conducive to the verification and implementation of the method.(b)The localization method based on region of interest matching achieves the localization by removing the complex background of the image by the ROI and determining the stereo matching region, which improves the localization accuracy and reduces the time consumption to some extent.(c) On the basis of the ROI, the localization method based on the pixel coordinate offset uses the ROI to determine the search range to extract the target pixel coordinates, and the coordinate system is transformed by extracting the pixel coordinate offset to complete the localization.The method has a high localization accuracy and low time consumption.The experimental results show that our methods can effectively improve the localization accuracy and reduce the time consumption and improve the localization efficiency and meet the localization requirements.In addition, the localization method based on the pixel coordinate offset is especially prominent in terms of localization accuracy and time efficiency.Therefore, the method is applied to an astronaut assistant robot binocular vision system to provide position information.
Future work will mainly includes (a) optimizing the localization methods of this paper, (b) identifying, tracking and locating dynamic targets, focusing on the localization accuracy and efficiency, and performing a dynamic target experiment verification.

FIGURE 2 .
FIGURE 2.The configuration and processing flow of the binocular vision system x f and y f represents the focal length normalized by the x-axis and the y-axis, respectively,  is the tilt factor, and 0 u and 0 v are the coordinates of the image principal point.The structural parameters (the rotation matrix R and translation matrix T) of the binocular cameras can be obtained by equation (3):

FIGURE 4 .FIGURE 5 .
FIGURE 4. Calibration images of the binocular vision system

FIGURE 6 .
FIGURE 6. Calibration error map of the binocular vision system Finally, a transformation matrix rect R is obtained, which converts the left camera image pole to infinity, causing the image to rotate and the polar line to level.rect R consists of e1, e2 and e3: (c) The final correction matrix is obtained by transforming the matrix rect R by multiplying by R's composite matrices l r and r r to achieve an image pixel row alignment.The correction matrices are as follows:

FIGURE 8 .
FIGURE 8. Schematic diagram of the stereo correction process 3) STEREO MATCHING In binocular vision measurement, stereo matching is a difficult and key problem.Stereo matching technologies can be divided into three categories [31]: feature-based stereo matching technology, region-based stereo matching technology and phase-based stereo matching technology.The

FIGURE 10 .
FIGURE 10.Schematic diagram of the matching process It can be assumed that the pixel coordinates of the left and right images in Figure 10 are ) , ( 1 1 v u and ) , (2 2 v u , respectively; after the stereo correction is completed, the imaging point of the same target point in the world coordinate system in the left camera is at the right position of the image, whereas the imaging point in the right camera is at

FIGURE 11 .
FIGURE 11.Disparity map of the binocular images

2 f 1 ,, and 2 P
physical coordinate systems of the left and right cameras, respectively.1 f and are the focal lengths of the left and right cameras, respectively, the baseline length is x T , the segments P Oc1 and P Oc2 are projected onto the horizontal plane of the optical axis, and the corresponding points for points P , 1 P are P , 1 P , and2

FIGURE 12 .
FIGURE 12. Schematic diagram of the imaging principle of the binocular vision system

FIGURE 13 .
FIGURE 13.Schematic diagram of the ROI matching process pixels.Second, the ROIs in the left and right images are extracted as ROI11 and ROI22, respectively, to reduce the search domain.The target detected by the Hough transform determines that the spherical target has a pixel coordinate parameter values of the Z direction in the left and right camera coordinate systems.

FIGURE 15 .
FIGURE 15.Schematic diagram of the localization method of the pixel coordinate offset

FIGURE 16 .
FIGURE 16.Part of the target images

FIGURE 19 .
FIGURE 19.Comparison of the time consumption of the three localization methods

TABLE I PARAMETERS
OF THE BINOCULAR VISION SYSTEM

Example of the acquired images
FIGURE 7.
The extraction result is used as the input of the ROI image mask equation, and a mask image with the same resolution as the original image is obtained.Then, the mask