A Corner Detection Method for Conventional Light Field Camera by Jointly Using Line-Features

Micro-lens array (MLA) based light field cameras have been commercialized and had a wide range of applications in various fields in recent years. To promote these applications, especially for 3D reconstruction, it is crucial to calibrate light field cameras. Among the calibration steps, it is of great importance to detect corners in checkerboard images and establish the point-to-point correspondence between the checkerboard and its images. However, almost all existing algorithms detect corners in the sub-aperture images instead of raw images for conventional light field cameras. In this paper, a corner detection algorithm that can recognize corners in the raw images of conventional light field cameras was proposed. Firstly, template matching was used to detect the 3D lines in the image space of the main lens. Then, the locations of the corners were obtained by calculating the intersection of 3D line segments and reprojecting it to the raw image again. Experimental results have demonstrated that the algorithm can still achieve good performance in actual datasets with blurred and low resolution micro lens image.


I. INTRODUCTION
Light field photography technology has achieved large developments in recent years since it can capture both the spatial and angular information of light rays at the same time. Various kinds of light field cameras have been proposed gradually [1]- [5]. Among them, micro-lens array (MLA) based light field cameras are the most widely used [4], [5]. According to the functionality of MLA in the whole imaging path, MLA based light field cameras can be divided into two types, i.e., conventional light field cameras and focused light field cameras, and both have been commercialized [6], [7].
Light field technology has wide applications in multiple fields, for instance, depth estimation [8]- [10], light field super-resolution [11], digital refocusing [12], synthetic aperture imaging [13], [14] and visual simultaneous localization and mapping (VSLAM) [15], [16]. To promote these applications, especially for 3D reconstruction, it is fundamental to calibrate light field cameras. Camera calibration, which aims to establish the correspondence between pixels in the sensor and the rays out of the camera, relies on a series of photos of the checkerboard with different poses. Corner detection The associate editor coordinating the review of this manuscript and approving it for publication was Gangyi Jiang. of checkerboard images is the basis of the whole calibration algorithm. The performance of detection algorithm has direct influence on the performance of the calibration algorithm. It is the key of the whole calibration process.
In this paper, a corner detection algorithm is proposed aiming at checkerboard images of conventional light field cameras. Taking raw images as the input, this algorithm find the locations of the 3D corner points of the checkerboard image in the image space of main lens by calculating intersections of two 3D line segments. After projecting these 3D corner points on micro-lens images, the coordinates of 2D corner points in raw images can be acquired. Our code is publicly available1. 1 Our main contributions are as follows: 1) We propose a joint detection algorithm to recognize 2D line features in micro-lens images with the help of 3D line segments in the image space of main lens. Due to the joint use of multiple micro-lens images, this line detection algorithm achieves a result that is more accurate and robust than independently line detection in each micro lens images. Besides, the parameters of 3D line segments can be obtained simultaneously.
2) We propose a corner detection algorithm that is suitable for raw images of conventional light field cameras. The locations of corners in the raw images is acquired, by calculating the intersection of 3D line segments and reprojecting the intersection to the raw image. By doing so, the difficulty of directly detecting corners in micro-lens images which is in extreme low resolution is avoided. Experimental results demonstrate that the algorithm is accurate and works well in blured actual dataset.

II. RELATED WORKS
Dansereau et al. [17] was the first to deliver an end-toend geometry calibration method for conventional light field cameras. Sub-aperture images were first extracted from raw images and corners in all sub-aperture images were detected as the input of the subsequent calibration process. Reference [18] introduced generic multi-projection-center model to the calibration of light field camera, and their algorithm can be applied to both conventional light field cameras and focused light field cameras. However, for conventional light field cameras, sub-aperture images should still be extracted before the implementation of corner detection. Reference [19] present a two-step calibration scheme for light field cameras. Intrinsic parameters with different properties were established in each step. Intrinsic parameters relating to the main lens and extrinsic parameters were first determined by using the sub-aperture image at the center view. Then disparities between different sub-apertures were determined by using corner detection results of other sub-aperture images and were further utilized to estimate other intrinsic parameters. All of the above algorithms recognize corners from sub-aperture images, however, the resolution of sub-aperture images is far less than that of the raw image. Taking Lytro Illum as an example, which is a high end hand-held conventional light field camera. The resolution of its raw image is 7728 × 5368, while the resolution of sub-aperture images extracted from raw images is only 625×434, which has a negative influence on the accuracy of corner detection algorithm. Moreover, the final detection result cannot strictly guarantee the detected corners of different sub-aperture images are on a single parallax line in the epipolar plane image (EPI), since corners in different sub-aperture images are detected separately. Then, when calculating the disparity of a corner, a line fitting method [19]would be adopted. Besides, there are some algorithm detecting corners directly in raw images. Noury et al. [20] proposed a corner detection algorithm consist of two steps. They first classified micro lens images according to their characteristics. Then they detected subpixelic corners locations in those micro lens images which contain corners using a pattern registration method. Nousias et al. [21] estimated the location of corners in micro lens images by calculating the intersection point of two potential saddle axes. However, the two methods mentioned above are designed only for focused light field cameras which has clear micro lens images. As for conventional light field cameras, the resolution of micro lens images is far lower than that of focused light field cameras. For instance, the resolution of micro lens images of Lytro Illum, a typical conventional light field camera, is only 15 × 15, while that of Raytrix R5, a typical focused light field camera, achieves 30 × 30. Moreover, for the conventional light field cameras, each micro lens plays a role of splitting light, rather than focusing light. This makes each micro lens image of conventional light field cameras become more blurred than that of focused light field cameras. Thus, low resolution and blurred micro lens images, which are the inherent characteristic of conventional light field cameras, become the largest challenge for detecting corners from the raw images of conventional light field cameras. To avoid directly detecting corner, [22] showed a calibration algorithm using ''line features''. They took high resolution raw images as input, rather than sub-aperture images with relatively low resolution. Instead of detecting micro lens images with corners, they detect micro lens images with ''line features'' which is the 2D image of the 3D line segment connecting two adjacent corners. Besides, this method does not directly export corner locations in the raw image and it does not use corners correspondence to calibrate cameras either. Instead, they solved intrinsic parameters of light field cameras by establishing the relationship between line features and points on 3D line segments. Inspired by the line feature detecting method in raw images in [22], we further improve their work. And our algorithm can not only boost the accuracy of the line feature detection results, but also export the corner locations in raw image.

III. PREPARATIONS
In the optical path model of this algorithm, main lens and micro lens array are regarded as thin lens and pinhole array, respectively. To better express the algorithm, three coordinate systems are defined in this paper.
The first is ''world coordinate system'', the origin of which is set at the center of micro lens array. Z axis is perpendicular to micro lens array and points out of the camera. X axis and Y axis are located on the plane of micro lens array (any two perpendicular lines on the plane of micro lens array). As shown in Figure 1, the distance between micro lens array and the sensor is d. The distance between the main lens and the sensor is D. X 0 , Y 0 , Z 0 represents the origin of the coordinate system. X corner , Y corner , Z corner represents the image point of a checkerboard corner in the image space of the main lens. X i mla , Y i mla , Z i mla represents the center of the i-th micro lens.
The second is the ''image coordinate system of raw images''. For each pixel, its image coordinate denotes its pixel index in the raw image. The origin of the image coordinate system is at the top left corner of the raw image, the coordinate of which in the world coordinate system is denoted as The third is ''micro lens image coordinate system''. Every micro lens image correspond to a ''micro lens image coor- dinate system'', the origin of which is set at the center of this micro lens image. The directions of coordinate axes are the same as that of the world coordinate system. For the i-th micro lens image, suppose the coordinate of its center in ''image coordinate system of raw images'', where d pix is the distance between two adjacent pixels. After imaging of the i-th micro lens, suppose the 2D projected point of the 3d image point X corner , Y corner , Z corner in the i-th micro lens image is U i s , V i s , which denotes the coordinate of the ''micro lens image coordinate system'' of this 2D projected point. And U i s , V i s can be calculated as follows, which can be further expanded as where K i is the intrinsic matrix of the pinhole camera corresponding to the i-th micro lens. R|T i is the transformation matrix from the world coordinate system to this pinhole camera coordinate system, where R is a unit matrix, i.e. R = I . For expressing brevity, let P i = K i R|T i denotes the projection matrix from the world coordinate system to the ''micro lens image coordinate system'' of the i-th micro lens.
For the sake of clarity, some nouns used in the following sections are explained in advance.
• Checkerboard image: The 3D image behind the main lens, which is re-converged by the light rays emitted by the checkerboard outside the camera, i.e. the 3D image of the checkerboard in the image space of the main lens, as shown in Figure 1.
• 3D line segment: The line segment connecting two adjacent corners in the checkerboard image, which is represented by L, as shown in Figure 1.
• 3D corner point: Corners in the checkboard image.
• 2D line feature: The 2D image of a 3D line segment in the micro lens image, which is represented by l. The corresponding equation for l is ax + by + c = 0.
• 2D corner point: The projection of the 3D corner point in the micro lens image of the raw image.
• L H and L V : Two categories of L divided according to the angle between its direction and the direction of X axis of the world coordinate system. L H represents horizontal lines whose angle is less than 45 degrees, while L V represents vertical lines whose angle is more than 45 degrees. Without specification, in the following sections, when the word ''image'' appears alone, it means an image in the camera sensor. And ''image point'' means the projected point of a scene point by a lens.

IV. PROPOSED ALGORITHMS
Inspired by [22], the first procedure of our algorithm is line detection. Different from [22], it is not to detect the 2D line feature in a single micro lens image, but to detect 3D line segments in the checkerboard image. Then the coordinates of the 3D corner point in the checkerboard image are obtained by calculating the intersection of two 3D line segments. Finally, the 3D corner point is re-projected onto the raw image to acquire the coordinates of the 2D corner point in micro lens images.
In this section, two steps are taken to find the optimal 3D line segments. The first step is to calculate a rough initial solution of the 3D line segment. Then starting form the initial solution, we get a series of new 3D line segments by changing the parameters of the line. After re-projecting every 3D line segment to HorArea or VerArea and matching them with raw data, the one with the max similarity to raw data is selected as the final refined solution.

A. INITIAL SOLUTION
Taking horizontal 3D line segment (L H ) as an example, the method in [22] is utilized to establish HorArea and to detect 2D line feature in each micro lens image in the HorArea respectively. The detection result which is denoted as l 0 ∼ l k is considered as the input of the subsequent steps to calculate the 3D line segments corresponding to these 2D line features.
According to [23], the set of spatial points mapped into a line by a pinhole camera is a plane. Specially, for a 2D line feature l i in the i-th micro lens image, its corresponding plane equation can be determined by represents parameters of the plane equation, i.e., A i X + B i Y + C i Z + D i = 0. P i is the projection matrix from the world coordinate system to the i-th micro lens image. Simplifying (4), we can obtain where a i , b i , c i represent the parameters of the line equation represented by l i , and U i c , V i c represent the coordinates of the VOLUME 8, 2020 center of this micro lens image in the raw image. Suppose L represents the 3D line segment corresponding to l 0 ∼ l k , then the plane i must contain L since l i is the image of L in the micro lens image. For every 2D line segment l 0 ∼ l k , we can determine a set of 0 ∼ k . By calculating the intersection of these planes, we can get an initial solution of L. This can be achieved by solving the following linear equations.
The solution set of the system (X , Y , Z , 1) represents the intersection of planes 0 ∼ k , i.e., the line L. Therefore, the dimension of the null space of the linear equations is 2.
We make singular valuable decomposition (SVD) for the coefficient matrix in (6) and obtain U V . And the two column vectors V 3 and V 4 of V which corresponds to the smallest and second-smallest eigenvalue of are just a set of basis of the solution space of (6). After normalizing the last term of V 3 and V 4 to 1,P 1 = (X 1 , Y 1 , Z 1 , 1) and P 2 = (X 1 , Y 1 , Z 1 , 1) are acquired which are just two points on the line L.

B. REFINED SOLUTION
The L calculated by the above method is not accurate due to the influence of the raw images' blur. We further optimize L as follows. First, we change the parameters of L to generate a series of new 3D line segments. Then, every 3D line segment is re-projected to the sensor plane, and the similarity between the projection and the actual raw data in HorArea or VerArea is calculated. The most similar one is regarded as the final solution of L.
First of all, we show the parameterization of a 3D line segment. Taking  can be regarded as a two-planeparameterization (TPP) [24], [25]. These two planes are located at X = X H 1 and X = X H 2 respectively. In this section, where r · d pix d equals to the tangent of half of the angle of the FOV (field of view) of a micro lens. Equation 7 means to select a micro lens as far as possible from the orthographic projection of P H 1 on the MLA plane, while ensure it contain a 2D projection of P H 1 in its micro lens image. The motivation of this strategy is that, when a fixed offset is added to the value of Z H 1 or Y H 1 of P H 1 (thus, a new P H 1 generates), the farther the micro lens is from the orthographic projection of P H 1 , the larger the 2D projection of the newly generated P H 1 in this micro lens image changes.

Step2
Supposep 1 is the 2D projection of P H 1 in the micro lens whose center is X mla ,Ỹ mla , and the micro lens image coordinate ofp 1 which can be also expressed as (9) or Then, do the same operation to Y H 2 , Z H 2 , and a series of new 3D line segments are generated.

2) Projection to HorArea.
Suppose one of the newly generated 3D line segments is L * , and P * 1 = (X * 1 , Y * 1 , Z * 1 , 1) and P * 2 = (X * 2 , Y * 2 , Z * 2 , 1) are two points on L * . The 2D projections of P * 1 and P * 2 on the micro lens image whose center is U i c , V i c in the HorArea, is calculated by the following equations.
where P i is the projection matrix of the i-th micro lens in the HorArea. After acquiringp * 1 ,p * 2 , (14) can be used to calculate the equations of 2D line features which is the projection of L * in this micro lens image.
where (a * , b * , c * ) is the equation coefficient of the 2D line feature in this micro lens image. 3) Calculating similarity.
We use the normalized cross-correlation (NCC) to calculate the similarity between the 2D line feature l * in the micro lens image and the actual raw date. The sum of NCCs of the 2D line features of L * in all the micro lens images of the HorArea is considered as the similarity between the 3D line segment L * and the actual raw data, which is represented by NCC (L * ), i.e.
The L * with the largest NCC (L * ), i.e. the one most similar to the actual raw data, is chose as the final refined solution of the 3D line segment.
Then the intersection of these two 3D line segments which is represented by P corner can be calculated by the following steps.
Since P corner is on the L H , there is where λ 1 is an unknown constant.
Similarly, since P corner is on the L V , there is Combine these two equations, we can acquire And, λ 1 and λ 2 could be calculated by solving (18). Then the coordinates of P corner can be determined by (16) or (17).
The coordinates of the 2D projection point of P corner in a micro lens image whose center is U i c , V i c , can be calculated by (3). We traverse all micro lens images and only consider those ones in which the distance between the projection point and micro lens image center is less than the radius of this micro lens image, as the final corner detection result.

V. EXPERIMENTAL RESULTS
To evaluate the effectiveness of the proposed algorithm, we compared the results of the proposed algorithm on four different datasets. The first two datasets are from dataset A and dataset E in [17]. They are captured by Lytro camera, and the checkerboard sizes of them are 19 × 19 grid of 3.61 mm cells and 8 × 6 grid of 35.1 × 35.0 mm cells separately. These two datasets are termed Dataset A and Dataset B in this paper. The third is come from [22], which is captured by Lytro Illum camera and with a checkerboard size of 9 × 6 grid of 26.25 mm cells. This dataset is denoted as Dataset C in this paper. The last dataset is captured by ourselves using a Lytro Illum camera, with a checkerboard size of 7 × 11 grid of 29.92 mm cells. After photographing, we use MATLAB toolbox LFToolbox V0.4 designed by Dansereau et al. [17] to generate raw images, then use white images to remove vignetting. And this dataset is denoted as Dataset D in this paper.
Images in different datasets brings different degrees of difficulties for feature detection. As shown in Figure 3(a) and Figure 3(b), there are two images from Dataset A and Dataset D respectively. It can be seen that, comparing with the image in Dataset A, the image in Dataset D has a longer span of a single 3D line segment, and there are more 2D line features in micro lens images in the HorArea or VerArea. And it can be noticed that the image in Dataset A is more blurrier and has micro images in lower resolution compared to Dataset D, which brings a greater challenge to the detection of 3D line segments.
To visually demonstrate the performance improvement of the refined solution relative to the initial solution, before comparing the final corner detection results, the 2D line features and 3D line segments calculated by the initial solution and refined solution on two datasets (Dataset A and Dataset D) are compared in Figure 5. Figure 5(a) and Figure 5(c) shows a chunk of raw image from Dataset A which is blurrier and in lower resolution, and it is termed chunk 1. Figure 5(b) and Figure 5(d) shows a chunk of raw image from Dataset D  which is clearer and in higher resolution, and it is termed chunk 2.
From these two figures, it can be found that, the overall distribution of 2D line features of the initial solution is a little irregular, even a little messy in Dataset A (see Figure 5(a)), since they are detected independently in each micro lens image. But the overall distribution of 2D line features of the refined solution is regular since they are projections of the same 3D line segment in different micro lens images. Moreover, although the 2D line features of the initial solution and the refined solution both match the raw data very well, the 3D line segments calculated by them differ greatly, as shown in Figure 6(a), and Figure 6(c). This is because the initial solution locally matches each micro-lens best, while the refined solution globally matches the whole chunk best. Then, we compare the final corner detection results of the proposed algorithm (including initial solution and refined solution) to other competitors. To our knowledge, for conventional light field camera, there're no other algorithms which directly detect corners in raw image, so we choose some classic corner detecting algorithms-''Harris'' and ''MinEigen'' algorithm-as the competitors which are widely used for images of traditional camera. Besides, methods based on sub-aperture images are not included in this section, since they could not directly give the locations of the corners in raw image.
In the four datasets, 291 3D corners are totally used to demonstrate the result of each algorithm. Among them, 217 3D corners are from Dataset A which correspond to 3105 2D corners in 3105 micro lens images, 24 3D corners are from Dataset B which correspond to 313 2D corners in 313 micro lens images, 11 3D corners are from Dataset D which correspond to 475 2D corners in 475 micro lens images and 39 3D corners are from Dataset D which correspond to 1436 2D corners in 1436 micro lens images. Since real corner locations in raw image are not known for actual data, we invite eight people to mark the corner locations manually, and the average of their result is selected as the ground truth. So, with the previously mentioned four methods (Harris, MinEigen, the initial solution of the proposed algorithm, the refined solution of the proposed algorithm) and the ground truth, there are totally five detection results in the experiment.
For Harris and MinEigen algorithm, it would bring some additional trouble to directly apply them in the raw image of light field camera. The first is ''false corner''-areas between two adjacent micro lens images is often falsely identified as a corner, as shown in Figure 4 . Another problem is ''missed detection''-corners whose location is near the edge of a micro lens image or whose intensity gradient is not large enough are often failed to be detected, as shown in Figure 4. And we count the number of totally detected 2D corners of each algorithm, which is termed ''detected corners'' in Table 1 and we also calculate the ratio of the ''detected corners'' of each algorithm to the number of ground truth, which is termed ''detection ratio'' in Table 1.
Finally, to measure the quantitative ''detection error'' for each algorithm, the deviation of every detected corner from its ground truth is calculated. And the mean value of the deviations of all corners, the maximum deviations and the minimum deviations are all shown in Table 2. Besides, in order to make a fair comparison of the detection error of each algorithm, the ''false corner'' and ''missed detection'' are both eliminated manually for Harris and MinEigen algorithm.
From Table 2, it can be noticed that, even after eliminated the ''false corner and ''missed detection'', the detection error of Harris and MinEigen algorithm are still large. More specifically, for dataset D which are clearer, the difference between Harris or MinEigen and the proposed algorithm are not so large. And for Dataset A and Dataset B which are blurrier, the detection errors of Harris and MinEigen become larger, while the proposed algorithm still achieves a good performance, which indicates the robustness of the proposed algorithm. And due to the further optimization, the refined solution gets a even better result than the initial solution.

VI. CONCLUSION AND FUTURE WORK
In this paper, 3D line segments of the checkerboard image in the image space of the main lens are detected by the initial solution first and then further optimized. The coordinates of 2D corners on the raw image are obtained by calculating the intersections of 3D line segments and re-projecting them to the raw data. Experimental results demonstrate that the corner detection algorithm is accurate and reliable in actual datasets with blurred and low resolution micro lens images. Our future work will focus on how to extend the corner detection algorithm from checkerboard images to natural images.