Blur-Countering Keypoint Detection via Eigenvalue Asymmetry

Well-known corner or local extrema feature based detectors such as FAST and DoG have achieved noticeable successes. However, detecting keypoints in the presence of blur has remained to be an unresolved issue. As a matter of fact, various kinds of blur (e.g., motion blur, out-of-focus, and space-variant) remarkably increase challenges for keypoint detection. As a result, those methods have limited performance. To settle this issue, we propose a blur-countering method for detecting valid keypoints for various types and degrees of blurred images. Specifically, we first present a distance metric for derivative distributions, which preserves the distinctiveness of patch pairs well under blur. We then model the asymmetry by utilizing the difference of squared eigenvalues based on the distance metric. To make it scale-robust, we also extend it to scale space. The proposed detector is efficient as the main computational cost is the square of derivatives at each pixel. Extensive visual and quantitative results show that our method outperforms current approaches under different types and degrees of blur. Without any parallelization, our implementation\footnote{We will make our code publicly available upon the acceptance.} achieves real-time performance for low-resolution images (e.g., $320\times240$ pixel).


Introduction
Keypoint detection, a fundamental technique in computer vision, has gained extensive attention in recent decades. It plays an important role in various applications such as image retrieval [1,2], image stitching [3,4], object recognition [5,6] and so on. It typically requires finding pixels or blobs which are supposed to be invariant against either photometric or geometric variations. Most of the existing methods attempt to improve the robustness against photometric variations from two different aspects: methods with utilization of sharp features and data-driven approaches. Nevertheless, these techniques are limited in the presence of image blur. Regarding the former class, the intersection of two edges may be smoothed out by image blur and lose the distinctiveness despite the fact that corner-feature based detectors such as Harris [7] can be robust to illumination changes. A more popular way is to find local extremes over scale space generated by different sizes of Gaussians [8,9,10]. However, in the case of motion blur, when the illuminance changes are integrated along a specified direction over time, the positions of local extremes will not be guaranteed the same, because the directional average may change the local distribution of intensities. For the latter category, one can collect training sets which will indirectly determine the type of feature detectors [11,12,13,14] by casting the keypoint detection task as a classification problem. Large scale of patches with good keypoints can be collected and annotated in the case of unblurred images. However, the definition of "good keypoints" involving both unblurred and blurred patches would become too ambiguous to supervise.
Image blur is pervasive in videos and images, mainly because of fast motion during exposure time and imperfect auto-focus systems. It destroys sharp features like edges, corners and local extremes. As shown in Fig. 1(a) and (b), with the influence brought by rotational blur, the intensities change drastically and the conventional method falls in failure. Detecting corresponding keypoints with image blur, which has been sparsely treated so far, suffers from the following challenges: (1) runtime performance. It is of critical importance because keypoint detection is usually used as the first step in many real-time applications such as SLAM [15]. (2) Image degradation. Regardless of loosing sharp features mentioned above, strong motion blur can possibly introduce additional features, as a result of "stretching" pixels along motion direction. For instance, the rotational blur in Fig. 1

Related Work
We only review previous techniques which are mostly related to our work.
Interested readers are referred to survey papers [18,19] for a more comprehensive review of state-of-the-art keypoint detectors. to measure the degree of asymmetry by averaging distances between derivative distributions radially to avoid the estimation of hypothetical axis of asymmetry.
Low self-similarity [24,25,26] is a classical idea which utilizes the fact that the changes of intensity should be high in all directions around a highly distinctive corner. Following this, the basic scheme is to design a comer response function which evaluates the "cornerness" by calculating the changes of intensity over pixels/patches. It has a close relationship with the idea of symmetry/asymmetry, as a region with high degree of asymmetry generally appears to be involved in low self-similarity. It should be noted that self-similarity does not take spatial information into account, yet the measurement of symmetry/asymmetry structure considers the spatial information.
Eigenvalues have been used for modeling keypoint detectors. Harris et al. [7] treat an area as a corner when both eigenvalues of the covariance matrix are positively large. Shi et al. [27] claimed that corresponding area can be regarded as a good candidate of features when the smaller eigenvalue is sufficiently large to meet the noise criterion. In this work, we employ eigenvalues to measure the distance between derivative distributions, which in essence estimates the asymmetry and keypoint likelihood.

Proposed method
In this section, we attempt to design a keypoint metric which maintains the order of keypoint scores measured throughout the image consistently in blurred and unblurred images. In the ideal case, the T opN keypoints from a unblurred image and its blurred version should be the same and correspond to each other.

Distance between Derivative Distributions
Instead of estimating similarity between two patches with raw intensities or gradient histograms, we propose to measure the similarity between two different derivative distributions. Derivative distribution refers to the distribution of image derivatives, with horizontal axis representing the horizontal derivative, and vertical axis representing the vertical derivative, which is shown in Fig. 2.
This can be treated as a problem of distance measurement between two point sets p and q, with each element from the two sets consisting of derivative I x and I y . Each similarity calculated from two point sets contribute to the final score of asymmetry, which will be introduced later. Various methods offer such distance measurement such as BBS [28], EMD [29], etc. However, they all require computing point-to-point distances, which could be computationally expensive in our task. In the case of BBS, assuming |p| = s and |q| = s, the number of tests is d, then the computational complexity for scoring asymmetry with image size |I| = L is O(dLs 2 ). Since the eigenvalues measure the variances along each eigenvector of the point sets, the shapes of p and q can be modeled by two ellipses with the eigenvalues as the lengths of semi-axes. We then define the distance between p and q as where λ max represents the length of the semi-major axis while λ min represents the length of the semi-minor axis. They are calculated from the covariance matrix of p and q, respectively.
To investigate the usefulness of this distance metric, we first analyze the sum of the semi-major and semi-minor axes. The perimeter of an ellipse parameter- ized by λ max and λ min can be represented by and C is the combination function. Obviously, λ max + λ min dominates the perimeter, which indicate that Eq. (1) calculates the distance mostly based on the difference of perimeters and ignores the uniformity of data distribution. The distance metric is translationand rotation-invariant: applying translation and rotation to the data points does not change the distance. This characteristic can help to endure the changes of distribution's shape at some extent caused by noise. The reason is that with the perimeter fixed, the shape of the ellipse can vary within a tolerance range. The drawback is also obvious: Eq. (1) cannot distinguish some shapes like corners from edges as the entire spreading condition of the distribution is evaluated only. However, since the proposed detector (Section 3.2) does not depend on specific geometric shapes like corners or edges, it will not be a practical issue in this work. (c) Distance between two patches' derivative distributions calculated with d(p, q). The scores (i.e., distances computed by Eq. (1)) are required to be non-ascending with the increase of 90 degrees clockwise, and the distributions are also rotated, with eigenvalues unchanged. The distance measurement is supposed to be able to quantify the visual difference with respect to the edges brought by the white strips, and invariant to rotations. Fig. 3(b) shows the change of distance scored by BBS [28] with the decrease of visual similarity between patches shown in Fig. 3(a).
In this example, BBS fails to evaluate the visual similarity properly with rotated patches. On the other hand, in Fig.3(c), despite the entire scale of distances decreases due to motion blur, the magnitude relationship between any two patch sets is preserved. d(p, q) is capable to measure the visual difference properly under both blurred and unblurred situations.
With regard to keypoint detection, calculating the eigenvalues of covariance matrices over numerous image patches can be highly time-consuming. To mitigate this issue, the covariance matrix Σ can be simplified by further assuming the mean I x and mean I y of p and q are zeros: As then d(p, q) can be further rewritten as where I p x 2 represents the expectation of I 2 x with respect to patch p, and other variables are in a similar manner. Note that in Eq. (4) only derivatives need to be calculated to estimate the distance between two patches' derivative distributions, thus it can be efficient. Greater values of d(p, q) contribute more to the asymmetry. Based on the above observation and analysis, we will explain how to use d(p, q) to model the asymmetry in Section 3.2.

Eigenvalue Asymmetry (EAS)
The degradation caused by image blur will decrease the intensity variations of the original image, thus introducing more uniform regions which are naturally with high degree of symmetry. Taking this factor into consideration, we measure asymmetry instead of symmetry. We claim that the regions with larger d(p, q) in different directions maintain better distinctiveness than other regions. A radius parameter r is introduced to test the asymmetry at pixel I(i, j) radially. Specifically, where R defines a function that generates two patch sets in the same size N with respect to r. P i,j,r n ∈ P i,j,r and Q i,j,r n ∈ Q i,j,r (n ∈ [1, N ]) are spatially symmetric across the coordinates (i, j), representing image patches from two sets, respectively. By averaging the distance defined in Eq. (4) over pairs of patches (P i,j,r n , Q i,j,r n ), the metric for evaluating the asymmetry can be defined as follows, Theoretically, N equals to πr and a higher EAS score means larger asymmetry.
By thresholding EAS, we can detect top scored pixels as keypoints. Since we estimate asymmetry radially, the EAS also holds the rotation-invariance.

Scale Space
To deal with heavy blur as well as image scaling, the proposed detector must be able to evaluate each pixel in scale space. We use the kernel described in [30] to generate octaves for building the Gaussian image pyramid by a scaling factor of 0.5 (i.e., the size of image is halved from previous octave to the next octave). We use the concept of blob to describe a keypoint, with a changeable radius r (equivalent to the radius in Eq. (6) are sufficient to make keypoints distinctive. In Fig. 4, we show each octave's score map of EAS. Basically, EAS tends to give higher score to pixels nearby edges as higher gradient variation can usually be observed. It is worth pointing out that with the increase of scale level, though encountering very different degrees of blur, the maps become similar with each other since the third octave.
This demonstrates our EAS is robust to blur.

Implementation Details
In this section, we introduce the implementation details in order to improve the reproducibility.
EAS with partial distributions. Heavy noises would enlarge the shape of derivative distribution. To alleviate this issue, in the implementation we only Non-maximum suppression. After calculating EAS in each octave, we represent each keypoint with a blob of radius r centered at (i, j) in the original image. We then select a subset of keypoints that are locally strong at each octave (e.g., within 3×3 neighbors), allowing keypoints with the same (i, j) but different r possibly appear at the same time.

Experimental Results
In this section, we compare our proposed method with state of the art detectors quantitatively and qualitatively. We validate our method in a variety of scenarios, including space-invariant blur, space-variant blur, complex blur, affine

Experimental Setting
It is difficult to achieve ground truth for blurred real data. To make fair and accurate comparisons focusing on keypoint detection, we use the images from the widely known dataset [31] and an additional 512 × 512 "Lena" image to generate synthetic test data.
As suggested by previous works [32,33], repeatability rate is used for evaluation. Only if two keypoints from two images are located at the same relative position by considering the geometric transformation, we view these two points as "corresponding points". N c is used to represent the number of corresponding points. The performance of a detector is usually sensitive to threshold tuning, which may bring unfairness to comparisons. In our experiments, T opN keypoints (with high confidence) in each image are thus selected for evaluation after ranking all the candidate keypoints. We redefine the repeatability as N c /T opN . From Fig. 6 to 11, the T opN is fixed to 500.
State of the art detectors are selected for comparisons, including Fast-Hessian from SURF [8], DoG from SIFT [10], Harris corner [7], FAST [13], minimum eigenvalue [27] and BRISK [34] with the same number of octaves. The metric score threshold is relaxed to generate sufficient candidates, and T opN of them are selected for evaluation. Affine region detectors are not compared as we aim to analyze the performance under image blur rather than geometric transformations. In Section 4.2, Fast-Hessian is chosen for comparison because its superiority stated in [8]. Fig. 11 Fig. 9, a smaller N c is detected by EAS. One possible reason is that the padded image produces additional boundaries on both sides that have strong EAS responses. Besides, EAS may be inappropriate for the sheering operation which compresses the pixels and changes the spatial relationships between pixels.

Fig. 6 to
(2) In Fig. 7, although EAS detects more keypoints in blurred regions comparing with Fast-Hessian, the amounts are unbalanced between blurred and unblurred regions. We suspect that the EAS scores calculated from unblurred regions are usually higher because of steeper gradients.  To further demonstrate the robustness of our method, we show results under relatively more realistic blur. Specifically, we adopt the method described in [35] to generate general motion blur with random motion trajectories, PSFs and sensor noise depending on the exposure time. From Fig. 12 to Fig. 17, (a) shows the random motion trajectory, and (b) shows four types of PSFs generated  Fig. 19 show the quantitative results with respect to Gaussian blur and linear motion blur, with the x axis representing T opN and y axis representing N c . Specifically, the results over "3 images × 2 types of blur × 5 degrees of blur × 5 T opN × 7 methods" are studied. The overall trend observed is that with the increase of T opN , N c increases, unless the degree of blur exceeds the ability of each method. Also, for all the methods, N c decreases with the increase of blur degree. EAS is observed to be robust against the increase of blur degree. Another interesting overall finding is that in the case of slightly blurred images, N c of EAS increases linearly along with the increase of T opN .

Fig. 18 and
In general, the curve is likely to converge with the increase of T opN with heavy blur.
In Fig. 18, the parameter σ of Gaussian varies from 1 to 9. For all the images, conventional methods can only detect few valid keypoints when σ is greater than 5. Although the curve of EAS converges to a certain limit and moves to lower right with respect to the increase of σ, our EAS achieves 37.2% of average repeatability which is much higher than Fast-Hessian (10.4%). In Fig.  19, the parameter l of linear motion blur varies from 5 to 25, and θ = 0, π/4, π/2 are respectively applied to "Graffiti", "Lena", "Boat". As a final result, 42.3% of average repeatability can be obtained by EAS, which soundly outperforms Fast-Hessian (11.6%). The examples of processing time shown in Table 2 are measured with complete MATLAB implementation (without parallelization or optimization) on a i7-7700 CPU@3.60GHz, 32.0GB RAM desktop computer.

Conclusions
In this paper, we presented a metric for measuring the distance between two derivative distributions, based on the difference of the sum of eigenvalues.
Furthermore, EAS for measuring asymmetry is proposed for keypoint detection.
It is robust in detecting corresponding points under various types and degrees of blur. Extensive experiments demonstrate that our EAS outperforms the state of the art methods in the presence of image blur.
Despite the robustness of our method, it still has a few limitations. It is likely to detect more keypoints in unblurred regions when both unblurred and blurred regions exist. One potential way to solve this problem is using supervised signals (e.g., classification for blurred/unblurred regions). Intensity variations around the detected corresponding points could be small, thus leading to difficulties for the description of the gradient based local feature descriptors. As the future work, we would like to develop effective scale search methods for EAS and design blur-countering feature descriptors for real-world applications.  Figure 19: Quantitative results on images with motion blur. Blur becomes severer from left to right. First to third row are respectively Grafitti, Lena and Boat.