HSI-MSER: Hyperspectral Image Registration Algorithm Based on MSER and SIFT

—Image alignment is an essential task in many applications of hyperspectral remote sensing images. Before any processing, the images must be registered. Maximally stable extremal regions (MSER) is a feature detection algorithm that extracts regions by thresholding the image at different grey levels. These extremal regions are invariant to image transformations making them ideal for registration. The scale-invariant feature transform (SIFT) is a well-known keypoint detector and descriptor based on the construction of a Gaussian scale-space. This article presents a hyperspectral remote sensing image registration method based on MSER for feature detection and SIFT for feature description. It efﬁciently exploits the information contained in the different spectral bands to improve the image alignment. The experimental results over nine hyper-spectral images show that the proposed method achieves a higher number of correct registration cases using less computational resources than other hyperspectral registration methods. Results are evaluated in terms of accuracy of the registration and also in terms of execution time.


I. INTRODUCTION
C URRENTLY, hyperspectral image (HSI) from remote sensing are widely available thanks to the increased availability of sensors. This allows us to obtain images of the same region of the Earth taken from different viewpoints at different times. The series of remote-sensing images are used in applications where it is essential to compare, study, or find differences between images. Automatic change detection [1], environmental monitoring [2], or super-resolution image creation [3], among others are applications in which registration is a fundamental Manuscript  prior task [4]. The images must be previously aligned in order to work with them afterwards. Image registration algorithms can be classified into area-based and feature-based methods [4]. Methods in the first group work directly with image intensity, e.g., Fourier Transform [5] and mutual information [6], while those in the second group, featurebased methods, look for information at a higher level, e.g., at the level of regions, lines, or points [7]- [10]. This property makes feature-based methods more suitable for images with illumination changes, which it is the case of remote sensing HSIs of the earth surface. For these images, the atmospheric conditions usually vary from one capture to another.
Generally, feature-based methods consist of the following four stages: feature detection, feature description, feature matching, and image transformation [11]. These methods rely on extracting the same features in the images to be registered. Knowing a number of corresponding features, an image transformation that aligns one image with respect to the other can be calculated.
Maximally stable extremal regions (MSER) [7] is a featurebased method for region detection in images that can be used for extracting the features needed for a later registration process. This method extracts regions, called extremal regions (ERs), by thresholding the image at different grey levels and according to a stability criterion. If MSER is applied to a pair of images, the extracted and matched regions of both images can be used to compute an image transformation to register them. MSER is resilient to changes of scale, rotation, translation, and illumination conditions. Other well-known feature detector methods in the literature are features from accelerated segment test (FAST) [8] and speeded up robust features scale-invariant feature transform (SURF) [9]. Both build a scale-space where scale-invariant points are detected. But the most popular feature detector and descriptor algorithm is the scale-invariant feature transform (SIFT) [10]. SIFT extracts keypoints from a multiresolution pyramid of the images created performing Gaussian convolutions and interpolations. Its descriptor stands out for being highly distinctive and invariant to illumination and distortion changes, making it a widely used method in the literature [12], [13].
The literature indicates that MSER and SIFT are two of the best region detector [14] and descriptor algorithms [15], respectively. They are used not only for image registration [16] but also in many other applications such as object recognition [17], [18], image retrieval [19], [20], and robot localization [21], [22], among others. MSER is also often used in the literature with the local affine frames (LAF) descriptor [18], [23].
SIFT and MSER-based methods have been previously proposed for image registration [24]- [27]. However, the exploitation of the whole spectral information of HSIs in order to improve the accuracy of the registration process has not been explored and analysed in-depth. To the best of authors' knowledge, the publications that used SIFT and MSER to register HSIs, as will be detailed in the following, separately reduce each of the images to a single band. New algorithms to deal with the spectral information in HSIs in an efficient way from the point of view of registration accuracy and computational cost need to be designed [28], [29]. The use of spectral information also allows registering images that cannot be registered considering only one band. A band selection method is required for selecting those bands that are most relevant for registration, i.e., it should avoid bands with redundant or low-quality information. Another feature of previous works is that they validate the algorithms considering a few scale factors and rotation angles. Unlike them, in this article, an exhaustive evaluation of the registration algorithms is carried out considering different hyperspectral datasets (urban, rural, crops, and nature scenes) and a wide variety of registration parameters (scale factors and rotation angles).
One example of a method that uses the multispectral image as a one-band image, and it is only evaluated on one pair of images is proposed in [24]. The authors propose a method to register a pair of multispectral and visible spectrum images using the keypoints detected and described by SIFT and the regions independently detected and described by MSER and LAF, respectively. Guo et al. [25] presented a multispectral remote sensing algorithm based on MSER for region detection. The regions are described twice using the SIFT descriptor and using a shape descriptor based on the Fourier transform. The method is evaluated on a SPOT2 multispectral and a SPOT2 pan image with different resolution. Moreover, the rotational invariance is not extensively tested, as the only angle difference considered is 30 • . No exploitation of spectral information is performed. Zhang et al. [26] proposed a multisensor registration method that combines MSER and SIFT, and takes into account the number of matches used and their distribution in order to improve the final transformation. For the experimental analysis, they register two bands of Landsat multispectral images, and a panchromatic image with respect to a RADARSAT SAR image. The first pair has translation changes, and the second has a different scale factor and rotation angle. Liu et al. [27] presented edge-enhanced MSER, a method for multisensor image registration also based on MSER and SIFT. Before detecting and describing the regions, an edge enhancement is applied to the image pairs to obtain more stable matches. The method is evaluated on a pair of images with a scale factor of 0.7× and a rotation angle of 35 • . No use of the information from the different bands is detailed, as in previous works.
In this work, a registration method for HSIs based on MSER for region detection and SIFT for region description is presented. It exploits the spectral information available in the HSIs by performing feature detection and description in several preselected bands of the images to be registered and by incorporating spectral information into the descriptor. The algorithm was designed to deal with extreme situations in terms of scale factor and rotation angle.
The main contributions of this article are the following. 1) An efficient registration method that adapts MSER and SIFT to efficiently exploits the spectral information available in HSIs is proposed. The method exploits spectral information in two ways. First, by extracting regions in several bands. Second, the regions are described with a descriptor composed of a spatial and a spectral part as it helps to discard false matches. 2) The proposal includes a band selection method specifically designed for the HSI registration problem. It selects bands according to their entropy and wavelength, and achieves better results than other band selection methods in the literature. 3) An exhaustive histogram-based search is used to estimate the registration parameters. All the possible combinations between the matched regions are taken into account. The method selects the best transformation considering all candidates. 4) An in-depth evaluation of the method is carried out. The evaluation is performed over nine pairs of HSIs taken by different sensors at different dates. Moreover, the set of images is extended by applying a wide range of scale factors and rotation angles. The rest of this article is organized as follows. Section II describes the different stages of the proposed method; the results are discussed in Section III. Finally, Section IV concludes this article.

II. HYPERSPECTRAL MSER
In this section, we present hyperspectral MSER (HSI-MSER), a registration method to align two hyperspectral remote sensing images based on MSER as region detector followed by SIFT as feature descriptor. The method exploits the spectral information available in the images. First, the standard versions of MSER and SIFT are described. Then, the proposed method is presented.

A. Maximally Stable Extremal Region
MSER is a method for region detection in greyscale images [7]. It has been successfully applied to a large number of applications such as image recognition, tracking, and image registration [30], [31]. The algorithm extracts a number of regions called MSERs by thresholding the image at different grey levels and according to a stability criterion. An ER is a region in which all pixels within the ER have higher intensity values (for the bright ERs) or lower intensity values (for the dark ERs) than all the pixels on the outer boundary of the region. The outer boundary of a region is defined as the set of pixels that meet two conditions: being adjacent to one, two, or three pixels in the region and not belonging to the region. An ER is considered stable (an MSER) when it does not change substantially as the grey level threshold is varied [7].
MSER presents two properties that make it ideal for image registration [14]. First, linear or affine transformations do not affect the extracted ERs because they only depend on pixel intensities that are preserved under these monotonic transformations. Second, a set of regions is preserved after applying geometric and photometric changes because an ER will continue to be an ER after these transformations.
A reference implementation of MSER can be found in [32] although it only considers RGB images in the range [0,255]. In this work, the original range of the input HSIs is considered and the algorithm is extended to an arbitrary number of bands.

B. SIFT Descriptor
The SIFT is one of the most popular feature detectors and descriptors. In this work, we use the descriptor part to calculate the description of each region previously detected by MSER.
The steps to compute the SIFT descriptor of each region are the following. First, the dominant angles are calculated for each region to achieve invariance to image rotation. An orientation histogram with 36 bins covering 360 • is created from the gradient orientations within the surface of each region. Then, it is weighted by gradient magnitude and by a Gaussian-weighted circular window. The highest peak in the histogram is selected as the dominant orientation. Moreover, any peak above 80% of it is also taken into account. That means that we will have regions with the same location but different orientations. The next step is the descriptor construction. First, an area of size 16 × 16 pixels centred around each region is selected. The area is divided into subareas of 4 × 4 pixels. For each subarea, an orientation histogram with 8 bins is created. Finally, a 128-parameter descriptor for each region is generated from this set of weighted histograms. To reduce the influence of boundary effects, brightness and illumination changes, the descriptor values are thresholded and normalized to unit length.

C. HSI-MSER
In this section, we present HSI-MSER, a registration algorithm for HSIs based on MSER and SIFT. HSI-MSER seeks to find a similarity transformation that successfully aligns two hyperspectral remote sensing images. One of the images is called the reference image. The other, the image that we want to register with respect to the reference image, is called the target image. The method consists of the following six stages: band selection, region extraction, region description, region matching, band combination, and registration. A schematic of the proposed algorithm can be seen in Fig. 1.

1) Band Selection:
It is common that contiguous bands do not differ in relevant information for the registration process. Band selection allows reducing the computational cost with respect to considering all the bands of the image but keeping only the relevant information. For this reason in the first stage, the most relevant spectral bands of the reference and target images are selected.
Out of the methods available in the literature for band selection of HSIs, principal component analysis (PCA) [33], Band-Clust [34], and ward's linkage strategy using mutual information (WaluMI) [35] were evaluated. All perform the selection by considering each HSI individually. The entropy-based selection method used in this work considers both images of the dataset jointly. The method consists in selecting the N bands with the highest entropy but separated by at least D consecutive bands. As the bands forming an HSI are ordered by wavelength, D is the minimum number of bands between each pair of selected ones. This ensures that the selected bands differ in both entropy and wavelength. We call this method entropy-based band selection (EBS).
First, the entropy of each band is calculated for both images, i.e., two entropy values are obtained for each band, and the minimum value of each pair is assigned to the band. Then, the bands are ordered according to decreasing values of entropy. The first band selected is the one with the maximum entropy. The next band selected will be the next with the highest entropy but separated by, at least, a distance of D bands, as indicated earlier. This step is repeated until N bands are selected. If it is not possible to find a band that fulfils this condition, D is reduced by one band and the process is restarted.
2) Region Extraction: The second stage consists in extracting regions from the HSIs. The region extraction is applied to each band selected in the previous stage.
Let I be an HSI consisting of H × W pixels indexed by the variable x and B spectral bands. Let B b (x) be the grey level value of a pixel x in the selected spectral band b. Let also L = [min(B b (x)), max(B b (x))], x ∈ I be the grey level range in the b band. The extracted regions in the b band for the greyscale level l ∈ L are transformed into ellipses where μ l and Σ l are the mean and variance of the pixels composing the region, and R l is an ER detected in this band for the greyscale level l [36]. The aim of this stage is to detect a large number of common structures in both HSIs that will then be used to calculate the transformation to align the images. HSI-MSER is specifically designed to deal with spectral information because some structures are only perceptible in some bands.
3) Region Description: In the third stage, the extracted regions are described. The algorithm is designed to register HSIs. This requires that the descriptor is made up of spatial and spectral information in order to achieve better alignments.
The SIFT descriptor is used for the spatial part. For each ellipse, the coordinates of its centre are considered as the coordinates of the region. The SIFT descriptor is computed on the surface of the region bounded by the ellipse as explained in Section II-B.
This spatial descriptor is enriched by a spectral part, in particular, the spectral signature of the keypoint, which is defined as the pixel vector of the centre of the ellipse since the regions extracted by MSER are homogeneous. Both parts are concatenated to form a descriptor that takes into account both the spatial and the spectral information. The descriptor is a vector made up of 128 components for the spatial part plus the number of bands of the original HSIs as the spectral part. 4) Region Matching: Then, in the fourth stage, the regions of each pair of bands (one band for each HSI) are matched independently, i.e., without taking into account the regions of the other pairs of selected bands. Although the bands are matched independently, the spectral information of the other bands has been taken into account. In particular, the Euclidean distance is used for the spatial part of the descriptor, and the cosine similarity for the spectral part.
The process for matching the regions of both images consists in calculating the distances between their descriptors. Given a region in the reference image, the best candidate match in the target image is the one with the closest distance. However, some regions are not detected in the target image, which means that we will get a false match. Therefore, a method is needed to discard false matches.
The method for region matching consists of two steps. First, the Euclidean distances between each region of the reference band and all of the regions of the target band are computed. The Euclidean distances are calculated on the spatial part of the descriptor, i.e., the SIFT descriptor. Given a region in the reference image, the region closest to it in the target image is considered a possible match if the ratio between the distances to the two closest regions is smaller than a distance D spatial . Second, to finally be considered a match, the cosine similarity between the spectral signatures of the centre of the regions must be higher than S spectral . The spectral information allows discarding false matches in this second step.
D spatial and S spectral were experimentally set at 0.7 and 0.95, respectively. These are the tradeoff values that achieve good results in terms of number of successfully registered cases for the whole dataset with a moderate computational cost. 5) Band Combination: As some regions are only detected in some bands, all matched regions extracted from the selected bands are joined in the fifth stage, i.e., all the regions extracted from the different bands are considered in the same pool.
Thus, regions that are only present in some bands are used to compute the transformation, i.e., all the spectral information is considered together. 6) Registration: Finally, in the sixth stage, an exhaustive histogram-based search is performed to register the images. The method computes a possible transformation for each combination of two matched regions. A selection is then carried out based on all the rotation angles and scale factors obtained.
The procedure is as follows. First, a scale factor, a rotation angle, and translation parameters are computed from each combination of two matched regions, as can be seen in Fig. 1. Second, a histogram with the rotation values is calculated. As we want a robust method against rotations, the 360 • have been divided into 72 bins, i.e., the bin size is 5 • . It was selected based on experiments and following the recommendations by [37]. This allows obtaining bins with a considerable number of elements (transformations), higher accuracy in terms of rotation angle for a first estimation, and a well-defined peak. Bin sizes of 2.5 • and 10 • were also considered, obtaining worse results in terms of number of successfully registered cases.
Moreover, a 2.5 • overlap between bins has been defined. The overlap of 2.5 • allows having a flexible boundary between bins, so each angle could contribute to different bins, for example, an angle of 7.5 • contributes to the bins centred in 5 • and 10 • .
Once the histogram is built, the elements (transformations) of the bin of highest frequency are sorted by the scale factor to obtain the median, which is a measure that is more robust to outliers than the mean. Finally, the scale factor ρ, rotation angle θ, and translation parameters (x, y) of the median element are selected to register the HSIs.

III. RESULTS
In this section, the results obtained by the HSI-MSER method using different hyperspectral remote sensing images are presented. First, the experimental conditions and test images are described in Section III-A. In Section III-B, the proposed band selection method is compared to other methods in the literature. Then, in Section III-C, an analysis exploiting different numbers of bands in the first stage of the algorithm (see Fig. 1) is carried out. In Section III-D, HSI-MSER is compared to other hyperspectral registration algorithms in the literature in terms of number of successfully registered cases,

A. Experimental Conditions and Dataset
This section presents the experimental conditions and test images as well as some experimental results. The experiments were carried out on a PC with a quad-core Intel i7-4790 CPU at 3.60 GHz and 24 GB of RAM. The code was written in C and compiled using the gcc and the g++ 7.5.0 versions under Ubuntu 18.04.
The test procedure consists in registering one image, called reference image, with respect to a second image, called target image, which presents changes of scale, rotation, and translation. The evaluation of the proposed method was performed over nine hyperspectral remote-sensing scenes [38] that can be divided into two groups: images frequently used in the literature, for which only one image is available, and pairs of images taken by the airborne visible/infrared imaging spectrometer (AVIRIS) sensor at different dates.
The first group contains scenes of rural places and cities. The target image, the image we want to align with respect to the original, is generated by scaling and rotating the original images (called reference images). In this way, we can investigate all the registration details in controlled conditions. The generation of the target images will be explained in more detail later. A colour composition of these images is presented in Fig. 2 In the case of the images in the second group, these parameters are applied to the most recent image, as mentioned above. In all cases, the target images are trimmed on the central region to keep the same size as the original images. The test consists in registering each target image (the generated ones) with respect to the reference image.
The registration algorithm obtains angle, scale, and translation parameters as output. The registration is considered correct if the parameters obtained by the algorithm are the same as the original values.

B. Evaluation of Band Selection Methods
In this section, the evaluation of different band selection methods in the first stage of the proposed algorithm is presented. EBS is compared to PCA [33], BandClust [34], and WaluMI [35].
PCA is a well-known dimensionality reduction method. It generates a new set of linearly uncorrelated variables where the first few retain most of the data variation [33]. The idea is to  eliminate data redundancy while preserving relevant information.
BandClust is an unsupervised recursive binary band-splitting algorithm [34]. It iteratively splits a band interval into two disjoint contiguous sets based upon a criterion of minimization of the mutual information, i.e., the method automatically determines the optimal number of bands. Finally, the bands of each set are averaged.
WaluMI performs a hierarchical clustering based on the Ward's linkage method [35]. It groups bands to minimize the intracluster variance and maximize the intercluster variance. The distance used is based on the mutual information between each pair of bands. In the end, WaluMI chooses the most representative band of each cluster. Table I summarizes the cases that were correctly registered for each scene using a single band, randomly selected for each image, or a set of bands extracted by these methods. As explained in Section III-A, the registration is considered correct if the parameters obtained by the algorithm are the same as the original values. In the case of the band selection method used in the proposal, EBS, two parameters must be fixed: the number of bands to be selected N and the minimum distance between the selected bands D. N is set to 8, while D is set to 20, as we want to select bands with different wavelengths to keep all the relevant information. This configuration is called EBS 8. The same number of bands have been selected for the state-of-the-art methods, with the exception of BandClust in which the method itself determines the optimal number of bands. The last row of Table I shows the average number of scalings, i.e., the sum of the number of scales per scene that were correctly registered for all angles divided by the number of scenes.
As shown in Table I, better results are obtained when the spectral information is exploited by considering several bands. This allows detecting features that are only present in some bands. Fig. 5 illustrates this statement. It shows an example of matching for two pairs of bands selected by EBS for the Jasper Ridge images. It can be observed that some features are only present and detected in some spectral bands.
PCA is the exception to the rule. It provides worse results than using only one band. The reason is that PCA applies different transformations to the reference image and to the target images obtaining 8 different principal components (PCs) for each one. This results in a small number of common regions.
The results in Table I show that using the proposed band selection method, EBS, 20.86 cases are correctly registered on average (for all the scenes), more than twice the number of cases achieved using only one band (9.71 cases).

C. Results Exploiting Different Numbers of Bands
As explained in Section II-C, HSI-MSER exploits spectral information in two ways. First, by searching for different regions in selected bands, and second, by incorporating the spectral information into the descriptor. In this section, we analyze the effect of selecting a different number of bands in the first stage, i.e., different values for N in EBS will be evaluated. D is set to 20 as in the previous section.
The test procedure is as explained in the previous section. A total of 40 scale factors from 1/9× to 16.5× and 72 rotation angles from 0 • to 360 • are applied to the target image. Table II   The range indicates the scales successfully registered for the 72 angles. The numbers in parentheses summarize the number of scales that were correctly registered for all angles. If an angle is incorrectly registered, the whole scale factor is considered incorrect, i.e., this case is not included in the table. The registration is considered correct if the parameters obtained by the algorithm are the same as the original values.
summarizes the cases that were correctly registered for each scene by exploiting different numbers of bands selected by EBS, from 2 to 16 bands in steps of 2. It shows that the more bands we used, the better the results. The HSI-MSER using 16 selected bands by EBS provides the best results on average, correctly registering 21.22 cases as compared to 11.78 cases registered using 2 bands. For EBS 8, the number of cases correctly registered is 19.00, which are very close in quality to the results obtained by EBS 16 but with lower computational cost. For that reason, we chose EBS 8 as the default configuration for the proposed method. The computational cost will be evaluated in Section III-E.

D. Comparison to Other Methods in the Literature
In this section, the proposed method is compared to other hyperspectral registration algorithms in the literature: the hyperspectral Fourier-Mellin (HYFM) [41], and the hyperspectral SURF (HSI-SURF) algorithm [42]. Both algorithms exploit 8 spectral bands to register two HSIs. The comparison is made in terms of range of successfully registered cases for each scene, number of matches, number of correct matches, RMSE, registration error, and computational time.
HYFM is an area-based method, which performs a PCA to reduce the dimensionality and extracts 8 PCs for each HSI [41]. One log-polar grid for each pair of PCs is computed using the adaptable multilayer fractional Fourier transform. The different log-polar grids are combined to integrate the information from the different PCs. The highest peaks in the combined log-polar grid are examined to determine the scaling, rotation, and translation parameters. HSI-SURF is a feature-based method [42]. It extracts keypoints in 8 selected bands for each image. The method is based on SURF [9] algorithm as keypoint detector and descriptor, and considers the spectral information of the images in the band selection, keypoint description, and keypoint matching stages. It uses a band selection method based on entropy as in the HSI-MSER.
A comparison between HYFM, HSI-SURF, and HSI-MSER regarding the number of successfully registered cases is presented in Table III for the same number of extracted bands. Feature-based methods (HSI-SURF and HSI-MSER) achieve better results than the area-based method (HYFM) because of their resilience to illumination and intensity changes introduced by noise. HSI-MSER is the method that correctly registers more cases on average, specifically, 19.00 cases. The most notable improvement occurs in the case of the second group of images for which almost twice the scale factors are correctly registered, for example, for Jasper Ridge up to 7.0× compared to 3.0× or 4.0× for HYFM and HSI-SURF, respectively. The registration accuracy must also be evaluated. Extracting control points manually for this evaluation is a time-consuming task that depends on the user decision. As an alternative, the regions extracted by MSER are considered as control points, as proposed in [16]. The reference registration parameters are applied to the matched regions detected in the target image to calculate how much they differ from the regions in the reference image. The reference registration parameters can be seen in Table IV. The original scale factors and angular rotations in the AVIRIS database for each image were used as a starting point [43]. Then, an expert refined these scale factors and rotation angles by hand and obtained the translation parameters to be considered as the correct values.
These experiments are carried out on the second group of test images presented in Section III-A. No additional scale factor, angle of rotation, and translation parameters are applied, only the original transformations already presented, because they were taken on different flights and dates. Table V compares the number of matches for each scene using the HSI-SURF and HSI-MSER methods. The first row displays the number of keypoint matches extracted in the case of HSI-SURF and the number of region matches in the case of HSI-MSER. It is interesting to compare the number of keypoint or region matches found by a method, although it does not influence the quality of the registration, but it does affect the computation time. In the second row, the number of correct matches is shown. A match is considered incorrect if the error measured as the Euclidean distance between the coordinates of the features obtained for the reference image and the corresponding ones for the target image after applying the correct transformation is higher than 2 pixels. In the case of HSI-SURF, the features are the coordinates of the keypoints, while in the case of HSI-MSER, the features are the coordinates of the centres of the extracted regions. Table VI shows the number of matches really used to compute the final registration parameters. These matches are a subset of those shown in Table V. Both HSI-SURF and HSI-MSER use the exhaustive search method explained in Section II-C to compute the registration parameters. It takes into account all the possible combinations between the matched features. Then, a histogram representing all obtained angles of rotation is created. The bin with the maximum number of elements is selected. The final registration parameters are obtained from the median of the scale factors of the parameters of this bin.
Two of the most frequently used error measures in the literature are also presented in Table VI: RMSE [44] and registration error. The registration error is computed as the average Euclidean distance between the features of the reference image used to compute the final registration parameters and the features of the target image used to compute the final registration parameters after the application of the reference transformation shown in Table IV. It is measured in pixels.
Let r i represent the coordinates of the centre of the region i in the reference image, t i the coordinates of the centre of its matched region in the target image after the application of the reference transformation; and M , the number of matches The features are the coordinates of the keypoints in the case of HSI-SURF.
As shown in Table VI, HSI-MSER achieves smaller errors than HSI-SURF in all scenes with the exception of Santa Barbara Front in which one of the selected regions has an RMSE of 2 pixels. The high error values obtained for both methods in the Crown Point scene are related to the large distortion present in these images. More degrees of freedom are needed to better align this scene, i.e., an affine or nonrigid transformation.

E. Computation Times
The time performance is crucial in real-time applications or when large datasets are available. In this section, we present an analysis of the computation times of HYFM, HSI-MSER, and HSI-SURF. Table VII shows the execution times in CPU for each scene considering the last common scale successfully registered for all methods in Table III. It also includes the HSI-MSER version exploiting 16 selected bands by EBS called HSI-MSER 16 in Table VII.
The lowest average execution time, 26.31 s, is achieved by HSI-MSER exploiting 8 bands. This method is less computationally expensive. Although HYFM and HSI-SURF obtain better times than HSI-MSER for smaller images (Pavia University, Indian Pines, and Salinas), for larger images HSI-MSER is more efficient. The reason is that, as it is shown in Table V, HSI-SURF needs a larger number of features than HSI-MSER to register the HSIs. This larger number of features increments the computational costs of the matching stage.
Thanks to the additional bands exploited by HSI-MSER 16, it obtains better results than any other in terms of registration precision as seen in Table II even though the execution time is twice as long compared to HSI-MSER 8, but lower than those of HYFM and HSI-SURF.

IV. CONCLUSION
In this article, HSI-MSER, a feature-based method for registering pairs of hyperspectral remote sensing images, is presented. In particular, the method uses MSER to detect regions and the SIFT descriptor to describe them. To improve the image alignment, the method exploits the spectral information available in the images by detecting features in several preselected bands as well as by including spectral information in the descriptor.
The proposed algorithm is evaluated for a wide variety of scale and rotation parameters, as well as compared in terms of registration precision to other methods in the literature, HSI-SURF and HYFM that also exploit spectral information. Nine HSIs taken by the AVIRIS and the ROSIS sensors are used to evaluate the method. They include urban or rural scenes and changes in different spatial structures and illumination.
Our proposal achieves competitive results when compared to HSI-SURF and HYFM in terms of registration precision and execution time, especially on larger images. Thanks to the exploitation of the spectral information, the method achieves correct alignment of up to 15.0× and a registration time of 26.31 s on average when 8 bands are exploited. In the case of 16 bands, successful alignments of up to 16.5× and an execution time of 51.51 s on average are achieved. Álvaro Ordóñez received the B.S. degree in computer science and the M.S. degree in big data technologies in 2015 and 2016, respectively, from the University of Santiago de Compostela, Santiago, Spain, where he is currently working toward the Ph.D. degree in computer science.
He is currently an Assistant Researcher of Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Santiago de Compostela, Spain. His main research interests include image analysis and processing, parallel algorithms, and big data technologies.
Álvaro Acción received the B.S. degree in computer science and the M.S. degree in big data technologies in 2014 and 2017, respectively, from the University of Santiago de Compostela, Santiago de Compostela, Spain, where he is currently working toward the Ph.D. degree in computer science and classification of hyperspectral remote sensing images.
He is currently an Assistant Researcher of Centro Singular de Investigación Intelixentes (CiTIUS), Santiago de Compostela, Spain. His main research interests include image analysis and processing.
Francisco Argüello received the B.S. and Ph.D. degrees in physics from the University of Santiago de Compostela, Santiago, Spain, in 1988 and 1992, respectively.