Introduction
Registration is a key image processing component in brain image studies. Automatic brain image registration methods predominantly rely on information theoretic-based similarity measures to avoid the time-consuming and subjective process of manual extraction of landmarks, or features and their alignment. Mutual information (MI), based on Shannon's definition of entropy, is a widely utilized similarity measure for intermodal and/or intersubject 3D brain image registration. MI was originally introduced for image registration by Viola [1] and Maes [2]. Despite its widespread use, it has been shown that the use of MI can result in misregistrations and there is room for improvement [3]–[5].
MI computation has been conventionally done based on a global spatial independency assumption over the entire image. This underlying assumption means that there is no statistical relationship among neighboring voxels which is strongly violated in most medical images; this shortcoming was recognized soon after MI was introduced. Studholme et al. [3] attempted to enhance the effectiveness of MI by incorporating spatial dependency into its computation, but the method resulted in limited improvement. Since then, researchers have tried to find new ways by which inter-voxel dependency can be taken into consideration toward computing MI.
Conceptually, the conventional MI provides image similarity based on a single voxel correspondence. However, MI with spatial dependency takes into consideration correspondences of multiple adjacent voxels. Spatially dependent MI is thus more robust to image degradation and consequently provides more accurate image registration. This advantage is the main motivation behind the attempts to incorporate spatial dependency into the computation of MI, but these attempts need to overcome the dimensionality problem when computing similarity across multiple spatially dependent voxels.
Fig. 1(a) illustrates the configuration of spatially dependent voxels in the lowest possible order of the neighboring structure. Even for the lowest order neighboring structure, the high dimensionality of the problem prevents any direct approach from obtaining a tractable solution. The volume of recent works on spatially dependent similarity measures (listed in Section II) is indicative of the keen interest among researchers in the brain imaging field to incorporate spatial information into useful similarity measures such as MI. However, most of the spatially dependent similarity measures, reviewed in Section II, are either an ad-hoc combination of MI with various image features to capture the image spatial information or a heuristic use of different definitions of entropy instead of the conventional Shannon entropy.
Nearest neighbor voxels configuration and joint distribution for (a) anisotropic and (b) isotropic random fields.
The recently introduced spatial mutual information (SMI) in [6] provides a method for computing spatially dependent MI while addressing the dimensionality problem by applying the Markovianity constraint. A measure that comes closest to SMI is the second-order MI introduced by Rueckert et al. [4].
In Section III, we first describe the existing artifact for translational misregistration in the recently introduced similarity measure SMI. Then, we introduce our method for extending this similarity measure to 3D brain image registration, making SMI a viable alternative to MI. We also demonstrate that this extension removes the aforementioned artifact for translational misregistration. In Section IV, we compare the effectiveness of the proposed 3D SMI with MI, SOMI, and SMI similarity measures for 3D brain image registration using simulated T1 and T2 weighted images while applying different levels of image degradation. Finally, we discuss future avenues of this work and conclude the paper in Section V.
Spatially Dependent Similarity Measures
The aim of this section is to review some of the existing attempts to incorporate spatial dependency into the computation of mutual information, or other information theoretic-based similarity measures. The recent upsurge in the volume of research in spatially dependent similarity measures indicates the importance of this work in the field, and the immediate need for a more robust and effective spatially dependent similarity measure. Fig. 1(a) shows the structure of the nearest neighboring voxels for a pair of 3-dimentional images. Even though the nearest neighbor structure is the lowest order in the neighboring structure of the Markov random fields, it results in a 14-dimentional joint probability, which is required for the computation of MI with spatial dependency. Such a high dimensionality makes any direct approach to compute the joint distribution an intractable problem. The methods reviewed in this section are mainly simplification approaches applied to joint distribution to make the computation of MI tractable. We begin with homogeneity and isotropy as the most common simplifying assumptions.
Second Order Mutual Information (SOMI) might be considered the most straightforward extension of MI to incorporate spatial dependency under both homogeneity and isotropy assumptions. It involves the use of the co-occurrence, or Aura matrices to estimate the four-dimensional joint probability density function (pdf) of an image pair [4]. This measure is given by SOMI=\sum_{x,x^{\prime}\in\chi}\sum_{y,y^{\prime}\in\chi}{p_{X,Y}(x,x^{\prime},y,y^{\prime})log{{p_{X,Y}(x,x^{\prime},y,y^{\prime})}\over{p_{X}(x,x^{\prime})p_{Y}(y,y^{\prime})}}}\eqno{\hbox{(1)}}
Unfortunately, in practice the four-dimensional joint histogram for estimating
In addition, SOMI is an isotropic measure, meaning that there is no sense of directionality in the adjacent voxels. In other words, all six voxels in the nearest neighboring structure are treated the same and without direction. As shown in Fig. 1, the 14-dimensional joint distribution for a pair of anisotropic fields is simplified to 4-dimensional joint distribution for a pair of isotropic fields. This simplification is essential for the validity of (1), but it reduces the sensitivity of the similarity measure to changes along different directions.
Gradient Mutual Information (GMI) is a spatial similarity measure that is formed by combining MI and a gradient measure [9]. GMI is formulated as follows:\eqalignno{GMI=&\, G(X,Y)I(X,/Y)\cr G(X,Y)=&\,\sum_{(x,y)\in ({{\bf X}},{{\bf Y}})}{w(\alpha_{x,y}(\sigma))min(\vert\nabla x(\sigma)\vert,\vert\nabla y(\sigma)\vert)}\cr w(\alpha)=&\,{{\cos (2\alpha)+1}\over{2}}\&\alpha_{x,y}(\sigma)=arccos{{\nabla x(\sigma).\nabla x(\sigma)}\over{\vert\nabla x(\sigma)\vert\vert\nabla y(\sigma)\vert}}&{\hbox{(2)}}}
The maximum distance gradient magnitude for capturing image spatial information is another approach discussed in [10]. In this approach, MI was obtained from a 4-dimensional joint histogram of two images and corresponding maximum distance gradient magnitudes. However, the lack of available data samples to fill the histogram bins still poses a challenge.
Shen et al. [11] developed a similarity measure that determines image similarities based on an attribute vector for each voxel including gray matter, white matter, and cerebral spinal fluid interfaces. This similarity measure was specifically devised for inter-modal, inter-subject magnetic resonance brain image registration requiring the segmentation of different tissue types.
Mutual information of regions was introduced by Russakoff et al. [12]. In this approach, a vector of intensity values is created for every voxel in the image. The components of these vectors involve the respective intensity values of the neighboring voxels. These vectors form a matrix in which the rows are assumed to be normally distributed. Thus, the entropy is computed directly from the determinant of the covariance matrix. The entropy of the multivariate normal distribution is given by H({{\bf Z}}){\rm=}{{1}\over{2}}log\vert{\rm\Sigma}_{{\bf Z}}\vert{\rm+}{{d}\over{2}}\log ({2}\pi e)\eqno{\hbox{(3)}}
In [14], the image joint histogram was computed along corresponding points on random lines instead of the usual grid pattern. Since the random lines align along any orientation, the spatial information is captured to some degree. However, the outcome depends strictly on the number of points on the lines. Even with a minimal number (2 points), a 4-dimensional histogram is required to be populated which leads to the same scarcity or dimensionality problem as mentioned earlier.
There have also been a number of attempts to compute image spatial mutual information based on multi-feature mutual information [15]–[17]. In these approaches, different features of an image are used to capture the image spatial information instead of incorporating neighboring voxels information. These approaches still suffer from the dimensionality problem. One solution is to consider the feature probability distribution as normal and obtain the joint entropy directly from the covariance matrix by (3). A reliable estimation of the normal distribution using the covariance matrix requires much fewer samples, yet the error in estimating the feature distribution this way has not been studied. The feature extraction is also another issue when using these methods since this process is often done in an ad hoc manner and there is no systematic way of obtaining the most appropriate features.
Another similarity measure, called
The definition of Quantitative-Qualitative MI (QMI) was introduced in [20] and further developed in [21]. QMI is created by adding a utility coefficient into the formulation of the conventional MI as follows:QMI=\sum_{x\in\chi}\sum_{y\in\chi}{U(x,y){p}_{X,Y}(x,y)log{{{p}_{X,Y}(x,y)}\over{{p}_{X}(x){p}_{Y}(y)}}}\eqno{\hbox{(4)}}
There have also been attempts to use different definitions of entropy instead of the classical Shannon entropy [23], as well as other information theoretic measures such as Kullbeck–Leibler Distance [24], entropy correlation coefficients [25] and normalized mutual information [26]. However, the concept of incorporating spatial information is not taken into consideration. For instance, so-called Jumarie entropy has been used in [27] to define a similarity measure. The expression for the joint entropy in that method resembles the normalized entropy of the absolute difference image. Yet the spatial information is still not taken into consideration, thus they are omitted from this review. Next we describe the spatial mutual information definition and its extension to 3D images.
Spatial Mutual Information
A. Existing Definition for 2D Image
SOMI was the first systematic attempt to incorporate the image voxel spatial dependency into the computation of image mutual information which was based on two simplifying assumptions: homogeneity and isotropy [4]. Despite the inefficiency introduced by the isotropy assumption, the dimensionality of the problem still prevented its full utilization in brain image registration. Markov processes, on the other hand, have facilitated handling high-dimensionality problems under anisotropic conditions. For example, there is a well-established approach using Markov random fields (MRFs) in image modeling [28]. In [6], we made the first attempt to compute image spatial information under the MRF constraint, which is a more relaxed constraint than the independency constraint. In this approach, a causal MRF model, called Quadrilateral Markov Random Field (QMRF), was used to compute image spatial information under the definition of Shannon entropy. The spatial entropy defined in [6] was further simplified for homogenous but anisotropic QMRF in [29] for nearest neighboring structures as per the following equation:\eqalignno{&H({{\bf X}})=mn(H(X,X_{u})+H(X,X_{l})\cr&~~\quad\qquad-H(X))-{{mn}\over{2}}(H(X_{u},X_{l})+H(X_{u},X_{r}))&{\hbox{(5)}}}
\eqalignno{&SMI=-mnH(X,Y)+{{mn}\over{2}}\left\{{\matrix{H(X,Y_{u})+H(X_{u},Y)\cr+H(X,Y_{l})+H(X_{l},Y)}}\right\}\cr&~\quad\qquad-{{mn}\over{4}}\left\{{\matrix{H(X_{u},Y_{l})+H(X_{l},Y_{u})\cr+H(X_{u},Y_{r})-H(X_{r},Y_{u})}}\right\}&{\hbox{(6)}}}
Illustration of sample 2D joint entropies in the definition of spatial mutual information.
An observation made in (5) and (6) is that they do not contain all the possible joint entropies of the cliques in the first order MRF, as shown in Fig. 2. For instance, the joint entropy
Next, we state the drawback associated with SMI when it is used as a similarity measure in 3D brain image registration, and present a solution for it.
B. Artifact of SMI as Similarity Measure
The formulation of SMI in terms of two-dimensional joint entropies is made possible in (6) by the two conditional independency assumptions given in the following equation:Y\bot X_{l}/{(X}Y_{l})~\&~X\bot Y_{l}/{(Y}X_{l})\eqno{\hbox{(7)}}
SMI curves computed between simulated T1 brain image and its translationally misregistered version over (a) axial (b) sagittal, and (c) coronal slices. (d)
The above problem is addressed in [6] by using the absolute value of the SMI. However, this approach introduces two local optimum points for translational misregistrations causing difficulty in the optimization process of registration. An alternative solution is proposed next by extending the definition of SMI to 3D brain image volumes with the added benefit of eliminating its translational artifact. The 3D SMI, hereafter called
C. SMI for 3D Brain Images
Mathematical computation of SMI for pairs of 3D images requires a theoretical expansion of the definition of QMRF to include 3D random fields, an expansion which does not exist at this time. In this work, we have considered a different approach to incorporate the 3D spatial information into the computation of SMI. Fig. 4 provides an illustration of the proposed approach. In this figure, a brain volume (top left) is translated along the {SMI}_{3D}={SMI}_{a}\times{SMI}_{s}\times{SMI}_{c}\eqno{\hbox{(8)}}
Illustration of changes in the cross-sectional slices of a typical 3D brain image for a translational misregistration.
Finally, it should be added that in the case of 3D images all the existing cliques in the neighboring structure of the first order QMRF need to be included. Even though these additional cliques would not change the final outcome for the spatial mutual information of a homogeneous random field (as described in [29]) for \eqalignno{&\!\!\!\!SMI\cr &\!\!\!=\!-mnH(X,Y)\!+\!{{mn}\over{4}}\left\{\!\!{\matrix{H(X,Y_{u})\!+\!H(X_{u},Y)\!+\!H(X,Y_{d})\hfill\cr\!+H(X_{d},Y)\!+\!H(X,Y_{l})\!+\!H(X_{l},Y)\hfill\cr\!+H(X,Y_{r})\!+\!H(X_{r},Y)\hfill}}\!\!\right\}\cr&-{{mn}\over{8}}\left\{\!{\matrix{H(X_{u},Y_{l})\!+\!H(X_{l},Y_{u})\!+\!H(X_{d},Y_{l})\!+\!H(X_{l},Y_{d})\hfill\cr~+H(X_{u},Y_{r})\!+\!H(X_{r},Y_{u})\!+\!H(X_{d},Y_{r})\!+\!H(X_{r},Y_{d})\hfill}}\!\!\right\}&{\hbox{(9)}}}
Experimental Results and Discussion
In this section, we used simulated 3D brain MRI scans to examine the effectiveness of
A. Data
In order to evaluate and compare the performance of
B. Evaluation
We first examined the elimination of the translation misregistration artifact in SMI when using the new measure
Using bilinear interpolation, we generated translated versions of the T1 scan with step size of 0.1 mm and along the
Fig. 3(d) shows the
Fig. 3 shows that the translational artifact was removed by using
Next we examined the effectiveness of
We tested the effectiveness of the four different similarity measures of MI, SOMI, SMI, and
While the experimentations here have been limited to simulated images due to our ability to control the noise level and access to gold standard, one can clearly see the superiority of
Finally, it is important to note that the product of three different 2D spatial mutual information mathematically does not give another mutual information. The proposed measure is merely a similarity measure which combines the three existing spatial information for three different cross-sections of a 3D image. The fully characterized 3D mutual information requires the extension of QMRF to 3D random fields, and the derivation of SMI from such fields.
Conclusion
This paper has described the importance of incorporating spatial information into the computation of image mutual information and reviewed previous attempts at computing mutual information with spatial dependency. It was shown that the recently defined spatial mutual information has an artifact in the case of translational misregistration, which was remedied by considering the proposed spatial mutual information for 3D brain images. The proposed similarity measure not only addresses the shortcoming associated with translational misregistration, but it also captures 3D, instead of 2D, brain image spatial dependency. The effectiveness of