Bayesian Target Detection Algorithms for Solid Subpixel Targets in Hyperspectral Images

We investigate the use of Bayesian methods for hyperspectral subpixel target detection, where the uncertainty associated with the target fill factor is “probabilized” by a suitable prior. Specifically, we present a general framework for Bayesian target detection by employing different models for the background distribution, comparing different choices for the Bayesian prior, and investigating different numerical schemes for evaluating the Bayesian integral. The Bayesian methods are furthermore compared to their generalized likelihood ratio test (GLRT)-based counterparts. Experiments performed over real hyperspectral imagery, with both real and implanted subpixel targets, show that incorporating prior knowledge by means of nonuniform priors emphasizing smaller target fill factors outperforms usage of the “noninformative” uniform prior and enhances Bayes performance beyond the GLRT, a result observed for both parametric and nonparametric background models. We find that even “rough” priors can successfully leverage the context-based information by emphasizing target sizes that are of most interest. We further observe that the Gauss–Legendre numerical integration scheme provides efficient integral approximation while maintaining the desirable admissibility property of Bayesian methods.


I. INTRODUCTION
S EARCHING a hyperspectral image for a target based on its spectral signature is a data processing task with many applications, ranging from forestry and mineral prospecting to pollution monitoring and public safety. With the richness of information content offered by hundreds of spectral bands at every image pixel, hyperspectral imaging has shown great potential in material discrimination. In many applications, the targets to be searched for are subpixel, which is to say that they are smaller than the pixel size on the ground. Detecting subpixel targets means handling a spectral signal including not only the target component but also the signal component pertinent to the background, which is composed of the nontarget materials in the given pixel [1]. Among the various approaches proposed to deal with subpixel targets, this article focuses on decision theory-based statistical algorithms for which the likelihood ratio test (LRT) [2] between the probability density functions (pdfs) of the pixel conditioned to the two competing (target present and target absent) hypotheses is applied at each image pixel. By modeling the spectral variability of a pixel containing a subpixel target with the replacement target model (RTM) [1], [3], the target-present pdf can be expressed as a function of the background pdf and the target signature and the resulting LRT becomes a composite hypothesis testing problem depending on the target fill factor (or abundance), which is the RTM key parameter. The most widely employed approach to composite hypothesis testing is to replace the unknown target fill factor with its maximum likelihood (ML) estimate, thus obtaining the generalized LRT (GLRT) [2]. Other approaches exist such as penalized LRT and clairvoyant fusion [4], [5], [6], which result in a "weighted GLRT" benefiting from a certain degree of design flexibility. Unlike these GLRT-based detectors, the Bayesian detectors integrate the likelihood over the range of fill-factor values by weighting it with a suitable prior pdf for the fill factor itself. An important advantage of Bayesian detectors is that they are guaranteed to be admissible [7]. This is an attractive benefit: if a detector is admissible, then no other detector is uniformly more powerful, that is to say, for a given false alarm rate (FAR), no other detector has a higher detection rate (DR) for all target fill factors. While this does not mean an admissible detector is unambiguously optimal, it does mean that no other detector is unambiguously better. Thus, when searching for a detector that is well-suited to a particular operational scenario, it is desirable to restrict attention to the class of admissible detectors.
Although the potential of the Bayesian approach for detecting targets in hyperspectral images was acknowledged several years ago [8], [9], Bayesian detectors are not widely employed in the hyperspectral detection literature. One reason may be the difficulty in finding closed-form solutions (mostly due to the integration they require); another may be that the flexibility to choose the prior turns out to be a double-edged sword-it is rarely obvious which prior should be associated with a given problem. One issue is the very interpretation of the prior. The traditional interpretation is that it represents the distribution of fill factors that one expects to see, a priori, in the imagery to be searched. A slightly different view is provided in [7], where the prior is seen as a means to express the importance that is attached to the various values of the fill factor. Despite this This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ difficulty, we consider the flexibility of the Bayesian detector to be more of a blessing than a curse; and combining that with the built-in admissibility property, we believe the Bayesian detection of subpixel targets merits further investigation.
In recent preliminary studies [10], [11], [12], we began exploring the potential of Bayesian detection in hyperspectral imagery: a basic implementation of a numerical solution to Bayesian detection with uniform prior was introduced in [10]; by employing a specific prior with fixed parameters along with certain simplifying assumptions, an approximate closed-form Bayesian detector was derived in [11]; and the utility of a single delta-function prior was investigated in [12]. This article follows the thread outlined by those studies by presenting a more general and structured framework for Bayesian target detection: 1) by enabling the use of arbitrary background models (as demonstrated by experiments using nontrivial parametric and nonparametric background models); 2) by enabling the flexibility to use arbitrary prior pdfs (as demonstrated by the use of a uniform, a nonuniform, and even a strictly "improper" prior); 3) by investigating and comparing the effectiveness of two different numerical integration schemes; 4) by presenting experimental results over both synthetic and real scenarios featuring rare targets; and 5) by comparing detection performance of Bayesian detectors with their more conventional GLRT-based counterparts. We find that Bayesian detectors represent a valid alternative to the more widely employed GLRT, offering not only admissibility but also flexibility with respect to the users' knowledge and interest about the scenario, and we observe that this flexibility can result in improved performance.
The rest of this article is organized as follows. Section II explains the Bayesian approach to solid subpixel target detection, with examples of two Bayesian target detectors. Section III describes the experiments and discusses their results. Finally, Section IV summarizes the main outcomes and contributions of this work.

II. BAYESIAN TARGET DETECTORS FOR SUBPIXEL TARGETS A. Detection of Solid Subpixel Targets
The LRT between the target-present (H 1 ) and the targetabsent (H 0 , i.e., just background) pdfs is expressed as follows: where x ∈ R d is the generic pixel (with d being the spectral dimension) and t ∈ R d is the target spectral signature-t and x are assumed to be radiometrically compatible. Both conditional pdfs are unknown and need to be estimated from the data. Because targets are rare in the scene, the number of available realizations of x|H 1 generally does not suffice to the f x|H 1 (x, t) estimation. Thus, the target-present pdf is generally expressed as a function of the background pdf and the target spectral signature t by means of target spectral variability models, which specify how the observed pixel depends on the background pixel in combination with the target signature and its abundance [3], [13]. One widely employed spectral variability model that is physically consistent with the phenomenology of subpixel targets is the replacement target model (RTM) [1], [3], which fractionates the pixel in two components, the pixel fraction where the target obscures the background and the residual background-only fraction, as follows: where z ∈ R d is the generic background spectrum and α ∈ ( 0, 1] denotes target fill factor and quantifies the fraction of the pixel occupied by the target itself. By the change of variable x = ψ(z), we obtain If the fill factor α were known, the detector in (3) (known as clairvoyant) would provide optimal detection [2]. In real scenarios, α is not known and we need to solve the composite hypothesis testing problem and produce a detector that does not depend on α.

B. Bayesian Approach to Subpixel Target Detection
A Bayesian LRT detector can be derived from the composite hypothesis testing in (3) by posing a prior pdf f α (α) for α and marginalizing α out as follows: .
In order to fully specify this detector, not only the prior f α (α) needs to be specified but also a model for the background pdf f z (). Here, we specifically investigate two Bayesian detectors by leveraging two very well-known background models, namely, a long-tailed (parametric) distribution and a data-driven kernel-based (nonparametric) model. 1) Parametric Bayesian Target Detector: Invoking parametric models means assuming a specific parametric form for the background pdf and replacing the parameters with their estimates made from the available data. Long-tailed distributions have been shown to be particularly suitable for modeling hyperspectral backgrounds [14] and have been successfully employed within the GLRT approach to detect rare weak targets [15], [16]. For the multivariate t distribution with mean vector µ, covariance matrix C, and number of degrees of freedom ν, the background pdf is given as follows: where c is a constant (depending on ν and d).
In practice, the parameters (µ, C, and ν) are not known and need to be estimated from the available data, i.e., from a sufficiently large set {z n } N n=1 of target-free secondary data, assumed to be identically distributed as z. Due to the target rarity assumption, all image pixels are generally employed for this purpose. The parametric Bayesian detector based on the multivariate t distribution can thus be written as RTM(P) Bayes with 2) Nonparametric Bayesian Target Detector: The nonparametric approach does not rely upon a specific distributional form for the background pdf but directly estimates it in a data-driven fashion. Here, we consider the variable-bandwidth kernel density estimator (VKDE) [17], [18], which has been successfully employed within the LRT for detecting rare targets and spectral anomalies in hyperspectral images [19], [20]. More specifically, the background pdf is obtained as follows: where κ[] is a kernel function whose bandwidth r k (z n ) depends on the local data density in the feature space and is specifically evaluated at each z n as its Euclidean distance to its k-nearest The k-NN-based bandwidth selection allows for an automated tuning of the kernel smoothing with resulting performance weakly dependent on the free parameter k [17], [18].
The nonparametric Bayesian detector based on the VKDE can thus be written as 3) Prior PDFs: Choosing the prior is a classic (and sometimes even controversial) issue in Bayesian statistics. Ideally, prior choice should be driven by a priori knowledge and, traditionally, it is set based on the best guess for what the unknown parameter distribution is likely to be. For instance, if targets are expected to be much smaller than a pixel, one should design a prior that puts more weight on small fill factors.
In principle, if no prior knowledge is available, a noninformative prior should be used, this is a prior as "flat" as possible [2]. Because of the bounded domain of the fill factor α, the most natural choice for a noninformative prior is the uniform pdf.
In [11], the prior was chosen to simplify the mathematical derivation of an approximated closed-form solution, thus trading analytical tractability for the user's freedom to set the prior based on the actual available knowledge. Here, by contrast, we do not impose any constraint on the prior, giving up on the derivation of (even approximated) closed-form solutions in favor of providing the user with the maximum flexibility in designing the prior. In fact, we believe that even when the available knowledge may be vague, scarce, or somehow not readily expressible as a pdf, a properly set "rough" prior would provide benefits relative to usage of a noninformative prior. In addition, for detection problems, an alternative interpretation is possible [7], where prior choice should not be limited to specify how likely some given fill-factor values are expected to be, but should also specify how important those values of fill factor are to the user. In this case, for example, putting more weight on small fill factors aims at enhanced DRs at those small fill factors, possibly at the expense of reduced DRs at larger fill factors.
In this work, we will leverage the context-based available information while at the same time directing the detection toward smaller target sizes. In doing so, besides proper priors, we also allow for "improper" priors (not finite or with infinite integral), as long as they can produce posterior pdfs with bounded integrals. 4) Integration Schemes: Two numerical integration schemes are considered in this article, the midpoint (MP) rule and Gaussian-Legendre (GL) quadrature [21], to numerically approximate the integral at the numerator of the Bayesian detectors. For simplicity, let us write q(α) as the integrand function, and observe that both rules can be expressed in the form but with different choices for w i and α i . Since the computation in (9) is dominated by the n pts evaluations of the integrand, the cost depends on n pts , but not on whether the MP or GL rule is employed. According to the MP rule, the integration interval is divided into equally wide subintervals identified by their n pts uniformly spaced MPs {α (MP) i = i/n pts − 1/(2n pts )} n pts i=1 and the integral is numerically approximated by the sum of the areas of n pts rectangles of width w (MP) i = 1/n pts and height q(α (MP) i ).
Approximation of an integral with the GL rule means, instead, adopting a nonuniform spacing of the n pts points where the function is evaluated. Specifically, the positioning of the sampling points within the integration domain is given by are the roots of the n pts -degree Legendre polynomial L n pts (ξ ), and the weights are given by The roots {ξ i } n pts i=1 can be found with recursive methods (e.g., the Newton-Raphson method), but both roots and weights are only calculated once (and have been extensively tabulated; e.g., [21, Table 25.2]), whereas the integrand evaluations are performed for every pixel.
It should be noted that approximating the integral at the numerator of a Bayesian detector with either the MP or the GL rule does not impair the admissibility of the detector itself. In fact, approximating the integral with a sum over discrete terms is equivalent to a Bayesian detector with a different prior (namely, a weighted delta-comb function) and, thus, the admissibility property is maintained.

III. EXPERIMENTAL RESULTS
This section describes the experimental design and discusses the experimental results obtained.
A. Experimental Design 1) Algorithmic Comparison: The Bayesian detectors derived in this article from the RTM are compared in this section to their RTM-based GLRT counterparts. For the parametric background model in (6), the GLRT counterpart is a closed-form expression, given by the elliptically contoured finite target matched filter (EC-FTMF) [16], [22]. For the nonparametric background in (8), the RTM-based NP-GLRT is not solvable in the closed form [19], but a numerical solution was implemented here, using the same sampling points {α ( * ) i } n pts i=1 (with * representing either the MP or GL rule) that were used for the numerical integration in the Bayesian algorithms. In the experiments with controlled fill factors (see Sections III-A4.a and III-B1), the clairvoyant versions of both parametric and nonparametric algorithms were introduced in the comparison as references. While a large variety of detectors based on different approaches have been proposed in the literature (e.g., [23], [24], [25], and [26]), in order to keep focus, we have circumscribed our attention to statistical LRT RTM-based detectors. As mentioned, the goal of this article is to investigate Bayesian methods as a robust, flexible, and admissible alternative to GLRT. The comparison of Bayes methods to other different detectors will be the subject of future work.
2) Methodology Settings: For the parametric Bayesian detector, ML estimates (μ,Ĉ) were used for µ and C, whereas a simple strategy based on the method of moments was employed for ν. Specifically, as done in [27], ν was estimated by the following ratio between moments of the EC-t distribution:ν We chose p to be equal to one.
For the nonparametric detectors, because the VKDE in (8) employs a spherically symmetric kernel, the data were linearly transformed to have unit variances in all spectral directions before detector application, as suggested in [28] and in the KDE literature [17]. The kernel function was taken as the Epanechnikov kernel function, which exhibits several desirable properties [17]. The number k of NNs in VKDE was chosen to satisfy N 1/3 < k ≪ N 1/2 , as recommended in [19].
With regard to the integration to be computed in both Bayesian detectors, we employed the GL quadrature rule with a number of points n pts = 6. The resulting points were also employed to sample the likelihood function in the NP-GLRT detector. Investigations into the integration strategies are addressed in detail in Sections III-A4.c and III-B3.
For the prior pdf functions for the Bayesian detectors, the uniform prior was employed in all experiments, allowing the comparison to be carried out at the "integral" versus "peak" level (see Sections III-A4.a and III-B1) and at the integration scheme level (see Sections III-A4.c and III-B3). In another experiment (see Sections III-A4.b and III-B2), Bayes flexibility with respect to context-based a priori information was explored by testing two further priors, both suitable to model a subpixel target detection scenario.
The first nonuniform prior function tested was a beta pdf [21] where a and b are the nonnegative shape parameters and () is the gamma function. The beta pdf naturally models subpixel target detection scenarios because its support is [0, 1]. By adjusting the shape parameters, a variety of configurations may be obtained, acting not only on the mean value E{α} = a/(a + b) but also on the entire pdf shape.
We also tested the following prior, which follows a power law and puts more weight on small fill factors: where m > 0 tunes the subpixel weighting. Note that even though this would be an improper prior over a continuous interval that included α = 0, our finite-sample approach is not strictly improper since we use a finite sum of delta functions whose amplitudes scale like 1/α m . Because the finite sum avoids α = 0, the actual prior has a bounded integral.
3) Performance Measures: Target detection performance is typically measured by making reference to two main basic quantities that leverage ground-truth data of target positions and are evaluated by thresholding the detection test statistic, namely, the DR and the FAR. DR is the ratio between the number of target pixels exceeding the threshold in the detection statistic (correct detections) and the total number of target pixels in the scene-the higher the better. FAR is the ratio between the number of nontarget (background) pixels exceeding the threshold (false detections) and the total number of background pixels-the lower the better. In our experiments, we built the empirical receiver operating characteristic (ROC) curves by plotting the DR versus the FAR and evaluated over the detection statistic while varying the detection threshold. ROC curves embody the fundamental tradeoff in target detection performance, i.e., DR and FAR cannot both be optimized, and provide an overall perspective on detection performance.
We also extracted scalar-valued summary performance measures from the ROC curves, specifically the FAR evaluated for several given values of DR (hereinafter, FAR@DR) and the area under the ROC curve (AUC), computed in two different ways. The empirical ROC curve, with DR and FAR plotted for every possible threshold, is typically a stair-step shape (this is strictly true if all pixels have distinct detection statistic values) Fig. 1. Scheme of the matched-pair method for target implanting. A matched pair of images is used: the first acts as a background image (embodying the null hypothesis H 0 ), whereas the second (representing the alternative hypothesis H 1 ) is a replica of the first image but with targets implanted in each pixel with a given fill factor α ff . The detection algorithm is applied to the pair of images. A ROC curve is obtained by thresholding the histograms of the corresponding detection statistics. and our standard AUC was evaluated as the area under this stair-step curve. It has been noted in [29], however, that ROC curves should theoretically be "convex-cap," so we also report convex-AUC, which is the area under the upper convex hull of the empirical ROC curve. While AUC and some of its variants are widely used in the target detection literature, we consider the FAR-based statistics as more informative for scenarios in which the actual targets are relatively rare since it is the low-FAR region of the ROC curve that is more important in that case. Besides ROC curves and the scalar-valued summary performance metrics, we also provide images of detection statistics as well as their histograms, so as to provide effective visual evidence of algorithm behavior. 4) Subpixel Target Detection Experiments: Three types of experiments were carried out, which are described in the following subsections. The first set of experiments featured a real hyperspectral image and a subpixel target detection scenario reproduced with controlled fill factors. Then, we further validated the methods on a real hyperspectral image encompassing two ground-truth subpixel target detection scenarios. Finally, the third set of experiments was aimed at providing insights about the Bayes integration scheme. All three sets of experiments were carried out on hyperspectral data fully available to the scientific community.
a) Experiments with controlled fill factor: In the first experiment, we reproduced a subpixel target detection scenario with controlled fill factors by means of the matched-pair method for target implanting [30], [31]. This allows the algorithms to be tested on a real hyperspectral image but in a controlled environment that enables a large number of target pixels to be tested and, thus, statistically reliable ROC curves to be drawn. As shown in Fig. 1, the original hyperspectral image is used as "background image," producing N instances of x|H 0 ; then, a second image is obtained by implanting a given target spectrum in each of the N image pixels with a user-specified fill factor α ff , thus obtaining N instances of the test pixel under the alternative "target-present" hypothesis. By running the desired set of algorithms on both images and thresholding the histograms of the corresponding test statistics, ROC curves can be drawn.
The hyperspectral image employed in this experiment is shown in Fig. 2(a), and the spectrum used for implanting, which was measured over a green wooden panel [not present in the image of Fig. 2(a)], is plotted in Fig. 2(b). Both image and target spectrum were taken from the SHARE 2012 collection campaign data [32], [33] acquired over Avon, NY, USA, the characteristics of which are summarized in Table I. Fig. 2(c) plots the target spectrum superimposed to spectra extracted from randomly selected image pixels.
The image consists of 250 × 419 pixels and, based on recommendations from previous works extensively examining these data [34], [35], a subset of 50 bands in the 218-643-nm range was employed for processing.
As noted earlier, the uniform prior was here employed in Bayesian algorithms. Numerical integration was conducted with the GL rule with n pts = 6. The number of NNs employed for VKDE was k = 100, fully satisfying the practical recommendation mentioned above.
b) Real subpixel target detection scenario: The SHARE2012 subpixel target detection experiment [32], [34] was leveraged for further validation of the methods. We make specific reference to the hyperspectral image acquired over Avon, NY, USA, where about 100 wooden panels (each 20" by 12") were deployed in two sets of 50 panels-each set arranged with panels placed in such a way that the target fill factor resulted in (much) lower than 20%. The first set of targets consisted of green wooden panels arranged on a patch of grass, whereas the other set featured yellow wooden panels arranged on a basketball court. The hyperspectral image with the two sets of targets is shown in Fig. 3(a), and ground-truth photographs of the green and yellow targets are shown in Fig. 3(b) and (c), respectively, while Fig. 3(d) and (e) shows the target reflectance spectra. The hyperspectral image consisted of 150 × 400 pixels, and the same band subset as in the previous experiment [34], [35] was employed for processing. The image characteristics are summarized in Table I. All three priors described above were employed in this experiment. Context-based a priori information was leveraged to select prior parameters. Specifically, the beta pdf shape parameters were selected as a = 0.5 and b = d, whereas m = d was taken for the power-law prior-both choices were made to craft pdf functions strongly favoring subpixel scenarios. Use of these priors allowed us to inject our contextbased a priori knowledge of target size into the Bayesian approach.
Numerical integration was performed with the GL rule applied with n pts = 6 (see Section III-B3).
The number of NNs in VKDE was taken as in the previous experiment (k = 100), which totally fits with the aforementioned practical recommendation. In this article, the performance will be examined by means of ROC curves, FAR at given DRs, and AUC metrics.

c) Examining different rules for the integration scheme:
In this set of experiments, we examined two strategies for numerically estimating the Bayesian integral [i.e., the numerator in (4)]. To this end, we used the same data as in the first experiment, specifically the matched image pair obtained with a fill factor of α ff = 0.05. Bayesian algorithms were applied with the uniform prior and k = 100 was selected for the VKDE.
Both MP and GL rules were examined in terms of integral approximation capability and impact on the detection performance. Specifically, we evaluate numerical integration with both rules and for both parametric and nonparametric methods for n pts = {2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 50}. The approximation capability performance was evaluated by examining log-integral curves, plotted with respect to n pts , and evaluated over 150 random pixels extracted from both images of the pair. Here, we do not seek the most accurate approximation possible but an approximation good enough to enable computational efficiency (low n pts ) and good detection performance. For the latter, we used the FAR obtained for a DR of 0.5 (FAR@DR=0.5) evaluated over the image pair and plotted against n pts .

B. Results Discussion
Here, we present and discuss the results for the three sets of experiments.
1) Results for the Experiments With Controlled Fill Factor: ROC curves for the matched-pair experiments are shown in Fig. 4(a) and (b) for parametric and nonparametric algorithms, respectively, where clairvoyant versions of the algorithms are also displayed. ROC curves are here parameterized with respect to the target fill factor α ff .
The graphs show that the parametric methods exhibit-in the low FAR region of greatest interest-a larger difference in performance between Bayes (with uniform prior) and GLRT, in favor of the latter (e.g., for α ff = 0.075 and DR = 0.8, the GLRT provided a FAR of around one order of magnitude lower than that of Bayes). In contrast, the nonparametric approach shows very similar performance for the Bayes (with uniform prior) and GLRT methods.
As expected, the clairvoyant 1 detector acted as a reference, by mostly representing an upper bound to the performance for a given α ff value and background modeling approach. Although comparing parametric versus nonparametric approaches is not the scope of the work, clairvoyant curves show that for the smallest α ff values, the parametric background model was better than the nonparametric model at low FAR, whereas the nonparametric performed better at high DR. However, for operational GLRT and Bayes algorithms, parametric methods obtained on these data generally gave better performance.
Summary metrics of FAR values obtained at some DR values, specifically FAR@DR={0.7, 0.8, 0.9}, are reported in Table II. FAR@DR metrics provide snapshots of the ROC curves for values of the detection threshold such that DR = {0.7, 0.8, 0.9} and confirm the behavior observed in the curves. Specifically, boldface FAR@DR values in the table represent those FAR@DR values for which (for the same background model and α ff ) the difference between GLRT and Bayes is at least an order of magnitude. This occurs for the parametric methods in favor of GLRT, mostly for the central α ff values, whereas for the nonparametric case, GLRT and Bayes provide more similar results. AUC statistics are also reported in Table II, which summarizes the overall algorithm performance throughout the entire ROC curve. Boldface AUC values represent the highest AUC values between GLRT and Bayes for a given background model and target fill factor α ff . While ROC curves and FAR@DR metrics reveal a generally better behavior of GLRT in the low FAR region, AUC values show that, in most cases and especially for the parametric case, Bayesian algorithms exhibit better overall performance, though the difference is small.
From this first set of experiments, the Bayes algorithm with uniform prior provided generally good performance, comparable to the GLRT. It should be noted that for this experiment, we have only employed the uniform prior and, thus, we have not fully explored the potential for leveraging a priori knowledge through the prior pdfs. In the following section, prior pdfs are adopted.
2) Detection Performance Validation Results for the Real Subpixel Target Detection Scenario: Before examining quantitative performance for the real subpixel target detection experiments, we first illustrate the algorithm performance by showing detection statistic images for both green (see Fig. 5) and yellow (see Fig. 6) target panels for (a) nonparametric methods and (b) parametric methods. In order to improve visual comparison and interpretation of the images, we rescaled the detection statistics using a log-ranked transform. Specifically, each rescaled statistic shows, for each image pixel z, the transformed value −log 10 {(R[ (z)]/N )}, 1 In a small number of cases, the clairvoyant detector was slightly outperformed by another detector, even though the clairvoyant is in principle the optimal detector. But this principle strictly applies only if the data distribution exactly follows the background model. For real data, which we used here, this model may only approximate the true background distribution. where R[ (z)] is the ranking that the generic detector () achieved at pixel z compared to all other image pixels. More directly, a pixel value σ of the rescaled statistic means that 10 σ other pixels in the detection statistic image are weaker than it is. Because the transform is monotonic, it has no effect on the performance of the statistic (the ROC curves associated with a statistic and a log-ranked transform of the statistic are identical), but it makes it easier to visualize performance (since only the strongest pixels will be dark) and to compare the performance (since the two statistics being compared are scaled the same way). Log-ranked detection statistics for the case of green targets for nonparametric methods [see Fig. 5(a)] provide immediate visual evidence of the benefits of leveraging the full potential of the Bayesian approach by crafting a suitable prior-while the targets stand out well for all the detection statistic images, more false alarms can be seen in the statistics of GLRT and uniform Bayes, compared to those of nonuniform Bayes. For parametric methods on green targets, on the other hand, log-ranked statistics exhibit fewer visual differences [see Fig. 5(b)]. For the yellow targets detected with nonparametric methods shown in Fig. 6(a), where the targets stand out comparably well in all statistics, fewer false alarms are evident in nonuniform Bayes statistics. Also, for the yellow targets detected with parametric methods shown in Fig. 6(b), Bayesian statistics are very similar one to another, whereas the GLRT statistic is apparently less sensitive to background Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. Ground truth is depicted on the left; target pixels are black, background pixels are light, and agnostic pixels (which are not counted as target or background because they are so close to the targets that the ground truth for these pixels is less certain) are gray. Log-ranked statistics correspond to a rescaling of detection statistics to a common distribution. The highest pixel in the detection statistic (ranking = 1) maps into log 10 (N /1) ≈ 5.02, where N = 250 × 419 is the number of pixels in the image. The tenth highest pixel in the detection statistic maps into log 10 (N /10) ≈ 4.02, and the 100th pixel maps into log 10 (N /100) ≈ 3.02. The weakest pixel (with the smallest detection statistic: ranking = N ) maps into log 10 (N /N ) = 0. structures but also shows the targets that do not appear to stand out as well as for the Bayesian statistics.
The quantitative performance can be observed in terms of the ROC curves plotted in Fig. 7 for the green [(a) and (b)] and yellow targets [(c) and (d)]. Results for the parametric algorithms are shown in Fig. 7(a) and (c), while those for nonparametric algorithms are shown in Fig. 7(b) and (d). As most of the figures clearly show, using a nonuniform prior with the Bayesian algorithm does improve the performance, regardless of the background modeling approach. Except for the detection of the green panels with parametric algorithms [see Fig. 7(a)]-where all methods provide similarly good performance-the Bayesian method with nonuniform priors provided the best performance across the two types of targets and background modeling approaches. Usage of the uniform prior led to poorer performance than either of the nonuniform priors. These results are confirmed by AUC and convex-AUC values reported in Table III for various algorithms and the two kinds of targets. In most cases, AUC values (of both kinds) for the Bayesian algorithms with nonuniform priors are notably higher than those of GLRT and Bayes with uniform prior.
It should be noted that the two different priors provided nearly identical performance. This was not surprising for two main reasons: first, because the same type of context-based  a priori information (namely, the small fill factors were expected) was exploited to craft both prior pdfs; and second, because a higher number of points than n pts = 6 for numerical integration is likely to be needed to appreciate the differences in two pdfs with such strong weights toward low α values.
More importantly, this outcome shows that Bayes is worth applying even if the "optimal" prior cannot be derived-as long as the available a priori information is injected into the process, even "rough" priors expressing the importance associated with small fill factors can benefit the final performance.
3) Insights Into the Integration Scheme: Results of experiments exploring insights into the integration scheme for Bayesian algorithms are shown in Fig. 8(a)-(f).
Average log-integral curves are shown in Fig. 8(a)-(d) where the average natural logarithm of the integral at the numerator of RTM Bayes in (6) is plotted against the number of points n pts . The integrals are averaged over 150 randomly selected pixels extracted from the background H 0 image [see Fig. 8 Fig. 8(a) and (c), while those for nonparametric algorithms are shown in Fig. 8(b) and (d). The MP and GL rules are identified by the blue and red colors, respectively.
As can be observed, the values of the integrals for parametric and nonparametric algorithms are different; this is not a serious concern, however, because it is the ratio of likelihoods that defines the detectors. What really matters here is that the trends of the curves are similar. More specifically, regardless of the background modeling approach, the MP and GL integration schemes both produced integrals that converged to the same value and did so both for background and targetimplanted images. In general, though this happened to a greater extent for the parametric case, the GL strategy led to a faster convergence than MP. Also, regardless of the scheme, the convergence was faster for the nonparametric algorithms. In general, all log-integral curves exhibited a mostly monotonic increase with n pts , although some slight oscillations can be observed for small n pts in the GL-based schemes, due to the nonuniform point spacing that changes with n pts .
As said before, log integrals should be observed coupled with some detection performance curves, as those shown in Fig. 8(e) and (f). Here, the negative base-ten logarithm of the FAR obtained for a DR of 0.5 (−log 10 (FAR@DR=0.5)), which is evaluated from all the pixels in the matched image pair, is plotted versus n pts as summary performance measure (the higher the better) for parametric [see Fig. 8(e)] and nonparametric [see Fig. 8(f)] Bayesian algorithms. Also here, both MP-and GL-based integration schemes led to curves converging to the same −log 10 (FAR@DR = 0.5) value. In general, the nonparametric algorithms exhibited a faster convergence than the parametric ones. Overall, the GL-based scheme provided a faster convergence than MP, with n pts = 6 almost providing the convergence value for both nonparametric and parametric cases. It should also be noted that the GL-based scheme allowed for higher −log 10 (FAR@DR=0.5) values than those obtained with the MP-based scheme.
Accounting for both log-integral and performance curves, we decided to adopt throughout the experiments the GLbased scheme, which, with its faster convergence and generally lower FAR, allowed better performance to be obtained with smaller n pts values than the MP-based scheme. We chose n pts = 6 as a tradeoff between performance and computational efficiency.

IV. CONCLUSION
Bayesian algorithms were developed and explored for solid subpixel target detection in hyperspectral imagery. These algorithms provide admissible alternatives to GLRT-based solutions to the composite hypothesis testing problem. To fully benefit from the flexibility offered by the Bayesian approach, we have employed numerical integration schemes that maintain the admissibility property but do not depend on simplifying or approximating the likelihood expression, and do not limit the choice of prior based on considerations of closedform integrability. This framework gives the user freedom to choose the prior, based either upon available a priori contextual information or upon whatever relevance the user chooses to attach to given target sizes.
We have specifically developed Bayesian detectors based on the RTM for a parametric (elliptically contoured multivariatet distribution) and a nonparametric (kernel-based data-driven distribution) background model. Two strategies for numerical integration have been analyzed and compared in terms of both their accuracy in approximating the integral as well as their performance in detecting the targets.
Experimental results over real hyperspectral images featuring both synthetic and real subpixel target detection scenarios have shown that usage of a suitable prior pdf indeed boosts the Bayesian detection performance. In fact, whereas usage of noninformative (i.e., uniform) priors has provided Bayesian performance similar to that of GLRT, injecting context-based information through the priors has been shown to considerably improve the performance and to a major extent with the nonparametric approach. Very similar performance improvements were obtained using quite different prior pdf functions (a proper beta-function prior and an "improper" power-law prior), showing that even rough expressions of prior information are still effective and can outperform a noninformative prior. This great flexibility shown by Bayesian algorithms from a target detection performance perspective provides impetus to further research. It is our view, in fact, that substantial benefits may be obtained by suitably engineering prior functions to optimize the desired performance criteria.