By Topic

- Aerospace
- Bioengineering
- Communication, Networking & Broadcasting
- Components, Circuits, Devices & Systems
- Computing & Processing (Hardware/Software)
- Engineered Materials, Dielectrics & Plasmas

SECTION I

Digital holography can be used to record both the phase and amplitude distribution associated to the light diffracted by a 3-D object with conventional image sensors [1]. Different interferometric techniques such as off-axis interferometry, phase shifting interferometry, or even Gabor holography have allowed extending holographic methods to the digital domain when combined with CCD or CMOS cameras [2], [3], [4], [5], [6]. An advantage of digital recording is that it permits the application of digital processing techniques and this is of great importance in 3-D imaging and 3-D image processing [7], [8], [9]. In particular, digital holography has been applied very efficiently for 3-D object recognition techniques [10]. In a different context, photon counting techniques have been widely used for imaging applications under photon starved conditions. Research fields such as night vision, information security, radiology, and stellar imaging, just to cite a few, have benefited from photon counting [11], [12], [13]. These imaging approaches have been possible thanks to new sensitive receivers that can detect single photons by high gain techniques. Images are therefore recorded on a photo-count at a time basis. Some of these methods have been extended to 3-D imaging and 3-D object recognition, mainly by applying photon counting to integral imaging [14], [15], [16], [17], [18]. Recently some digital holographic techniques have been developed in the photon counting regime with remarkable results but only for reconstruction purposes of 2D images [19], [20]. In [19] phase-shifting digital holograms of low-resolution 2D images are recorded with ultraweak illumination and successfully reconstructed. In the case of [20], holograms are captured with parallel phase shifting techniques and the technique is tested thoroughly on simulated holograms of 2D images. In this paper, we present results of the performance analysis of a technique for 3-D object recognition using digital holography in the photon counting domain. The digital holograms are calculated by applying phase-shifting techniques to a set of interferograms. These interferograms are obtained experimentally and photon counting conditions are simulated. The 3-D recognition ability of our system is analyzed as a function of the total number of photons by using a maximum-likelihood (ML) approach adapted to one-class classification problems. The likelihood is modeled assuming a Gaussian distribution of the data to the class to be recognized, and the centroid of this Gaussian is considered as the one with the highest value in a mixture of two Gaussians. The behavior of the system is studied in terms of the 3-D position of the reference 3-D object. Section 2 describes the technique used to obtain the holograms and briefly comments on the propagation in the Fresnel approximation. Section 3 describes the generation of the photon counting images, and Section 4 the classification strategy using an ML approach and the one class classification characteristics of the problem. Section 5 presents and discusses the recognition results under photon counting conditions. Conclusions are presented in Section 6.

SECTION II

A phase-shifting interferometer is used to record the Fresnel digital hologram of a 3-D scene containing one or several objects. The interferometer is based on a Mach–Zehnder architecture [see Fig. 1(a)]. The object beam illuminates the 3-D input object, and the reference beam forms an on-axis interference pattern with the light diffracted by the object onto the CCD camera. A set of four interferograms $I_{p}({\bf r})$, with $p = 1, 2, 3, 4,$ is recorded, each adding a different constant phase delay between the signal and the reference beam, where ${\bf r}$ is the vector denoting the transversal coordinates. To this end, the reference beam travels through a phase shifter, constituted by two rotating retarder plates, which modulates the phase of the reference beam with phase shifts: $\Delta\varphi_{1} = 0$, $\Delta\varphi_{2} = -\pi/2$, $\Delta\varphi_{3} = -\pi$, $\Delta\varphi_{4} = -(3\pi/2)$. If we denote by $G({\bf r})$ the complex amplitude distribution of the light field diffracted by the object at the output plane, i.e., our digital hologram, then the measurements made by the CCD camera can be written as TeX Source $$I_{p}({\bf r}) = \left\vert G({\bf r}) + R \cdot e^{i(\varphi_{0} + \Delta\varphi_{p})}\right\vert^{2}\eqno{\hbox{(1)}}$$ where we have assumed a constant reference amplitude $R$ and phase $\varphi_{0}$. From this equation, it can be shown that the hologram $G({\bf r})$ can be evaluated by the following mathematical operation: TeX Source $$G({\bf r}) = {1 \over 4} \cdot \left \{I_{1}({\bf r}) - I_{3}({\bf r}) + i \cdot \left[I_{2}({\bf r}) - I_{4}({\bf r})\right]\right\}\eqno{\hbox{(2)}}$$ The resulting complex hologram, $G({\bf r})$ allows to numerically reconstruct the complex amplitude distribution, $O({\bf r}, z)$, generated by the 3-D object at planes located at a distance $z$ from the sensor. The reconstruction can be obtained by computing a discrete Fresnel integral or by using the propagation transfer function method, i.e. TeX Source $$O({\bf r}_{i}, z) = F^{-1}\left(F\left[G({\bf r}_{i})\right] \cdot \exp\left\{-i\pi\lambda z\left[{u^{2} \over (\Delta xN_{x})^{2}} + {v^{2} \over (\Delta yN_{y})^{2}}\right]\right\}\right)\eqno{\hbox{(3)}}$$ where $F$ denotes the fast Fourier transform, $(u, v)$ are discrete spatial frequency variables, ${\bf r}_{i}$ denotes the discrete transversal spatial position in both the CCD plane and the output plane, $N_{x}$ and $N_{y}$ are the number of samples in the $x$ and $y$ directions, and $\lambda$ is the wavelength of the light source. Note that negative values of $z$ are to be considered to simulate backward propagation. In this approach, the resolution at the output plane is the same for any propagation distance $z$, and is given by the resolution at the input plane, i.e., the size of the pixel $(\Delta x, \Delta y)$ in the CCD sensor. Fig. 1(b) shows a grey scale visualization of the Fresnel reconstructed scene of the die (in particular, of $\vert O({\bf r}_{i}, z)\vert^{2}$) for a distance of $z = -345\ \hbox{mm}$ from the object to the camera (see Section 5 for further details).

SECTION III

The statistical model for a photon counting detector is assumed to be a Poisson distribution because the average number of photons per pixel is low. The probability of counting $k$ photons in a time interval $\tau$ can be shown to be Poisson distributed [21]. In particular, the probability distribution follows the equation: TeX Source $$P_{d}(k; {\bf r}, \tau) = {\left[a({\bf r})\tau\right]^{k}e^{-a({\bf r})\tau} \over k!}, \qquad k = 0, 1, 2, \ldots\eqno{\hbox{(4)}}$$ where $k$ is the number of photons produced by a detector centered on a position vector ${\bf r}$ during a time interval $\tau$, and $a({\bf r})$ is the rate parameter. The mean of photon counts is given by: TeX Source $$n_{p}({\bf r}) = a({\bf r})\tau.\eqno{\hbox{(5)}}$$ Photon-counting images can be simulated from irradiance images because the recorded irradiance on a pixel is related to the mean number of photons that arrive at that pixel. In our experiments, a CCD camera records the irradiance distribution of phase-shifted interferograms as is shown in Fig. 1(a). The mean number of photons at a pixel at position ${\bf r}_{i}$ will be given by [21]: TeX Source $$n_{p}({\bf r}_{i}) = {N_{p}I_{p}({\bf r}_{i}) \over \sum_{j = 1}^{N_{T}}I_{p}({\bf r}_{j})}.\eqno{\hbox{(6)}}$$ In the previous equation, $I_{p}$ is the irradiance, $N_{T}$ is the total number of pixels, and $N_{p}$ is the expected number of photons in the image, which will be changed in our experiments in order to generate reconstructed images at different photon counting levels. Therefore, by following Eq. (6), a photon counting version of each interferogram recorded with the optical system in Fig. 1(a) can be simulated by normalizing each one of the interferograms with the whole irradiance, multiplying the normalized image by $N_{p}$ and applying Eq. (4) to generate a Poisson distribution. From these four photon counting interferograms, a photon counting hologram can be generated using Eq. (2). As in conventional holographic techniques, this photon counting hologram can be used to reconstruct the object at planes orthogonal to the output plane by using Eq. (3). This will allow us to apply statistical pattern recognition techniques, as is described in Section 4.

SECTION IV

The approach we will use to perform object recognition in the photon counting domain is similar to that applied in [16] and [22]. In [16], authors use an ML approach and consider a series of hypothesis, where each hypothesis corresponds to a different object class in a scene. However, there are two important differences between the approach in [16] and the approach we follow here:

- In our case, we will consider that there is only one class in the scene and therefore it will make the process different to the case of two or more classes. In fact, the problem we are facing may be characterized as the so-called one-class classification problem in the field of pattern recognition [23]. In a multi-class classification scenario, we may design a classification strategy able to adapt to a series of pre-established classes (types of objects). However, in a one-class classification scenario there is reliable information only about the target class.
- In [16], it is assumed that the conditional density of the number of counts for the ith pixel in the image follows a Poisson distribution. In our case, instead of modeling the distribution of photon counts we will model the values of the set of pixels that belong to the target class object assuming that they follow a Gaussian distribution. In fact, these pixel values can be statistically modeled as a mixture of Gaussians taking into account that Gaussian mixtures are used as generic probability density estimators.

Therefore, let us consider the following target class likelihood equation: TeX Source $$Pr({\bf o}_{i}\vert H) \sim \left({1 \over C}\right) \cdot e^{-{1 \over 2}\left({{\bf o}_{i} - \mu \over \sigma}\right)^{2}}\eqno{\hbox{(7)}}$$ where ${\bf o}_{i} = \vert{\bf O}({\bf r}_{i}, z = constant)\vert^{2}$. $C$ is the Gaussian normalization constant, i.e., $C = \sqrt{2\pi}\sigma$, and $\mu$ is the center of the 1D Gaussian distribution corresponding to the histogram of the target object class. ${\ssi H}$ represents the hypothesis that the target class is present in the scene. We will not make any assumption about noise distribution, apart from the fact that the pixel values will be Gaussian distributed and their likelihood may be modeled following a Gaussian distribution as well. The likelihood function of the reconstructed scene under hypothesis ${\ssi H}$ will be given by TeX Source $${\cal L}({\bf R}\vert H) = \prod_{i = 1}^{M}Pr({\bf o}_{i}\vert H)\eqno{\hbox{(8)}}$$ where in our case the product over $i$ considers the pixels inside the window of the size of a mask representing a model of the object in the image, and $M$ is the number of pixels in the mask. In an object and background disjoint model we will consider the approach that $Pr({\bf o}_{i}\vert H)$ can be written as [16] TeX Source $$Pr({\bf o}_{i}\vert H) = Pr({\bf o}_{i}\vert H)^{w_{i}} \cdot Pr({\bf o}_{i}\vert H)^{1 - w_{i}}\eqno{\hbox{(9)}}$$ where $w_{i}$ is the window corresponding to the object of the target class, i. e., it is 1 inside the target object class support and zero elsewhere. Under the assumption that the second product is irrelevant to the recognition problem (when the noise on the target is small, see [16], [22]), we finally have $Pr({\bf o}_{i}\vert H) \propto Pr({\bf o}_{i} \vert H)^{w_{i}}$. Thus, the log-likelihood for Eq. (8) taking into account the previous equation, becomes: $\log[{\cal L}({\bf R}\vert H)] = \sum_{i = 1}^{M}w_{i} \cdot \log[Pr({\bf o}_{i}\vert H)]$. Doing some operations, we finally arrive at TeX Source $$\log\left[{\cal L}({\bf R}\vert H)\right] = \sum_{i = 1}^{M}\left[w_{i} \cdot \log\left({1 \over C}\right)\right] - \sum_{i = 1}^{M}\left[w_{i} \cdot \left({1 \over 2} \cdot\left\{{{\bf o}_{i} - \mu \over \sigma}\right\}^{2}\right)\right]^{2}.\eqno{\hbox{(10)}}$$ The maximization of Eq. (10) at a given mask location in the image will establish where the target class object is likely to be located. In order to finally decide if the maximum of the $\log [{\cal L}({\bf R}\vert H)]$ found in an image belongs to a target location, a one-class classification approach is used [24], where we consider the target class hypothesis ${\ssi H}$ and the complementary class of any other possible object $\overline{H}$ in the image. Therefore, in order to decide if a certain location corresponds to a target object class hypothesis, by the Bayes rule, it should satisfy the maximum a posteriori (MAP) criterion, that is: ${\cal L}({\bf R}\vert H) \cdot {\cal L}(H)\ > \ {\cal L}({\bf R}\vert\overline{H})\cdot {\cal L}(\overline{H})$. Since we usually do have little or no knowledge about the complementary class of objects hypothesis, which could be considered anything that may appear in the image background, we could assume that the density ${\cal L}({\bf R}\vert\overline{H})$ should be smaller in regions where ${\cal L}({\bf R}\vert H)$ is large. Thus, we could write ${\cal L}({\bf R}\vert\overline{H}) \propto F({\cal L}({\bf R}\vert H))$, where $F$ is considered a monotonically descreasing function [24], and thus: ${\cal L}({\bf R}\vert H)\ >\ P^{-1}\{{\cal L}({\bf R}\vert H)/{\cal L}({\bf R}\vert\overline{H})\}$, where $P(\xi) \equiv F(\xi)/\xi$. Therefore, a constant threshold $\theta$ can be defined in order to decide whether a given location that has a maximum $\log[{\cal L}({\bf R}\vert H)]$ in the image belongs to a target object class or not. This threshold is defined as linearly proportional to the standard deviation of the log-likelihood of the target class hypothesis for each $N_{p}$ value, that is, $\theta(N_{p}) = \beta \cdot \sigma(N_{p})$, where $\sigma(N_{p})$ is the standard deviation (over the number of repetitions of the experiment) of the log-likelihood measure for target class objects at the photon counting regime $N_{p}$ and $\beta$ is a proportional constant. Let $\log[{\cal L}({\bf R}\vert H)]$ be the expected log-likelihood of a target object class for a given photon counting regime, $N_{p}$, which can be measured and learned in advance from an object sample image. If we have a new image without previous knowledge about whether the target object is or is not present, the classification rule $D({\cal L}^{\prime}({\bf R}\vert H))$ to decide whether or not the maximum of the measured log-likelihood $\log [{\cal L}^{\prime}({\bf R}\vert H)]$ in an image belongs to the target object class is defined as TeX Source $$D\left({\cal L}^{\prime}({\bf R}\vert H)\right) = \cases{1, & if $\log\left[{\cal L}({\bf R}\vert H)\right] - \vert\theta(N_{p})\vert\ <\ \log\left[{\cal L}^{\prime}({\bf R}\vert H)\right]$ \cr 0, & otherwise.}\eqno{\hbox{(11)}}$$ Applying Eq. (11), we can decide whether the maximum of the log-likelihood can be associated to the target class or not even considering a very low number of photons for generating the holograms.

SECTION V

In the optical set-up used to create the holograms [Fig. 1(a)] an Argon laser with wavelength $\lambda = 514.5\ \hbox{nm}$ was used. The wave plates were a $\lambda/2$ and a $\lambda/4$ phase retarders adapted to that wavelength. The detector was a 4.2 million pixels (2048 × 2048) CCD with pixel size $\Delta x \times \Delta y = 9 \times 9\ \mu\hbox{m}^{2}$ capturing 16 bits images. The reference object was a cubic die with a lateral size equal to 4.6 mm. The center of the die was located at a distance of $d_{1} = -345\ \hbox{mm}$ from the output plane. In order to detect the die using the ML criterion for one-class classification problems, a mask $w_{i}$ is created centered in the die. Using Eqs. (4) and (6), a photon counting version of each interferogram $I_{p}, p = 1, \ldots, 4$ is created. The corresponding photon counting version of the hologram is made applying Eq. (2). The hologram is propagated to a distance $z = -345\ \hbox{mm}$ using Eq. (3). Fig. 2(a) shows the reconstruction given by the hologram at a distance of $z = -345\ \hbox{mm}$ when using $N_{p} = 4 \times 10^{6}$ photons for each one of the four interferograms $I_{p}, p = 1, \ldots, 4$. The whole image size is 2048 × 2048. Therefore, this corresponds to 0.95 photons per pixel at interferogram level. The use of such a high level of photons has been made just to help visualizing the object under photon counting conditions. Fig. 2(b) shows the histogram of the pixel value distribution for a window of size 653 × 857 centered in the die. Fig. 2(b) also shows a mixture of two Gaussians for this distribution. The Gaussian with highest centroid is the Gaussian we assume models the target object. The Gaussian with lower centroid value may be considered the background included in the generation of the support function. The use of Gaussians and mixtures of Gaussians is widely accepted when solving one-class classification problems [23]. It is important to note that this distribution is different to that obtained when photon counting methods are directly applied to images, such as in photon counting integral imaging. This difference is due to the propagation process between our photon counting hologram and the reconstructed images. Equation (10) determines the likelihood corresponding to the window centered in a specific pixel and with a size of 653 × 857. The value of the maximum of the log-likelihood for a particular $N_{p}$ value is then selected after the window has swept through the entire image. We select the centroid, $\mu$, and its standard deviation, $\sigma$ with the highest value in order to use them for the target class hypothesis in Eq. (10). Fig. 3(a)–(d) show the log-likelihood per pixel for $N_{p} = \{5 \times 10^{4}, 1 \times 10^{5}, 4 \times 10^{5}, 1 \times 10^{6}\}\ \hbox{photons}$. For $1 \times 10^{5}\ \hbox{photons}$, this means 0.024 photons per pixel for the whole 2048 × 2048 image. A reconstructed version of the die scene has been overlapped in order to help visualizing whether the position of the maximum in each case is correct or not. This reconstructed version of the die is obtained without applying photon counting. As we can see, the maximum is correctly located for the cases (b)–(d), but not in (a). Fig. 4(a) shows the log-likelihood mean curve and standard deviation for the pixel whose value is maximum (continuous red line) and for a pixel that is confidently part of the background (dotted green line), for a set of 10 image repetitions generated for each $N_{p}$ value in the range: $N_{p} = [100, 4 \times 10^{5}]\ \hbox{photons}$. The log-likelihood curve corresponding to the die class will be used as ${\cal L}^{\prime}({\bf R}\vert H)$ in Eq. (11) to identify the presence of the target object class. We also analyzed the capability of the proposed strategy to detect the presence of the die for different reconstruction distances. We varied the reconstruction distance, $z$ from $z = -265\ \hbox{mm}$ to $z = -375\ \hbox{mm}$, with $\delta z = -10\ \hbox{mm}$ for $N_{p} = 1 \times 10^{5}\ \hbox{photons}$. Fig. 4(b) shows a 3-D plot with the position of the maximum of the log-likelihood for the 12 reconstruction distances used, for $N_{p} = 1 \times 10^{5}\ \hbox{photons}$. The position of the maximum for the case of $z = -345\ \hbox{mm}$ is used to generate a blue dotted line to compare the positions against the value for that reconstruction distance. For each circle in the plot the reconstructed distance $z$ is also indicated. As we can see, the center estimated by the maximum of the log-likelihood shifts from the blue dotted line which means that deviates from the correct value. Fig. 5 shows the 3-D shape of the log-likelihood function for three different reconstruction distances, for $N_{p} = 1.5 \times 10^{5}\ \hbox{photons}$. As we can see, the maximum spreads over a higher range of pixels for Fig. 5(a) and (c), which is in agreement with the fact that the die becomes out of focus as the reconstruction distance separates from the in-focus distance (i. e., $z = -345\ \hbox{mm}$). To overcome this effect, we could use image processing algorithms to determine that the object is out of focus and thus to determine when the maximum of the log-likelihood will not be accurate. In order to associate a classification error to our methodology, we changed the position of the reconstructed die for the $N_{p}$ range of interest. To generate a transversal shift of the object $\delta{\bf r} = (\delta x, \delta y)$ we reconstructed the hologram with a tilted plane wave. This is performed by multiplying the complex hologram distribution $G({\bf r})$ by a tilted plane wave factor $\exp\{j2\pi({\bf r}_{i} \cdot \delta{\bf r})\}$ in Eq. (3), where the ⋅ symbol refers to the dot product. Taking this into account, we shifted the reconstruction of the die to 30 different locations following a 5 × 6 grid covering the whole image size. The axial position was fixed at a distance of $z = -345\ \hbox{mm}$. The difference, in pixels, between these regular coordinates and those measured by using the maximum of the log-likelihood was assessed for each position in the grid. The detected maximum of the log-likelihood in each image was classified according to the decision rule introduced in Eq. (11) and only those maxima identified as target objects were used. Table I shows the mean $(\overline{\delta x}, \overline{\delta y})$ and standard deviation $(\varepsilon(\overline{\delta x}), \varepsilon(\overline{\delta y}))$ over 30 positions of the difference between the considered correct die center and the maximum of the log-likelihood, for three photon counting regimes: $N_{p} = \{8 \times 10^{4}, 1 \times 10^{5}, 1.5 \times 10^{5}\}\ \hbox{photons}$. A value of $\beta = 1$ was selected for the application of the decision rule, in the three $N_{p}$ cases. Table 1 also shows in parenthesis, for the first column, the number of photons per pixel ($N_{T}$ represents the total number of pixels in the image) and, for the rest of columns, the relative error in percentage with respect to the size of the support function in each $(x, y)$ direction.

SECTION VI

We have shown a method to recognize 3-D objects in a 3-D input scene by using digital holography in the photon counting domain. The optical system used for recording the digital hologram was a phase-shifting interferometer based on a Mach–Zehnder architecture. The photon counting simulations were applied directly to each one of the four interferograms recorded by the optical system. These photon-counting interferograms were then combined by using a phase-shifting algorithm to generate the new hologram. The amplitude distribution of the 3-D object under weak illumination was finally reconstructed from the photon-counting hologram by simulating diffraction in the Fresnel approximation.

For detecting the presence of the target object in the input scene in photon counting conditions, we applied an ML approach under the existence of a hypothesis consisting of the distribution of this class as a mixture of two Gaussians, where one of the Gaussians models the target class object, and the other the background included in the support function defined. We must stress that this problem is different to other methods applying photon counting techniques directly to the images to be processed. In our case the hologram is created under photon counting conditions but the images are obtained by reconstructing the hologram at a certain distance and therefore the data distribution changes.

We analyzed the behavior of the log-likelihood as the reconstruction distance varied, concluding that the shape of this function spreads over as the reconstruction distance departs from the “in-focus” distance. We also analyzed the recognition capability of our strategy for different positions of the object and under three different photon counting regimes. Because of the statistical nature of our approach, these results are generalizable to other real or virtual objects independently of its position in the scene.

Summarizing, we have described a method for 3-D object recognition that uses holograms obtained under photon counting conditions to generate the object. Our results show that the objects can be recognized and their position determined for very low number of photons. This paves the way to its use for discriminating target objects in application fields like holographic microscopy under photon counting conditions as well as in biomedicine applications where low illumination conditions are necessary.

No Data Available

No Data Available

None

No Data Available

- This paper appears in:
- No Data Available
- Issue Date:
- No Data Available
- On page(s):
- No Data Available
- ISSN:
- None
- INSPEC Accession Number:
- None
- Digital Object Identifier:
- None
- Date of Current Version:
- No Data Available
- Date of Original Publication:
- No Data Available

Normal | Large

- Bookmark This Article
- Email to a Colleague
- Share
- Download Citation
- Download References
- Rights and Permissions