A Miniaturized and Intelligent Lensless Holographic Imaging System With Auto-Focusing and Deep Learning-Based Object Detection for Label-Free Cell Classification

Cell detection and classification is a key technique for disease diagnosis, but conventional methods such as optical microscopy and flow cytometry have limitations in terms of field-of-view (FOV), throughput, cost, size, and operation complexity. Lensless holographic imaging is a promising alternative that offers large FOV, rich information content, and simple structure. However, its performance on cell detection and classification still needs to be improved. In this paper, we propose an intelligent cell detection system based on lensless holographic imaging and deep learning. Our system uses unstained cells suspended in solution as samples and employs a threshold segmentation-based auto-focusing algorithm to determine the optimal focusing distance for each imaging session. We also use a deep learning-based object detection neural network to classify different types of cells from the focused holographic images without the need for cell segmentation. We demonstrated the performance of our system using four cell detection tasks: tumor cells vs. polystyrene microspheres (77.6% accuracy), different tumor cells (80.1% accuracy), red blood cells vs. white blood cells (78.1% accuracy), white blood cell subtypes (88% accuracy), which showed that our system achieved high accuracy with label-free, portable, intelligent, and fast cell detection capabilities. It has potential applications in the miniaturized cell detection field.

various types of images.The detection of cells can provide valuable information for disease diagnosis, prognosis, and treatment evaluation.For instance, the detection of blood cells can reveal the proportion and morphology of different types of blood cells, such as red blood cells, neutrophils, monocytes, and platelets [1].Similarly, the detection of circulating tumor cells can serve as a biomarker for cancer metastasis and staging.Conventional methods for cell detection mainly rely on optical microscopy or flow cytometry.Optical microscopy, a widely used technique, involves staining and magnifying cell samples on slides and manually inspecting them under a microscope.However, optical microscopy has several limitations, such as low throughput, a small field-of-view (FOV), static sample input, and human bias [2].In contrast, flow cytometry is considered the gold standard for cell detection since it can automatically analyze millions of cells per second based on their optical properties, such as fluorescence or scattering.Flow cytometry has advantages over optical microscopy in terms of high throughput, high efficiency, and single-cell resolution.However, it also has drawbacks such as large size, high cost, complex operation, loss of spatial information, and destruction of cell samples [3].As a result, flow cytometry is primarily suitable for centralized medical facilities, such as hospitals or research institutes, but not for point-of-care settings or resource-limited areas.Due to the growing demand for point-of-care health monitoring, conventional cell detection techniques based on large-scale facilities are not sufficient.Therefore, there is an urgent need to develop miniaturized and portable cell detection techniques.
In recent years, much research effort has been put into miniaturizing and simplifying cell detection devices [4], [5].And lensless imaging technology was developed based on the conventional optical microscopy technique [6].The limitation of optical microscopy for miniaturization lies in its bulky optical components (mainly lenses) and the stringent requirement for optical alignment.The essence of lens-free imaging technology is to remove the optical lenses to reduce the optical alignment requirement and to integrate high-resolution digital image sensors and digital image processing techniques to achieve comparable imaging quality with optical microscopy.Finally, image detection and classification techniques are combined to complete the cell detection task [7].Since lens-free imaging systems discard most of the optical components, their overall structure is very compact.And because there is no limitation of FOV reduction caused by lens magnification, their FOV can be large.Therefore, their cell detection quantity and efficiency can be greatly improved.
Lensless imaging systems mainly use three imaging techniques: lensless shadow imaging technique [8], [9], [10], lensless fluorescence imaging technique [11], [12], [13], and lensless holographic imaging technique [14], [15], [16].Lensless shadow imaging technique is a simple and direct method that records the shadows of samples on a digital image sensor using a spatially limited light source.However, this method can only capture low-quality images with amplitude information (mainly contour information) of samples.Lensless fluorescence imaging technique is similar to lensless shadow imaging technique, but it uses excitation light to stimulate fluorescent markers on samples and filters out the excitation light to eliminate interference.This method can distinguish different cells by their fluorescence signals, but its resolution is much lower than other lensless imaging techniques.Lensless holographic imaging technique uses partially coherent light as a light source instead of the spatially limited light source used by other methods.The diffracted light and transmitted light generated by partially coherent light passing through samples interfere with the original light source to form a holographic image on a sensor array.This method can record both amplitude and phase information of samples, reflecting their internal structures and optical properties.Moreover, this method can achieve high-resolution and large FOV imaging by using computational techniques such as image reconstruction and super-resolution.Therefore, the lensless holographic imaging technique has unique advantages over other lensless imaging techniques in terms of image quality, resolution, and versatility.
In recent years, to improve the intelligence of lensless holographic imaging systems, they have been well combined with deep learning technologies [17], [18].In lensless holographic imaging systems, deep learning is primarily used to improve imaging resolution, replace image reconstruction steps, and perform cell classification based on holographic images.Although the use of deep learning technology increases computing resource consumption, it replaces the function of corresponding hardware structures and simplifies the entire system.Early researchers hoped to replace this image reconstruction process with neural networks by using trained deep-learning network models to convert holographic images and effectively improve recovery efficiency.Rivenson et al. proposed a deep learning convolutional neural network (CNN) for building the image reconstruction part of a lensless holographic imaging system [19].The results of this work showed that CNN networks can not only completely achieve image restoration but also have higher signal-to-noise ratios than traditional restoration algorithms.By applying deep learning classification technology to lensless holographic imaging technology, Vercruysse et al. proposed a method that uses microfluidic channels to capture single cells using high-speed cameras to capture their holograms [20].They then constructed a three-classification dataset of white blood cells based on holographic images and achieved classification recognition of mononuclear and neutrophils.Based on the characteristics of lensless holographic imaging, we propose to combine it with deep learning-based object detection techniques to achieve accurate and efficient cell detection.The advantages and innovations of this combination are as follows: First, holographic images have a large FOV, which can capture a large number of cells in a single image.Object detection techniques can directly perform detection on the whole image without segmenting it into small patches, which can fully utilize the large FOV advantage of holographic images and reduce computation time.Second, holographic images contain both amplitude and phase information of cells, which reflect their internal structures and optical properties.Thus deep learning techniques can extract rich features from both amplitude and phase images and fuse them together to improve the classification performance.
In this study, we proposed a miniaturized and intelligent lensless cell detection system that uses lensless holographic imaging technology as the imaging method and combines deep learning object detection technology to achieve rapid and accurate detection and classification of cells.We proposed a threshold segmentation-based automatic focusing algorithm to meet the characteristics of the test samples and demonstrate its effectiveness.And we constructed four types of datasets, which are various cancer cells, red and white blood cells, and mononuclear and neutrophils respectively.Finally, we present the test results of the object detection model trained from these four datasets.The average accuracy for cell classification reached 80.9%.

A. System Overview
Fig. 1(a) shows the general working principle of a lensless on-chip digital holographic microscopy, which is composed of an incoherent light source, a pinhole, a sample plane, and a detector plane.Generally, the diameter of the pinhole is about 50∼100 μm, the distance from the pinhole to the sample plane (Z1) is about 2∼5 cm, and the distance from the sample plane to the detector plane (Z2) is less than 1 mm [21], [22].The centers of these three components should be vertically aligned as much as possible.Based on such principle, Fig. 1(b) and (c) show the structure of the proposed lensless on-chip digital holographic microscopy system.The system consists of a yellow LED light source (OSRAM LY E65F-DAEB-46-1, λ = 590 nm) with a pinhole of 300 μm in diameter as a partially coherent light source, a coverslip (about 140∼170 μm in thickness) as the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
sample plane, and a CMOS image sensor (Sony IMX219PQH5-C, 1.12-μm pixel size) as the detector plane.A Raspberry Pi 4 Model B is employed as the controller.
The workflow of this system is as follows.First, the unstained cell sample solution is imaged using the proposed system to obtain a holographic cell image.Then, the holographic image is cropped into sub-images and focused using the auto-focusing algorithm proposed in this work.Finally, the images are input to the deep learning-based object detection model trained in this work to obtain the classification results.

B. Sample Preparation
The human peripheral blood sample and the Caki renal cancer cells, HepG2 human liver cancer cells, SW480 human colon cancer cells, A549 human lung cancer cells, A2780 ovarian cancer cells are sourced from the School of Medicine Translational Medicine Institute, Zhejiang University.The experiments conducted with these cells comply with relevant biological cell experiment requirements and are approved by the Ethics Committee of Hangzhou Dianz University.
1) Tumor Cells and Polystyrene Microsphere: Since tumor cells and polystyrene microspheres (diameter 15 μm) are pure solution samples, we only need to dilute them with PBS to 4 × 10 4 /mL (a suitable sample concentration for lensless holographic imaging).For the preparation of microspheres, we also need to add Tween20 to increase their surface activity, because they are hydrophobic and tend to stick together.
2) Red Blood Cell and Isolation of White Blood Cell: The following steps describe how to prepare red blood cell solution for lensless holographic imaging: (1) Shake human whole blood well on an oscillator to prevent serum and blood cells from separating.(2) Dilute the whole blood by 100, 000 times with PBS reagent to obtain a red blood cell concentration of about 4 × 10 4 /mL, which is more suitable for lensless holographic imaging than the original concentration of about 4∼5.5 × 10 9 /mL.Since white blood cells are roughly 1/1000 of red blood cells in whole blood, the diluted solution can be considered as a high-purity red blood cell solution.(3) Use an optical microscope to check the activity and integrity of the red blood cells in the solution.Observe whether the cells are round, flat, and full.(4) Stain some cell solution with AO fluorescent dye (which only stains the nucleus) and check with a fluorescent microscope to confirm the purity of the red blood cells in the solution.Since red blood cells have no nucleus, they will not fluoresce when the solution is pure.( 5) Perform experiments within 12 hours after preparing the solution, as red blood cells have a short survival time in PBS.
The following steps describe how to prepare white blood cell solution: (1) Add 2 mL of fresh whole blood to a test tube and mix with 6 mL of red blood cell lysis solution (R1010 Solarbio).Shake well and place on ice.(2) After 10 minutes, shake well again and keep on ice for another 10 minutes.(3) Centrifuge the mixture at 450 × g for 10 minutes and discard the supernatant.(4) Resuspend the cell pellet in 4 mL of cell lysis solution.Shake well and place on ice for 5 minutes.(5) Centrifuge the mixture at 450 × g for 10 minutes and discard the supernatant.(6) Resuspend the cell pellet (white blood cells) in 2 mL of PBS and shake well on an oscillator.This will yield a white blood cell concentration of about 4∼10 × 10 6 /mL.( 7) Dilute the cell solution with PBS to obtain a concentration of about 10 4 /mL, which is suitable for holographic imaging.( 8) Use an optical microscope to check the activity of white blood cells.( 9) Store the cell solution in an ice-water mixture after use.
3) Isolation of Mononuclear and Neutrophils: These two types of cells were obtained using a human peripheral blood neutrophil isolation reagent (Solarbio, Human peripheral blood neutrophil isolate kit), and their purity was verified by AO staining and fluorescence microscopy.The human peripheral blood neutrophil isolation reagent is a combination of reagents, containing reagent A and reagent C. The following is the preparation process of obtaining a high-purity single-nucleus cell nucleus neutrophil solution using these two reagents.(1) Add 4 mL of reagent A to a clean centrifuge tube with a 15 mL capacity, then add 2 mL of reagent C (slowly add, stack on top of reagent A) to form a gradient interface (reagent A and reagent C are immiscible).( 2) Use a Pasteur pipette to draw 4 mL of fresh whole blood and carefully spread it on top of reagent C to form a three-layer gradient liquid surface.(3) Place the above sample in a horizontal axis centrifuge, centrifugation conditions are 550 × g, 20 minutes.(4) After centrifugation, two layers of ring-shaped milky white cell layers will appear in the centrifuge tube, which are the single-nucleus cell layer and the neutrophil layer.The other layers are plasma layer, reagent C, reagent A, red blood cell layer.

C. Lensless Holography Imaging Principle
When light passes through a sample, it undergoes reflection, refraction, and transmission.Assuming that the object generates a physical optical field U obj in response to the light field, the light field passing through the sample plane is the superposition of the reference light and the object light, that is: Here, assuming that the position of the pinhole is Z = 0 and downward is the positive direction, U Z represents the light field at a distance of Z. Since Z 2 is very small, the propagation of the light field can be regarded as plane wave propagation.According to Rayleigh-Sommerfeld diffraction theory, the spatial propagation function (spatial domain) of plane light waves is as follows: Here, H d represents the spatial transfer function of light wave propagation d distance, f x and f y represent the spatial frequency of light field in x and y directions respectively, and λ is the wavelength of light.Combining (1) and ( 2) with angular spectrum theory [23], we can obtain the light field reaching the detection plane: Here F {} and F −1 {} represent Fourier transform and inverse transform operations respectively.At this time, the light field is a complex field (including amplitude and phase information), but the final light field to be recorded by the image sensor can only record intensity information.Therefore, the final recorded hologram is actually the intensity part I s of the final formed light field, that is: (2) and ( 4) indicate that the final hologram is an intensity map of the superposition of object light and reference light forming a light field.This intensity not only contains the amplitude information of samples but also includes phase information.

D. Auto-Focusing
The object detection network of this system employs a focused hologram as its input.However, due to the 3D structure and semi-transparent characteristics of the target sample, the traditional automatic focusing algorithm in the field of image processing does not perform optimally in this system.Therefore, we propose an automatic focusing algorithm based on threshold segmentation that takes into account the characteristics of the target sample.The main flow of this algorithm is as follows in Fig. 2(a).First, a rough estimation of Z 2 of the imaging system is performed.Then, backpropagation images are generated under different Z that are taken from a neighboring range of the rough value with a certain step, and perform threshold segmentation processing on the images to count the sum of pixels (S) occupied by others that are not the real part of the image.When S gets the maximum, that Z is the accurate Z 2 .
We used a USAF resolution test board as a sample to test the effectiveness of our proposed autofocus algorithm.The testing process was as follows.We used the holographic imaging system constructed in this paper to holographically image the USAF resolution test board and obtain its hologram.A part of the hologram was cropped to reduce computational complexity.Then, we numerically back-propagated the hologram with a 2 μm increment and saved the back-propagated images within the range of 0-600 μm.Next, we observed and obtained the clearest image and finally compared the focal distance obtained by our proposed autofocus algorithm with the human-confirmed focused image's focal distance to verify the effectiveness of our algorithm.
Fig. 2(b)-(d) shows the experimental results using USAF resolution test board as a sample.In Fig. 2(c), the maximum point obtained at Z 2 = 346 μm, and the corresponding backpropagation image is a focused real-image.In practice, the objects captured by our system are generally small translucent objects with low contrast such as blood cells and algae.Using these samples as the object of the auto-focusing algorithm would be challenging to generate good results as it is difficult to distinguish the real image by threshold segmentation.Therefore, it is recommended to use a high-contrast, clear object (such as USAF resolution test board) to perform the auto-focus processing to obtain the accurate Z 2 of this system before imaging the experimental samples.

E. Deep Learning-Based Object Detection
Object detection is a fundamental task in computer vision that aims to identify and locate objects of interest in an image or video.It has many practical applications in fields such as security, surveillance, autonomous driving, medical imaging, and robotics.The basic principle of object detection is to use a model that can learn to recognize and localize objects from a large amount of labeled data, and then apply the model to new images or videos to make predictions.Deep learning-based object detection models use deep neural networks as the core component to extract features and perform classification and regression on the input data [24].Deep neural networks are composed of multiple layers of artificial neurons that can learn complex and hierarchical representations of the data through training.Deep learning-based object detection models can achieve high accuracy and robustness in various scenarios, but also face challenges such as computational complexity, data imbalance, and domain adaptation [25].
In this paper, we focus on deep learning-based object detection methods that can be divided into two main categories: singlestage methods and two-stage methods.Single-stage methods predict the class and bounding box of objects from the input image directly, while two-stage methods generate candidate regions first and then classify and refine them.Single-stage methods usually have faster inference speed and lower computational cost than two-stage methods, but at the cost of lower accuracy [26].In our system, we use YOLOv5 as the object detection network.
YOLOv5 is a single-stage deep learning object detection network that has high real-time performance and low computational cost [27].It consists of a family of compound-scaled models trained on the COCO dataset, and can perform detections at three different scales with anchor boxes.It also supports various features such as test time augmentation (TTA), model ensembling, hyperparameter evolution, and export to different formats.Compared with other object detection networks, such as Faster R-CNN, which have slower inference speed and higher memory consumption due to their two-stage pipeline, YOLOv5 is more suitable for detecting cells in holograms [28].In actual experiments, the accuracy of cell detection of the model produced by YOLOv5 training can reach more than 80%.Fig. 3 shows the recognition detection effect of YOLOv5 on holograms.

F. Dataset Construction
The process of making a holographic dataset is as follows in Fig. 4(a).( 1) Prepare a cell solution with a suitable concentration.(2) Use this system to take holographic images.(3) Use autofocus on the holographic images to obtain focused   Group A demonstrates that the model can detect microspheres and cells with F1 scores of 92.1% and 81.4%, respectively, in the test set.This indicates that the model can recognize the biological features of tumor cells that distinguish them from microspheres, such as translucency and nucleus.The overall accuracy of the model is 77.6%, which verifies the ability of the YOLOv5 target detection network to extract and recognize features from cell holographic images.Group B shows that the model can recognize Caki and SW480 cells with F1 scores of 88.7% and 89.9%, respectively, which are higher than the F1 score of 85% for A2780 cells.This suggests that the model has a relatively consistent recognition ability for the three types of cells and a higher overall accuracy (80.1%) than group A. Group C reveals that the model can detect red blood cells and white blood cells with F1 scores of 83.7% and 88.6%, respectively, indicating a stronger detection ability for white blood cells than red blood cells.The lower F1 score for red blood cells may be due to their cake-like shape, which causes a large variation in plane shape at different angles and leads to some side-exposed red blood cells being classified as background.The overall accuracy of the model is 78.1%, which verifies the system's detection ability for basic human health indicators.Group D exhibits superior performance in all evaluation metrics in both validation set and test set.In the test set, the model achieves an accuracy of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.88.0% and F1 scores of 93.6% and 93.2% for single-nucleated cells and neutrophils, respectively.These results suggest that the model has a good detection ability for white blood cell subtypes.The detection of white blood cell subtypes (group D) is of great significance for this system.The distribution (proportion and quantity) of white blood cell subtypes can reflect various physiological information (diseases, health status, etc.) of the human body [29].The proposed system is a miniaturized intelligent cell detection system based on lensless holographic imaging, which has the advantages of simplicity, portability, easy operation and detection of white blood cell subtypes.This system can be an important tool for instant intelligent medical diagnosis in medically deficient areas.
In this paper, we use the focused holographic images as the direct input of the deep learning recognition network, instead of the reconstructed images after holography that are commonly used in traditional cell detection systems based on lensless holographic imaging technology (Table I shows the effect of focused hologram).This avoids the computationally intensive image reconstruction process in the lensless holographic imaging system.The image information is not increased by the reconstruction process, and the classification ability of the target detection network mainly depends on whether the input images have enough information to distinguish different types of holographic cell images.The average accuracy of the four groups of target detection experiments in this section is 80.9%, which may be attributed to two factors: 1) some faint outline circular images in the hologram (cells at other heights outside the focus point mapped to the current focus point) are recognized by the network model as cells, but they are labeled as background in the manually annotated datasets; 2) the information of holographic images formed under current imaging conditions is insufficient.

IV. CONCLUSION
In this paper, we proposed a miniaturized and intelligent lensless holographic imaging system with auto-focusing and deep learning-based object detection for label-free cell classification.The system uses a simple and portable structure to image unstained cell samples in solution and employs a threshold segmentation-based auto-focusing algorithm to obtain focused holographic images.We also used a deep learning object detection network (YOLOv5) to classify different types of cells from the focused holographic images.We demonstrated the system's performance on four cell detection tasks: tumor cells vs polystyrene microspheres, different tumor cells, red blood cells vs white blood cells, and white blood cell subtypes.The system achieved 80.9% average accuracy in the four test groups, with label-free, portable, intelligent, and fast cell detection capabilities.It has potential applications in miniaturized cell detection field.
The main contributions and innovations of this paper are as follows: r We proposed a smart cell detection system based on lens- less holographic imaging and deep learning, which can overcome the limitations of conventional methods such as optical microscopy and flow cytometry in terms of field-of-view, throughput, cost, size and operation.
r We proposed an automatic focusing algorithm based on threshold segmentation that is suitable for the characteristics of the cell samples in our system.The algorithm can determine the optimal focusing distance for each imaging session and obtain focused holographic images.
r We used the YOLOv5 framework to train and test four holographic cell datasets that we constructed.The datasets contain about 2 × 10 4 /mL cell samples of different types and sizes.The YOLOv5 framework can directly locate and detect targets on the entire image, which can be well combined with lensless holographic imaging with a large field of view.
r We evaluated the performance of our system on four cell detection tasks and obtained high-accuracy results.The system can recognize the biological features of different types of cells, such as translucency, nucleus, shape, and size.The system can also detect white blood cell subtypes, which is of great significance for instant intelligent medical diagnosis.

Fig. 2 .
Fig. 2. (a) Workflow of the auto-focusing.(b-d) Results of auto-focus processing to find an accurate Z 2 using USAF resolution test board.(b) Segment the more obvious part of holographic interference superposition from the original hologram for auto-focusing; (c) sum curve for non-real image pixel after target separation by threshold segmentation at different backpropagation distances.When Z 2 is 346 µm, the maximum value is obtained, which is the focus point; (d) backpropagation images and their threshold segmentation results at different distances taken from the backpropagation process.

Fig. 3 .
Fig. 3. Hologram input and prediction output of the YOLOv5 network.

Fig. 5 .
Fig. 5. Distribution of loss values during the training process of four object detection models.(a) and (d) Represent the distribution of loss values during the training process of target detection models for tumor cells and microspheres, multiple types of tumor cells, red blood cells and white blood cells, mononuclear and neutrophils.

Fig. 6 .
Fig. 6.Result of detection experiment.(a) and (d) Represent the test set result of tumor cell and microspheres, multiple types of tumor cells, red blood cells and white blood cells, mononuclear and neutrophils.Every group contains confusion matrix, precision, recall, accuracy and F1 score.

TABLE I DIFFERENT
EFFECTS OF FOCUSED HOLOGRAM AND RAW HOLOGRAM AS INPUT FOR YOLOV5