Direct Cell Counting Using Macro-Scale Smartphone Images of Cell Aggregates

The field of bioengineering depends on technologies for stable cell culture. Conventionally, every process involved in cell culture has been performed manually, so the culture efficiency and stability can vary between trials or depending on the technician. Among these processes, cell counting is particularly important because cell density affects cell function. Conventional cell counting techniques for cell number estimation are inefficient and unstable because they involve the manual work of collecting a sample of the cell suspension. Thus, a cell counting method that is not susceptible to human error is needed. In this study, we present a novel cell counting method based on smartphone imaging and convolutional neural network-based image processing. Cells are aggregated by centrifuging in a tube and then imaged using a smartphone. The image is transferred to a server, and the cell number is predicted using convolutional neural networks on the server. All processes are performed by a custom-developed smartphone-compatible web app. Compared with the conventional method using a hemocytometer, our method yields more stable cell counting. Furthermore, the time and labor required for cell counting are significantly reduced. Our new method could potentially replace conventional cell counting techniques and thus enhance the stability and efficiency of bioengineering studies that require cell culture.


I. INTRODUCTION
Studies in bioengineering, including tissue engineering, regenerative medicine, organ-on-a-chip, and biomedicine studies [1], [2], require stable and effective cell culture methods. Cell culture processes involve seeding, detaching, and reseeding cells. The density of cultured cells is known to affect their function and proliferation rate, so the number of seeded cells must be measured and adjusted before seeding [3]- [5]. Conventionally, every process involved in cell culture is performed manually. However, manual steps present a risk of contamination and result in variable culture efficiency and stability between trials and depending on the technician [6]. In the interest of standardizing The associate editor coordinating the review of this manuscript and approving it for publication was Rajeswari Sundararajan. these processes, several techniques have been developed, such as cell patterning, detaching, collecting, and harvesting [7]- [12]. Cell counting technologies, which are key to realizing stable and effective cell culture, have been developed, but they require dedicated devices or labor-intensive processes [13], [14].
The most widely used conventional cell counting method consists of several steps, as shown in Fig. 1a. First, a sample of the cell suspension is collected from the suspension containing all cells. Second, the sample is loaded into a hemocytometer mounted on a microscope. Finally, the cell density of the sample is measured by a technician and used to calculate the number of cells in the suspension, assuming that the cell density of the sample is equal to that of the suspension. This manual process is tedious and time consuming, increasing the burden on technicians. Because the counted cells cannot be reused, this process wastes cells that may be difficult to obtain. Moreover, the estimation of cell density in the suspension is susceptible to human error (for example, when counting cells on the microscope) and depends on the skill of the technician; therefore, the results tend to vary from technician to technician. The added steps also increase the risk of contamination.
Several cell counting methods based on cutting-edge technologies, such as image analysis, electronics, and optics, have been proposed [15]- [16]. Most of these methods are performed with a sample of cells from the cell suspension. Of these methods, image analysis-based techniques have been most commonly implemented in practical applications. An automated cell counter designed to decrease the technician burden and risk of miscounting has been developed and commercialized [17]. The cell suspension is introduced into a dedicated chamber, and the device automatically measures the cell density in the sample. While this tool does reduce the time and labor required for cell counting, the measurement may still be susceptible to error because it still requires manual sampling of the cell suspension. Another method based on image analysis is based on counting cells from microscopic images of cells adhered to a culture surface [18]. Manual sampling is not required with this approach. However, because the density of cultured cells is not homogeneous, the accuracy of the predicted cell number is influenced by the cell seeding techniques used by technicians [19]. In contrast, optics-based methods can be used to measure the number of cells in the entire cell suspension, but they require dedicated and costly devices and labor-intensive procedures and can be used only for cell suspensions in which the approximate cell number is known [16]. Thus, a novel cell counting method that is more practical and robust against error is still required.
As mentioned above, almost all previously developed methods for cell counting require sampling of the cell suspension, which introduces many issues. Although several attempts have been made to replace the entire manual cell culture process with a fully automated system, no method of automated cell counting has been applied in practice [20]. To address the limitations of conventional approaches, the cell number should be measured directly in a suspension to prevent technician error and variation due to technician skill and thus obtain consistent results. Furthermore, decreasing the number of steps involved in the process is helpful for reducing the time cost and risk of contamination.
Hence, we proposed a technique involving the use of macro-scale images showing all cells in a suspension. Because single cells cannot be observed or counted in a macro-scale image as a result of their small size, we focused on the only moment at which cells can be observed even with the naked eye: immediately after centrifugation, when the cells are aggregated at the bottom of the tube. Skillful and experienced technicians are sometimes able to predict the number of aggregated cells after centrifugation, which demonstrates the potential for estimating the cell number at this stage. Deep learning methods, especially convolutional neural networks (CNNs), have been successful in various computer vision tasks such as classification, segmentation, and regression [21]- [25]. Thus, we used deep learning with CNNs to analyze images and predict the cell number from macro-scale images of aggregated cells.
On the basis of this approach, we developed a novel cell counting method in which an image of aggregated cells is captured by a smartphone and transferred to a server by a web application for smartphone. The CNN on the server then predicts the cell number (Fig. 1b). The use of ubiquitous smartphone technology offers fast computing, easy connectivity to servers, and a user-friendly interface [26]- [30].
The proposed method is simple and relatively quick, and provides superior accuracy and consistency for cell counting.

A. OVERVIEW OF THE PROPOSED METHOD
Our method consists of three elements: image capture of aggregated cells, a web application for smartphones and a CNN. First, the aggregated cells in the tube are imaged from two directions using a smartphone. The second element, the smartphone-based web application, transfers these images from the smartphone to the server. Then, the CNN predicts the number of cells from the transferred images. The predicted cell number is transferred back to the smartphone via the web application. Therefore, in our method, the input is the centrifuge tube containing aggregated cells, and the output is the estimated cell number (Fig. 1b).

B. IMAGE CAPTURE
In general, datasets incorporating a large number of images captured with a range of imaging conditions need to be prepared to train the CNN to make robust inferences from images captured under various imaging conditions [29]. To minimize the number of training images required, we fixed the imaging conditions by using a jig to hold the centrifuge tube and smartphone, as shown in Fig. 2. With this jig, images can be taken from two directions with the exposure angle, lighting, and distance of the object from the camera kept constant. Figure 2 shows the fabricated jig with a 15-mL centrifuge tube containing a cell suspension. The jig comprises two parts, each fabricated by a 3D printer (BCN3D SIGMA R19, BCN 3D Technologies, Barcelona, Spain): The upper part covers the side of the tube and fixes the tube position relative to the camera. The bottom part has a holder for the smartphone, a space holding a light-emitting diode (LED; LP-LED3SET), and a black wall. The black wall was integrated to shade the tube from direct illumination and thus prevent direct light from reflecting off of the surface of the tube, which would occlude the view of the cells. A 15-mL tube (TR2000, Nippon Genetics Co., Ltd., Tokyo, Japan) is set into the jig. The 3D model files of our developed jig are provided in the supplementary information. The smartphone holder fixes the smartphone upside-down at a constant distance of 80.6 mm from the tube. The LED illumination helps to maintain consistent illumination conditions.
To prepare cells for imaging, a cell suspension is centrifuged (H-19α, Kokusan, Saitama, Japan) in a 15-mL tube for 2 min at 370 × g. The cell density in the suspension is measured using an automatic cell counter (TC20TM Automated Cell Counter, Bio-Rad, CA, USA) three times as the reference standard.

C. DESIGN OF THE CNN
We implemented a CNN model that predicts the number of cells from images of aggregated cells taken from two VOLUME 8, 2020 directions (Fig. 3a). Our CNN model comprises consecutive convolutional layers, max pooling layers, and fullyconnected layers. The last fully-connected layer consists of one neuron for the objective function (Fig. 3b). Every convolutional layer and all fully-connected layers except for the last are followed by rectified linear units [30]. We applied dropout (rate: 0.5) to the fully-connected layers [31]. Because the CNN model learns the task of regression, the meansquared error was used for the objective function. The details of the hyperparameters of the CNN model are provided in Supplementary Table 1.

D. DESIGN OF THE CELL COUNTING APP
We designed a web application (herein referred to as cell counting app, CCA) to serve as an interface between the smartphone and the CNN model on a server. The CCA consists of three main components: a web-based user interface, a web server for providing an application programming interface (API) for cell counting, and a server for machine learning (Fig. 4). The user interface allows the user to upload two images and displays the counting results. The web server receives the images and carries out the analysis using the trained CNN model. This CNN model was trained by machine learning on the server and transferred to the web server in advance. The web-based CCA algorithm performs the following steps: (1) the images are received; (2) the CNN model carries out the image analysis; and (3) the user interface receives the result of the analysis (i.e., the number of cells).
The user interface and cell counting API were implemented in HTML, CSS, and Python 3.6. We developed separate Django applications to implement the API and user interface independently. The API for cell counting was implemented by using the Django web development framework because the API used to interface with the trained CNN model via Python is based on Django. The web server receives and saves the trained CNN model from the server for machine learning via the cell counting API. The user interface was designed with a single-page application concept using the Django template language with Bootstrap 4. The source code of the CCA is available from https://github.com/funalab/ CellCountingApp.

E. PREPARATION OF THE TRAINING AND TEST DATASETS
We prepared a training dataset to train the CNN model. We captured images of cells centrifuged and aggregated at the bottom of the conical centrifuge tube using a smartphone (iPhone 8, Apple Inc., Cupertino, CA, USA) set into the jig; images were captured from two directions (Fig. 3a). The number of cells was varied from 1.0 × 10 6 cells to 1.0 × 10 7 cells in increments of 1.0 × 10 6 cells, and 50 image sets were acquired for each cell number. We defined the number of cells measured by the automatic cell counter as the true value. We divided these datasets into five subsets and performed cross-validation.
To evaluate the developed method, a test dataset was prepared in the same way as the training dataset. The numbers and the densities of cells in each sample were randomly determined. In addition, cell counting was also performed using a hemocytometer as a conventional cell counting method for comparison; the cell counting was repeated three times with the same sample. The datasets are available from https://github.com/funalab/CellCountingApp.

F. TRAINING PROCEDURE FOR THE CNN MODEL
We implemented and trained the CNN model using Chainer, an open-source software for deep learning [32]. We trained the CNN model with five-fold cross-validation. At each fold, we trained the CNN model for 100 epochs using Adam with mini-batches of five images. For each epoch, we evaluated the loss of the CNN model using validation data. The loss, L, was calculated based on the mean-squared error as follows: VOLUME 8, 2020 where N , y, and t represent the size of the mini-batches, the output, and the ground truth multiplied by 10 −6 , respectively.
In the pre-processing step, a 320 × 320 pixel region of interest (i.e., the cell aggregate) was cropped from the original image (Fig. 3a). Then, data augmentation was performed by adding perturbation with a uniform distribution to the cropped image and randomly flipping the cropped image in the horizontal direction. These data augmentations were performed to prevent overfitting of the CNN model [31]. To ensure robustness against variations in illumination intensity, the pixel values of the cropped input image were normalized to the range of [0, 1] by subtracting the minimum pixel intensity and then dividing all pixel intensities by the difference between the maximum and minimum pixel intensities [33].
To compare the conventional cell counting method with our CNN model, we used the model with the least loss in crossvalidation and applied it to the test data.

H. STATISTICAL ANALYSIS
We used a concordance correlation coefficient to evaluate the degree of agreement between the cell numbers determined by the proposed method and the true values [34]. The concordance correlation coefficient represents the variation between groups. We also used the F value to evaluate the variance in the cell numbers. The F value is the ratio of the variance in the group means to the mean of the variance within the group. The F test and unpaired Student's t-test were performed to compare the two groups. Values of p < 0.05 were considered statistically significant.

A. TRAINING AND VALIDATION OF THE CNN MODEL FOR PREDICTING THE NUMBER OF CELLS
We used the training dataset to train our CNN model by five-fold cross-validation and evaluated the training and validation loss for each epoch (Fig. 5). In every fold, the training and validation loss decreased with each epoch and converged. The mean of the lowest training loss across epochs was 0.021 ± 0.002, and the mean of the lowest validation loss across epochs was 0.006±0.002. We defined the convergence model for each fold as the epoch with the lowest validation loss (Fig. 5, orange plot). The fold-1 model with the lowest loss was used for verification with the test dataset, as described in the next section.

B. EVALUATION OF OUR CNN MODEL
We compared our method with the conventional method in terms of accuracy and stability. Accuracy was defined relative to the true value, while stability was defined as the consistency of the measurements. The concordance correlation coefficient and F value were used to evaluate each estimative metric.
First, the accuracy of each method was evaluated by comparing each measured value with the true value (Fig. 6); here, the number of cells measured by the automatic cell counter was considered as the true value. The concordance correlation coefficients of our method and the conventional method were 0.956 and 0.804, respectively. This finding indicates that the cell numbers estimated by our method were more accurate relative to the true value than those estimated using the conventional method, implying that the proposed method has higher accuracy than the conventional method.
The stability of each method was evaluated by plotting the cell numbers determined by our method or the conventional method against the true values and then determining the variance of the residuals of a linear fitted line. The variances for our method and the conventional method were 3.12 × 10 5 and 7.23 × 10 5 cells, respectively. The difference between these results was statistically significant (F value = 2.32, p-value of F test = 0.03 < 0.05). This result indicates that our method is more stable than the conventional method because it has a lower variance of residuals (i.e., prediction error).

C. DEMONSTRATION OF OUR METHOD
Next, we evaluated the entire proposed method from the viewpoints of the simplicity of the procedure and the time required to conduct the measurement.
To evaluate the simplicity of the proposed method, we compared the procedures of the conventional method and our method. As shown in Fig. 1(a), the conventional cell counting method using a hemocytometer involves four steps: After centrifugation, the supernatant is removed, and the cells are diluted in fresh medium to realize an appropriate cell density for counting (step 1). Then, a sample of the cell suspension is introduced into a hemocytometer, which must be washed daily (step 2). The hemocytometer is then mounted on a microscope, and the cell density is visually counted and used to calculate the total cell number (step 3). Finally, the sample is centrifuged once more (step 4) and resuspended in the volume of media needed to realize the desired cell density. In contrast, our method involves only a single step, namely using the CCA, as shown in Fig. 7; the remainder of the process is automated. One of the advantages of our method is that there is no need to maintain a homogeneous cell density. To use the CCA, the aggregated cells in the tube are imaged with a smartphone from the left side of the jig (sub-step 1-1) and again from the right side of the jig (sub-step 1-2). The captured images are then loaded into the appropriate location on the user interface of the CCA, and the 'Counting' button is pressed (sub-step 1-3). The estimated cell number is returned after tens of seconds (sub-step 1-4); the mean and standard deviation of the time from pressing the counting button to obtaining the result was 31.080 ± 3.806 s (n = 10). These findings demonstrate the simplicity and limited burden of the proposed method. Supplementary Movie 1 shows a comparison of the proposed and conventional cell counting methods.
We also measured the time required to carry out these two cell counting methods, as shown in Fig. 8. Our method was faster than the conventional method: while the conventional method (steps 1 through 4) required around 383.7 s, our method (a single step) required around 56.3 s, and this difference was statistically significant. This finding indicates that the proposed method realizes simple and quick cell counting.

IV. DISCUSSION
In this study, we demonstrate a novel automated smartphonebased cell counting method that is more convenient, accurate, and stable compared with the conventional hemocytometerbased cell counting method. The conventional technique is inconvenient because of the multiple steps and manual cell counting required. The proposed method can significantly reduce the time and involved procedures for counting cells. The use of smartphone technology with its seamless inputs and outputs of data enabled the rapid response of the proposed method. Furthermore, the required run-time of the proposed method is expected to be reduced in the near future as data transmission speeds increase. The steps required in the conventional method are complicated. A comparison of the proposed and conventional methods shows that our method could reduce the burden on technicians and the risk of contamination. Furthermore, the proposed method is promising not only from the viewpoint of functionality but also in terms of environmental protection; as shown in Supplementary Movie 1, the use of many consumables and washing of the hemocytometer (both of which are considered environmentally unsustainable) are required in the conventional method, but not in the proposed method [35]. Furthermore, the method is more readily applicable because the jig is the only special device required to apply the proposed method for practical use given the ubiquity of smartphones.
While we found that the accuracy and stability of the proposed method were statistically better compared with the conventional method, the actual cell number cannot be measured precisely. In this study, we defined the number of cells measured by an automatic cell counter as the true value, but there is no practical method to measure the exact cell number. In fact, even though there is a generalized protocol for bioengineering experiments, researchers need to make their own protocols. However, the exact number of cells is generally not of great importance. The estimated cell number derived with conventional methods is unreliable because of the numerous manual steps that are required to measure the cell number, as well as variation resulting from differences in the approach of individual technicians. Thus, the method proposed here is valuable in that it can serve as a consistent standard that is independent of technicians' skills.
Although our method provided superior accuracy and stability, further improvements could be achieved with two modifications (aside from simply increasing the size of the training datasets). The first involves the hardware, namely, modifying the size of the jig to obtain a clearer image with more consistent imaging conditions. Because of the gap between the jig, the tube, and the smartphone, the ROI can be shift within the image frame ( Supplementary Fig. 1), and the focus of the image can shift (Supplementary Fig. 2); the latter of these issues is more problematic. This could be solved by the use of a larger jig with appropriate dimensions. This modification should yield images better suited to the training dataset, which would thus presumably improve the results of the proposed method. The CNN model predictions in this study were based on images captured from only two directions. However, cell aggregation occurs in three dimensions.
Accordingly, the accuracy of the results could be further improved by modifying the jig to allow images to be captured from more directions. With regard to software improvements, the accuracy of the proposed method could be improved by changing the structure of the CNN model. The CNN model used in this study was based on a conventional architecture, VGG16 [36]. Future versions of the CNN model can be supplemented with modules, such as residual modules and attention modules, which have been reported to be effective in improving prediction accuracy [37], [38].
Here, we validated our method with a single cell type, constant centrifugation conditions, and one type of tube. Because cells of different species have different sizes, different centrifuging conditions result in cell aggregates of different sizes, even with a constant cell number [39], [40]. The jig used in this study was developed for a 15-mL tube, and cannot be used with other tube sizes. Thus, while we successfully demonstrated the potential performance of the proposed method, the results are specific to the conditions tested here. For our method to be used with other cell species, datasets must be collected for each cell species, and the CNN model must be trained by these datasets on the server to prepare a tailor-made trained model. These updates would make the present method more robust against conditions such as different cell species and various imaging conditions.
While several cutting-edge cell counting technologies have been proposed, there remains a need for techniques that do not require manual sampling and that can return results quickly for daily cell culture processes. In this study, we demonstrated the value of our novel cell counting method. We believe that our method has tremendous potential for easy cell number measurement in routine applications. To increase the ubiquity of the developed approach using macro-scale images, the CNN model can be incorporated into an automatic cell culture process rather than implementing a web application. The smartphone camera and jig could be replaced with higher quality equipment to improve and stabilize the imaging conditions, and thus facilitate the use of a CNN model with an automatic cell culture system.

V. CONCLUSION
In this paper, we introduced a novel cell counting method for estimating the cell number from images of a tube in which cells have been aggregated by centrifugation. The proposed method exhibited superior accuracy, stability, and convenience compared with the conventional and widely-used hemocytometer-based cell counting method. Our method is expected to be broadly applicable in bioengineering studies, and its implementation is expected to significantly reduce the burden of cell culture processes.

ACKNOWLEDGMENT
(Chikahiro Imashiro and Yuta Tokuoka contributed equally to this work.) The authors would like to thank for Dr. Shuichi Kurabayashi of Cygames, Inc., for his valuable advice in VOLUME 8, 2020 the development of our idea. Chikahiro Imashiro and Yuta Tokuoka are JSPS Research Fellows.