Label-Free Normal and Cancer Cells Classification Combining Prony’s Method and Optical Techniques

Label-free methods neither cause cell damage nor contribute to any change in cell composition and intrinsic characteristics. Indeed, there is much interest in the scientific community to learn more from existing methods and to develop new label-free based methods for detection and classification of cells. Cell classification using optical measurements has been frequently utilized. When cells interact with light, due to differences in the composition of different types of cells, changes in the optical absorption and transmission response result. This work combined the advancement in optical measurements and Prony techniques to enhance the classification of cells based on their measured optical profiles. In this work, six types of cells, HeLa, 293T, lung- cancer and normal, and liver- cancer and normal, were suspended in their corresponding medium and their transmission characteristics were assessed. After media de-embedding, the transmission profiles were fitted with a sum of exponentially decaying signals using the Prony algorithm. After that, the optical response of each cell was modeled with a set of extracted parameters: amplitude, frequency, phase, and damping factor. The four parameters extracted via the Prony method are related to the coefficients and locations of the poles for each fitted model. A figure of merit (FOM) has been introduced, whose distribution in the complex z-plane plays a major role in the classification of cell type. The changes in the values of FOM are due to the changes in cell composition and intrinsic characteristics of different cells.


I. INTRODUCTION
Diagnosing diseases such as cancer at an early stage is very crucial. At the initial stage, the symptoms of cancer are not apparent. If cancer spreads, effective treatment is an onerous task and, generally, the patient's survival rate is very low. More than 90% of women diagnosed with breast cancer survive the disease for 10 y compared to less than 20% of women surviving for 5 y when diagnosed at an advanced stage [1]. Approximately 93% of patients diagnosed with colon cancer at an early stage have 5-y survival rates compared to those diagnosed at a later stage [2]. A similar increase in survival rate is found for other types of cancer when detected early [3]. Much of the proposed work has been carried out for the discrimination of normal and cancer cells at an early stage. Various methods and techniques, such as the empirical mode The associate editor coordinating the review of this manuscript and approving it for publication was Derek Abbott . decomposition technique [4], a genetic algorithm [5]- [8], a projection image transformation algorithm [9], a quasi-Newton inverse algorithm [10], and the Prony technique [11], have been utilized for distinguishing normal and cancer tissues or cells.
Mukhopadhyay et al. used the Empirical Mode Decomposition (EMD) technique in their work detecting cancer at an early stage [4]. The signals obtained from elastic scattering spectroscopy from the normal and cancer cells are processed using the EMD technique. The optical signal response is decomposed into a set of finite numbers of band limited signals known as the intrinsic mode function (IMF). The area parameter of each IMF obtained for normal and cancer cervical tissues is used as a tool for discriminating the tissues. The results show that the algorithm is efficient in sorting normal and cancerous tissues. However, EMD has limitations in discriminating the components in narrowband signals [5]. Li et al. applied a genetic algorithm (GA) combined with linear discriminant analysis (LDA) as a signal processing technique in the detection of nasopharyngeal cancer [6]. The spectra obtained from surface-enhanced Raman Spectroscopy for malignant and benign tissues are analyzed using the GA-LDA method. The alterations in the features of the Raman spectra of normal and cancer tissues are used for differentiating the tissues. The GA -LDA algorithm is utilized to look for the dominant features of the spectra. The algorithm, although it worked efficiently for cancer tissue discrimination based on feature selection, has limitations. The algorithm has to be run more than 100 times to select the appropriate spectral bands. The overall accuracy of the diagnostic model is 76.9%. Li et al. used the same (GA-LDA) technique for detecting bladder cancer [7]. As mentioned earlier, the downside of this technique is that the algorithm is executed more than 100 times and Raman variables are searched for characterizing bladder cancer in each run. Duraipandian et al. reported the use of GA along with partial least squaresdiscriminant analysis (PLS-DA) with double cross-validation (dCV) for the feature selection from Raman spectra of normal and cancerous cervical tissues [8]. The results show a diagnostic accuracy of 83% in discriminating cancerous and normal cervical tissues. Franceschini et al. applied a projection image transformation algorithm for processing images of breast tissues [9]. The optical images processed by this technique enhance the features that show the inhomogeneity in normal and cancerous tissues. The spatial resolution of the optical method used in this work is 1 cm but they can detect tumors of smaller size if the images have good optical contrast. Salomatina et al. have investigated the optical differences between cancerous and normal skin cells.They have utilized a sphere spectrophotometer to conduct the absorption and transmittance measurements. Optical properties such as absorption and scattering coefficients of the normal and cancer skin are obtained from the measured quantities using a quasi-Newton inverse algorithm and the Monte Carlo technique. The efficiency of the quasi-Newton inverse algorithm is that it requires many fewer iterations (less than 10) to reach convergence. The optical parameters obtained are statistically acceptable if the probability value is less than 0.05, which means the optical properties of the normal and cancer tissues differ by more than 95%.
Hauer employed the Prony method for determining the modal components of the signal response obtained from a Western U.S. power system [12]. The signal components extracted using the Prony technique -in combination with Fourier techniques and frequency domain approaches -are used for dynamic modeling of the power system. The results show that the Prony algorithm gave a good fit with a reasonable SNR value for the high noise signal. The Prony method is used for finding low frequency oscillations in power systems. Xiao et al. compared the Fast Fourier Transformation (FFT) technique and the Prony technique in their study identifying low frequency oscillations in power systems and concluded that Prony is a competent technique compared to FFT [13]. The simulation results show that the technique is efficient for identifying low frequency oscillations in real grids. Chuang et al. applied the Prony analysis technique on a synthesized signal that represents the backscattered signal from radar targets [14]. Then, the Prony algorithm is used to deduce natural resonances of the targets. The resonances obtained using the Prony method are used for target detection and discrimination. They concluded that the results obtained through the Prony method in the absence of noise are more reliable than those from the numerical search procedure. The lengthy computation time in numerical search methods is greatly overcome by using Prony's method. Marple et al. discussed the use of Prony's method to detect and classify acoustic transient signals obtained from subaquatic sonar sensors [15]. The energy component coupled to the pole amplitude and damping constants of the estimated model is used as a key for transient detection and for extraction of features used in classification. The results show that the technique worked very well even in the presence of noise in the signal.
In biomedical signal processing, the Prony technique is prominently used for the characterization of tumors [16]- [18], for cancer detection [19]- [21] and for power spectrum estimation of DNA sequences [22]. Furthermore, the Prony algorithm is widely used in biomedical signal processing for tumor detection. Huo et al., in an attempt to model breast tumors, reported the use of the Prony method [16]. The tumor in the breast is represented as a concealed dielectric target. When it is subjected to a short EM pulse, it backscatters a signal that includes complex natural resonances (CNR), which is equivalent to the poles of the tumor. The Prony method gives the poles and residues from the time domain backscattered signal. The complex natural resonance can be correlated with the morphological and intrinsic composition of the tumor. Hence, the optical and electrical properties can be used to detect and identify tumors. Li et al. utilized an approach similar to [16] for characterizing breast tumors based on 2D-FDTD simulation [17]. The time-domain response of the tumors is obtained through FDTD simulation and is analyzed using the Prony technique for characterizing the tumors. The results are promising in characterizing breast tumors when used in combination with imaging diagnosing methods such as ultrasound imaging, confocal microwave imaging and so on. Wang et al. discussed the use of the Prony technique for extraction of poles from noisy data for tumor characterization [18]. The results show that the poles extracted using the Prony technique gave accurate results even when the detected signal was mixed with a limited level of noise. Bannis et al. employed the Prony technique for breast cancer tumor detection from a scattered field electromagnetic (EM) signal [19]. The poles extracted from the scattered EM signal are used as a tool for breast cancer detection. In another work to study the effect of the chest wall on breast tumor detection, they utilized the Prony algorithm for poles extraction [20]. Gale et al. utilized the Prony method for estimating parameters of a nuclear magnetic resonance (NMR) signal obtained from blood plasma for the early detection of cancer [21]. Roy and Barman suggested a method to estimate the power spectral density of a DNA sequence. In this approach, the simulation results from Prony's all-pole model efficiently distinguished the coding and noncoding regions of a DNA sequence [22].
This work combined the advancement in optical measurements and Prony techniques to enhance the label-free based classification of cells based on their measured optical profiles. Here, six kinds of cells, HeLa, 293T, lung-cancer and normal, and liver-cancer and normal, were suspended in their corresponding medium and their transmission characteristics were collected. It is worth to mention that the cell lines under investigations: HeLa, 293T, lung and liver cells were taken from different tissue organs. However, the lung (as well as liver) healthy and cancer cell lines were taken from the same organ tissue. The transmittance profiles were then fitted with a sum of decaying exponential signals using the Prony algorithm. A figure of merit was introduced, whose distribution in the complex z-plane plays a major role in the classification of cell type. The alteration in the values of FOM is due to the changes in cell composition and intrinsic characteristics of different cells.

A. CELL SAMPLE PREPARATION AND CULTURE
The cell lines used in this work were procured per the American Tissue Culture Collection (ATCC) standard. Each type of cell was cultured in a medium that is specific for the cell type. Based on the type and feature of cells, the nutritional requirements for its growth in vitro also differ [23]. This difference in nutritional requirements is applicable for normal and cancerous cells of the same tissue. A summary of the cells used in this work is shown in Table 1.
The culture medium and methods for the six type of cells used in this study are discussed in detail below. A humidified air ambience with 5% carbon dioxide (CO 2 ) at 37 • C was maintained for all the cells.

1) BEAS 2B -NORMAL LUNG CELLS
Per the ATCC guidelines, the culture plates on which the cells were cultured were precoated with a precoating mixture. The mixture used for BEAS 2B cells contains fibronectin (0.01 mg/mL), bovine collagen (0.03 mg/mL) and bovine serum albumin (0.01 mg/mL) diluted in bronchial epithelial basal medium (BEBM). The BEGM bullet kit (Lonza TM Clonetics TM ), which includes the essential additives (gentamycin/amphotericin was discarded) for primary culture, was used for BEAS 2B cells. Supplements such as penicillin (100 units/mL) and streptomycin (100 mg/mL) were added to the medium. For trypsinization, an EDTA solution (0.53 mM) with 0.5% polyvinylpyrrolidone (PVP) was used.

2) HCC-827 -LUNG CANCER CELLS
The ATCC-recommended medium suitable for culturing CC-827 lung cancer cells is the Roswell Park Memorial Institute (RPMI) 1640 medium. RPMI-1640 was obtained from HyClone TM , US. The medium is suitable for culturing a variety of mammalian leukemic cells. The medium had a 10% heat-inactivated fetal bovine serum (FBS) supplement as base. The trypsinization of the cells was done with 0.25% trypsin (a 0.53 mM EDTA solution).

3) THLE2 -NORMAL LIVER CELLS
A mixture consisting of 2.9 mg/mL of collagen I, 1 mg/mL of fibronectin, and 1 mg/mL of bovine serum albumin in BEBM was used as a precoating mixture coated on the culturing plates. The reagents were procured from Sigma-Aldrich. Discarding the gentamycin/amphotericin and epinephrine, the Lonza TM Clonetics TM BEGM bullet kit with a base of epidermal growth factor (EGF) (5 ng/mL), phosphoethanolamine (70 ng/mL), and other additives in the kit were used as growth medium for the THLE2 cells. The supplements for the media were heat-inactivated FBS (HyClone TM , US -10%) and penicillin-streptomycin (Gibco -1%). Trypsinization was carried out with 0.5% trypsin (0.53 mM EDTA solution).

4) HEPG2 -LIVER CANCER CELLS
The HEPG2 cancer cells from liver tissue were grown in Dulbecco's modified Eagle's medium (DMEM -HyClone TM ) in culture plates. Ten percent of FBS (HyClone TM , US) and 1% of penicillin-streptomycin (Gibco) were supplements for the medium. Per ATCC guidelines, trypsinization for these cells was done using 0.5% trypsin (0.53 mM EDTA solution).

5) 293T -NORMAL KIDNEY CELLS
These normal cells from kidney tissue were cultured in DMEM (HyClone TM ) base. The medium was supplemented with 10% FBS and antibiotics such as penicillin-streptomycin and gentamicin.

6) HeLa -CERVICAL CANCER CELLS
According to the ATCC standard, the HeLa cells were cultured in DMEM (HyClone TM ) with 7% fetal calf serum (FCS) and the antibiotics PenStrep and gentamicin as supplements. The cells were subcultured and trypsinized as per the ATCC protocol. Each type of cell was suspended and cultured separately.

B. SPECTROPHOTOMETER
A light beam from a xenon light source is split into its component monochromatic beams by diffraction grating [24]. The single wavelength beam is divided into two equal-intensity beams. One of the two beams is the reference beam that passes through a cuvette loaded with only the medium. The second beam passes through a transparent container loaded with cells in the media. The container is a high precision cell made of quartz superasil with light path of 1 mm and an area of 2 by 2 mm 2 . The spectrometer has an electronic detector that measures the intensities of the light beam. Based on the measured intensities, the transmittance of the cells is determined.

C. SENSOR AND LIGHT SOURCE
Mini-Spectrometers C11708MA from Hamamatsu/Japan, the optical sensor used in this work, convert the variable attenuation or reflectance into signals. The details of the integrated MEMS sensor and the light source used in this work are reported in [25].

III. CURRENT APPROACH
This section summarizes the Prony estimation principle. The measured optical responses can be fitted or modeled with a sum of damped exponential signals as given in (1) [26]: where N is the number of samples, and p is the order of the fitted model, which is same as the total number of damped exponential components in the summation. The least number of exponentials that gives the best fitting is considered the optimum order of the fitted model. The complexity of the fitted model increases with the increase in the order number. The i th exponential component has amplitude A i (same unit as y[n]), frequency f i (Hz), damping factor α i (per second), and initial phase θ i (in radian). T s is the sampling interval between consecutive data samples. Using Z-transformation, (1) can be expressed as follows: where h i represents the coefficient (magnitude) of the estimated poles and z i denotes the location of the poles in (2).
These parameters can be expressed as: The sampled data is preprocessed prior to fitting and parameter extraction. The first step in preprocessing is to eliminate noise from the data. This is done by smoothing the data. Data smoothing is followed by data detrending to remove trends, if any, from the measured sequence. The detrending operation gives a more accurate linear model that best describes the relationship between the input-output signals. Based on the observation of the pole coefficients and locations, a figure of merit (FOM) has been introduced for the discrimination between normal and cancer cells from the same tissue.

IV. RESULTS AND DISCUSSION
Six types of cells were utilized in this study. These cells were utilized to carry out the proposed current approach in terms of detection capabilities. Normal and cancer cells of lung and liver were used to demonstrate cell identification using the current approach. Using a hemocytometer, the cell concentration in each suspension was adjusted to 10 7 cells per mL with 5% mean error. After that, each type of cell suspension was loaded in the experimental setup and the optical transmittance of the cells was measured over the wavelength of 640-1050 nm with a wavelength reproducibility between −0.5 to 0.5 nm and maximum of 20 nm FWHM spectra, under constant light conditions. The de-embedding of the medium and holder contributions are then performed by subtracting the suspension responses directly from the filled control medial response. Figure 1 shows the signal intensities varying with wavelength. As the measured signal exhibits transient behavior, a wavelength modified Prony algorithm can be applied.   Figures 1(a) and (b) depict the measured optical responses superimposed with the Prony estimated signal for the HeLa and 293T cell lines, respectively. The least number of exponentials that gives the best fitted model is considered the optimum order of the model. The optimum order (p) was found to be 40, which is the minimum required order that provides the excellent fitting. A higher order (higher p values) will result in redundancy and require further processing resources. The responses were collected using the experimental setup reported in [25]. It is recommended to apply the same order to both the 293T and HeLa cell suspensions for fair comparison. Parameters such as amplitude, frequency, phase and damping factor of the exponentials are extracted from the fitted response of each type of cell. Further information about the parameter estimation for exponential sums approximated by the Prony method has been detailed by Jun et al.. in [27]. Furthermore, a description of the extractions of the corresponding transient parameters, such as the order of the signal model, the data window length, sampling interval and parameters such as the attenuation factor has been explicitly described in [26]. Figures 2(a), (b), (c) and (d) show the plots of amplitude, damping factor, frequency and phase, respectively, obtained for 293T with a fitting order of 40. The measured data were smoothed using the Savitzky-Golay method [28]. This has been used here to increase the data precision without distorting the signal tendency. The extracted parameters are further processed to extract the corresponding coefficients and pole locations. The coefficients and locations of poles were computed using (3) and (4). The extracted coefficients and location of poles for the HeLa and 293T cell suspension are illustrated in Fig. 3(a) and (b), respectively.
Rodríguez et al. have conducted a review of Prony's method regarding the signal approximation using MATLAB code [29]. They have implemented the classical methods to test both performance and Prony approximation. The complete theoretical bases of Prony's method and their pieceby-piece implementation in MATLAB have been presented. Rodríguez's algorithms and codes are adopted in this work. As illustrated in Fig. 3(a) and (b), the extracted poles are located within the unit circle of the z-plane. The y-axis represents the imaginary part and the x-axis represents the real part. The coefficients of HeLa are focused around the origin point when compared to 293T in the z-plane. The distribution of the coefficients and poles locations is not helpful to be used for cell identification. Therefore, a figure of merit (FOM) is introduced for better identification accuracy. The FOM is defined as follows: where L(p) and C(p) represent the location and coefficients of the poles, respectively. The computed FOM is then   normalized for each type of cell with its corresponding maximum value. Although the Prony algorithm was developed for modeling signals in the time domain, it can be applied for responses obtained in frequency domains as well [30]. In his paper, Kumaresan has extracted the poles directly from the frequency response using a technique that is analogous to Prony [30]. Figure 4 shows the extracted FOM for HeLa and 293T cells. The FOM distribution for the HeLa is very close to the center of the unit circle. Significant differences in cell composition for normal and cancer cells have been reported. Their interaction with light will cause a change in the optical absorption and transmission response. Due to differences in the composition of the different type of cells, the light interaction with the cells causes an alteration in their absorption and transmission responses. The modifications of the optical responses from normal to cancer were explained mainly by morphological changes, modification of its physiological and biochemical properties that affect the refractive index and allow them to be differentiated from each other. The pole locations and coefficients will be affected accordingly. Empirically, the cancer cells exhibit higher transmittance intensity when compared to normal ones from the same tissue type.
The FOM is inversely proportional to L(p); therefore, for corresponding high locations of poles, lower FOM values are obtained. On the other hand; the complex poles are defined as σ ± jω, where σ is the damping coefficient and ω is the resonant pulsation. The damping and resonant pulsation are higher in cancer cells compared to normal cells. Therefore, the FOM becomes smaller for cancer cells than for normal cells.
Based on these results, it is evident that the coefficients and poles locations vary with composition and cell morphology. Undeniably, the main difference between normal and cancer cells of the same tissue is in terms of composition and morphology. Hence, the proposed FOM is a distinctive parameter that can be used to explore the detection and identification of normal and cancer cells. This is possible when the technique is used only for fitting the response in the frequency domain to the sum of the damped exponential and for parameter extraction. The objective here is to make inferences from the obtained parameters and for further processing. These frequency domain measurements cannot be utilized for the generation of a representative equivalent circuit. Hence, this work claims the validity of using the Prony technique to model a frequency domain signal, as the extracted parameters VOLUME 8, 2020 are used for making inferences for cell identification. It is worth mentioning that the focus of this work is to classify normal and cancerous cells for the same tissue. Therefore, the FOM for lung and liver normal and cancerous suspensions were extracted per the introduced procedure and are depicted in Fig The distribution of the FOM of cancer cells is closer to the origin of Z-plane when compared with that of the normal cells. Each plotted measurement represents the average of 15 measurements. The multiple measurements were conducted on different aliquots taken from the same sample suspension in the same region spot. The error bars in the subfigures of Fig. 5 represent the average values along with maximum and minimum values. The bar corresponding to the x-axis represents the average in the FOM real part, while the endpoints represent its maximum and minimum values. The bar corresponding to the y-axis represents the average in the fom imaginary part, while the endpoints represent its maximum and minimum values.
For further investigations, the distribution of the FOM for normal and cancer cells has been superimposed on each other, as depicted in Fig. 6. Figure 6(a) superimposes the lung normal and cancer corresponding FOMs. Figure 6(b) superimposes the FOMs for the liver normal and cancer cell lines. The majority of the real part of the FOMs corresponding to cancer cells have positive real part (located in the right hand side) of the z-plane; the majority of the real part of the FOMs corresponding to normal cells have negative real part (located in the left hand side) of the plot.
The figure of merit (FOM) which we are introducing for the first time relates the location of the poles (L(p)) and the coefficient of poles C(p)). Scientifically: significant differences in cell composition for normal and cancer cells have been reported [25]. Their interaction with light will cause a change in the optical absorption and transmission response. Due to differences in the composition of the different type of cells, the light interaction with the cells causes an alteration in their absorption and transmission responses. The modifications of the optical response from normal to cancer state were explained mainly by morphological changes, modification of its physiological and biochemical properties that affect the refractive index and allowing them to be differentiated from each other. The poles location and coefficients will be affected accordingly. Therefore, it is suggested that within the range −0.5 to +0.5 in the z-plane, if 85% of the FOM values have negative real part (located in the left hand side) then the cell lines under study is considered to be normal; else it is cancer cells. There is a clear discrimination strategy: by performing optical measurements on the different in vitro normal and cancer cell line models, the developed data processing procedure based on the Prony method to achieve a label-free discrimination between cancer and healthy cells from the same tissue type works very well.

V. CONCLUSION
In summary, this work addressed the classification and discrimination between normal and cancer cells from the same tissues. A label-free method combining the Prony estimation theory and optical transmittance measurements was introduced and proven to be a powerful technique. The proposed approach has been examined using six types of different cell lines. The measured optical responses of the six types of cells have been reconstructed using the Prony algorithm with same fitting order of 40. Based on the observations, a normalized figure of merit has been introduced for identification. Based on this merit, the distribution of the FOMs around the center of the unit circle of the cancer cell lines was closer than the normal cell lines from same tissues (in the case of lung and liver cells). These findings can be considered the foundation stage for cell identification using optical measurements combined with the Prony estimation theory.