Reconstruction and Localization of Tumors in Breast Optical Imaging via Convolution Neural Network based on Batch Normalization Layers

The Near-Infrared (NIR) Diffuse Optical Tomography (DOT) aims to reconstruct optical-property images of tissue and identify and localize breast tumors. This study has developed an efficient and fast DOT reconstruction method based on Deep Learning (DL) algorithm. The DL has already been applied in DOT application with limitations such as a limited dataset and the reconstruction of absorption coefficients only. We solve the problem of a limited dataset by generating a large dataset with multiple phantoms and inclusions at various positions. Moreover, a single Deep Neural Network (DNN) model has been designed to reconstruct absorption coefficients and scattering coefficients. For evaluation of the proposed DNN models, the phantom experimental dataset has been used where the results of the proposed DNN models outperform the results of the Tikhonov Regularization (TR) method and other Artificial Neural Networks (ANN). Moreover, it is shown that the DNN model with batch normalization layer results in improved spatial resolution, based on Contrast-and-Size Detail (CSD) analysis, as compared to DNN models without batch normalization layers.


I. INTRODUCTION
Optical imaging has become one of the primary preclinical and clinical imaging modalities due to recent developments in photonics and molecular probes. In optical imaging, light diffusion is a fundamental restriction that limits the spatial resolution of deep-tissue imaging. A forward model and an inverse model are used in DOT to produce the tomographic image. It is essential to understand the optical properties of biological tissues to develop DOT models. There are multiple benefits of optical imaging compared to other imaging techniques, such as its non-invasive nature, relatively low cost, and portability (which allows repeatable imaging procedures under differing patient conditions).
Despite the significant advances made in new technologies and improved temporal and spatial resolution, reduced cost, and broader application in recent decades, many improvements can still be made, including accurate reconstruction, localization, reducing acquisition and discomfort, while increasing clinic throughput and accuracy. As tumors have higher levels of vascularization than the surrounding tissue, resulting in different light absorption characteristics; in addition, relative Hb/HBO concentrations can distinguish tumors from background tissue and discriminate between cancers with various rates of activity. Furthermore, it should be noted that the optical properties depend primarily upon the type and concentration of hemoglobin in the tissue. In addition, the optical absorption coefficient, the optical scattering coefficient, and the mean cosine of the scattering phase function are essential factors known as the reduced scattering coefficient. Continuous-wave (CW), frequency domain (FD), and time-domain (TD) are three types of systems used in DOT.
As a result of the ill-posed and ill-conditioned nature of the current methods for solving inverse problems in DOT, the amount of information obtained from a sample is severely limited. The resolution and noise of these methods are also low. Thus, there is a need for an alternative method for providing a faster and more accurate diagnosis when diffuse optical tomography is used to obtain images of human breast tissue. To solve inverse problems in diffuse optical tomography, we discuss several deep learning networks. A number of applications of deep learning have been actively investigated, including the classification of images, computer vision, and regression analysis, and the accuracy of these tasks has been proved to be state-of-the-art. As of today, deep learning and DOT are in their early stages of development. Most implementations have utilized basic tools and simplified implementation scenarios. DL has attracted extensive attention and has had a profound impact on nearly every facet of modern life. Deep CNNs, on the other hand, can learn complicated patterns or objects in large data repositories as well as or better than humans. Regression tasks and inverse problems have recently been solved using DL methods.

A) CLASSICAL APPROACHES
For the last two decades, NIR light has been extensively investigated and applied for optical breast imaging [1].A classical method consists of the 'forward problem,' that is, of predicting the measured fluence at the detectors given a geometric model of the optical parameters, background parameters, as well as the position and functionality of the source and detector are necessary for studying the propagation and inverse problems of light in diffuse tissue. In addition to regularization, optimization, statistical modeling, and parametric representations, signal processing techniques are employed. In order to stabilize inversions of forward models, regularization techniques are used to eliminate ill-conditioning caused by the ill-posedness of the inverse problem. It is much less expensive than conventional imaging techniques such as magnetic resonance imaging.
Due to optical scattering, high-resolution modalities using low-energy photons are limited in depth to just a few millimeters, resulting in rapid degeneration of the image quality. Reconstruction of the optical properties of DOT images usually suffers from low spatial resolution owing to the diffusive nature of NIR light in tissues. DOT is a functional technique that estimates the intrinsic biophysical composition of tissues including the concentration of total hemoglobin, oxyhemoglobin, water, and lipids [2]. It is non-invasive, has deep penetration, and does not cause harm to patients during screening in comparison with other classical imaging techniques such as X-ray mammography and ultrasound imaging. It reconstructs tomographic 2-D/3-D images of absorption and scattering coefficients. Diffuse optical imaging (DOI) involves utilizing NIR light between 650nm and 1000nm to image biological tissue in the diffusive regime, assessing the difference between tumor and normal breast tissue blood oxygenation [3].
DOT describes a process of reconstructing a spatial map of optical absorption coefficients (chromophore concentrations related to absorption) and scattering coefficients from fluence measurements, using an analytical model for describing photon propagation [1]. It is also important to note that the quality of the results obtained depends on how the image is acquired and how it is reconstructed [4]. In the nonlinear approach, the inverse problem is regarded as the optimization of an objective function representing the sum-squared difference of the measurements to the diffusion model, plus additional regularization terms representing prior knowledge like smoothness, sparsity, and total variation [5]. Whereas some analytical approaches do not need the inversion of the sensitivity matrix [6], their convergence is slow. When three parameters refractive index, absorption, and diffusion coefficients are unknowns, no unique solution exists to the inverse problem(reconstructing optical coefficients from a forward model) which results in degradation of image reconstruction [7]. Important parameters for phantom design i.e. Region of Interest (ROI), size of inclusion, off-center, diameter, contrast, and background are shown in Figure 1.

B) DEEP LEARNING APPROACHES
A deep learning approach is superior to classic methods as it is capable of learning features from raw input data directly. Furthermore, it can be used to create more sophisticated models with greater accuracy from complex datasets, which cannot be achieved by a basic regression function alone [8]. DL methods also enhance the images via (a) noise and artifact reduction that results from learning prior information [9] (this is critical for DOT systems with a low number of measurements when regularization methods do not perform well) and (b) accurate image reconstruction to better recover the optical properties of tissue [10]. D L techniques have been used to tackle the inverse scattering issue [11]. Investigation of DL has drawn researchers' attention extensively since DL can solve complicated patterns in huge datasets of any sort [12]. In addition, it can also be applied to medical image reconstruction [13], where it has demonstrated encouraging results.
The use of deep learning algorithms as opposed to conventional inverse problems has been shown to produce better results in terms of data processing, image segmentation, and image reconstruction. Research is taking place in this area as it becomes necessary for high-resolution imaging of mammalian tissues to identify any anomalies present there to develop a robust, affordable, and non-invasive system. DL approach proposed in [14] has been used for recovering the absorption coefficient inside a phantom made from propylene with a vertically oriented cylindrical cavity, while the cavity is filled with different acetyl inclusions that have different optical properties.
The DOT breast imaging community has attempted to reconstruct human tissues, but most of the initial efforts were focused on improving the localization to obtain absorption profiles of the object under study. It has become increasingly important in recent years to develop mathematical and computational methods for the simultaneous reconstruction of several different optical quantitative features. Yedder et al. [15] applied the DL algorithm to detect abnormality inside the phantom with 240 measurements and a limited-angle view. On the other hand, Sabir et al. [16] also applied Convolution neural networks (CNNs) to experimentally estimate the bulk optical properties of breast phantom. Feng et al. [17] and jalalimanesh et al. [18] use single dense layers and convolution layers to reconstruct absorption coefficients by considering scattering coefficients constant for single phantom measurements.
When light traverses centimeters of tissue, it experiences significant scattering. As a result of this scattering, tissues are difficult to image in terms of structure and function; the light that is re-emerging from the tissue has traveled a complex path and any localizations of absorption or scattering or other optical parameters can be lost. Based on the measured signals, DOT reconstructs the medium's optical properties and the spatial distribution of those properties. Researchers working in the area of medical image analysis have studied the benefits of using DL algorithms in different ways to reconstruct those properties. There have been several studies that have explored how computer vision can be used to analyze the outcomes of medical examinations, such as blood tests, X-rays, ultrasounds, and magnetic resonance imaging (MRI) [19]. In the DL, different tasks are performed, including detecting tumors, localization of tumors, improving spatial resolution, classifying cells, and detecting various diseases. Since the advent of efficient computational infrastructures such as graphics processing units (GPUs) and cloud computing systems, DL has become increasingly important in various fields [3], including image reconstruction in medical diagnosis.
The inverse scattering problem has been addressed with machine learning approaches in a few preliminary works. Kamilov et al. [11] proposed the second beam propagation method, in which the unknown photon flux is computed using a backward-propagation algorithm, to compute a non-linear inverse scattering solution for optical diffraction tomography. By using a neural network to simulate the dynamical scattering of fast electrons, Broek and Koch [20] have developed an earlier version of the beam propagation method. There has been some evidence [16,21] developing over the last few years that convolutional neural networks are able to better estimate bulk optical properties and increase imaging speed compared to other existing methods, suggesting the potential of using convolutional neural networks (CNN) in conjunction with optical tomography. In another article [22], methods for propagating light from angled sources in compressed breast tissues characterized by subsurface inhomogeneities were combined with novel inverse problems and deep learning methods. To detect and reconstruct test objects, a deep learning algorithm, called U-Net, was developed.

C) CONTRIBUTION
The contribution of this paper can be summarized as four-fold: • We design two non-identical deep neural network algorithms based on 1D and 2D convolution layers, followed by the BatchNormalization (BN). Our experiments show that the BN indeed has positive effects on localization and reconstruction of optical properties.
• To the best of our knowledge, a deep learning network based on the BatchNormalization Convolution neural network (BNCNN) layers and 1D CNNs has been not used in varied medical image reconstruction most specifically in DOT inverse problems to date.
• The simultaneous recovery of absorption and scattering coefficients was done in the same model that trains the network with FD data.
• Key challenges as potentially promising strategic directions for further study were discussed in order to deal with sensor domain raw data instead of image domain since most CNN models are based on image domain input data.

D) ORGANIZATION OF THE PAPER
A complete schematic diagram of the presented work can be seen in Figure 2. In the first section, we give a brief overview of raw data (radiance data), followed by data collection (simulation and experimental) along with presenting a variety of essential imaging parameters; we then summarize preprocessing for network input, either signal or image, later we introduce two neural networks based on input data, and finally, we explain how our proposed networks successfully reconstruct optical properties, i.e., absorption and scattering coefficients.

III. DATA PREPARATION
This study includes (a) numerical modeling for creating a training dataset and (b) experimental verification for the test.
The former aimed to evaluate the learning of proposed networks based on datasets to recover the absorption and scattering coefficients of defects inside a biological phantom. The use of computational propagation models is essential for understanding the physics behind DOT and simulating DOT measurements but is also necessary to determine the fluence rate corresponding to a particular reconstruction. In practice, data collection from the forward computation (fluence rate) for the model development is the first step in the DOT image reconstruction, and the inverse model follows it. Afterward, the collected data is analyzed to uncover helpful information and gain a deeper understanding of the data. The latter was conducted to verify the reconstruction, localization, and accuracy of optical properties based on DL. An example of the 2-D circular phantom, including an inclusion discretized by triangular meshes, light sources and detector positions, and other factors, is shown in Figure 5.   The reconstruction of an image using DNNs requires a large dataset of pairs (X, Y), where X denotes radiance data and Y denotes the predicted optical properties. X is obtained by the forward computation based on the finite element method. In radiance data (X), we have two kinds of data in the different domains for human tissue the reduced scattering coefficients are much higher than the absorption coefficients. In the current study, we employed an in-house coded program [23,24] to generate a dataset by the TR method to simulate photon propagation inside circular phantoms with a diameter of the range that varies from 60−150 mm; the absorption and scattering coefficients of the phantom are in the range 0.005−0.03 mm −1 and 0.5-3 mm −1 , respectively. The phantom includes one or two inclusions associated with varying radii. The radius of inclusions/tumors is randomly selected from the range of 2.0 to 30 mm. The phantoms are discretized into 8192 triangular meshes with 4425 nodes totally for solving the forward model, an example of triangular meshes i.e. 19 nodes and 24 elements are depicted in Figure 4. Our dataset is sufficiently rich with over 90 different phantoms ranging in size from 60 to 150mm and with frequencies from 10 to 100MHz. We also include homogenous samples in the dataset so our train network is capable of detecting the absence of tumors. Table 1 describes the specifications of numerical data set preparation, T, V, Te, homo, si, di, s-d, partition, a. s′ denotes Training, validation, Testing, Homogeneous, single Inclusion, two Inclusions, and Source-detector position around the phantom, based on the number of inclusions, absorption coefficient, and scattering coefficient respectively. The unstructured finite element mesh is mapped to a regular pixel grid of 2×64×64 to reconstruct absorption and scattering coefficients.16 measurement points are assumed to be evenly spaced around the circular circumference of each image reconstruction for each of 16 excitation positions, resulting in a total of 16×15=240 amplitude measurements and 16×15=240 phase measurements (16 source positions and 15 detector positions). We generate 20,000 samples in total, where the distribution is 15,000, 4,000, and 1,000 samples used for training, validation, and testing sets, respectively. The simulation test set is synthesized by adding 40% additive noise to the network input to mimic experimental data during boundary measurements. In order to consider the effect of experimental errors, the noise was added to the generated simulation data during data modeling.

B. EXPERIMENTAL SETUP FOR PHANTOM DATASET (TEST DATASET)
An experimental test dataset (phantom) was created by gathering measurement data obtained previously through the phantom test at the laboratory. Twelve samples were chosen, each with 16 source locations and 15 detectors ( Figure 6). As tissue is a turbid medium with significant scattering, the light follows a highly complicated path. As a result of a high scattering level, not many photons are measured. We measure a few photons due to the presence of scattering. However, the first arriving photons are assumed to have followed a direct path (few scattering events) through the tissue. It is needed to calibrate the input data from experiments before employing them to test the trained deep learning model. The calibrated   data were computed accordingly, provided that both homogeneous and inhomogeneous data are present from the same phantom case.
Phantoms with cylindrical ( Figure 7) or semi-ellipsoid ( Figure 3) provide an excellent solution to mimic tumor/background contrast in the breast. Cylindrical and semiellipsoid tissue-mimicking phantoms with varied inclusions were prepared using silicone as the matrix, mixed with carbon and titanium dioxide powders to adjust absorption and scattering properties. Further, the calibration phantom was designated and employed with the properties a = 0.006 mm -1 and s′ = 0.6 mm -1 to simulate human breast tissue. 2×16×15 (optical properties' detector positions' sources positions) floating-point values are used as the input data for each sample. These values are normalized log amplitude and normalized phase. A 64×64 grid of absorption and scattering coefficients is generated as a result. Table 2 shows the specifications of sample parameters presented in this paper from the test dataset for further discussion afterward.

IV. METHODOLOGY
We reconstruct optical properties using two non-identical DNNs based on 1D and 2D convolution layers, and each layer is followed by the BN [25]. The simultaneous recovery of absorption and scattering coefficients has been made in the same model that trains the network with FD input data. Since most CNN models are based on image domain input data, we also discussed key challenges for further study to deal with sensor domain raw data (fluence rate/radiance data/boundary measurements) instead of image domain.
In the rest of this section, we focus on iterative and noniterative computational models for the propagation of light in diffusive tissue, which are helpful for DOT imaging, particularly in frequency-domain imaging, in the presence of absorption as scattering coefficients in the particular parameter of interest.

A. ITERATIVE RECONSTRUCTION
Most of the classic image reconstruction methods are iterative. As the photon diffusion equation has a non-linear nature, it generally requires a two-step procedure, including forwards and inverses. In general, iterative reconstruction minimizes a cost function based on two terms where the first term describes the data(radiance) and the second term describes a regularization term based on some preliminary information. The DOT generates a map of spatial optical absorption and scattering coefficients using fluence measurements based upon a forward model for describing photon propagation. Several image reconstruction methods have been developed depending on the light propagation model, measurements, geometry, and optimization scheme [26,27]. Iterative reconstruction methods need to improve the physics, sensors, and noise statistics of an imaging system, increasing cost and time.
In DOT, the inverse problem needs a solution where (r) is required to be calculated based on the measurement data (r) and light source information. The process starts by assigning some initial guess (r), then solving the forward measurement to get the corresponding (r), then comparing it with the measured (r) to check for some criteria. If those criteria are met, then the computation is stopped and (r) is obtained as a result. If they aren't, then update (r) and repeat the process until those criteria are met. The forward measurement can be solved either by analytic solutions or numerical solutions. The inverse problem is solved by iteratively updating (r). This update depends on the comparison between computed (r) from the forward solution and measured (r). Figure 8 illustrates a complete schematic diagram of the iterative reconstruction process. Generally, it's described by the datamodel misfit difference. This value is to be minimized concerning unknown (r) until convergence is reached. This minimization problem can be formulated with either the linear method or the non-linear method. An iterative reconstruction method i.e., TR is used to create a training dataset.

1) TIKHONOV METHOD
There are multiple models to describe the heavily scattered NIR light propagation in DOT among which the diffusion approximation equation is widely used. Tikhonov method is one of the most important classical methods for reconstructing optical coefficients in DOT. With the diffusion approximation equation, the relationship between the photon density (r) and optical property distribution (r) can be defined based on equation 1. It is very likely that biological tissues scatter photons in the forward direction, but due to the diversity of scattering events, the scattering direction can also be random. This problem belongs to a class of problems referred to as inverse problems. The behavior of interactions among a large number of photons in turbid media can be described by the radiative transport or Boltzmann radiative transport equation(RTE) [28]: where v is the speed of photons in the medium, and ( , Ω, ) is the radiance (power per unit area and unit solid angle) as a function of position r, in the direction Ω at time t. t = s′ + a is the optical transport coefficient. Here is the scattering phase function, and Q( ,Ω, ) is the radiant source function. The left-hand side of Eq. (1) accounts for photons leaving a small element in phase space, and the right-hand side accounts for photons entering it. The radiative transport equation can be simplified based on diffusion theory if the scattering probability is much larger than that of absorption, or s′ ≫ a, with an isotropic source.
The complete RTE accurately describes light propagation through tissue, but an analytical solution can be applied only to a limited number of scenarios. Consequently, an approximation is applied, the diffusion approximation, which is an expansion of the RTE in first-order spherical harmonics. Hence, the reduced scattering coefficient is assumed to be larger than the absorption coefficient. To this extent, the radiance is expressed as the weighted sum of the photon flux rate, which is the integral of the radiance over a given solid angle, and the current density is defined as the net flow of energy per unit area per unit time. After several mathematical manipulations, it is possible to simplify and rewrite the RTE in diffusion form. Diffusion approximation to the transport equation can also represent NIR photon movement in highly scattering media, such as living tissue. Diffusion model data are then derived from the frequency domain calculation as below, i.e.
In this case, the isotropic source Q0, at a position r delivers light through turbid media at source frequency. Additionally,  represents the fluence rate at position r observed at a frequency , a(r) is the optical absorption coefficient, and D(r) denotes the optical diffusion coefficient which is defined as the value of D(r) = 1/3( s′(r) + a(r)) A diffusion-based approximation can be used if the scattering probability is much greater than absorption within the medium. For this approximation to work, the reduced scattering coefficient must be small compared to the absorption coefficient. This reduced value is equivalent to the scattering rate needed to achieve a uniformly (or isotopically) random scattering function in terms of scattering coefficients.  This work focuses on the FD system, which uses a laser source (a few MHz -1 GHz) to irradiate the tissue and measure the amplitude and phase of diffusing waves. Using the additional information provided by the stage, it is possible to measure both the absorption coefficient and the scattering coefficient simultaneously. By measuring the phase shift (delay) and amplitude decay of the detected signal compared to the incident one, it is possible to obtain information about tissue's absorption and scattering properties in FD. The use of NIR light for DOT has several disadvantages, the most important of which is the complexity and iterative nature of the inverse problems involved in reconstructing the tomographic image from the obtained data. The section below discussed how inverse problems could be efficiently solved using noniterative DL networks.

1) BPNN
In previous studies, researchers have developed a non-iterative reconstruction method based on Back Propagation Neural Networks (BPNN) to reconstruct DOT images [17] . BPNN was performed using a circular phantom with a diameter of 80 mm for 22000 samples, absorption coefficients of 0.01 mm -1 , reduced scattering coefficients of 1.0 mm -1 , containing 16 sources, and 16 detectors uniformly arranged in its circumference. Diameters of 6, 8 characterize one inclusion case, and 10 mm, while diameters of 8 mm characterize two case studies. A constant reduced scattering coefficient was assumed for inclusions and the background.
PBNN is limited to one particular phantom design dataset, and it is not applicable for more than one phantom sample dataset (complex); the scattering coefficients, which are kept constant, make it simpler to reconstruct images using backpropagation neural networks. The scattering coefficients in human tissue are higher than the absorption coefficients, which makes it challenging to reconstruct both properties simultaneously in one network since neural networks are based on learning; therefore, it is challenging to train a neural network to recover both properties simultaneously when the input data is complex and in a completely different domain ( s′ ≫ a ).

C. PROPOSED DOT RECONSTRUCTION FRAMEWORK
A significant challenge for developing non-iterative DNN models is that the distribution of radiance data between each layer changes during the training stage as the changes are made to the parameters of the previous layer, which complicates the process. Consequently, the training rate is lower, the learning rates are slower, and the parameter values must be initialized carefully, making it challenging to train non-linear models. In this section, deep convolutional neural network models are described based on 1D and 2D convolution layers. The code used for training the neural network is available at https://github.com/Nazish-Murad/1D-2D-BNCNN

1) 1D DATA IMAGE RECONSTRUCTION
The 1D data sequences (amplitude and phase) in DOT problems are collected from the forward model for image reconstruction. However, it is time-consuming to convert the 1D samples to 2D format. In addition, the original measured signals may be polluted by fiber attenuation, photodetector saturation, and fiber-subject contact problem, which result in wrong features being created in the 2D maps. Moreover, most simulation examples currently used in DOT imaging are simple ones, which do not represent the powerful generalization capabilities of network models. Therefore, the 1D-CNN is suitable for analyzing sensor data over time as a CNN with generalization capabilities. In order to reconstruct DOT images, a 1D-CNN model with a structure of convolutional layer, pooling layer, BN layer, and complete connection layer is proposed. The 1D measurement data series of DOT can be directly contributed to the 1D-CNN without 2D  map conversion. This allows mutually maintaining the correlation of original signals to extract critical features for image reconstruction. This part of the paper utilizes a 1D-CNN algorithm to achieve image reconstruction. As the collected signals are subject to noise interference, the pooling operation of CNN preserves the characteristics and removes redundancy. Each input layer consists of 240 (16×15) values for phase and amplitude, and each output layer consists of 4096 (64×64) values for both absorption and scattering coefficients. The convolutional, BN, and full connection layers are employed to establish the relationship between inputs and outputs. In convolutional layers, padding = "same" and stride = 1 were adopted. The seven convolutional kernels have sizes of 1×480×64 (for the first six convolution layers), and 1×480×8 (for the final convolution layer) are employed. The numbers of kernels among them are 64 and 8, respectively. The network description is shown in Figure 10, and the details are shown in Table 3. Considering the boundary measurements, the network input size is 2×240 and considering the unknown absorption.
Furthermore, for scattering coefficients, the network output size is 2×64×64. With 1-D convolutions, we extract the features and their strength from 1-D data (boundary measurements). Convolutional layers are more memory efficient than fully connected ones, as fewer parameters need to be learned; therefore, the model is less likely to overfit data ( Figure 9). During the training phase, the network for the numerical study was trained for 50 epochs at a learning rate of 0.0001; further, a batch size of 32 was chosen for the network. The training process takes approximately five seconds per epoch (5'50 = 250 seconds for full training).

2) 2D DATA IMAGE RECONSTRUCTION
In the first step, the "noisy" simulated data is supplied to the 2D CNN, convolutions are performed with seven layers of 3×3 kernels, and both the same padding and stride are set to 2×2, preserving as much spatial extent of the input as possible. In addition, the convolutional layers comprise the same number of filters followed by BN layers. Two fully connected layers consisting of 64 and 8192 hidden units follow the convolutionpooling layer. An activation function and BN layers follow every convolutional and linear layer. 2D convolution layers followed by two fully connected layers and BN layers mimicking 2D data are generated (can be seen in Table  3). Final results are computed using soft plus. The 2D network model detected the anomalies accurately and proved that acceptable results could be attained using a 2D form of optical measurements. Figure 11 provides the complete 2D CNN model description.

V. RESULTS
This study examines the effects of the DL networks on DOT by training several model architectures. We proposed and developed that reconstruction method because classical regularization-based algorithms tend to produce over smoothed images, which leads to poor image quality and inaccuracies in the reconstruction process. This framework utilizes two networks to improve the results and the location of tumors from DOT research since it prevents model divergence and facilitates faster convergence through higher learning rates.
The following subsections provide qualitative analysis of reconstructed optical properties and more details for randomly selected samples from simulation and phantom test datasets.

A) PERFORMANCE ANALYSIS
It is challenging to compare reconstruction methods since there are no clearly defined tools to assess the quality of such a non-linear reconstruction problem, but the relationship between inclusion size and its contrast can be detected. A contrast resolution is defined as the resolution measurement based on the contrast of optical property values of inclusions relative to the background. Each corresponding image can be visualized as a CSD resolution map. Over the region of interest, contrast resolution and size resolution are calculated to analyze the quantitative information of the reconstructed images [29,30]. It is based on accuracy and density/saturation and has the benefit of being readily applied. According to this definition, the contrast resolution refers to a comparison between a sample's optical property values and those of the surrounding background, i.e., where ̅̅̅̅̅̅ and ̅̅̅̅̅ represent the average of maxima and minima over all of the specified inclusion zones, due to the potential of some oscillations in these regions So that it matches the [31] concept of contrast for assessing the visibility of a structure, the above equation is further changed as follows:  where C and I denote contrast and intensity, respectively. Moreover, to avoid probable outlier values that act as noise in images, the percentile values instead were employed; for instance, ̅̅̅̅̅̅ and ̅̅̅̅̅ were replaced by the 90 th and the 10 th percentiles. Additionally, the size resolution was defined as The RMSE (root mean squared error) is determined over the whole 2D imaging domain/region of interest, between the original (specified) value of inclusions and the baseline (reconstructed) value, to evaluate the resolution of overall inclusion sizes. For the background optical coefficients, the baseline values were utilized. As a precaution against size overestimation, the contrasting resolution is included in the size resolution. During the evaluation process, we combined the two measures of contrast and size to a single numerical CSD analysis, which allowed us to compare the results between the three reconstruction methods, i.e.
The CSD is used to cope with the drawbacks of only contrastdetail analysis. The integrated contrast and size resolution evaluate both the contrast precision and the size accuracy for the image reconstruction scheme. It is worth noting that the measure emphasizes accuracy to prevent overestimation.

B) NUMERICAL AND PHANTOM RESULTS
Two random case studies (Table 4)  the proposed methods, a breast-like phantom (circular phantoms only) was examined under different conditions to determine how it improves both resolution and contrast when reconstructing an image via the DL approach. The images were evaluated using the CSD analysis, which characterizes spatial resolution limits, signal-to-noise limits, and the tradeoff between contrast and size of objects. Using a threshold (contrast-to-noise ratio = 0.3) in images, the minimum acceptable noise level is approximated, which can be used to measure the human perception of objects and the hemoglobin concentration. For classical method comparison, we use the results of the Tikhonov regularization method [29], and for learning method comparison BPNN is used. The same network has been considered for our dataset used for the BPNN method. Figures 12 and 14 can be considered as two rows (upper row for absorption coefficient recovery and the bottom row for scattering coefficient recovery) and five columns for all other methods in presented case studies of a single as well as two inclusions. The ground truth images are shown in the first column; the reconstructed images using Tikhonov regularization, 2D CNN, 1D CNN, and BPNN are given in the second, third, fourth, and fifth columns, respectively. In Figures 12 and 14, results of localization of tumors using 1D corresponding circular profiles through the centre of inclusions and along the x-axis for absorption (upper row) and scattering (bottom row) coefficients are shown. The green solid lines denote all methods reference solution/ground truth. The blue triangle solid lines represent the 1D CNN method, while purple diamond-solid lines represent the results of 2D CNN. The comparison is made for both classical and learning-based methods, where black dotted lines and dash-dot red lines denote the Tikhonov regularization and BPNN solutions respectively. Moreover, . denotes 2D contrast, initial contrast, 2D size , and contrast size resolution over the region of interest, respectively. The sizes and the actual values of both optical properties for each case are shown in Table 4.

1) PHANTOM CASE M4 STUDY
In this case study, we use a sample from the phantom test dataset composed of 5-mm diameter inclusion located at 10 mm off the center phantom(left) as an example for discussion. From the second to the last column of Figure 12, we can see that the most values of recovered optical properties using TR are merged with the background (smooth edge problem) compared to ground truth, the recovered optical properties using 1D CNN and 2D CNN matched with their actual values. Compared with Figure 12 (b) and (e), the results from the proposed 1D CNN in Figure 12 (c) and (d) outperform those from TR and BPNN, especially for the one inclusion case from the experiment dataset (lab/phantom data). It is to be noted that a higher contrast was reconstructed using 1D CNN (for both optical properties) and 2D CNN (capture absorption successfully while scattering coefficient has some edge smoothness problems). Compared to the previous simulation results using one inclusion, 2D CNN has offered more improvement in contrast and size than the other reconstruction algorithms. The corresponding 1D profiles through the centers of the inclusions and along the x-axis are plotted for comparison of the four schemes as shown in Figure 13. One can easily observe that a good agreement is found between 1D CNN and 2D CNN for single inclusion cases. It is observed that for the phantom experiment dataset, two inclusions reconstruction is difficult due to the complex structure of contrast and background, and our proposed models are simple, which can be seen in the scattering coefficient plot of 2D CNN presented in Figure 13. Taking BPNN into consideration, it can be observed that it is not performing well ( Figure 12). However, 1D CNN is more efficient and accurate for localizing tumors in breast imaging when phantom data is considered. The performance analysis based on CSD comparisons for the one inclusion from Figure 12 is listed in Table 5. The contrast and size resolution for the displayed sample in both schemes are presented in Table 5. It can be observed that proposed networks produce an excellent resolution. Our experiments show that the 1D has positive effects on localization and reconstruction of optical properties.  The proposed networks were validated by considering the classical method and learning method already available in the recent literature. The test set results of our networks were compared with those obtained from the TR and BPNN and the reference solutions. Both networks' numerical and phantom results were found in good agreement with TR and those learning approaches available in the literature. Figure 14 illustrates the ability to recover images with two inclusions using one iterative and three non-iterative methods.

2) SIMULATION CASE 12238 STUDY
Here, we present two inclusions sizes of 12  reconstructed and localized successfully via proposed networks without merging (smooth edge) to the background. Both inclusions can be observable and reconstructed with their centers positioned correctly for both proposed algorithms. Despite this, it is possible to determine the inclusions' shape and edges using 1D CNN and 2D CNN even when the second inclusion is too small, i.e. 2.87mm. Both the networks are found well-balanced and preserve the edge smoothness. However, the scattering and absorption coefficients of inclusions with small radii were underestimated, and the images could not be recovered by Tikhonov's regularizationbased reconstruction method and BPNN. A good agreement with a reference can be seen in the reconstruction of both properties. However, both optical coefficients give better localization (as shown in Figure 15) for both proposed networks. In contrast, BPNN performs very poorly (a straight line) compared to our proposed network and TR. This shows that BPNN can only reconstruct simple case studies (absorption coefficient only) and will not perform well for complex data ( Figure 14). The DL networks can capture one and two inclusions in the 1D solution profile. However, the 2D CNN captures the edge more sharply and more efficiently. The CSD for DL in Table 6 concerning reference solution, classical method, BPNN. The quantitative results are summarized and will be discussed in later sections. Compared with Tikhonov regularization and BPNN, the values from analysis detail of optical properties obtained using 1D CNN and 2D CNN are significantly higher and greatly improved (contrast-to-noise ratio = 0.3), which shows that 1D NN and 2D CNN can yield higher reconstruction resolution accuracy.

VI. DISCUSSION
A high level of distortion that results from absorption and scattering makes image reconstruction extremely challenging. A scattering of highly scattered light results in the detection of a blurred image of the underlying structures. Consequently, it is challenging to acquire quantitative information on structural and functional characteristics within DOT. The location of the reconstructed object is pretty accurate after we have solved the inverse problem and preprocessed the data. It is important to note that the MSE is calculated concerning the intensity of each pixel within the image. The MSE of the inverse problem is very low due to the background noise and apparent mismatches at the boundary.
Additionally, one should note that although the location is reasonably accurate, the optical property reconstruction is not good in TR and BPNN due to the different contrast between the ground truth and the reconstructed image. Various techniques have been proposed in the past to overcome these challenges, with varying success. Toward the objective of quantitatively assessing these reconstructed images from using DNN, two measures (contrast and size resolution) defined in the previous section are calculated over the ROI. The contrast resolution focuses on determining the difference between the optical properties of inclusions as compared to the background region, whereas the size resolution focuses on the area of the inclusions compared to the total area. When the reconstructed result is identical to the expected, the resolution value equals one accordingly.
Based on the simulation and phantom experiments results, we can say that the optical images reconstructed by a learning reconstruction technique strongly depend on the selection of convolution layers, especially when the inclusions are slight. The absorption and scattering coefficients from breast phantoms were reconstructed in the study using the architecture based on TR, BPNN, and deep neural networks. Implementing a neural network architecture for the image reconstruction aims to visualize and improve the spatial resolution of the reconstructed absorption and scatteringcoefficients images of breast phantoms. A fast reconstruction method based on the DL approach is presented to recover the absorption as well as scattering coefficients of the biological phantom. Here we use a DOT scanning system, as shown in the previous section, with one source and one detector at 16 circular uniformly distributed locations, i.e. 240 (16'15) measured data, which reduces the cost of instrumentation, much lower than that of a typical DOT scanning system. The computational complexity of a 1D CNN under similar conditions (the same configuration, network, and hyperparameters) is substantially lower than that of a 2D CNN or compact design. Compared to the same number of layers in a 1D CNN model, a 2D CNN takes longer to train than a 1D CNN.

A) CONTRAST SIZE DETAIL ANALYSIS
In order to assess the imaging performance, a CSD analysis has been performed to determine the minimum detectable contrast level for inclusions. The proposed network can have a better contrast precision and size accuracy for the image reconstruction over TR and BPNN while the higher number of samples passes the threshold. The deep learning technique beat the other approaches by determining validation measures and contrast and size resolution. Compared with other regularization-based algorithms, we find that the proposed algorithms provide significant improvements in image accuracy and quality compared to the TR and BPNN-based algorithms. Table 5 and Table 6 summarize the calculation of the two measures over a 2D ROI for all images in Figures 12  and 14, respectively. Table 6 simulation dataset and Table 5 phantom dataset performance show that the contrast and size are significantly improved by the reconstructed optical property images derived from 1D/2D CNN.

B) CLASSIC AND LEARNING METHOD COMPARISION
In our work, we also compare the proposed technique with the widely used TR and BPNN [17] in the same conditions. Our results (Figures 12 and 14) show that the suggested approach produces tremendous progress in accuracy and image quality compared to the TR-and BPNN-based approaches. Numerical simulation experiments proved the improved performance of the suggested approach. We present in Tables 5 and 6 the results of qualitative analyses conducted when comparing the tumor and background and determining the significance of DL in DOT.
The number of samples is improved in resolution (Table 5), which indicates that the recovered images are nearly the same as the ground truth image. Similar results can also be observed in other cells in Table 6. These results show that 1D CNN and 2D CNN outperform TR and BPNN in terms of higher accuracy and better image quality.

C) EFFECT OF NORMALIZATION DURING TRAINING
Batch Normalization (BN) [25] matches the current state of the art on different challenges by reducing the internal covariate shifts within the model and speeding up the learning process significantly. The BN can be seen as an additional layer, in the same way as fully connected or convolutional layers, that can be inserted into the model architecture. Deep architectures have been notoriously slow to train because of their internal covariate shift in BN layers. In the case of DOT image reconstruction, the use of batch-normalization layers improves the resolution and localization of lesions. There are several possible causes of covariate shift. One is the difference in distribution between the training and test sets. During the training phase, the distribution of activation across layers changes due to the changes in network parameters known as the internal covariate shift. One possibility for a covariate shift in DOI results from different distributions of input signals (both amplitude and phase) as well as different sets of training (from simulation with employed artificial noise added) and testing (both simulation and phantom experiments). As a result, batch-normalized models achieve better validation and test accuracies across all test datasets.
The CSD analysis is carried out to check the effect of batch normalization layers for the resolution of image reconstruction of optical properties of the breast. By comparing batch-normalized and non-normalized networks, the effects of normalization on the training are investigated. To reduce expensive computations of covariance matrices, we normalized each input feature to have a zero mean and a standard deviation across each layer. It can be observed in Table 7 (simulation test samples and experimental test samples) that more samples pass the threshold in the presence of BN layers. Comparing the model for batch normalized layers to other learning models, including BPNN, our model still successfully predicts most optical characteristics.

VII. CONCLUSION
The first 1D-CNN architecture for solving an inverse problem in DOT to reconstruct both optical properties in a frequency domain system using 1D data has been experimentally demonstrated. This paper has developed an excellent resolution, simple to implement deep learning networks for localizing and reconstructing optical properties in simulations and the phantom case study. According to our results, the proposed networks can reconstruct good-quality images more effectively than other optimization-based methods (TR). It is noteworthy that the method trained on simulated data is also able to reconstruct images from actual experimental data (phantom) successfully. An essential advantage of the proposed approach is that image formation is fast compared to both iterative methods of Tikhonov regularization and noniterative methods of BPNN reconstruction. However, it was observed that 1D CNN gives better localization and resolution of tiny inclusions than the 2D CNN in phantom case studies. At the same time, 2D CNN performs low for phantom cases in single inclusion and two inclusions.