Highly Efficient Inverse Design of Semiconductor Optical Amplifiers Based on Neural Network Improved Particle Swarm Optimization Algorithm

An artificial intelligent neural network improved particle swarm optimization algorithm is proposed for the inverse design of semiconductor optical amplifier. Seven input parameters, current-gain curve and saturation output power curve are selected to form the data set based on the physical model of semiconductor optical amplifier. The effectiveness of forecasting performance is improved by contrasting two back propagation neural network techniques (Scaled Conjugate Gradient and Levenberg-Marquardt) and operational settings (Central Processing Unit and Graphics Processing Unit). Higher accuracy is achieved through feedback analysis of neuron number optimization and test error. The addition of a unique backpropagation neural network can make the fitness of particle swarm algorithm mostly converge below $2\times 10^{-4}$. The relative difference between original performances and inverse predictions is close to 0%, which proves the effectiveness of parameter extraction. This method can take advantage of neural networks to improve accuracy and speed of particle swarm optimization algorithms for efficient semiconductor optical amplifier inverse design and multi-solution analysis.

With the development of machine learning, using deep neural networks can reveal implicit input-output relationships to effectively facilitate photon prediction and inverse design [17] in a data-driven manner. In forward predictive design, artificial neural networks are applied to simulate optical processes [18], analyse response feasibility [19], 2D 3D structure prediction [20], [21] and more. For inverse structure design, tandem neural networks [22], [23], [24] have been used to design multilayer films based on target transmission spectra and also conventional nanophotonics structures.The combination of neural networks and PSO [17], [25] has been used in the reverse design of lasers, but it is limited accuracy and time-consuming. Back Propagation Neural Network (BPNN) can achieve accurate predictions, combination BPNN and particle swarm optimization (PSO) algorithms benefit to speed up computational efficiency, save the time cost of training inverse networks and achieve faster extraction of compact model parameters.
For the first time, we offer a model for SOA in this research that combines PSO algorithms with artificial intelligent neural networks. This model can automatically extract parameters based on the curve of the tested SOA performance, thus greatly enhancing the effectiveness of SOA on demand design. The second and third portions demonstrate forward prediction and inverse design. The conclusion is presented in the end.

II. FORWARD DESIGN
Utilizing forward design techniques, a more compact design model can be achieved for SOAs, making them even more attractive for various applications. The forward design data set is based on a physical model (traveling wave model) that describes the photoelectric process within the SOA. The traveling wave model [16], [26], [27], [28], [29], [30], which combines the optical wave equations, the photon density equation, and the carrier rate equation, is widely used in semiconductor active device simulation and has been demonstrated to provide an accurate simulation of the SOA. Further details regarding the model can be found in the appendix. A simple schematic of SOA is shown in Fig. 1.
In order to extract the characteristic parameters of SOA, an artificial intelligent neural network for the high-precision prediction is constructed. Seven important parameters are selected from the physical model of the SOA, involving structural parameters and material coefficients; they are injection efficiency η effc (dimensionless), forward reflectance R 1 (dimensionless), backward reflectance R r (dimensionless), cavity loss (in unit of cm −1 ), optical field limiting factor Γ (dimensionless), active area thickness d (in unit of μm) and active area width w (in unit of μm). These parameters, though specific, can be generalized to other situations. They have been selected empirically with a focus on design aspects and they have a large impact on the performance of design-on-demand SOA. d and w are parameters most often considered in the design of SOA (cavity length is generally fixed integers: 300 μm, 500 μm and 1000 μm), Γ and loss are parameters that will definitely influence the device, R 1 and R r are parameters that will inevitably cause deviations in the process (try to get as close to zero as possible) and η effc is the parameter that will influence the experiment. The values of the parameters are taken randomly within each specified range as η effc = [0.  The network is constructed using Scaled Conjugate Gradient (SCG) of the conjugate gradient algorithm, which performs well on a wide variety of problems, especially for networks with a large number of weights. For large-scale networks, the SCG algorithm is faster than the Levenberg-Marquardt (LM) algorithm for function approximation problems and has relatively modest memory requirements. From the Table I, SCG has a slightly lower mean square errors (MSE, i.e., difference between the NN predicted curves (Ŷ i ) and the SOA physical model calculated ones (Y i ), shown in (1)) than LM trained and tested with the same neurons, but SCG can achieve approximately the same accuracy or even ex ceed the performance of the LM algorithm when the number of neurons is increased. The application of SCG-BP neural networks allows for faster prediction and extraction of parameters.
The designed network is connected by three hidden layers with seven input parameters as described above. According to Fig. 2, output parameters are as follows: a G-I curve with 80 data, two saturation output power curves with 100 data each, for a total of 280 output data. It is important to note that before the relevant data sets are stored in the database for neural network training, they underwent a mean normalization operation to lessen the weight imbalance of the magnitude disparities between the output data. The training set consists of the first 9,500 data sets in the total data set of 10,000, the hidden layer contains three layers (which guarantees that complex feature parameters can be learned, because too few will result in underfitting and in reverse will result in overfitting), and the test data set consists of the last 500 data sets in the total data set. In training a forward neural network, the allocation of the training set, validation set, and test set is 70%, 15%, and 15%, respectively. The dependence of the neural network on the number of neurons is examined in order to choose the right neurons to construct the neural network topology, as shown in Fig. 3, the MSE and the time taken to predict the G-I and saturation output power curves are calculated for neural networks with different numbers of neurons. It can be seen that the test MSE does not change much after 20 neurons and has good results with lower training error and shorter time at 52 neurons.
With 52 neurons per layer, comparing the performance of the Graphic Processing Unit (GPU) and Central Processing Unit (CPU) with the same algorithm training the neural network, as shown in Fig. 4, the CPU use shorter time with a small amount of data, at the amount of data larger than 4000, the GPU training  To further validate the predictive capability of the designed network, the performance of the neural network is tested by taking three sets of sample data randomly from a test dataset (different from the training dataset) of 500 sets (the network prediction data is compared with the physical model calculation data) and the results are shown in the Fig. 5.
It could be seen that the designed neural network can predict SOA performance parameters with high accuracy, extracting non-linear relationships between multiple parameters and the corresponding outputs.
The physical model takes 14049.816 seconds to calculate 500 sets of data (one G-I curve, the saturated output curve at a bias current of 350 mA and 500 mA), however, the neural network simulates 500 sets of data in 0.362 seconds, 38812 times faster, enabling faster parameter extraction.

III. INVERSE DESIGN
For the inverse design of extracting the non-linear parameters of SOA, the parallel algorithm PSO can be used to find local and global optimal solutions [31], [32]. BPNN is used to optimise the PSO algorithm (BP-PSO) so that it iterates more rapidly while maintaining high accuracy. Moreover, it avoids the phenomenon of a single solution of the cascaded network and enables the search of multiple solutions. Researchers can choose the appropriate solution according to the process capability.
The flowchart and iterative process for BP-PSO are shown in Fig. 6. The first step is initialisation: set the number of particles, the number of iterations, the acceleration factor, and set the boundary range of the algorithm; the second step is data generation: the PSO algorithm randomly generates parameters within the given range; the third step is selection of fitness: the initial fitness of each particle is calculated using the mean square error function of the neural network prediction output curve combined with the PSO algorithm and the target curve as the fitness function; the fourth step is recording. The optimal point is selected and the local optimal p best and global optimal g best are recorded; the fifth step is to update the constraint: the velocity of the particle is updated according to the velocity and position update formula and the position of the particle that crosses the boundary is constrained; the sixth step is to calculate the fitness: the fitness of each particle is calculated according to the selected fitness function; the seventh step is to compare the records: each particle is compared with its historical optimal value and the better point is selected as the new local optimal p best and global optimal g best ; the last step makes a judgement: determine whether the end condition (maximum number of iterations or minimum accuracy) can be reached, if yes, output the optimum parameters, if not, skip to step five.   Table II. A randomly selected set of data for BP-PSO inverse design allows multiple solutions to be obtained for analysis, i.e. a set of performance parameter curves can correspond to multiple combinations of structural design parameters, and two sets of extracted parameters are listed here for comparison with the original data.
For inverse design, the high accuracy can be verified by comparing the performance results from the predicted parameter set with the original performance parameters. Here, P1 and P2 in Table II are compared with the original data using the    Fig. 7. The simulated performances of parameter sets predicted from the inverse predicting model are close to the target values. Moreover, Fig. 7(a) show the relative difference between original and inverse predictions in percentage. The percentage error is basically close to 0%. Overall, these results provide confidence in utilizing BP-PSO to inversely design structures of SOA.  For the 25 rounds of PSO search, as shown in Fig. 8, it is indicated that the BP-PSO search is not generated aimlessly and randomly, but is distributed in a fluctuant way within a certain range of the original data. The parameters η effc , loss and Γ are very concentrated, and the w and d distributions are somewhat scattered, due to carrier rate equation of SOA model, where small changes in w and d are multiplied together and may cause fluctuations to the whole model. BP-PSO can be reverse designed with multiple solutions for physicality, thus providing more design possibilities for the researcher. R l and R r are more scattered probably from the effect of the open root sign based on the boundary conditions, and give the margin of error within which the manufacturing process can be appropriately liberalized.To clearly show the accuracy of the algorithm, the distribution of fitness function is plotted, and most of 25 times that the rounds of repeated search are performed below 2 × 10 −4 , which also shows that high precision inverse design parameters can be automatically obtained by the BP-PSO algorithm.Compared to 2 × 10 −3 accuracy of NN-PSO [17], one order of magnitudes higher accuracy of inverse design is presented.Their standard deviation (STD) and mean values are listed in Table III, demonstrating that there is no need to manually adjust the tested-out parameters, and that valid device parameters can be extracted in a short time by BP-PSO to solve the non-linearity problem of multiple solutions in SOA.
The means and standard deviations of ten randomly listed test datasets are used to further validate the accuracy and ability of BP-PSO algorithm to reverse-extract parameters for different datasets by reverse design, as shown in the Fig. 9. The figure shows that the reverse search designed the mean of seven parameters with standard deviation bars, and when compared with the corresponding original data (red circles), the parameters η effc , loss and Γ are predicted more accurately, and the remaining parameters are scattered accordingly, but the parameter of reverse extraction is still valuable for reference.

IV. CONCLUSION
The optimized PSO algorithm based on Scaled Conjugate Gradient back-propagation neural network is proposed for the inverse design of SOA to achieve parameter extraction of multiple solutions with high accuracy and computational speed. Comparing the Scaled Conjugate Gradient and Levenberg-Marquardt algorithms, the influence of GPU and CPU environments on neural networks, SCG algorithm is selected to train back-propagation neural networks in GPU environment. The neuron-dependence of the neural network (7 inputs and 280 outputs) shows that 52 neurons per hidden layer obtains the most appropriate MSE of training and testing. The reliability of the algorithm for the prediction of the G-I curve and saturated output power curve is confirmed by the performance parameters extraction based on the SOA traveling wave model. In the reverse design of SOA, the fitness of BP-PSO mostly converges to below 2 × 10 −4 , which proves the effectiveness of parameter extraction. By comparing the mean and standard deviation of 10 test data sets, the effectiveness of parameter extraction in reverse design of BP-PSO is verified.

APPENDIX PHYSICAL MODEL OF A SOA
Ridge waveguide SOA is simulated using a one-dimensional travelling wave model as shown in (A.1), where F (z, t) and R(z, t) are amplitudes of the forward and backward waves respectively, v g is the group velocity, κ is the coefficient of mutual coupling of the forward and backward waves, G is the mode gain, δ is the detuning factor, S f (z, t) and S r (z, t) is the amplitude of the forward and backward spontaneous radiation noise respectively. The boundary conditions for (A.1) are (A.2): where F (0, t) and R(l, t) are amplitudes of the forward light field at device's start and the backward light field at its end, respectively. r 1 and r 2 are the reflectance at the first and end of the device respectively, E s (t) are complex amplitudes of the incident signal light and ω 0 are reference frequencies.
The mode gain in (A.1) can be expressed as where α is waveguide loss, g m is material gain for bulk materials and Γ is optical field limiting factor. Γ is defined as the percentage of the overlap area of the light field with the active area in relation to the overall light field distribution. The material gain can usually be expressed by the empirical formula shown in (A.4): where g p is the material differential gain, N is the carrier concentration, N 0 is the transparent carrier concentration, ε is the non-linear saturation factor of the gain, P is the photon density and can be expressed as: where is the approximate Planck constant, ω is the angular frequency, A is the cross-sectional area of the active area, n effc is the material equivalent refractive index and can be expressed as: Γα m g m (A.6) n eff0 is the equivalent refractive index in the absence of current injection. The amplitude of the spontaneous radiated noise in (A.1) satisfies the Gaussian distribution, the phase satisfies the uniform distribution between 0 and 2π, and its autocorrelation function The carrier rate equation is used in this case to approximate the carrier variation because it has little bearing on the precise computation of the optical field distribution in SOA. The carrier production rate due to electron injection, minus the carrier rate consumed by various complexes, and the carrier rate consumed during photon generation are expressed by this equation, and is written as: where η is the current injection efficiency, I is the injection current, e is the amount of charge, A, B, C are the rate of linear radiative compounding, bimolecular radiative compounding, and intermittent compounding, respectively. V is the total volume of the active area, and is written as: where w is the width of the active area and d is the thickness of the active area. The SOA gain is calculated as follows: Gain (dB) = 10 lg P out P in (A.10) Thus, equations (1) to (10) together form a set of physical models that describe the SOA. SOA requires a high carrier concentration to maintain high gain and large saturation power, it also requirements a small optical field limiting factor that allows the swift field of the light wave to extend into the P-type cladding. Other parameters of the SOA are shown in the Table IV.