Machine-Learning Assisted Prediction of Spectral Power Distribution for Full-Spectrum White Light-Emitting Diode

The full-spectrum white light-emitting diode (LED) emits light with a broad wavelength range by mixing all lights from multiple LED chips and phosphors. Thus, it has great potentials to be used in healthy lighting, high resolution displays, plant lighting with higher color rendering index close to sunlight and higher color fidelity index. The spectral power distribution (SPD) of light source, representing its light quality, is always dynamically controlled by complex electrical and thermal loadings when the light source operates under usage conditions. Therefore, a dynamic prediction of SPD for the full-spectrum white LED has become a hot but challenging research topic in the high quality lighting design and application. This paper proposes a dynamic SPD prediction method for the full-spectrum white LED by integrating the SPD decomposition approach with the artificial neural network (ANN) based machine learning method. Firstly, the continuous SPDs of a full-spectrum white LED driven by an electrical-thermal loading matrix are discretized by the multi-peak fitting with Gaussian model as the relevant spectral characteristic parameters. Then, the Back Propagation (BP) and Genetic Algorithm-Back Propagation (GA-BP) NNs are proposed to predict the spectral characteristic parameters of LEDs operated under any usage conditions. Finally, the dynamically predicted spectral characteristic parameters are used to reconstruct the SPDs. The results show that: (1) The spectral characteristic parameters obtained by fitting with the Gaussian model can be used to represent the emission lights from multiple chips and phosphors in a full-spectrum white LED; (2) The prediction errors of both BP NN and GA-BP NN can be controlled at low level, that is to say, our proposed method can achieve a highly accurate SPD dynamic prediction for the full-spectrum white LED when it operates under different operation mission profiles.


Introduction
With the improvement of living standards, people's requirements on lighting have gradually shifted from environmental protection and energy saving to the pursuit of health and comfort. However, several traditional light-emitting diode (LED) products have high proportion of blue light with short wavelength and low proportion of red light at long wavelength, which will lead to visual fatigue and vision loss [1]. Thus, the design of next generation LED light source will be challenged not only by low cost and high light efficiency, but also need to meet the demands of health, comfort, high color quality, low frequency flash, high reliability and so on [2]. As natural sunlight is the most suitable and comfortable light for human beings, the full-spectrum white LED has great potential applications in indoor lighting, medical and human centric lighting, special display, plant lighting and other fields [3]. Currently, a common design of the full-spectrum white LED is to mix all emission lights from multiple LED chips and phosphors [4]. As a fundamental performance indicator, the spectral power distribution (SPD) of a full-spectrum white LED is complex and always inferenced by electrical and thermal loadings [5]. Therefore, achieving dynamic prediction of SPD to simulate natural light spectrum becomes one of essential but challenging research topics in future human centric lighting design and application.
The SPD represents a functional relationship between spectral density with wavelength and the spectral density represents the radiation energy per unit wavelength range. Because there are many internal and external factors to determine the SPD of a LED light source, it is difficult to achieve its dynamic prediction with high accuracy. At present, there are some researches relating the SPD prediction. For instance, J. C. C. Lo et al. [6] proposed a mathematical model, in which the excitation and emission spectra of various phosphors were used as input parameters to predict the emission spectra of multiple LED phosphors and compared with the experimental results. It is found that the proposed model can accurately estimate the emission spectrum of the LED with mixed multiple phosphors. P. Dupuis et al. [7] developed an SPD prediction method by using the low-order polynomial function with current as the only input parameter. Moreover, the SPD decomposition approach with statistical models is used to discretize the continuous spectrum of a LED and the statistical characteristic parameters is extracted to represent the whole information in an SPD. The commonly used statistical models include Gaussian model, asymmetric double sigmoidal model and Lorentz model and so on [8]- [11]. J. J. Fan [12] et al. used the Gaussian and Lorentz models to extract SPD features, and achieved a dynamic and accurate predicting of the color coordinates, correlation color temperature (CCTs), CRIs and estimating the residual life of phosphor converted white LED (PC-wLED). C. Qian [13] et al. modeled the SPD of PC-wLED by superimposing two asymmetric double sigmoidal (Asym2sig) functions. H.T. Chen [14] et al. provided a method to predict the instantaneous changes of CCT and CRI when the power of LED system was changed, in which the Gaussian function was also used to model the SPD of LED system. M. H. Chang [15] et al. used multiple peak fitting method to extract features of SPD for PC-wLED and applied the principal component analysis (PCA) to reduce the dimensionality of features, finally achieved a shorten of the LED qualification test period from 6000 hours to 1200 hours. Generally, the statistical characteristic parameters extracted from the SPD decomposition model have been proved as an effective way to discretize the continuous spectrum, however, the multidimensional data mining and processing of the extracted statistical characteristic parameters become another challenge in SPD prediction.
In general, machine learning (ML) is a set of methods that can be used to learn and detect patterns from input data samples and use the uncovered patterns for further decision making in prognostics or predicting future data [16]. ML always includes supervised learning, unsupervised learning and semi-supervised learning. In the supervised learning, a labeled or classified set of input data is used to estimate and predict the resulting output pattern. As a result, it depends on the learning method to discover the group of input data or desired pattern. As one of most popular supervised learning methods, Artificial Neural Network (ANN) has been proven as an effective way on data mining and processing. It abstracts the human brain neuron network from the perspective of information processing, establishes a simple model, and forms different networks according to different connection modes [17]. As it has the advantages of function approximation, self-learning, complex classification, associative memory, fast optimization, and strong robustness and fault tolerance brought by highly parallel distributed information storage [18], [19], ANN has been widely used in many fields, such as medicine [20], [21], biology [22], [23], physics [24], and even in the LED field [25]. For example, K. Y. Lu [26] et al. proposed a lifetime prediction method based on the multi-dimensional back propagation neural network (BP-NN), in which the Adaboost algorithm improved BP-NN could lower the life prediction error but the operation time was increased. Aiming at the high cost of reliability prediction and evaluation of high-power LED, an intelligent prediction method based on dynamic neural network was proposed by Yan [27]. It is proved that this method has good extrapolation ability and robustness and it can successfully predict the life of high-power LED in a short time with prediction error as less than 5%. Liu et al. [28] combined the finite element modeling (FEM) analysis of a single heat transfer physics field with the ANN to present a more efficient method for heat dissipation analysis of multi-chip LED light sources. However, the limitation of BP-NN is that its convergence speed is slow and it is easy to fall into local minimum [29]. Genetic Algorithms (GA) is a computational model to simulate the natural selection and genetic mechanism of Darwin's biological evolution theory. It is a method to search the optimal solution by simulating the natural evolution process [30], [31]. The GA can be used to optimize the initial weights and thresholds of the neural network to prevent the BP-NN from falling into local minimum in the training process. L. Liu et al. [32] used the GA-BP neural network to recognize the alphabet. The results show that the network optimized by genetic algorithm has a more accurate and faster convergence compared with BP-NN. J. W. Gao et al. [33] predicted the short-term traffic flow, and applied the GA to optimize the weights and thresholds of BP-NN. The results also show that the new model has higher superiority in convergence speed and prediction accuracy.
Taking the dynamic change and complexity of SPDs of a full-spectrum white LED operated under different electrical and thermal loadings into consideration, this paper combines the SPD decomposition method with the ANN based machine learning method (i.e., BP-NN and GA-BP NN) to realize the dynamic prediction of SPDs for a full-spectrum white LED. The remaining of this paper is organized as follows: Section 2 proposes the SPD decomposition model and ANN theory. Section 3 introduces the test sample, experimental setups and collected data used in this study. Section 4 compares and discusses the prediction results obtained by using the BP-NN and GA-BP NN. Finally, the concluding remarks are presented in Section 5.

The SPD Decomposition Model
Generally, the SPD of full spectrum white LED contains many peaks, which can be modeled by the multi-peak function. In the SPD decomposition model proposed in this paper, a summation of symmetric Gaussian functions was chosen to fit and disassemble the SPD of a full-spectrum white LED, that is expressed as Eq. (1).
Where y 0 represents the initial value, A, x c and w are the peak area, peak wavelength and half-wave width of each emitted spectrum respectively.

BP-NN and GA-BP NN
As mentioned before, ANN refers to a complex network structure formed by interconnection of a large number of processing units (neurons), which is an abstraction, simplification and simulation of the structure and operation mechanism of human brain. Fig. 1 is a structure diagram of a single hidden layer neural network, which contains n inputs and m outputs. ANN does not show the exact  relationship between input and output. It only presents the unsteady factors that cause output changes, i.e., non-constant parameters. In this paper, BP-NN was firstly used to predict the SPD of a full-spectrum white LED. BP-NN is a multi-layer feedforward neural network trained by error back propagation algorithm, and is the most widely used one at present. Its basic idea is the gradient descent method, which uses the gradient search technique to minimize the mean square error between the actual output value and the expected output value of the network. However, BP-NN also has some drawbacks, such as easy to fall into local minimum values, the number of network layers and the number of neurons without corresponding theoretical guidance. Currently, there are some improved BP-NN approaches to accelerate the convergence speed of the network and avoid falling into local minimum. To optimize the BP-NN, the GA model was integrated in this paper with BP-NN to reduce the blindly searching process in the early stage of BP-NN algorithm and find the optimal weights and thresholds. Fig. 2 shows the flowchart of GA optimization.

Test and Data Acquisition
This paper chooses a full-spectrum white LED package with high CRI (Ra = 90∼92, CCT = 4800∼5200 K). It is packaged with a cyan LED chip (IF C = 150 mA), a blue LED chip (IF B = 150 mA), a red LED chip (IF R = 30 mA, VF R = 2.0∼3.0 V) and coated with a yellow phosphor layer, and the cyan chip is connected in series with the blue chip (IF CB = 150 mA, VF CB = 5.0∼6.0 V). The package size is 5.2 mm × 5.4 mm as shown in Fig. 3.
In order to realize the dynamic prediction of SPD of the full-spectrum white LED operated under different electrical and thermal conditions, we first used the integrating sphere to measure the SPD of test sample driven by an electrical-thermal loading matrix. As shown in Fig. 4, the experimental setup consists of an integrating sphere (Model: EVERFINE HASS20) to collect the SPD data, a DC power supply (Model: KEYSIGHT N5751A) to provide driven current, and a temperature control platform (Model: EVERFINE CL-200) to control the case temperature for test sample. The ranges of case temperature and driven current selected for the test sample are shown in Table 1. The selected full-spectrum white LED package needs two driven currents to power-on the red LED chip and series-connected cyan-blue LED chips separately, and a synchronous change of driven currents is recommended for the test samples. Usually, the limit operation temperature of most LEDs is between 80 ∼100°C. In addition, the excessively high temperature and large driven current may cause serious degradation and even bring catastrophic failure to LEDs. Considering the rated  currents and actual operation temperature limit, the maximum case temperature was selected as 80°C and the maximum driven currents were set as 40 mA and 200 mA for the red LED chip and series-connected cyan-blue LED chips respectively. Fig. 5(a) shows the SPD change trend of test samples operated under different driven currents when the case temperature is controlled as 25°C, where its SPD increased when the driven current rising. Meanwhile, Fig. 5(b) shows the SPD changed under different case temperatures when a drive current was fixed as 150 mA, which indicates that the case temperature has negative impact on SPD.

SPD Decomposition Results
It can be seen from Section 2, the number of fitted peaks is selected according to the spectral power distribution of different LEDs. The full-spectrum LED used in this paper should select four fitted peaks. As calculated from Table 1, 372 sets of test condition are considered in this paper. Firstly, the SPD decomposition model described in Eq. (1) is validated by using the data collected under four sets of test conditions as shown in Table 2. The SPD decomposition modeling with the Gaussian function fitting is present in Fig. 6, which includes the original SPD, four extracted individual spectrum, and cumulative peak-fitting model, respectively. The Goodness-of-Fit is evaluated by using the coefficient of determination (R 2 ). The maximum value of R 2 is 1, and the closer the value of R 2 to 1, the better the fitting degree of regression line to the observation value is. Here, we fit sixty sets of SPD data and obtain all R 2 values. As shown in Table 3, all R 2 values are larger than 0.98, which indicates that the Gaussian based SPD decomposition model used in this study is appropriate to represent the original SPD of the full-spectrum white LED.

SPD Prediction With BP-NN
According to above analysis, it can be seen that the SPD of the full-spectrum LED package is highly controlled by the case temperature and driven current. The neural network modeling in this paper is implemented by MATLAB. As the junction temperature (T j ) is not effectively monitored during operation, the case temperature (T) and driven current (I) are selected as the inputs of the BP-NN model used in this study. And the outputs are the characteristic parameters extracted from the Gaussian function fitting, such as y 0 , BP-NN is used to predict each output parameters and each network has two input neurons and one output neurons. Eq. (2) is used to calculate the number of hidden layers of the network.
Where q is the number of hidden layers, n is the number of input neurons, m is the number of output neurons and a is the adjustment constant between 1 and 10. q = 7 is chosen in this paper. The designed neural network structure is shown in Fig. 7. Next, we arrange the 372 sets of collected SPD data to validate the BP-NN in SPD prediction, in which 352 sets are randomly selected as training set through the above network shown in Fig. 7. The remaining 20 sets of collected SPD data are used in testing. The gradient descent method is used to learn the neural network. The learning rate is usually 0.01 or 0.1, here, 0.01 is chosen. And the maximum number of network training is set as 1000. Table 4 shows the input data (case temperature and driven current) of the randomly selected test samples used in prediction. It can be seen from the table that the test samples used in prediction has a good discreteness, covering a variety of case temperatures and driven currents.
Furthermore, run the network and get the prediction result. In order to evaluate the prediction accuracy, the percentage of prediction error described in Eq. 3 is used, in which E p is the percentage of error, R represents the actual characteristic parameters of Gaussian function and P are the predicted values by using the NN. Fig. 8 shows the absolute percentage of prediction error, which reveals that all maximum prediction errors can be controlled under 5%, and the averaged values are lower than 1.2%. Except for A 4 and y 0 , the average error percentage of other parameters is less than 0.5%. This TABLE 4 The Input Data of the Randomly Selected Test Scenarios Used in Prediction result demonstrates that the BP-NN can achieve a good prediction accuracy for the characteristic parameters of SPD. In order to illustrate the SPD prediction more intuitively, we plot the measured SPD, the Gaussian function fitting model and the predicted SPD by BP-NN in Fig. 9 with four sets of conditions as examples. It can be seen that the predicted SPDs by the BP-NN coincide with those modeled by the Gaussian function. To see the two curves more clearly, we bold the SPD predicted by the Gaussian function. Meanwhile, according to the fitting coefficient R 2 > 0.98 obtained in Section 4.1, it also proves that the BP-NN has high accuracy of predicting the SPD for the full-spectrum LED package.  In order to furtherly illustrate the accuracy of predicting SPD, we calculate the Root-Mean-Square-Error (RMSE) and chromaticity difference ( xy) between predicted SPD and actual SPD, those can be expressed as Eq. (4) and Eq. (5).

RMSE =
(I m (λ j ) − I e (λ j )) 2 n (4) where I m (λ j ) is real measured SPD, I e (λ j ) is predicted value of SPD, and n is number of measurements.
where x m , x e and y m , y e are respectively chromaticity coordinates calculated from measured and expected value at CIE1931 color space. As shown in Table 5, the averaged RMSE and chromaticity difference are 6.33 * 10 −5 and 0.0021, respectively, that furtherly confirms the high prediction accuracy of the proposed BP-NN method.

SPD Prediction With GA-BP NN
In order to improve the SPD prediction accuracy, we integrate the GA with BP-NN to optimize the weights and thresholds of the BP-NN. In the GA-BP NN, the number of iterations of the GA, the population size and the crossover probability are set as 20, 10 and 0.4 separately. The network  structure is the same as Fig. 7. The prediction results of characteristic parameters in Gaussian model with GA-BP NN are shown in Fig. 10, which presents that the maximum error percentage is 3.3%, the maximum average error percentage is 0.8%. Comparing the prediction results in the Fig. 8 and Fig. 10, it can be seen that both the average and maximum values of prediction error percentage with the GA-BP NN are lower than those of BP-NN. This is because the GA-BP NN Vol. 12, No. 1, February 2020  6 The Prediction RMSE and Chromaticity Difference reduces the blindly searching process at the early stage of BP-NN implement and find the optimal weights and thresholds. However, when the prediction error is reduced, the program running time is increased because of the more 20 iterations implemented in GA optimization. The running time of BP-NN is 4s, while the running time of GA-BP NN is 23s.
In order to illustrate the SPD prediction more intuitively, we also compare the measured SPD, the Gaussian fitting model and the predicted one by the GA-BP NN in Fig. 11 with four sets of conditions as examples.
Similarly, we can also get the RMSE and chromaticity difference. As shown in Table 6, the averaged RMSE and xy are 6.26 * 10 −5 and 0.0019, respectively. According to the above analysis, it is known that the percentage of error, root mean square error, and chrominance difference predicted by GA-BP NN are smaller than those predicted by BP-NN. It can be explained that GA-BP NN has played an optimization role to a certain extent.

Robustness Study of the Methods
To evaluate the universal applicability of the proposed methods, the robustness study with one two additional cases is presented and discussed in this section. Both cases are based on the BP-NN predictions.

Case 1: Predict the Data Outside the Experimental Measurement:
In order to verify that the method can still predict the data outside the experimental dataset, we select some data to prove it. The prediction method is chosen in the same way as in Section 4.2. Select the input training set case temperature range from 25°C to 65°C, a total of 279 sets of training data. Used to predict the SPD with case temperatures of 70°C and 75°C, a total of 20 sets prediction data. The specific training set and test set input data are shown in Tables 7 and 8. Fig. 12 shows the absolute percentage of prediction error, which reveals that all maximum prediction errors can be controlled under 5%. Table 9 shows RMSE and xy, from which we can see that the average RMSE and xy are 5.50 × 10 −5 and 0.0018, respectively. This result proves that this method can still achieve good prediction for data outside the test data interval.

Case 2 Prediction With Small Amount of Training Data:
In order to prove that the method can still achieve prediction when the training data is small, we choose 48 sets of data to prove. Table 10 shows their drive current and case temperature, of which sets of data as the training set and the remaining 6 sets as the test set. The input data of the test set is shown in Table 11. The prediction method is chosen in the same way as in Section 4.2.  Fig. 13 shows the absolute percentage of prediction error, which shows that all maximum prediction errors can be controlled under 6%, and the averaged are lower than 2.2%. Table 12 shows RMSE and xy, from which we can see that the average RMSE and xy are 5.50 × 10 −5 and 0.0018, respectively. This result proves that this method can still achieve prediction when the training data is small. Accuracy of prediction depends on nature and number of samples. This shows that the prediction accuracy has a relationship with nature of LED and amount of test data.

Conclusion
This paper proposes an SPD prediction method for a full-spectrum white LED package by combing the SPD decomposition model with ANN base machine learning method. The Gaussian based SPD decomposition model is firstly proposed to discretize the continuous SPD of a full-spectrum white LED package as many spectral characteristic parameters. Then case temperature and driven current as inputs in both BP-NN and GA-BP NN models are used to estimate the spectral characteristic parameters, and finally a highly accurate SPD prediction is achieved and validated with different case studies. The prediction results show that: (1) The averaged SPD prediction RMSE and chromaticity difference ( xy) in both BP-NN and GA BP-NN can be controlled at around 6 * 10 −5 and 0.002 respectively. (2) The robustness study also proofs the effective potential of our proposed method applied in dynamically accurate SPD prediction for the full-spectrum white LED operated under different mission profiles.

Disclosures
The authors declare no conflicts of interest.