Parameter Extraction of Photovoltaic Models Using a Dynamic Self-Adaptive and Mutual- Comparison Teaching-Learning-Based Optimization

Parameter extraction of solar cell models plays an important role in the simulation, evaluation, control, and optimization of the photovoltaic (PV) system. Although many meta-heuristic algorithms have been proposed to solve the parameter extraction, it is necessary to further improve the accuracy and reliability of these algorithms. In this paper, an optimized teaching-learning-based optimization (TLBO) is proposed, namely dynamic self-adaptive and mutual-comparison teaching-learning-based optimization (DMTLBO). DMTLBO enhances the basic TLBO by improving its teacher phase and learner phase: (i) In the teacher phase, two differentiated and personalized teaching strategies are proposed according to learners’ learning status. In these two strategies, an adaptive state transition weight factor <inline-formula> <tex-math notation="LaTeX">$\omega $ </tex-math></inline-formula> and a dynamic gap weight factor <inline-formula> <tex-math notation="LaTeX">$\beta $ </tex-math></inline-formula> are introduced to reflect the dynamic transformation of the learners’ learning state in the actual teaching situation. (ii) In the learner phase, a new learning strategy is proposed. The learner can communicate and learn with three different learners who are randomly selected and ranked. To verify the performance of the DMTLBO algorithm, it is used to extract the parameters of different PV models, such as the single diode model, the double diode model, and three PV modules. Among these PV models, the root mean square error values between the measured data and the calculated data of DMTLBO are 9.8602E-04 ± 2.07E-17, 9.8248E-04 ± 1.53E-06, 2.4251E-03 ± 2.15E-17, 1.7298E-03 ± 5.74E-14, and 1.6601E-02 ± 4.55E-10, respectively. Compared with other optimization algorithms, the experimental results show that DMTLBO can provide better or highly competitive convergence speed and extraction accuracy. Besides, the influence of the improved teacher phase and learner phase on DMTLBO and the changing process of both weight factors <inline-formula> <tex-math notation="LaTeX">$\omega $ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\beta $ </tex-math></inline-formula> are investigated.


I. INTRODUCTION
The energy crisis, environmental pollution, and climate change have been caused by the overuse of fossil energy, hence renewable and distributed energy generation gradually becomes trendy research topics that have to go hand-inhand with energy storage research [1], [2]. Among various renewable energy sources, solar energy has the advantages of long-term existence, cleanliness, safety, huge quantity, The associate editor coordinating the review of this manuscript and approving it for publication was Xujie Li . and convenient utilization [3], [4]. Therefore, the energybased solar photovoltaic (PV) cell is the most potential and promising alternative energy. PV cell is an important part of the PV power generation system. It is necessary to establish an accurate model of the PV cell, and also extract the model parameters precisely, which has a great significance on the simulation, design, evaluation, control, and optimization of the PV system [5], [6].
In recent years, scholars have conducted a lot of research on the accurate modeling and parameter extraction of PV models. The parameter extraction problem of the PV model VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ can be transformed into an optimization problem [7]. From a large amount of measured current and voltage data, it can be seen that the search space is nonlinear and multimodality, which brings great challenges to the modeling and analysis of PV cells [8]. At present, two equivalent circuit models, i.e. single diode and double diode are usually used to describe the current-voltage (I-V) characteristics of PV cells. The I-V curve is a macro description of the PV cell, while the parameter model reflects the internal characteristics of the PV cell [9], [10]. Therefore, for a PV power generation system composed of multiple series and parallel PV cells, no matter which model is used, it is necessary to accurately extract the model parameters to describe the I-V relationship of the PV system. To solve this problem, different effective methods have been proposed, which can be mainly divided into the following two aspects: mathematical analysis method and meta-heuristic method. The mathematical analysis method is to use a series of analytical methods such as differential derivation and simplified models to process the I-V characteristic equation so that the parameter value can be estimated [11]- [15]. However, the I-V characteristic equation of the PV cell model is a complex transcendental nonlinear function, it is impossible to directly solve specific parameters through simple calculations. At present, many mathematical analysis methods such as normalized current density and voltage [16], Taylor's series expansion [17], Levenberg Marquardt (LM) algorithm [18], the power law J-V model [19], and the multi-dimensional variant of the Newton-Raphson method [20] have been optimized to solve this problem. These methods rely heavily on mathematical derivation of the objective function and some selected key points. However, they require some assumptions and simplifications, which may lead to inaccurate solutions. Besides, their objective functions must be differentiable, continuous, and convex, and their performance is related to the choice of initial values [21]- [25]. Therefore, the error of the approximate value of the parameter obtained by these methods is large, and it is not suitable for occasions with high accuracy requirements.
The meta-heuristic method is a derivative-free optimization method. Compared with the mathematical analysis method, the meta-heuristic method has no restrictions on the objective function and has a great advantage in solving multimodality problems. At present, many meta-heuristic methods have been used to extract PV model parameters. For example, genetic algorithm (GA) [24], [26], particle swarm optimization (PSO) [27]- [30], adaptive differential evolution (ADE) [31]- [33], ant lion optimizer (ALO) [34], artificial bee colony (ABC) [35]- [37], cuckoo search (CS) [38], hybrid flower pollination (HFP) [39], harmony search (HS) [40], water cycle (WC) [41], and improved whale optimization (IWO) [42], etc. Although meta-heuristic algorithms can obtain relatively satisfactory results in the parameter extraction of PV models, most of them are difficult to find accurate global optimal solutions. Therefore, it is necessary to optimize various algorithms based on their characteristics to design a highly competitive meta-heuristic algorithm for solving the parameter extraction problem of PV models.
As one of the modern meta-heuristic algorithms, the teaching-learning-based optimization (TLBO) algorithm is a simple yet effective optimization algorithm [43]. It is a population-based approach that uses population solutions for global solutions. The motivation for developing naturebased algorithms is their ability to efficiently and efficiently solve different optimization problems [44]. The TLBO simulates two important processes which can improve learners' achievements, including teachers' classroom output to learners and learners' learning in the traditional classes [45], [46]. The algorithm is mainly divided into two phases: (1) teacher phase: learners learn from teachers; (2) learner phase: learners learn from each other. The TLBO algorithm has the advantages of simple algorithm principle, easy to understand, few algorithm parameters, and high optimization accuracy. It is an effective optimization algorithm for solving complex constrained optimization problems, which requires relatively little computation [47]. Therefore, TLBO has been developed rapidly and been successfully applied to solve many practical problems including PV model parameter extraction. For example, Ref. [48] introduced an elite replacement strategy to TLBO and showed good performance in the process of solving complex nonlinear optimization functions. Ref. [35] combined TLBO and ABC for the solar PV parameter extraction problems. In [50], a modified TLBO algorithm based on generalized opposition was proposed to extract the parameters of the PV model.
Although TLBO has some of the advantages mentioned above, it also suffers some problems [50]. In the teacher phase, the search ability is poor, making it easy to fall into the local optimum prematurely. In the learner phase, the accuracy of algorithm optimization is low due to one single learning object [42], [51]. Given these shortcomings of the TLBO algorithm, this paper proposes a dynamic self-adaptive and mutual-comparison teaching-learning-based optimization algorithm, namely DMTLBO. In the teacher phase, two differentiated and personalized teaching strategies are proposed according to learners' learning status. In these two strategies, dynamic self-adaptive weight factors ω and β are introduced to make learners can selectively learn from the teacher while maintaining their learning status, thereby improving the global searching ability. Confucius said: '' Among any three people walking, I will find something to learn for sure.'' Inspired by this, in the learner phase, a new learning strategy is introduced. First, three other different learners are randomly chosen and ranked according to their objective function values. Then, the learner updates its state by communicating and learning from these three ranked learners to increase the diversity of the population. To verify the performance of DMTLBO, it is applied to the parameter extraction of the single diode model, double diode model, and three different PV modules. The simulation results show that compared with other algorithms, the proposed algorithm can improve the convergence speed and optimization accuracy. The main contributions of this paper are: • Two improved teaching strategies by introducing an adaptive state transition weight factor ω and a dynamic gap weight factor β are proposed. Learners can select one of these two teaching strategies to update the state according to their learning levels.
• A new learning strategy is proposed. In this strategy, each learner can learn from three other different learners who are randomly selected and ranked, which can improve the algorithm's population diversity and accuracy.
• The influence of these two optimized phases on DMTLBO is investigated. Results show that the absence of either will deteriorate the performance of DMTLBO to a large extent.
• The changing process of the adaptive state transition weight factor ω and the dynamic gap weight factor β is also analyzed.
• Five PV models are employed to verify the performance of DMTLBO. Compared with other optimization algorithms, the DMTLBO algorithm is highly competitive. The rest of the paper is structured as follows. Section II describes PV models and the objective functions of the parameter extraction problem. The basic TLBO algorithm is introduced briefly in Section III. Section IV describes in detail the proposed DMTLBO and its application in PV model parameter extraction. Section V presents the analysis and comparison of simulation results. Finally, Section VI concludes the paper and future work.

II. ESTABLISHMENT OF PV MODELS
Single and double diodes are widely used in the establishment of an equivalent circuit model at present. This paper mainly introduces the commonly used single diode five-parameter model and double diode seven-parameter model, as well as the PV module models. Additionally, the objective function used in these models is defined.

A. DOUBLE DIODE MODEL
PV cell can be ideally viewed as a diode in parallel with a current source. Due to the semiconductor impurities and imperfections, the current source needs a resistor in parallel. Also, the PV cell metal contacts and semiconductor material bulk resistance are represented by a series of resistance [51], [53]. The equivalent circuit of the double diode model is shown in Figure 1 and the relationship of the I-V characteristics is described as follows [54]- [56]: where I represents the cell output current. V is the cell output voltage. I ph is the total current generated by the PV cell. I sd1 and I sd2 are the currents of the first and second diodes, both n 1 and n 2 are the diode ideal factors. R s is the series resistance. R sh is the shunt resistance, and V t is the junction thermal voltage defined as: where k = 1.3806503 × 10 −23 J /K is the Boltzmann constant, q = 1.60217646 × 10 −19 C is the electron charge, and T represents the temperature of junction in Kelvin. From (1), there are seven unknown parameters that need to be extracted, including I ph , I sd1 , I sd2 , R s , R sh , n 1 , and n 2 .

Algorithm 1 The Pseudo Code of Basic TLBO
Input: Control parameters: Out: The optional solution 1 Set the iteration counter t = 1; 2 Initialize the population X randomly; 3 while the iteration stop condition not satisfied do 4 Calculate the X mean ;

5
Select the best learner as the X teacfler ; 6 //Teacher phase 13 Select a random learner X j (j = i); 14 Generate X i.new with Eq. (14); Based on the double diode model, the diffusion current and composite current are usually combined while the diode ideal factor n is introduced. The equivalent circuit of the single diode model is shown in Figure 2 and the relationship of the I-V characteristics is described as follows [56]- [59]: Algorithm 2 The Pseudo-Code of DMTLBO Input: Control parameters: N p , Max_NFE Out: The optional solution 1 Set NFE = 0; 2 Initialize the population X Randomly; 3 Evaluate the objection function value for each learner; 4 NFE = NFE + N p ; 5 while NFE < Max_NFE do 6 Select the best learner as the teacher X Teacher ; 7 Calculate the X mean and evaluate its objective function value; //Teacher phase 10 for i = 1 to N p do 11 Generate ω with Eq.(16); 12 Generate β with Eq.(17); 19 Select random learners X r1 , X r2 and X r3 (r 1 = r 2 = r 3 ); 20 Sort the three learners based on the fitness value in ascending order; 21 Generate µ with Eq.(19); 22 Generate X i.new with Eq. (18); From (3), there are five unknown parameters that need to be extracted, including I ph , I sd , R s , R sh , and n.

C. PV MODULE
The PV module is a combination of several PV cells connected in series or parallel. Its equivalent circuit is described in Figure 3, and the output current of the PV module is formulated as follows [58]- [61]: where N s and N t represent the number of PV cells connected in series or parallel, respectively. The same as the single diode model, the PV module also has five unknown parameters to be extracted, including I ph , I sd , R s , R sh , and n.

D. OBJECTIVE FUNCTION
In this work, the parameter extraction problem of the PV model can be transformed into a optimization problem.
To solve unknown parameters, we transform the above formulas to obtain their homogeneous formulas.
For the single diode model: For the double diode model: For the PV module: The root mean square error (RMSE) is used as the objective function [62]: (11) where I calculated = f (V measured ) is the calculated current data, V measured and I measured are the measured data, N denotes the total number of measured current data, x is a vector containing the parameters to be extracted. From (11), it is obvious that the smaller the value of RMSE, the more accurate the extracted parameters.

III. THE TEACHING-LEARNING-BASED OPTIMIZATION (TLBO)
TLBO is a population-based meta-heuristic optimization algorithm proposed by Rao and Kalyanker to simulate the interactive process between teachers and learners in traditional teaching classrooms [42]. In TLBO, all learners are regarded as a population, the population size is N P , and the number of subjects of each learner is the problem dimen- The fitness function value f (X i ) is his/her academic achievement. Learners improve their academic performance through two learning processes in the classroom. TLBO divides these two processes into a teacher phase and a learner phase. In the teacher phase, the learner with the best academic performance is selected as the teacher, who teaches other learners in the classroom to improve the overall level of the class. In the learner phase, learners communicate and learn with each other to improve themselves. The main pseudo-code of the basic TLBO is illustrated in Algorithm 1.

A. TEACHER PHASE
In the teacher phase, the best learner of the class becomes a teacher X teacher , and other learners improve the overall level of the class by learning from the teacher. For each learner, the updating process is expressed as follows: where X i.new is the updated learner of the learner X i ; X mean is the mean level of the class; T F = round (1 + rand (0, 1)) is the teaching factor that decides degree of change in the mean level; rand is a random number within the range [0, 1]. Comparing the objective function values of X i.new and X i , if X i.new is better than X i , then replace X i with X i.new ; otherwise, no change.

B. LEARNER PHASE
In the learner phase, a learner randomly selects another learner to improve his/her level through group discussions and formal communications. For each learner, the learning process is expressed as follows: where X j is different from X i ; f (X i ) and f (X j ) are the objective function values of X i and X j . If X i.new is better than X i , then replace X i with X i.new ; otherwise, no change.

IV. OUR APPROACH: DYNAMIC SELF-ADAPTIVE AND MUTUAL-COMPARISON TEACHING-LEARNING-BASED OPTIMIZATION (DMTLBO)
Comparing with other similar meta-heuristic algorithms, TLBO shows better efficiency in finding the optimal solution [43], [65], [66]. However, it also has some drawbacks. For example, in the teacher phase, the teacher teaches all learners according to the mean level of the class, while ignoring the differences of learners in the learning process. It makes the population appear premature convergence and inhibit learners' further improvement. In the learner phase, although learners can improve their knowledge level by communicating and learning with others, the choice of only one single learning object restricts the interaction and acquisition of effective information. In the actual teachinglearning situation, each learner has different learning and knowledge acceptance abilities at different periods. Developing dynamically personalized learning for different learners plays an extremely important role in improving their knowledge level. Therefore, it is necessary to improve TLBO from the actual situation and establish a more perfect teachinglearning method to improve the searching ability.

A. DYNAMIC SELF-ADAPTIVE TEACHER PHASE
In the actual teaching-learning situation, the learners mainly learn from teachers at the beginning to improve the mean level of the class. With the improvement of learning ability, they can selectively absorb the knowledge taught by teachers while maintaining their learning status. In this step, the mean is taken as the boundary condition. When the level of learner is lower than the mean, the updating process is the same as the teacher phase of the original TLBO. When the level is better than the mean, the learner's renewal process mainly depends on two parts: the previous learner X i and the difference between X i and X teacher . Therefore, two weight factors deciding the degree of influence of the two parts are introduced into (15) to reflect the realistic teacher phase and improve teaching and learning efficiency, which is shown as: where ω is the adaptive state transition weight factor that enables the learner to maintain an adaptive learning state transition; β is the dynamic gap weight factor that controls the speed at which learners approach the teacher.

VOLUME 9, 2021
If f (X i ) < f (X mean ), it indicates that the learner has a strong learning ability and can learn selectively according to the gap between himself/herself and the teacher. As the learner gets closer to the teacher, the value of the β needs to be decreased to slow down the speed of their approach. At the same time, the value of ω ought to be increased to enhance the learner's ability to maintain his/her own learning state. Therefore, the change of these two factors is related to the fitness function value of X i and X teacher , the current number of iterations t, and N P . In order to further improve the teaching and learning efficiency, the adaptive state transition weight factor and the dynamic gap weight factor are defined as functions of the fitness in the teacher phase and given as follows:

B. MUTUAL-COMPARISON LEARNER PHASE
To solve the problem of poor searching ability and one single learning object in the learning phase of the basic TLBO, a new learning strategy is proposed. First, the learner randomly chooses three other different learners, denoted as X r1 , X r2 and X r3 . Then, the three learners are ranked in ascending order according to the value of their objective function, X A , X B and X C are obtained. Among them, X A is the best one while X C is the worst one. Finally, they communicate and learn with each other. The mutual-comparison learning process is shown as: where µ is a dynamic scaling factor. It can be seen from (18) that the function of µ is to determine the proportion of learning state between the learner and X A . When the objective function value of X A is smaller, it means that X A has a better learning level, then the learner will learn more from X A . On the contrary, the learner maintains a greater proportion of his/her own learning status. Although X B and X C are relatively poor, the increase in the number of learning objects can enhance the population diversity of the algorithm. In addition, they are also moving towards X A , which can ensure the unity of the learners' mutual learning direction.

C. MAIN PROCEDURE OF DMTLBO
The main pseudo-code of DMTLBO is illustrated in Algorithm 2, where NFE is the number of objective function evaluations, Max_NFE is the maximum number of NFE. Figure 4 is the flow chart of DMTLBO. Compared with the basic TLBO, DMTLBO has the following advantages: 1) same structure as TLBO. DMTLBO is still divided into two phases: teacher phase and learner phase, making the algorithm relatively simple and easy to implement; 2) The time complexity of DMTLBO is O (N P  *  t), and the space  [54]. For the sake of fair comparison, the search range of each parameter is given in Table 1 [71]- [74]. In addition, DMTLBO is compared with eight advanced meta-heuristic algorithms, which are RTLBO [75] SATLBO [76], GOTLBO [50], ITLBO [51], LETLBO [77], TLABC [35], SHADE [78], and IJAYA [79]. The parameter settings of all algorithms are as follows: the N p is 50 and the Max_NFE is 50, 000. These algorithms are implemented in Matlab2016b, and each algorithm runs independently for 30 times. All the comparison experiments are conducted on an Intel (R) Core (TM) i7-7700 HQ CPU @ 2.80GHz PC with 8GB RAM, under the Windows10 64-bit OS.

A. RESULTS ON THE SINGLE DIODE MODEL
The results of the single diode model are shown in Table 2. The five-parameter extracted by DMTLBO are brought into the objective function to obtain the calculated current data, which is compared with the measured data. The individual absolute error (IAE = |I measured − I calculated |) and the reconstructed I-V characteristic curve are shown in Table 3 and Figure 5, respectively. By comparing the IAE values of DMTLBO with three other improved TLBO algorithms, it can be seen that the calculated data obtained by DMTLBO and the measured data are preferably consistent, indicating that the algorithm has good accuracy for the parameter extraction of the single diode model.

B. RESULTS ON THE DOUBLE DIODE MODEL
The proposed DMTLBO is applied to the seven-parameter extraction of the double diode model and compared with the above algorithms. Table 4 shows the parameter results extracted by these algorithms. The IAE results and the I-V characteristic curve are shown in Table 5 and Figure   6, respectively. It can be seen that although the number of extracted parameters is increased, the calculated data obtained by DMTLBO is highly consistent with the measured VOLUME 9, 2021   data. It shows that the algorithm also has good accuracy for the parameter extraction of the double diode model.

C. RESULTS ON THE PV MODULES
Tables 6, 7, and 8 present the parameter results extracted by the mentioned algorithms in the Photowatt-PW201 module, the STM6-40/36 module, and the STP6-120/36 module, respectively. The calculated current and the IAE results are given in Tables 9, 10, and 11, the corresponding I-V characteristic curves are shown in Figures.7, 8, and 9. The results demonstrate again that the calculated data obtained by the proposed DMTLBO in these PV modules can be highly fitted with the measured data.

D. COMPARISON OF OBJECTIVE FUNCTION VALUES
To verify the accuracy of each algorithm in the parameter extraction of PV models, this subsection compares the RMSE values of the different algorithms in different PV models. Considering that these algorithms are random methods, the best, worst, mean, and standard deviations of the RMSE values are calculated to evaluate the overall performance of DMTLBO. The statistical results are shown in Table 12 with boldface to emphasize the optimal value. From Table 12, it can be observed that: • In the single diode model, we provide the reported results of some state of the art approaches for comparison, including RSS [80], RF [81], TSLLS [82], Li et al. [83], and Tong and Pora [72]. Although DMTLBO's minimum RMSE value (9.8602E − 04) is worse than RSS, RF, and TSLLS's value (7.7301E − 04), it is better than or comparable to other algorithms. It shows that DMTLBO has a strong competitiveness. Moreover, DMTLBO, ITLBO, and SHADE are also better than other algorithms in terms of the worst value and mean value. It is worth mentioning that the standard deviation value obtained by DMTLBO is 2.07E − 17, which is significantly smaller than that of other algorithms. This shows that the algorithm can achieve the same RMSE value in almost every run.
• In the double diode model, DMTLBO, ITLBO, and SHADE can get the minimum RMSE value (9.8248E − 04). However, DMTLBO is better than the     PV models, and performs better than or comparable to other compared algorithms.

E. VERIFICATION OF CONVERGENCE PERFORMANCE
In this subsection, the convergence curves of the DMTLBO, RTLBO, GOTLBO, SATLBO, and ITLBO in the five PV models are drawn for comparison to verify the convergence of DMTLBO. As shown in Figure 10, the DMTLBO and ITLBO both have a relatively fast convergence rate and finally get the minimum RMSE value in the single diode model. In the double diode model, the ITLBO has the fastest convergence rate, but DMTLBO and RTLBO also can converge to the optimal RMSE value in the late iteration. In the Photowatt-PW201 module, the convergence performances of DMTLBO, RTLBO, and ITLBO are better than that of GOTLBO and SATLBO. In the STM6-40/36 module and the STP6-120/36 module, although the convergence speed of DMTLBO is slower than that of RTLBO, GOTLBO and ITLBO at the early stage, as the iteration goes on, it is obvious that DMTLBO has a faster convergence rate and surpasses all other algorithms. It is obvious that DMTLBO has an overall faster convergence rate and the minimum RMSE value can be obtained by DMTLBO in the five PV models, especially for the single and double diode models. It indicates that DMTLBO can skip the local extreme value and find the global or near-global optimal value.

F. THE BENEFIT OF DMTLBO COMPONENTS
In this paper, we optimize both phases of TLBO by the dynamic self-adaptive teacher phase and the mutualcomparison learner phase. To further discuss the superior performance of DMTLBO, two optimized phases' effects on DMTLBO are evaluated. One is DTLBO, which is made up of the learner phase of TLBO and the dynamic self-adaptive teacher phase; the other is MTLBO, which is made up of the teacher phase of TLBO and the mutual-comparison learner phase.    Table 13 presents the RMSE values of DMTILO, DTLBO, and MTLBO in different PV models. MTLBO is consistently better than DTLBO in these five PV models. Although MTLBO obtains comparable values compared with DMTLBO in the single diode model, Photowatt-PW201 module, and STP6-120/36 module, it is worse in the double diode model and the STM6-40/36 module. It can be concluded that the mutual-comparison learner phase contributes more than the dynamic self-adaptive teacher phase to the DMTLBO. However, the absence of the dynamic selfadaptive teacher phase will deteriorate the performance of DMTLBO to a large extent.

G. CHANGING CURVES OF WEIGHT FACTORS ω AND β
In this subsection, we will further verify the performance of DMTLBO by observing the changing curves of dynamic self-adaptive weight factors ω and β. At the beginning of the iteration, the learner's learning level is low, and it is necessary to quickly approach the teacher to conduct a global search more effectively. As the iteration progresses, the learner's learning level is improved, it is necessary to slow down the speed of getting close to the teacher, and give more to maintain his/her own learning state for learning, which can improve the ability of local search. In (15), ω and β decide the degree of influence of the two parts, so it needs  to change in two opposite directions. It can be analyzed from Figure 11 that as iteration increases, β gradually decreases from the maximum value tending to 1 to the minimum value tending to 0. The change in the value of β can control the speed at which the learner approaches the teacher. However, ω increases gradually from the minimum value tending to 0.5 to the maximum value 1, and then remains stable. This shows that the value of ω is small in the early stage, so the learner learns more from the teacher, and when ω increases to 1, it means the learner is excellent enough to be able to fully maintain his/her learning state.
After introducing these two factors in the improved teacher phase, the update of the learner's learning status is more in line with the actual situation. And it can improve the ability of local search and global search to achieve an efficient balance between exploration and exploitation.   DMTLBO can avoid prematurely falling into the local optimum and find the global optimum more effectively. (iv) Analyzing the influence of the improved teacher phase and learner phase on DMTLBO, it indicates that the performance of DMTLBO can be enhanced by these two phases. Although the mutual-comparison learner phase contributes more, the absence of the dynamic self-adaptive teacher phase will deteriorate the performance of DMTLBO to a large extent. (v) Through observing the changing curves of the dynamic self-adaptive weight factors ω and β, it shows that the introduction of factors ω and β makes the update of the learner's learning status more consistent with the actual teaching-learning situation and achieves a high-efficient balance between exploration and exploitation.

VI. CONCLUSION AND FUTURE WORK
Parameter extraction is critical to the establishment of photovoltaic models. In this paper, a dynamic self-adaptive and mutual-comparison teaching-learning-based optimization (DMTLBO) is proposed. According to the actual teaching-learning situation, the basic TLBO is improved by introducing new teaching and learning strategies. In the teacher phase, two differentiated and personalized teaching strategies are proposed according to learners' learning status. And two dynamic self-adaptive weight factors are introduced to control the dynamic transformation of the learners' new learning state. While in the learner phase, the learner can communicate and learn with three different learners who are randomly selected and ranked, which can improve the population diversity and learning efficiency. Finally, DMTLBO is applied to the parameter extraction of the single diode model, the double diode model, and three PV modules. The experimental results demonstrate that this algorithm is more accurate and reliable for the parameter extraction of PV models compared with other optimization algorithms. It can achieve an overall faster convergence speed. The mutual-comparison learner phase and the dynamic self-adaptive teacher phase can cooperate well in improving the performance of DMTLBO. In addition, the introduced dynamic self-adaptive weight factors ω and β can make DMTLBO more practically and balance exploration and exploitation effectively.
In future work, we will test the performance of DMTLBO on benchmark functions and constrained problems, and try to further improve its performance on parameter extraction of more PV models. In addition, since the convergence speed of the algorithm at the early stage of evolution is relatively slow, we will improve its convergence by adding effective local search operators. Finally, we believe that DMTLBO is a highly competitive algorithm and other researchers can further enhance its performance and apply to solve more optimization problems.