Production Wastage Avoidance Using Modified Multi-Objective Teaching Learning Based Optimization Embedded With Refined Learning Scheme

Teaching learning-based optimization (TLBO) is a popular algorithm used to solve various optimization problems. Nevertheless, conventional TLBO and some improved variants tends to suffer with premature convergence due to rapid loss of population diversity, especially when handling the challenging optimization problems. Furthermore, it is not practical to tackle real-world multiobjective problems using prior approach given the frequent changes of customers’ requirements. Motivated by these challenges, an improved variant known as Modified Multi-objective Teaching Learning Based Optimization-Refined Learning Scheme (MMTLBO-RLS) was proposed as a posterior approach to solve challenging multiobjective optimization problems, including the prediction of optimum turning parameters to machine Polyether ether ketone material (PEEK). Substantial modifications were introduced for teacher and learner phases of MMTLBO-RLS to achieve better balancing of exploration and exploitation searches without incurring excessive computational cost. For modified teacher phase of MMTLBO-RLS, each learner was guided by a unique teacher solution and unique mean position to perform searching with better diversity. Meanwhile, two new learning strategies are incorporated into the modified learner phase of MMTLBO-RLS, enabling all learners to enhance their knowledge more efficiently based on their learning preferences. A systematic approach was followed to develop modelling equations required for optimization. The developed algorithm was then employed in single objective optimization as well as multiobjective optimization to cater its performances in any real-world environment. The prediction model reports that surface roughness of $1.1042~\mu m$ and material removal rate of 22.8991 cm3/minute can be achieved. The predicted results differ from validation results by less than 2.69% in any case of optimization. A benchmarking on the performance of MMTLBO-RLS in solving CEC 2009 multiobjective benchmark functions was further carried out with other seven meta-heuristic algorithms. The superior performance of MMTLBO-RLS proves that it is not only suitable to be used in industries to produce the parts of PEEK with supportive quality and quantity, but it is also able to solve other multiobjective optimization problems with competitive performances.


I. INTRODUCTION
Polyether ether ketone (PEEK) is a biomaterial that has superior mechanical properties and high temperature durability.
The associate editor coordinating the review of this manuscript and approving it for publication was Md. Abdur Razzaque.
The ultimate tensile strength of this thermoplastic material is in the range of 90 to 100 MPa, its modulus of elasticity is about 3.6 GPa and the glass transition temperature is about 143 • C to 250 • C. It is preferred in many industrial applications including valves, bearings, pistons, seals manufacturing and bio-medical application. The implants or bone plates made of PEEK are viable alternative to Stainless steel and Titanium alloys for the reason of avoiding stress shielding. The stress shielding is a major issue in bone plating, which is caused by difference in elastic modulus of fractured bone and implant. When the elastic modulus of implant matches with the fractured bone, the higher stress transformation is occurred between them and hence the re-fracture of the bone is avoided in the future. The orthopedic implants, bone plates and medical instruments are manufactured by casting, forging, sintering, machining, and recently additive manufacturing. These parts require machining like turning, drilling, grinding etc. The geometry of the joint implants, surgical instruments, molds or forging dies are different in shape and complex as well. Though the dimensional accuracy is easy to achieve, the surface finish is challenged to ensure. The post machining through belt grinder or polishing machine is time consuming and moreover it increases the lead time. Achieving the best surface finish inherently during the machining is better than applying any other post machining processes. The improved surface quality can be achieved by using a combinational and engineered optimum cutting parameters with appropriate tool path strategies.
A popular modelling technique known as regression can be leveraged to formulate nonlinear functions that are able to accurately describe the output responses of given processes in associated with their input parameters. The optimal combinations of these input process parameters can then be searched from the solution space of these regression models using various optimization schemes to achieve the best performances of given processes. As compared to traditional mathematical optimization methods (e.g., conic programming, stochastic programming, geometric programming), nature-inspired optimization methods have recently emerged as the more promising approaches to handle most challenging real-world engineering problems. Nature-inspired optimization algorithms are preferred as they do not require the good estimation of initial solution and accurate gradient information of objective functions.

A. TLBO VARIANTS AND APPLICATIONS 1) SINGLE OBJECTIVE OPTIMIZATION
The engineering problem can be a single objective or multi-objective optimization model. It depends on number of response parameters involved in the given problem. Teaching-learning based optimization (TLBO) is a prevalent optimizer that inspired by teaching-learning strategy in classroom to find optimal solutions in solution space. This is because it requires lesser effort to adjust the specific control parameter of the algorithm. A 3D finite element modelling simulation was applied along with TLBO in [1] to optimize the depth of cut, feed rate and cutting speed for minimizing the power consumption of micro ball-end milling process of D2 steel. TLBO was applied in [2] to determine the best combination of wire feed rate, voltage, current, thickness of workpiece in order to individually optimize the penetration, reinforcement and width of weld in a metal inert gas (MIG) welding process. Surface roughness for plasma arc cutting of AISID2 steel [3] and electric discharge machining (EDM) of pure magnesium [4] were minimized by TLBO via the searching of optimal machining parameters. The tool path computation of CNC machining in [5] was formulated as a discrete optimization problem and a discrete TLBO variant was implemented via parallel computing to determine an optimized path with minimum global distance. Apart from the original TLBO, substantial amounts of TLBO variants have also been developed via various enhancement schemes to solve different challenging optimization problems and realworld applications.
Three new initialization schemes inspired by oppositional based learning were proposed in TLBO to generate initial population with better quality. This improved the convergence speed and accuracy of final solution in solving different scheduling and dispatch problems [6]- [8]. A nonlinear inertia weighted TLBO (NIWTLBO) was designed in [9] to adjust the memory rate of learners. It promoted global search at early stage of search process and emphasized the local search in latter stage. Similarly, a weighted elitist TLBO (WETLBO) was proposed in [10] to search for the best hyperparameters of support vector machine (SVM) for classifying and diagnosing the faulty data collected from chemical process. Varying population size in triangular form (VTTLBO) was introduced by [11] in which varying population size was attempted to reduce the computing cost. It used gaussian distribution to generate the new solutions during the increasing phase of population size, while similarity criterion was considered to discard redundant solutions in decreasing phase. This could help to optimize the variables of artificial neural network (ANN). Inspired by modern pedagogical concept of intraclass grouping, a fuzzy K-means clustering method was proposed in [12]. A fuzzy grouping learning (FGL) strategy was used to perform partition on the main population of FGLTLBO into different clusters based on the interests and capabilities of learners. The knowledge of FGLTLBO learner was updated by interacting with the best and mean position vectors of cluster. A similar variant known as clustered adaptive TLBO (CATLBO) was designed in [13] to determine the optimal generation schedules for deregulated power market. In contrary to FGLTLBO, each CATLBO learner from different clusters was assigned with the unique teacher and was optimized separately to prevent rapid diversity loss. In [14], different subswarms were randomly created at bottom layers of hierarchical multiswarm cooperative TLBO (HMCTLBO) to enhance its global search ability, whereas the best learners of all subswarms were identified to construct the upper level of hierarchy and evolved with gaussian sampling learning. The randomized regrouping and Latin hypercube sampling were also incorporated in HMCTLBO as other diversity maintenance schemes.
In [15], differential learning TLBO (DLTLBO) was proposed to search for the optimal parameters of digital infinite impulse response filter. It was leveraged by interactive VOLUME 10, 2022 learning strategies designed in teacher phase. In specifically, two candidate solutions were first obtained using neighborhood search and mutation, followed by the construction of new offspring leaner using the crossover process to maintain the population diversity of DLTLBO. In addition, different mutation schemes were also introduced as the diversity maintenance schemes of TLBO variants to solve various engineering applications such as ANN training [16], optimal configurations of distributed generation units [17], optimal reactive power dispatch [18] etc. In [19], the combined search schemes of mutation, crossover and learning phase with selffeedback mechanisms were adopted. An improved TLBO (I-TLBO) based on the historical experiences of population was applied to solve heat treating problem in foundry industry. Recently, an improved TLBO (DI-TLBO) was proposed in [20] to solve multilevel thresholding image segmentation problems by incorporating two new learning strategies in both teacher and learner phases. The additional mechanisms such as self-feedback learning, mutation and crossover were further utilized to enhance the exploration strength of DI-TLBO. Another improved TLBO (ITLBO) variant was developed to determine for the optimal feature subset of chronic disease dataset in [21] by leveraging the concept of Chebyshev distance in updating the new position of each learner during teacher phase. Inspired by the benefits of TLBO and simulation annealing (SA), a hybrid algorithm known as TLBOSA was designed in [22] to solve the feature selection problem of gene expression dataset along with the SVM. Under the TLBOSA framework, TLBO served as the global search method to guide population searching towards the promising solution regions, whereas SA was used as the local search method to further refine the solutions found. A modified TLBO (MTLBO) was proposed in [23] to optimize the hyperparameters of extreme learning machine (ELM) for enhancing its capability to predict solar power in the short and medium terms. During the teacher phase of MTLBO, the original population was randomly partitioned into several sections and guided by different good performing learners. Meanwhile, the teacher solution was considered in learner phase to enhance the exploitation strength of MTLBO.

2) MULTIOBJECTIVE OPTIMIZATION
Different TLBO variants have been attempted to solve for multiobjective optimization problems (MOPs) that are prevalent in engineering applications. In contrary to single objective optimization problem (SOPs), MOPs tend to generate multiple optimal solutions due to the presence of more than one optimization objective with contradictory characteristics. In teacher phase of a multi-objective TLBO (MO-TLBO) [24], the least crowded Pareto optimal solution as the teacher and centroid of external archive mean position were used. In θ-multiobjective TLBO (θ-MTLBO) [25], the dynamic economic emission dispatch problem was solved by leveraging a mapping process to convert the original decision variables into the corresponding phase angles. Both niching and fuzzy clustering methods were introduced into θ-MTLBO to improve the accuracy and diversity of its Pareto fronts, respectively. In multiobjective improved TLBO (MO-ITLBO) [26], multiswarm approach was first incorporated in teacher phase in which unique teacher was assigned to guide each subswarm. Both tutorial learning and selfmotivated learning were embedded in teacher and learner phases, respectively, to promote knowledge exchange within population.
In [27], self-adaptive multi-objective TLBO (SA-MTLBO) was developed for solving cracking of furnace owing to ethylene. It used self-adaptive strategy to update the knowledge of learner either in teacher phase or learner phase and two additional search operators to handle scenarios where any two learners in comparison are non-dominated by each other. In [28], the quasi-opposition-based learning concept was utilized to generate an initial population with better fitness quality for handling the multiobjective power flow optimization problem. A multiobjective individualized instruction TLBO (INM-TLBO) was proposed in [29]. It applied roulette selection to identify the unique teacher and interactive peer for each learner in order to perform searching with greater diversity. A hybrid algorithm known as TLBO-PSO was designed in [30] to tackle a multi-objective economic dispatch problem that involved power generation using renewable energy by minimizing the generation emission and cost simultaneously. During the teacher phase of TLBO-PSO, the new position of each learner was updated by considering the difference between teacher and population mean as well as difference between teacher and learner itself.
The empirical models of EDM process for Nimonic 75 superalloy were formulated in [31], where two conflicting goals of maximizing material removal rate and minimizing surface roughness are combined as one objective function using weight assignment method and optimized by classical TLBO. Two goals of minimizing surface roughness and maximizing cutting rate to be attained by the wire-cut EDM process of Inconel-825 were solved simultaneously with a single objective function in [32] using TLBO. Similarly, three optimization goals to be attained by abrasive water jet machining of C360 brass (i.e., minimum surface roughness, maximization of material removal rate and hardness) are represented as a single objective function in [33] and solved using classical TLBO. Two response variables of electrode wear ratio and drilling rate for electric discharge drilling process of titanium in [34] were obtained using the response surface methodology and converted into a single objective function with grey relational analysis before searching for the best combinations of machining parameters (i.e., peak current, pulse-off and pulse-on time) using TLBO. A preferencebased multiobjective TLBO (PBMOO-TLBO) was proposed in [35] to attain the sustainable machining of Ti-6Al-4V alloy with wire-cut EDM via the minimization of surface roughness and maximization of material removal rate. The multiple numbers of single objective function were constructed via different weight combinations and solved using PBMOO-TLBO to construct a Pareto front.
MOTLBO [36] applied any value between 1 and 2 as a teacher factor for solving a machining problem that included simultaneous minimization of carbon emission and operating time. An enhanced multiobjective TLBO (EMOTLBO) was developed in [37] for optimizing turning parameters in machining of Delrin material. In EMOTLBO, the roulette wheel selection was used in teacher phase based on crowdedness level criterion to select teacher for each learner, while tournament selection was used to discard redundant archive members during external archive updation. Non-dominated sorting TLBO (NSTLBO) developed in [38] addressed the solution for different multi-response machining problems. The enhanced versions of NSTLBO were introduced in [39] and [40] to handle the machining of Polytetrafluoroethylene (PTFE) and swept friction stir spot welding of aluminum alloy, respectively. In these versions, NSTLBO variants assigned the nearest Pareto solution as the teacher to each learner during teacher phase. The weighted mean position and self-learning concepts were incorporated to enhance their global search abilities. The differences between conventional and swept friction stir spot welding on aluminum alloy were analyzed in [41] in terms of their failure mode, microstructure and mechanical properties.

B. CHALLENGES OF EXISTING WORKS
Although numerous works related to TLBO were proposed by different researchers since its inception, some common drawbacks and technical challenges can be observed from these studies. First of all, it is noteworthy that the related works of [1]- [5], [31]- [35] focused on applying the original TLBO to solve different real-world applications, particularly on the machining optimization problems. Despite having relatively good performances in solving these problems, the original TLBO tends to suffer with drastic performance degradation when dealing with more complex optimization problems with explosive numbers of local optima in fitness landscapes. Without incorporating any robust mechanisms or modifications to achieve better balancing of exploration and exploitation searches, all of these TLBO learners have high likelihood to be misguided by the local or non-optimal solution regions with inferior directional information. This undesirable scenario can result in the premature convergence issue of TLBO when handling more challenging real-world problems, hence delivering poor optimization results.
The idea of multi-population frameworks was adopted by TLBO variants designed in [11]- [14], [23], [26] to tackle the rapid loss of population diversity in solving more complex optimization problems. In general, these multi-population TLBO variants have better diversity preservation capability than those with single population due to their capability to divide the main population into multiple subpopulations, hence enabling the learners to perform searching in different regions of solution space simultaneously. Although the idea of multi-population frameworks is feasible to tackle the premature convergence issues of TLBO in certain extent, this approach has some on-going technical challenges that need to be addressed in order to embrace its full potentials. A major drawback of multi-population frameworks is the high computational complexity incurred to divide the original main population into multiple subpopulations and the regrouping mechanisms by referring to some criteria such as fitness or distance between solutions. Furthermore, it is nottrivial to determine the optimal number of subpopulations and the types of learning strategies to be assigned for each subpopulation in order to solve different types of optimization problems effectively.
The modification of learning strategies is another popular strategy used by existing TLBO variants [9], [10], [15]- [17], [19], [20], [22]- [24], [27], [30], [36] to promote their population diversity in solving optimization problems with complex fitness landscapes. Different types of new learning strategies such as neighborhood search, mutation, crossover and etc. were introduced into the teacher phase, the learner phase or both learning phases of these TLBO variants in order to enhance their effectiveness in tackling different optimization problems. Several common drawbacks can be observed from the existing TLBO variants with modified learning strategies. For instance, the learners of some TLBO variants [22] still have high tendency to learn from the same teacher solution and mean position that were both constructed using historically best position found from population. Although these historically best positions were useful to accelerate the convergence speed of learners at the initial optimization stage, they tend to remain unaltered for subsequent iterations at the later stage of optimization, hence resulting in the higher probability to suffer with premature convergence. It is also noteworthy that the strategy of assigning same mean position that represents the mainstream knowledge of population for all learners is contradictory with real-world scenario of teaching and learning because each learner supposed to have slightly different perception on the mainstream knowledge of classroom [42]. In addition, it is observed that the learner phases of some TLBO variants [9], [10], [15], [16], [19], [20], [23], [30], [36]- [39] did not accurately reflect the actual scenario of peer interaction in classroom. Some of these TLBO variants also only allowed each learner to interact with same peer learner in all dimensions during the learner phase. In real-world scenario, it is more common for a learner to interact with different peer learners to enhance the knowledge of different subjects in order to improve the overall learning efficiency. Meanwhile, some existing TLBO variants have neglected the preferences of different learners to enhance their knowledge after classes, especially for introverted learners that prefer self-learning instead of interacting with their peers.
Finally, it is noteworthy that some TLBO variants in [30]- [35] were designed as priori approach to solve the multiobjective optimization problems. For priori approach, different weight values were assigned to the multiple numbers of objective functions that have contradictory goals by referring to their importance levels in order to construct a single objective function [43]. Despite having simpler mechanisms VOLUME 10, 2022 to tackle multiobjective optimization problems, the priori approach can only produce a unique optimum solution based on the predefined importance level assigned to each objective function. Nevertheless, this approach is not practical to be implemented in the real-world scenario because the process planners might not always know the importance levels of all objective functions in advance [44], especially if customers have high tendency to change their requirements frequently. In this scenario, it is more time and resource consuming to perform the optimization process with priori approach in order to satisfy the requirements of customers. In contrary to priori approach, the posterior approach emerges as a more promising solution to solve the multiobjective optimization problems due to its ability to generate a set of Pareto-optimal solution in a single optimization process [43]. Referring to the important levels assigned for all objective functions, the process planner is able to select a unique optimal solution from the Pareto optimal solutions without having to repeat the optimization process.

C. RESEARCH SIGNIFICANCES AND CONTRIBUTIONS
A multi-response machining model for PEEK material is first formulated in this paper by using the response surface model technique based on the experimental results obtained from the turning of PEEK material. A modified TLBO variant known as Multi-objective Teaching Learning Based Optimization-Refined Learning Scheme (MMTLBO-RLS) is subsequently designed in current work to search for the optimal combination of turning parameters that can maximize material removal rate and minimize surface roughness of PEEK simultaneously.
From practical point of view, the proposed MMTLBO-RLS has several desirable characteristics that enable it to solve the proposed multi-response PEEK machining problem and other complex multiobjective optimization problems more effectively. First of all, MMTLBO-RLS is designed as a posterior approach to tackle various types of multiobjective optimization problems. Instead of only able to produce a unique optimum solution in each run based on predefined weightage assigned to each objective function, the proposed MMTLBO-RLS is able to generate a set of non-dominated Pareto optimal solutions in a single run when solving the multiobjective optimization problems. A unique optimum solution can be subsequently selected from Pareto optimal solution set using fuzzy decision maker based on the latest preferences specified by the customers.
Secondly, the proposed MMTLBO-RLS is designed to have excellent diversity preservation capability through the modification of learning strategies in both of the teacher and learner phases instead of relying on the multi-population frameworks that tends to be more computationally intensive and not feasible for real-world applications. The proposed modifications introduced for both teacher and learner phases of MMTLBO-RLS is more feasible for solving real-world applications. This is because the modifications of learning strategies are able to provide additional solution diversity required for reducing the high tendency of population to be trapped into the local optima of complex multiobjective problems without having to incur the excessively expensive computational resources.
Thirdly, the proposed MMTLBO-RLS is designed to have better ability in balancing the exploration and exploitation searches of algorithm, therefore it is able to generate the more uniformly distributed Pareto optimal solution set with higher percentages of non-dominated solutions. The learning strategies of MMTLBO-RLS for both teacher and learner phases are further modified and refined to ensure the learning mechanisms of all learners can better represent the modern classroom teaching and learning environments. For instance, the concepts of unique teacher solution and unique mean position are leveraged to guide the search process of each MMTLBO-RLS learner during the modified teacher phase in order to prevent the diversity loss of population. Meanwhile, different learning strategies are introduced for different types of MMTLBO-RLS learners during modified learner phase in order to enhance the learning efficiency of algorithm.
In general, the technical contributions and significance of current study can be summarized as follows: • A multi-response PEEK machining problem that aim to simultaneously maximize material removal rate and minimize surface roughness is first formulated using the response surface methodology. An improved TLBO variant called MMTLBO-RLS is subsequently designed as a posterior approach to solve these challenging multiobjective optimization problems.
• For the modified teacher phase of MMTLBO-RLS, a teacher selection scheme and the concept of unique mean position are introduced to preserve the diversity of population during the search process. In order to fully utilize the useful directional information of promising population members, each MMTLBO-RLS is guided by its unique teacher and unique mean position to perform searching in different subregions of solution space.
• Modified learner phase of MMTLBO-RLS considers the possibility of different learners to have their unique preference in updating knowledge. Therefore, two new learning schemes known as the self-motivated learning (SML) and interactive adaptive learning (IAL) are incorporated into the proposed MMTLBO-RLS to improve the overall learning efficiency of learners.
• Rigorous performance studies of MMTLBO-RLS are conducted using the proposed multi-response PEEK machining optimization problem and the CEC 2009 multiobjective benchmark functions with different complexity levels. The proposed MMTLBO-RLS is revealed to exhibit its dominating search performances over the well-established multiobjective optimization algorithms in solving majority of tested problems The article is presented as: Section II focuses on the experimental design and modelling of turning process. The detailed search mechanisms of proposed MMTLBO-RLS are

II. EXPERIMENTAL DESIGN AND MODELLING OF TURNING PROCESS
Design of Experiment is an organized approach to estimate the number of minimum runs or trails to be conducted for collecting experimental datasets. The design factors or turning control parameters are independent parameters involved in the experimental design. The independent variables involved are speed (V c ), feed ( f ), and depth of cut (ap). The surface roughness (R a ) of the part and material removal rate (MRR) during turning are two parameters generally considered for optimization. The quality of the part is assessed by R a , while quantity or volume of production is assessed by MRR. The R a is the predominant response variable that any production industry uses in quality control to decide whether to accept the part or reject. The material removal rate is another response variable that relates to volume of production that leads to productivity.
To start with, the preliminary trail experimentations were conducted with different combination of input parameters and the R a and MRR in each experiment were recorded. Three levels for each independent variable were chosen as shown in Table 1 from the preliminary investigations. The selected three levels of each parameter were then used to complete the Box-Behnken design, which is an independent quadratic design. L 27 design matrix was developed from design of experiments and used in turning of PEEK rods using the computerized numerical machine tool center (Sprint 16TC Fanuc 0i T Mate C). CNMG carbide tip insert with shape of rhombic 80 • and coolant were used in all experiments. The samples were 20mm in diameter and 60mm in length as shown in Figure 1. The response variables from each run and the time taken for completion of each turning operation were measured immediately after the experiment. Mitutoyo surf test digital meter was employed for measuring R a of the machined part and Eq. (1) was used to estimate MRR. The  experiments were conducted four times for each machining condition and the average of the measurements was considered. The experimental results along with the respective cutting parameters used in each turning operation are shown in Table 2.
where MRR is material removal rate (cm 3 /minute), D 0 is outside diameter (unmachined diameter in cm), D i is inside diameter (machined diameter in cm), L is length of cut (cm) and T is time taken to cut (minute). It is noted from the experimental results that the best surface finish of 1.02 µm was achieved at the cutting condition of V c = 155 rpm, f = 0.2 mm and ap = 0.25 mm. But the best material removal rate of 24.38 cm 3 /minute was achieved at the cutting condition of V c = 95 rpm, f = 0.6 mm and ap = 0.5 mm. These two results are found contradict to each other as R a relates to quality and MRR relates to quantity. The challenge of balancing the contradictory performance of these two dependent parameters can be solved using multi-objective optimization. The predicted optimum turning parameters can be applied in production to machine a large amount of good quality parts. The characteristic of optimization assists the production companies to avoid production wastages and hence increased productivity.

A. ANALYSIS OF INDEPENDENT PARAMETERS AND MODELING OF DATASET
To solve for aforementioned problem, the influence of each independent variable on dependent variables and predominant independent variable were determined through Analysis of Variance (ANOVA). ANOVA test on datasets of Table 2 reveals that feed rate is a super dominating independent variable with the contribution of 93.90% on the result of R a . The depth of cut is found to be a predominant parameter with 47.31% in regard of MRR. Figure 2 shows the correlation between cross product terms of R a and Figure 3 shows the correlation between cross product terms of MRR. The significance of feed rate on R a is evidenced in Figure 2(a) and 2 (c). All other linear terms and cross product terms have a little or negligible effect on R a .
The second order response surface model (RSM) was developed for each response variable. The second order model considers linear function of variables with independent nature as well as their cross-product terms. The derived RSM equations are: These two optimization functions were further applied in the developed algorithm to get minimum R a maximum MRR. The developed algorithm is detailed in the below section.

III. PROPOSED METHODOLOGY OF OPTIMIZATION A. OVERVIEW OF TLBO
TLBO algorithm was first introduced by Rao et al. [45], in which the searching of solution space was conducted using teaching-learning paradigm of a conventional classroom. During the initial state of optimization, a set of learners consisting of the population size of N are arbitrarily generated. Let d ∈ D, where d and D are dimension index and total dimensions respectively. Each n th learner has the candidate solution of the problem as X n = X n,1 , . . . , X n,d , . . . , X n,D . The quality of corresponding n th learner in terms of knowledge level can be quantified using an objective function of (X n ). Notably, the knowledge level of every TLBO learner is enhanced iteratively by search mechanisms introduced in the teacher phase and learner phases as explained below.
In teacher phase, the position vector of learners is updated based on the information provided by the best population member (teacher solution X teacher ) and the mainstream knowledge data is disseminated among all solution members. Particularly, a mean position vector X mean computed from overall population can be used to quantify the mainstream knowledge of conventional TLBO.
If r 1 ∈ [0, 1] is the random value produced from uniform distribution and T f ∈ {1, 2} is a teaching factor used to indicate the significance of mainstream knowledge in guiding learner towards the search, the new solution of every n th learner generated in teacher phase is For learner phase, an interactive learning is observed between every n th learner with other population members to achieve fitness enhancement. A randomly chosen s th peer learner to interact with the n th learner, s ∈ [1, N ] and s = n; r 2 ∈ [0, 1] represents a random value produced from uniform distribution. In learner phase, a learner X n can be attracted by the randomly selected learner X s as indicated in Eq. (6) if the solution from X s has superior fitness. In contrary, a learner X n is discouraged to approach towards the learner X s as in Eq. (7) if this randomly selected peer has inferior solution.
If the new solution X new n found by each n th learner during teacher phase or learner phase has more superior fitness than the original solution X n , a tournament selection scheme is triggered to update X new n as the current solution of n th learner. In contrary, X new n will be discarded if it has worse fitness value than that of X n . The iterative knowledge enhancement process of learners through the teacher-learner phase are performed up to the satisfaction of algorithm's termination condition. Finally, the current best solution of the population for the given problem X teacher is returned. VOLUME 10, 2022

B. MODIFIED MULTIOBJECTIVE TLBO WITH REFINED LEARNING SCHEMES (MMTLBO-RLS)
A decisive factor to govern the performance of TLBO and its variants when dealing with different optimization problems is the underlying mechanisms used to balance two contradictory search behaviors known as exploration and exploitation. Although numerous TLBO variants have been designed in the last decade, their capabilities to solve the engineering optimization problem with several fitness landscapes remain questionable. This is because majority of learning mechanisms incorporated into the algorithmic frameworks of these TLBO variants are not sufficiently comprehensive to describe the actual teaching and learning processes as observed from real-world scenarios. Under these circumstances, most of the useful directional information brought by the predominant learners in population might not be fully utilized to carry out effective searching process, leading to higher tendency of these TLBO variants to suffer with different drawbacks such as rapid diversity loss or slow convergence.
A new multi-objective TLBO variant, namely MMTLBO-RLS, is therefore proposed in current research to address the afore-mentioned weaknesses. Several major modifications are proposed in MMTLBO-RLS to further refine the search mechanisms of teacher and learner phases, enabling more realistic emulation of real-world teaching and learning processes. With these search mechanisms of MMTLBO-RLS, the useful information of predominant learners can be better utilized to carry out more effective searching process in solution space and improve the overall optimization performances. Furthermore, an archive controller is also included into MMTLBO-RLS as an essential mechanism to handle the challenges of MOPs in the presence of contradictory optimization objectives by effectively managing the newly introduced and redundant archive members.

1) PARETO DOMINANCE
Assuming that MOP has a total of M optimization objectives, let m (X n ) represents the value of objective function obtained by learner X n in response to each m th objective of the problem, where m = 1, . . . , M and n = 1, . . . , N . With the presence of multiple and contradictory optimization objectives in MOPs, it is nontrivial to differentiate the quality of solutions by only referring to their objective function value as what have been commonly practiced in single objective problems (SOPs). For this reason, Pareto dominance is envisioned as a useful concept to address the aforementioned challenge and some of its fundamental definitions are described as follows: Definition 1 (The Definition of Pareto Dominance): Assume that two solution vectors X n and X s have the objective function values m (X n ) and m (X s ), respectively corresponding to every m th objective. X n is considered to dominate X s , i.e., X n X s , if and only if ∀i ∈ {1, 2, . . . ,

Definition 2 (The Definition of Pareto Optimality):
A solution vector X * is identified as Pareto optimal solution, if and only if /∃X ∈ R D : X X n , i.e., there is no more solution X to dominate X * .

Definition 3 (The Definition of Pareto Optimal Set):
The collection of Pareto optimal solutions can produce a Pareto optimal set (PS) if and only if PS = X n ∈ R D |/∃X ∈ R D : X X n .

Definition 4 (The Definition of Pareto Optimal Front):
The mapping of Pareto optimal set into objective function space can produce a Pareto front (PF), if and only if PF =

2) CONSTRUCTION OF EXTERNAL ARCHIVE IN MMTLBO-RLS
The initial population of MMTLBO-RLS is produced by randomly generating the solution of N learners, i.e., P = [X 1 , . . . , X n , . . . , X N ]. m (X n ) is a value of objective function obtained by the n th learner corresponding to each m th optimization objective with m = 1, . . . , M and n = 1, . . . , N . Given these objective functions, the optimal Pareto fronts are found based on Definition I as explained in earlier subsection and then stored into a finite size external archive denoted as A. Essentially, the external archive A has M dimensional objective space, which have been explored and formed using an adaptive grid approach by multiple equally spaced hypercubes in order to generate the uniformly distributed Pareto fronts. Each Pareto optimal solution is inserted into an appropriate hypercube by referring to the corresponding objective values. The useful directional information contained in these Pareto optimal solutions are fully utilized to influence the search trajectories of all MMTLBO-RLS learners.

3) PROPOSED MODIFICATIONS IN TEACHER PHASE OF MMTLBO-RLS
In teacher phase of TLBO, each learner X n can search for new solution based on directional information provided by the best solution vector X teacher and mean solution vector X mean determined from population as indicated in Eq. (5). Nevertheless, mechanisms used to select the best solution or teacher solution tends to be more complicated for MOPs due to the presence of multiple and contradictory objectives that could produce a set of equally good Pareto optimal solutions that are qualified to lead the search process. Apart from the objective function values, the solution density of learners in objective space also need to be considered to rank the quality of all Pareto optimal solutions found.
Based on the above motivation, a selection scheme is incorporated into the proposed MMTLBO-RLS to determine a unique teacher solution for each learner to achieve more effective searching. Suppose that X teacher n refers to a teacher solution specifically allocated to guide the n th learner, where X teacher n is selected from external archive A based on the density of each hypercube occupied with the Pareto optimal solutions. In order to produce a Pareto front with more uniform distribution, the less occupied hypercube has higher chance to be chosen for contributing one of its non-dominated solution as X teacher n to guide the search process of n th learner. Let H be the total occupied hypercubes found from A and κ h denotes the numbers of Pareto optimal solutions appear in each h th occupied hypercube. The probability h of each h th occupied to be selected using roulette-wheel method is defined as: According to [46], [47], the leader selection pressure defined as parameter υ in Eq. (8) has a constant value greater than 1 because it serves as a fitness sharing strategy used to reduce the likelihood of population converging towards the crowded regions by penalizing those more populated hypercubes with lower selection probability h . From Eq. (8), the probability of each h th occupied hypercube being chosen to contribute X teacher n increases with the lower density value κ h and vice versa. A Pareto optimal solution of the selected h th occupied hypercube is then randomly chosen as X teacher n to guide the n th learner in searching for new solutions. The teacher selection mechanism employed in MMTLBO-RLS is expected to guide all learners exploring towards the promising solution regions occupied by different teacher solutions, therefore it can locate the optimal Pareto front of a given MOP more effectively while preserving the population diversity.
Apart from the mechanisms used to identify a unique teacher for every learner, a more realistic mathematical formulation is also developed in modified teacher phase of MMTLBO-RLS to comprehensively address the mainstream knowledge in classroom because this is another crucial factor to govern the effectiveness of knowledge transferring process. It is noteworthy that the existing modelling approach used by conventional TLBO as shown in Eq. (4) is not sufficiently accurate to portray the actual teaching-learning in classroom because it assumes that all learners are influenced by the same mainstream knowledge expressed as the mean position vector X mean of population. Furthermore, information sharing among all learners via the same mainstream knowledge X mean is also not beneficial for the algorithm to achieve further population diversity enhancement. These undesirable deficiencies are the main factors to restrict the robustness of original TLBO to deal with more complex optimization problems such as MOPs. Intuitively, different mean positions should be formulated to provide unique directional information for different leaners to adjust their respective search trajectories given that each learner has slightly different interpretations on the mainstream knowledge of classroom. When each learner is guided by the different directional information of its unique teacher and mean position vectors during the teacher phase, the overall population diversity is expected to enhance. This enables the algorithm to exhibit better robustness against the misleading information of local optima, hence reducing the likelihood of suffering with premature convergence issue.
Motivated by the aforementioned justifications, an alternate strategy is further designed to model the tendency of each MMTLBO-RLS learner to have different interpretations of classroom mainstream knowledge. Particularly, a new scheme is proposed to derive the unique mean position of each learner by leveraging the useful search information offered by all existing Pareto optimal solutions. As compared to TLBO, the proposed scheme of calculating the unique mean position for each MMTLBO-RLS learner is anticipated to achieve more effective knowledge enhancement of overall population by fully utilizing the expertise of multiple teachers stored in A during the modified teacher phase. Let r a ∈ [0, 1] be a random number assigned to the a th Pareto optimal solution A a stored in the h th occupied hypercube, where a = As shown in Eq. (9), different random weightage value of r a is assigned to each a-th Pareto optimal solution to indicate its unique contribution to formulate the unique mean positions used to guide different learners.
Referring to X teacher n andX mean n , a new learning strategy is designed for the modified teacher phase of MMTLBO-RLS to determine the new solution X new n of each n th learner as follow: X new n = X n + r 3 X teacher n − T f 1 X n + r 4 X mean n − T f 2 X n (10) where r 3 , r 4 ∈ [0, 1] are two different random numbers generated from the uniform distribution; T f 1 , T f 2 ∈ [1, 2] are two different teaching factors generated from the uniform distributions to quantify the degree of influences brought by the teacher and mainstream knowledge on a given learner. In contrary to TLBO, the new learning strategy of MMTLBO-RLS in Eq. (10) enables each n th learner to directly interact with its unique teacher and mean position vectors to determine its next position in search space. The new learning strategy introduced into the modified teacher phase of MMTLBO-ELS is expected to achieve more effective knowledge enhancement for each learner via better utilization of promising information contributed by Pareto optimal solutions stored in the external archive A. Each MMTLBO-RLS learner is assumed to have strong tendency to learn from these Pareto optimal solutions if both teaching factors are set as T f 1 = T f 2 = 2. Otherwise, the learner is considered to have moderate tendency to interact with these Pareto optimal solutions if both teaching factors of T f 1 and T f 2 are set as 1.
Suppose that m X new n represents the objective value of X new n for each m th objective, the value of m X new n is evaluated and compared with that of m (X n ) for m = 1, . . . , M with the Pareto dominance concept as shown in Figure 4. The  new solution X new n obtained can be used to update the existing X n if X new n X n as shown in Lines 11 and 12 of Figure 4. Otherwise, the current X n is retained as indicated in Lines 13 and 14 of Figure 4. If both X new n and X n are non-dominated to each other, a coin is flipped to randomly select one of these solutions to be updated as the current position of n th learner in the next iteration as illustrated in Lines 17 to 20 of Figure 4.

4) PROPOSED MODIFICATIONS IN LEARNER PHASE OF MMTLBO-RLS
Similar to teacher phase, inaccurate representation of teaching and learning process is also observed in the learner phase of conventional TLBO. One of the most notable drawbacks is that only a single learning strategy is assigned to all learners for the knowledge enhancement purpose during the learner phase of conventional TLBO. For real-world scenarios, a classroom contains diverse types of learners with their own preferred approaches in seeking for new knowledge. In order to facilitate different preferences of these learners, it is more desirable to incorporate the multiple learning schemes with different levels of exploration and exploitation strengths into the learner phase in order to facilitate more effective search processes. Another notable deficiency of the learner phase implemented in conventional TLBO is that every learner can interact only with single peer learner to update its directional information in all dimensions. This learning behavior is not only inefficient, but it is also not aligned with the actual scenario of teaching-learning. Most often, a learner is expected to achieve more effective knowledge exchange and have better capability to discover the useful solution regions that have not been visited before by interacting with multiple numbers of peer learners in solution space. Motivated by these two main deficiencies, further modifications are introduced to improve the performance of MMTLBO-RLS by introducing two new learning schemes into its modified learner phase.
The first learning scheme introduced into the modified learner phase of MMTLBO-RLS is known as self-motivated learning (SML). This learning strategy aims to emulate the behaviors of certain highly motivated learners that are keen to explore for new knowledge in different aspects without relying on the assistance from the other peers. The proactive learning behavior of SML is proven as an essential skillset from the viewpoint of modern educational landscape to search for new knowledge. From optimization's point of view, the stochastic characteristic of SML can be beneficial to promote the exploration strength of MMTLBO-RLS, enabling it to have better robustness to address the premature convergence issue. The search mechanisms of proposed SML are explained as follows. Suppose that P SML = 1 D represents the probability of a MMTLBO-RLS learner to perform SML in the modified learner phase, where P SML ∈ [0, 1]. Let d r ∈ [1, D] be a randomly selected dimensional index of n th learner to perform SML. A random perturbation process is then applied on the selected X n,d r to enable it exploring for the new information contained in particular dimension of solution space. Suppose that X new n,d r is perturbed component obtained by the n th learner after performing the SML processes on selected X n,d r , i.e., where r 5 ∈ [−1, 1] is a random data produced from uniform distribution; X new n,d r refers to the d th r component of n th selfmotivated learner; X U n,d r and X L n,d r refer to the d th r dimension of upper limit and lower limit of the decision variable, respectively. The overall mechanism of SML scheme adopted by each n th selected learner is presented in Figure 5. As compared with the learner phase of conventional TLBO, the SML scheme introduced in the modified learner phase of MMTLBO-RLS is expected to offer greater exploration strength to population in more consistent manner throughout the search process. For conventional TLBO, the learner phase is only able to demonstrate its exploration behavior through Eq. (7) when the learner is repelled away from the randomly selected peer learner with worse fitness. When dealing with more complex problems such as MOPs, two compared learners tend to have similar fitness and become non-dominated with each other in latter stage of optimization process due to population diversity loss. This undesirable scenario can drastically reduce the probability of triggering Eq. (7), resulting in the suppression of exploration search that can further accelerate the diversity loss of population. On the other hand, the frequency of triggering Eq. (11), i.e., SML in modified learner phase of MMTLBO-RLS can be guaranteed by through the proper setting of P SML . Sufficient amounts of exploration strengths can be consistently induced to maintain the population diversity of MMTLBO-RLS throughout the search process, enabling it to handle the optimization problems with complex fitness landscapes with better robustness.
The second learning scheme incorporated into modified learner phase of MMTLBO-RLS is known as interactive adaptive learning (IAL). Similar with the learner phase of conventional TLBO, the proposed IAL enables knowledge enhancement of a given learner through peer interaction during the modified learner phase of MMTLBO-RLS. Nevertheless, it is noteworthy that some of the mechanisms designed in the proposed IAL are fundamentally different from learner phase in TLBO. Firstly, the proposed IAL does not restrict every learner to interact with only one peer learner during the modified learner process. For the sake of achieving more realistic modelling of learner phase to attain better optimization results, IAL is designed to facilitate information exchange among multiple peers for updating the knowledge of learner in every subject (i.e., dimensional component) more effectively. Furthermore, the proposed IAL is also designed to consider the possibility of learners to have different interest levels to interact with other peers even though they have opted for IAL during the modified learner phase of MMTLBO-RLS. Define P IAL n ∈ [0, 1] as a random probability value produced from uniform distribution to emulate the interest level of each n th learner to perform IAL with other peer learners during the modified learner phase. Different learners can be assigned with different P IAL n values to imply their different tendencies to learn from other peer learners. Suppose that X new n represents the new solution of each learner obtained from the proposed IAL scheme. For every d th dimensional component of X new n denoted as X new n,d , a random number r 6 ∈ [0, 1] is generated from the uniform distribution and then compared with the P IAL n value assigned to n th learner. Define λ n ∈ [0.5, 1] as a randomly generated interactive learning factor for each n th learner that participates in IAL. If r 6 is smaller than P IAL n , the peer learners of X j , X k and X l are randomly chosen from population for information exchange with n th learner in d th dimension (i.e., subject), where n = j = k = l. Otherwise, the n th learner is assumed to prefer retaining its knowledge in d th dimension by inheriting the original value of X n,d into X new n,d . The overall learning mechanism used to update the d th component of each n th learner via the proposed IAL scheme is represented as follow: As shown in Eq. (12), the n th learner assigned with higher P IAL n has higher likelihood to interact with multiple peers in updating the directional information of X new n , hence it has demonstrated more explorative behavior. On the other hand, the n th learner assigned with lower P IAL n has more exploitative characteristic due to its high tendency to retain the information in most of its dimensional components and then perform searching around its nearby solution regions. The mechanisms used by the proposed IAL scheme to balance the exploration and exploitation searches of learners through the modelling of different interest levels for each learner to interact with multiple peers when updating each dimensional component are presented in Figure 6. After obtaining the new position X new n via modified learner phase, the associated objective function value m (X n ) in each m th objective is evaluated and compared with those current values of m (X n ) for m = 1, . . . , M with the Pareto dominance concept as shown in Fig. 3. Similar with modified teacher phase, the newly obtained X new n can replace the current X n when X new n X n as shown in Lines 11 and 12 of Figure 6. Otherwise, the current X n is retained as shown in Lines 13 and 14. If both of X n and X new n are non-dominated to each other, a coin is flipped to randomly select one of these solutions to be updated as the current position of n th learner in the next iteration as illustrated in Lines 17 to 20 of Figure 6.  respect to every a th Pareto optimal solution stored in A, where a = 1, . . . , |A|. An archive controller is incorporated into MMTLBO-RLS to manage the new incoming solutions or discard the extra archive members when A is fully occupied. The rule of thumbs adopted by an archive controller in managing A is shown in Figure 7 and described as below: • A new solution is rejected by A if it is dominated by one or more archive members.
• A new solution is added into A if it dominates and removes one or more archive members.
• A new solution is added into A if it is non-dominated with all archive members and A is not fully occupied.
• An adaptive grid approach [48] is used to reorder the segmentation of objective space, if the new solution is located outside of hypercube covered by existing A.
• The redundant archive members with higher solution density need to be discarded if A is fully occupied. The procedures used to remove the redundant archive members with higher solution density when external archive A is fully occupied are explained as follows. Suppose that H is the total occupied hypercubes in A; κ h is the number of archive members exist in each h th occupied hypercube, where h = 1, . . . , H . Let B h be the probability of each h th occupied hypercube to be selected with a roulette-wheel method to discard its archive members, then where γ > 1 is a constant. As shown in Eq. (13), the h th occupied hypercube with the higher density values of κ h is assigned with the larger selection probability, implying its higher tendency to be chosen to randomly discard one of its archive members. On the other hand, the h th occupied hypercube with lower density values of κ h is assigned with the lower selection probability, hence it has better chance to preserve the archive members.

6) FUZZY DECISION MAKER (FDM)
A set of Pareto optimal solutions are obtained at the end of optimization process and stored into the external archive A.
In order to satisfy all optimization goals, the process planner is required to select the most appropriate Pareto optimal The weightage value w m indicates the relative significance level of each m th objective function that is referred by process planner to identify the most desirable Pareto optimal solution. Meanwhile, µ a is defined as the total optimality degree corresponds to each a th Pareto optimal solution of A Considering the relative significance levels of all optimization objectives, the a th Pareto optimal solution with larger value of µ a can produce better optimization results when solving the MOPs and vice versa. As shown in Figure 8, the a th Pareto optimal solution with the largest µ a value is selected from A as the most desirable Pareto optimal solution X preferred .

7) COMPLETE MMTLBO-RLS
The refined teaching and learning framework employed by MMTLBO-RLS is presented in Figure 9. Accordingly, the maximum fitness evaluations (FEs) denoted as is utilized as the termination criterion of MMTLBO-RLS and a counter variable γ is defined to trace the FEs consumed. The initial population P with N learners is first randomly produced from the uniform distribution at the beginning of optimization process. After evaluating the M objective functions values of these N learners, all Pareto optimal solutions are identified and stored into an external archive A. VOLUME 10, 2022 During the optimization process of MMTLBO-RLS, the new solution X new n of every n th learner can be produced from the new learning schemes incorporated into modified teacher or learner phases. The Pareto dominance concept is utilized to determine if the new solution X new n can be used to replace the current solution X n in the next generation of MMTLBO-RLS. Furthermore, the archive controller is also triggered at the end of each iteration to update the archive members of A. All these mechanisms are repeated until γ > . Finally, the most favored Pareto optimal solution X preferred is determined from the external archive A with assistance of FDM by considering the relative significance levels of all optimization objectives specified by stakeholders.
The essential differences between the proposed algorithm and some previous methods are summarized as follows: • The works by [9], [10] had focused on applying various parameter adaptation strategies to adjust the exploration and exploitation strengths of TLBO variants. These approaches tend to introduce excessive amounts of control parameters that are difficult to be tuned. The works by [11]- [14], 23], [26] had modified the neighborhood structures to adjust the information flow rate within the population. However, these approaches prone to suffer with high computational complexity in dividing the main population into several subswarms. Our proposed work explores the ideas of modifying learning strategies in achieving performance gain of TLBO with lesser computational complexity by leveraging the useful directional search information offered by other nonfittest learners.
• Although some existing works such as those reported in [9], [15], [16], [24], [27], [36]- [38] proposed the modification of learning strategies to improve search performance, their innovations were restricted to certain learning phase. On the other hand, substantial modifications in both teacher phase and learner phase are done by the proposed MMTLBO-RLS to further refine its algorithmic framework, hence ensuring more realistic modelling of teaching and learning process is achieved. These modifications enable better utilization of useful directional information of predominant learners to attain better balancing of exploration and exploitation searches when solving the challenging optimization problems.
• Although the works in [9], [10], [15]- [17], [19], [20], [23], 30], [36]- [39] had proposed new learning mechanisms in learner phase of TLBO, most of the modifications done in algorithms have neglected the learning preferences of different learners or their interest to interact with different peers for enhancing their knowledge in different subjects. In the proposed algorithm, two different learning schemes known as self-motivated learning (SML) and interactive adaptive learning (IAL) with different search characteristics are introduced to facilitate different learning needs of learners in classroom during optimization process. Further- more, the IAL scheme proposed can introduce different levels of exploration and exploitation strengths based on the interest levels of a given learner to interact with its surrounding peers. The inherent mechanisms used by both SML and IAL schemes to balance the exploration and exploitation searches is expected to assist the proposed MMTLBO-RLS to solve complex problems with better robustness.
• The multiobjective optimization problems reported in [30]- [35] were solved using the prior approach that can only produce a unique optimum solution for each optimization run. This approach might not be feasible for real-world problems because the requirements of stakeholders might change with time and optimization processes need to be repeated for multiple times. In contrary, the proposed MMTLBO-RLS is a posterior approach that is able to generate a set of non-dominated Pareto optimal solutions in single run. A unique optimum solution can then be chosen from the Pareto solution set by using FDM based on the latest requirements of customers. Therefore, the proposed approach is considered to be more feasible to handle the realistic scenarios that involve the constant changes of stakeholders' requirements.

IV. PERFORMANCE STUDY IN MACHINING PROBLEM A. MULTI-OBJECTIVE OPTIMIZATION
The proposed MMTLBO-RLS was used to optimize the cutting parameters for turning of PEEK material. Considering the importance of both response parameters, relatively equal weightage, w 1 = w 2 = 0.5 was used to predict the optimum machining parameters in response to maximum surface finish and maximum material removal rate, where w 1 + w 2 = 1. The dominance of the responses in the given solution space was observed. It resulted the optimum cutting parameters of V c = 155 m/minute, f = 0.2 mm/rev and ap = 0.66 mm with the predicted R a = 1.1042 µm and  Figure 10.
The predicted optimum cutting parameters were further used to conduct validation experiments. It resulted surface roughness of 1.13 µm and material removal rate of 22.2832 cm 3 /minute for the same optimum cutting condition. Table 3 presents the predicted results and validation experimental results and the corresponding error rates. Referring to validation results obtained, the small deviation of 2.28% in R a and 2.69% in MRR are observed.

B. SINGLE-OBJECTIVE OPTIMIZATION
Furthermore, the capability of MMTLBO-RLS to optimize the individual objective function related to turning of PEEK was investigated separately. Figs.11(a) and 11(b) presents the convergence curves obtained by the proposed MMTLBO-RLS in minimizing R a and maximizing MRR, respectively. It is observed that the MMTLBO-RLS can rapidly converge towards the minimum value of R a and maximum value of MRR in less than 1,000 fitness evaluation numbers, implying excellent accuracy and efficiency of the proposed algorithm. The predicted single response optimum parameters were further validated experimentally. It is also noteworthy that the proposed algorithm has successfully identified the better combination of machining parameters than those from initial experiments. From the observations, it is concluded that MMTLBO-RLS has successfully delivered promising performance in predicting the best cutting parameters for turning of PEEK parts.

V. EVALUATION PERFORMANCE ANALYSIS OF MMTLBO-RLS IN MACHINING OPTIMIZATION
The competence of the MMTLBO-RLS algorithm was evaluated by assessing the performance of it with some other existing classical and metaheuristic search based optimization algorithms. These seven selected algorithms are: MOPSO [48], NSGA-II [49], MOGWO [50], MOTLBO [36], MO-ITLBO [26], NSTLBO [38] and multi-objective sequential quadratic programming (MOSQP) [51]. The same datasets of turning of PEEK material was used and the simulation results produced by all involved optimization methods were evaluated thoroughly using qualitative and quantitative analyses.

A. PERFORMANCE INDICATORS
Two performance indicators were used to evaluate the capabilities of all compare-algorithms in solving the current optimization problem. The first indicator is known as coverage operator which is used to differentiate the quality of two Pareto fronts by calculating the number of dominated solution members. Assume that F A 1 and F B 1 represents the Pareto fronts of two compared algorithms. Then, and these two analysis results should be presented explicitly for the sake of better clarity. Spacing metric is another performance indicator adopted and it is used to measure the diversity of a Pareto front. If |F 1 | refers to the total number of Pareto optimal solutions in F 1 . Let d a be the minimum Euclidean distance measured between a th and b th Pareto solutions stored in F 1 , then: Mean Euclidean distance ofd can be determined as: Referring to the values of |F 1 |,d and d a , the spacing metric S is finally computed as; More evenly distributed non-dominated solutions in Pareto front is indicated by the smaller value of S. All the nondominated solutions in Pareto front is equidistant from one another when S = 0.

B. PARAMETER SETTINGS OF ALL ALGORITHMS
The multiobjective optimization algorithms such as NSGA-II, MOTLBO and NSTLBO compared the solution qualities by leveraging the concepts of crowding distance and nondominated sorting. On the other hand, the external archive concept was adopted by MO-ITLBO, MOGOW and MOPSO and the current proposed MMTLBO-RLS, to preserve the Pareto-optimal solutions. The parameter settings for executing all compared multiobjective optimization algorithms were taken from the respective articles. Table 5 shows parameter settings used in all algorithms. For instance, the parameters of α and nGrid used to construct the external archives for MOPSO and MOGWO were set as 0.1 and 10, respectively. The probability of triggering mutation operation in both of MOPSO and NSGA-II is P mut = 1 D. The inertia weight ω of MOPSO was set to vary in the decreasing order from 0.9 to 0.4. And two fixed acceleration coefficients c 1 and c 2 were set to 2.05. In addition, the NSGA-II was assigned with a crossover probability denoted as P cr = 1 D. MO-ITLBO has a subswarm size of nGroup = 4 tuned for its teacher phase to achieve better diversity preservation and its external archive was constructed using the parameter values of ε = 0.007. The teaching factors of T f ∈ {1, 2} was applied in NSTLBO. In contrast to it, teaching factors were uniformly distributed between 1 and 2 for other TLBO variants (i.e., MOTLBO and MO-ITLBO). For MOSQP, there are no algorithmic specific control parameters and its source code is publicly shared in [51]. Finally, some essential parameter settings determined for the proposed MMTLBO-RLS were P SML = 1 D, α = 0.1, nGrid = 10 and T f ∈ [1, 2] as presented in Table 5. Three different populations sizes denoted as N = 20, 30 and 40 were considered in the current performance analyses. For external archive, the archive size was set equal to population size, i.e., |A| = N in MOPSO, MOGWO, MO-ITLBO and MMTLBO-RLS. In order to ensure all multiobjective optimization algorithms are fairly compared, the same termination criterion of = 50, 000 was used. Intel R Core i7-7500 CPU @ 2.70 GHz computer with Matlab 2020a software was used in all simulations. Each algorithm was executed for 20 times and the average results were taken for comparison analysis.
The mean (x) and standard deviation (σ ) of coverage metrics in MMTLBO-RLS and seven other peer algorithms for different population sizes are summarized in Table 6.  when N = 20, 30 and 40, respectively. On the other hand, not more than 0.1% of the Pareto optimal solutions generated by the proposed MMTLBO-RLS are dominated by Pareto fronts of MOTLBO for any population size.
Considering the results shown in Table 6, it can be concluded that the optimal Pareto fronts produced by MMTLBO-RLS are the best, owing to the greater percentage of non-dominated solutions found. These promising quality of Pareto fronts obtained by MMTLBO-RLS evidence that the new learning schemes introduced in teachinglearning process can utilize the promising direction information offered by the predominant learners to achieve more effective searching. This confirms the better use of exploration and exploitation searches through the refined algorithmic framework of MMTLBO-RLS that enhances its overall search accuracy. Table 7 presentsx and σ of spacing metric (S) of Pareto fronts produced by algorithms for N = 20, 30 and 40. Notably, MMTLBO-RLS is regarded to be the best performing algorithm for its excellent capability to generate the lowest S values for all population sizes. In other words, the Pareto front members produced by MMTLBO-RLS are the most uniformly distributed as diversified from all other multi-objective optimization algorithms. Both of NSTLBO and MOSQP are observed to deliver worst performances in solving the current machining problem. The Pareto fronts obtained from these two algorithms for any population size are not only have highest numbers of dominated solution members but also have the poorest distributions. Drastic The replicated non-dominated solutions led to the better C but, yet worst S. In contrary, NSGA-II has produced more inferior solutions in its Pareto fronts, but it demonstrated better distribution in most population sizes as indicated by the relatively lowx value obtained. These observations imply that the population of NSGA-II tends to suffer with the rapid diversity loss issue and most of the Pareto font members obtained have higher likelihood to be stuck in local Pareto front regions despite of their uniform distributions. In contrary to the drastic performances of MOTLBO and NSGA-II, the optimization results achieved by the proposed MMTLBO-RLS in solving multi-response PEEK machining problem are more appealing due to its ability to generate the Pareto front members that are not only well distributed but also have good solution quality. The unique teacher selection and weighted mean position concepts incorporated into the modified teacher phase can offer additional momentum to each MMTLBO-RLS learner in exploring more diverse areas of solution spaces to produce Pareto front with better distribution. On the other hand, both self-motivated learning and interactive adaptive learning schemes proposed in the modified learner phase are useful in assisting the population members to escape from the local Pareto front regions and enhance the learning efficiency of algorithm through more effective information exchange between learners, respectively. Figure 12 compares the Pareto-fronts from other multiobjective optimization algorithms with MMTLBO-RLS in handling the multi-response PEEK machining problem when the population size is set as N = 40. In general, the qualitative results in term of Pareto fronts presented in Figure 12 have shown good consistency with the quantitative results in terms of coverage operator and spacing metric as reported in Tables 6 and 7, respectively. Among all compared algorithms, NSTLBO was reported to deliver the worst performance in predicting the best combination of process parameters that can lead to the optimal machining conditions of PEEK. Notable premature convergence issues can be observed from NSTLBO as indicated by its poorly constructed Pareto front that fails to true Pareto front. Similarly, the Pareto front of MOSQP when solving the multi-response PEEK machining problem is also poorly constructed. When compared to other algorithms, it is evident that majority of Pareto front members of MOSQP are not properly converged to the promising solution regions and this undesirable characteristic has resulted in the high numbers of dominated solutions and uneven distribution of Pareto front for MOSQP. Although the Pareto fronts of NSGA-II and MOGWO can better approximate the true Pareto front of multi-response PEEK machining problem, the numbers of Pareto optimal solutions produced by these two algorithms at the boundary regions of solution space are lesser than MMTLBO-RLS, implying that the former two algorithms have limited exploration strengths to discover the new solution regions in search space. While the MOPSO, MOTLBO and MO-ITLBO can produce more solution members at the large surface roughness regions, it is notable that the Pareto front members produced by these compared algorithms are less uniformly distributed at the lower surface roughness regions as compared with those of MMTLBO-RLS. Similar observations can also be made on the low surface roughness regions of Pareto fronts obtained by MOGWO and NSGA-II. Among all compared algorithms, the proposed MMTLBO-RLS demonstrated the most promising performance because of its ability to generate the Pareto front with better quality in terms of both quality and diversity of solution members. The unique teacher and mean position concepts incorporated into the modified teacher phase serve as the diversity enhancement scheme that can provide additional momentum for all MMTLBO-RLS learners to perform more effective searching around the unvisited solution regions of search space. Meanwhile, the self-motivated learning incorporated in modified learner phase can enhance the robustness of learner against misleading search information provided by inferior learners. The interactive adaptive learning can promote more effective searching through the rigorous information exchange with multiple peer learners. The well utilization of exploration search and exploitation search can be accomplished eventually through the refined algorithmic framework of MMTLBO-RLS, enabling the proposed algorithm to solve the MOPs with better performances.

VI. PERFORMANCE ANALYSIS OF MMTLBO-RLS IN CEC 2009 MULTIOBJECTIVE BENCHMARK FUNCTIONS A. BENCHMARK FUNCTIONS AND PERFORMANCE METRICS
Apart from multi-response PEEK machining problem, the optimization performances of proposed MMTLBO-RLS are further analyzed with 23 multiobjective benchmark functions with different characteristics designed for CEC 2009 Special Session and Competition [52]. In particular, 13 and 10 out of these 23 CEC 2009 benchmark functions are formulated as the unconstrained (i.e., UF1 to UF13) and constrained (i.e., CF1 to CF10) multiobjective optimization problems, respectively. The benchmark functions of UF1 to UF7 and CF1 to CF7 have two objective functions; UF8 to UF10 and CF8 to CF10 have three objective functions; UF11 to UF13 have five objective functions. As for the ten constrained multiobjective functions, CF1 to CF5 and CF8 to CF10 have one inequality constraint, whereas both of CF6 and CF7 have two inequality constraints. More detailed mathematical formulations of these CEC 2009 multiobjective benchmark functions are provided in [52].
For all CEC 2009 multiobjective benchmark functions, their true Pareto fronts are provided for comparison with those approximated Pareto fonts found by the selected algorithms. Therefore, a performance metric known as inverted generation distance (IGD) [53] is introduced in this section to evaluate the quality of approximated Pareto front produced by each selected algorithm in terms of accuracy (i.e., closeness to true Pareto front) and diversity (i.e., uniformity of non-dominated solution set). Define A is an approximated Pareto front produced by a selected algorithm in solving a given CEC 2009 multiobjective benchmark function with a true Pareto front of TP that containing a total of |TP| solution members. Let (TP i , A) be an operator used to find the shortest Euclidean distance between each i-th member in true Pareto front of TP and the approximated Pareto front of A. Then, the IGD value associated with TP and A can be calculated as [53]: Without loss of generality, the accuracy and diversity of approximated Pareto front A can be measured simultaneously using the IGD value in Eq. (20) if the value of |TP| is large enough to represent the true Pareto front. It is more desirable for an algorithm to produce the approximated Pareto front with smaller IGD value because it implies that the solution set A has more uniform distribution and closer to TP.
A nonparametric statistical analysis known as the Wilcoxon signed rank test [54], [55] is also used to perform rigorous pairwise comparison between MMTLBO-RLS and its peer algorithms in terms of IGD values. In particular, the Wilcoxon signed rank test was conducted at the significant level of σ = 0.05 and the results are reported as R + , R − , h and p values. Both of R + and R − represent the sum of ranks of the compared algorithms where MMTLBO-RLS outperforms and underperforms, respectively. The h values of ''+'', ''='' and ''−'' implies MMTLBO-RLS is statistically better, insignificant and statistically worse than its peer algorithm, respectively. The p value in Wilcoxon signed rank test indicates the minimum level of significance to detect the performance differences between MMTLBO-RLS and its peer algorithm [54], [55]. If the p-value obtained is smaller than the predefined threshold σ , it implies the better results obtained by the better performing algorithm are statistically significant instead of by the random chances.
The control parameters of all compared algorithms are set based on the recommendations provided in their original literatures and summarized in Table 8. Different population sizes are set for all selected algorithms based on the numbers of objective functions available in each benchmark function. According to [52], the populations size is set as N = 100 for two-objective problems; N = 150 for threeobjective problems; N = 800 for five-objective problems. The same maximum fitness evaluation numbers of = 300, 000 are set for all compared algorithms when solving each CEC 2009 multiobjective benchmark function [52] to ensure the fairness of comparison. It is noteworthy that the original source codes of NSMTLBO, MOMVO, MOFEPSO, MOGOA, MOHS, MOCS and SHAMODE were obtained from their respective authors to ensure the simulation results produced in current study are equivalent to those already published. Meanwhile, the simulation results of DMOEA-DD and DECMOSA-SQP are directly extracted from their original papers for fair performance comparison. All compared algorithms are simulated for 30 independent runs in current study by using the same Intel R Core i7-7500 CPU @ 2.70 GHz computer with Matlab 2020a software when solving each benchmark function and the average results were subsequently calculated for performance comparison and post statistical analyses.

C. COMPARISON OF SIMULATION RESULTS IN CEC 2009 MULTIOBJECTIVE BENCHMARK FUNCTIONS
The optimization performances of proposed MMTLBO-RLS and seven other peer algorithms to solve all CEC 2009 multiobjective benchmark function are evaluated in terms of mean IGD (IGD mean ) and standard deviation (SD). All simulation results are presented in Table 9 with the best and second-best results (i.e., lowest and second lowest IGD mean ) for each benchmark function are highlighted as the boldface and underline texts, respectively. Referring to the IGD mean values, the number of benchmark functions in which the proposed MMTLBO-RLS wins over the peer algorithms, ties with the peer algorithms and loses to the peer algorithms are reported as w, t and l, respectively, enabling these performance comparisons to be summarized as w/t/l/. A metric of #BR is also introduced to indicate the number of benchmark functions that can be solved by a compared algorithm with the best (i.e., lowest) IGD mean values.
On the basis of the simulation results reported in Table 9, MMTLBO-RLS exhibited the most competitive performances among all compared algorithms because the proposed work is able to solve the 23 benchmark functions with 11 best and 1 second-best IGD mean values. It is also noteworthy that the proposed MMTLBO-RLS has successfully solved all of the unconstrained and constrained three-objective benchmark functions (i.e., UF8 to UF10 and CF8 to CF10) with the best IGD mean values. These compelling results imply that the Pareto fronts produced by MMTLBO-RLS in solving majority of the CEC 2009 multiobjective benchmark functions are evenly distributed and closely approximated to the corresponding true Pareto fronts. The competitive simulation results produced by MMTLBO-RLS in Table 9 when solving majority of CEC 2009 multiobjective benchmark functions are consistent with the Pareto fronts presented in Fig. 13. Notably, the true Pareto fronts of these selected benchmark functions are plotted as red lines, whereas the approximated Pareto front produced by MMTLBO-RLS are indicated with blue diamond. The MMTLBO-RLS is followed by MOFEPSO and NSTLBO as the second and third best performing peer algorithms, respectively. In particular, MOFEPSO has solved CEC 2009 multiobjective benchmark functions with 5 best and 3 second-best IGD mean values. Despite performing well in handling the unconstrained fiveobjective and constrained two-objective benchmark functions, MOFEPSO has shown relatively poor performance in solving other types of CEC 2009 multiobjective benchmark functions, especially for those with three-objective. NSTLBO is reported to solve CEC 2009 multiobjective benchmark functions with 3 best IGD mean values (i.e., UF2, UF4 and CF1) and 5 second best IGD mean values (i.e., UF5, UF6, UF8, UF11 and CF9). MOCS is observed to have relatively good performance in certain types of problems by producing the best IGD mean values for UF3 and UF13 as well as the second  IGD mean values for UF1, UF2 and UF4. For NSMTLBO, although it is only able to produce 1 best IGD mean value for UF3, Table 9 reveals that it has successfully solved another 5 benchmark functions with the second best IGD mean value. On the other hand, SHAMODE is observed to only perform well on the unconstrained two-objective benchmark functions despite being able to produce 4 best and 2 second-best IGD mean values. Notable performance degradation of SHAMODE is observed when it is used to solve the benchmark functions with constraints or with more than two objective functions. Other compared algorithms such as MOMVO, MOGOA, DMOEA-DD, MOHS and MOH are observed to deliver relatively inferior performances when solving the CEC 2009 multiobjective benchmark functions. Majority of the IGD mean values produced by these five peer algorithms when solving the benchmark functions are larger (i.e., worse) than those of proposed MMTLBO-RLS. From Table 9, DECMOSA-SQP is reported as the worst algorithm because it is not able to solve any CEC 2009 multiobjective benchmark functions with the best or second-best values. The proposed MMTLBO-RLS is also reported to outperform DECMOSA-SQP in term of IGD mean when solving all benchmark functions. Table 10 presents the pairwise comparison results between MMTLBO-RLS and its ten competitors obtained through the Wilcoxon signed rank test in terms of sum of ranking (i.e., R + and R − ), p values and h values. The significant performance improvement achieved by proposed MMTLBO-RLS against NSMTLBO, MOMVO, MOGOA, SHAMODE, DMODE-DD, DECMOSA-SQP, MOHS, MOCS and NSTLBO can be confirmed through the independent pairwise comparisons because their p-values reported by Wilcoxon signed rank test are less than the predefined threshold of σ = 0.05. Meanwhile, the Wilcoxon signed rank test reveals that no significant performance differences are observed between the MMTLBO-RLS and MOFEPSO. Referring to  the nonparametric statistical analysis results in Tables 10, it is concluded that the proposed MMTLBO-RLS indeed outperforms most of its competitors significantly from the statistical point of view when solving CEC 2009 multiobjective benchmark functions.

VII. DISCUSSIONS
Referring to the extensive simulation studies performed in the previous sections, the proposed MMTLBO-RLS is proven to outperform the other well-established TLBO and non-TLBO based multiobjective optimization algorithms when solving diverse types of multiobjective problems, including the multi-response PEEK machining problems formulated in current study. In contrast to the selected peer algorithms that can only solve certain types of multiobjective optimization problems with good performances, the proposed MMTLBO-RLS is able to produce the approximated Pareto fronts with better solution quality and more uniformly distributed solution sets for majority of multiobjective benchmark functions and multi-response machining problems. The competitive performance exhibited by MMTLBO-RLS over its competitors can be justified based on the two major modifications introduced, namely: (a) modified teacher phase and (b) modifier learner phase. The novel mechanisms introduced into both teacher and learner phases of MMTLBO-RLS enable more accurate representation of real-world teaching and learning paradigm to attain the better balancing of exploration and exploitation searches of algorithm during optimization process.
For modified teacher phase of MMTLBO-RLS, two main contributions known as the selection scheme of unique teacher and formulation of unique mean position for each learner are introduced to further enhance the diversity level of population. In particular, the proposed teacher selection scheme enables each learner to select different Pareto optimal solutions stored in an external archive A in guiding the search process. This mechanism ensures the useful information brought by each non-dominated solution in A can be leveraged to locate the true Pareto fronts of a given multiobjective optimization problem. Meanwhile, the formulation of unique mean position for each learner by referring to all non-dominated solutions in A allows each learner to adjust its search trajectory in unique manner by fully utilizing the promising information of current non-dominated solution sets in A. By incorporating these two mechanisms into the modified teacher phase, the unique directional information can be generated for each MMTLBO-RLS learner to perform the search process without having to suffer with rapid loss of population diversity. In other words, the derivation of these unique directional information is effective to prevent the convergence of MMTLBO-RLS population towards the local Pareto front that tend to deliver poor optimization results.
For modified learner phase of MMTLBO-RLS, another two major contributions known as self-motivated learning (SML) and interactive adaptive learning (IAL) are introduced to enhance the overall learning efficiencies of learners. These modifications are mainly driven by the fact that different learners in classroom tend to have different learning preferences, hence different learning schemes need to be incorporated to facilitate their unique learning needs. This observation is aligned with the perspective of optimization, where learning strategies with different levels of exploration and exploitation strengths are needed to tackle optimization problems with different types of fitness landscapes effectively. SML is introduced to the learners that prefer to achieve knowledge enhancement via personal efforts instead of interacting with their peers. This can be achieved by performing stochastic perturbation on a randomly selected dimension of each learner. Essentially, SML is a learning strategy to promote exploration search and it plays essential role to assist MMTLBO-RLS to uncover the unvisited regions of solution space. This explains the competitive performance of MMTLBO-RLS in identifying the non-dominated solutions locate in the boundary regions of Pareto fronts for most tested problems. On the other hand, IAL allows each learner to interact with multiple peer learners in exchanging their useful search information. In order introduce different levels of exploration and exploitation strength in IAL, each learner is modelled to have different levels of interest to interact with multiple peers in updating each of its dimensional component. For instance, the learners that have stronger interest to interact with its peer to update most of its dimensional components through IAL tends to have relatively stronger exploration strength. On the other hand, some learners behave relatively more exploitative due to their higher tendency to retain their original information in majority of dimensional components. These search mechanisms are able to attain the proper balancing of exploration and exploitation searches of algorithm, enabling the proposed MMTLBO-RLS to solve different types of multiobjective optimization problems with more competitive performances.

VIII. CONCLUSION
One of the main challenges of using TLBO to solve realworld challenging multiobjective optimization problems is the high tendency of premature convergence due to the rapid loss of population diversity. The modification of learning strategies by leveraging useful directional information from different TLBO learners are envisioned as a promising strategy to preserve the population diversity without incurring excessive computational costs. A new TLBO variant of MMTLBO-RLS is designed in this paper to solve various challenging multi-objective optimization problems, including to determine the optimum turning parameters to machine PEEK parts based on a prediction model developed. The proposed MMTLBO-RLS algorithm predicted the parameters as follows: For the improved quality of the product, V c = 155 m/minute, f = 0.2 mm/rev and ap = 0.25 mm is the best condition to get the minimum surface roughness of R a = 0.900 µm. For the high volume of production, V c = 95 rpm, f = 0.6 mm/rev and ap = 0.6 mm is the best condition to obtain maximum material removal rate of MRR = 24.7884 cm 3 /minute. These predicted results deviate from the validation results by less than 1.99%. Meanwhile, when the industry aims for the improved quality as well as quantity together, V c = 155 m/minute, f = 0.2 mm/rev and ap = 0.66 mm is the best condition to obtain R a and MRR of 1.1042 and 22.8991 cm 3 /mm, respectively. Comparing the predicted results with validation results, small errors of 2.28% for R a and 2.69% for MRR are observed between the experimental results and predicted results.
The proposed MMTLBO-RLS was further evaluated with other seven algorithms in solving CEC 2009 multobjective benchmark functions. It is found that the proposed algorithm demonstrates the best optimization performances because it can generate the approximated Pareto fronts that are not only uniformly distributed, but also closer to the corresponding true Pareto fronts. The excellent performances of MMTLBO-RLS against its peer algorithms when solving the CEC 2009 multobjective benchmark functions are also verified using non-parametric statistical analyses. Based on the experimental results obtained, it is concluded that the proposed MMTLBO-RLS is suitable to use in industries to produce parts of PEEK material. The perfect turning operations with supportive quality and quantity can assist the companies to avoid wastage and reduce cost of production. Apart from the multi-response PEEK machining problem, the proposed MMTLBO-RLS has also exhibited its robustness in solving other multiobjective optimization problems with different characteristics.
KOON MENG ANG received the B.Eng. degree (Hons.) in mechatronic engineering from UCSI University, Malaysia, in 2019, where he is currently pursuing the Ph.D. degree. His current research interests include the development of modified swarm intelligence algorithms and their applications in solving various optimization problems, such as the training of machine learning models, neural architecture search of deep learning models, and multi-response machining optimization. SANGEETHA ELANGO (Member, IEEE) received the M.E. degree in computer science engineering from Anna University, India. She is currently pursuing the Ph.D. degree with UTAR University, Malaysia. She has more than ten years of tertiary teaching experience. Her research interests include artificial intelligence and optimization. She is a member of IEEE/RAS Society.