Feature-extraction-based iterated algorithms to solve the unrelated parallel machine problem with periodic maintenance activities

This paper considers an unrelated parallel machine problem with job release times and maintenance activities, in which machines have to periodically undergo maintenance since the status of the machines will be deteriorated by job-induced dirt. The problem is inspired by a wet station for cleaning operations in a semiconductor manufacturing process. The objective is to minimize the makespan. Since the considered problem is proven to be NP-hard, obtaining optimal solutions is almost impossible in a reasonable computational time when the problem becomes large. We develop specific feature-extraction procedures to recognize important information in a job sequence and linkage encoding (LE) procedures to generate new job sequences. The two above procedures are embedded into an iterated algorithm, called a feature-extraction-based iterated algorithm (FEBIA), to obtain optimal or better solutions for the considered problem. To examine the performance of the FEBIA, the FEBIA is compared with two population-based algorithms, the particle swarm optimization (PSO) algorithm and the genetic algorithm (GA), using many test data. The results reveal that the proposed FEBIA perform better than the two population-based algorithms, demonstrating the potential of the FEBIA to solve the unrelated parallel machine problem with periodic maintenance and job release times.


I. INTRODUCTION
This paper considers job scheduling and maintenance activity optimization problems for unrelated parallel machines with dynamic job release times. In the literature, the majority of joint dispatching job and maintenance activity problems examine single-machine configurations [1][2][3] and assume that all jobs are ready to be processed at the same time. However, unrelated parallel machine configuration problems are commonly encountered in different manufacturing systems, such as semiconductor and electronics manufacturing, since a bank of parallel machines can overcome the impact of a bottleneck in the production line to improve the system throughput. Additionally, a case in which the job release times are the same or equal to zero is not practical for such a dynamic manufacturing situation, especially in a semiconductor manufacturing system.
Semiconductor wafer fabrication consists of hundreds of operations and the scheduling complexity is increased by specific process requirements such as sequence-dependent setups, machine maintenance, and batch-processing. Some researchers have noted that when maintenance planning and production scheduling lack coordination, machines may remain idle due to waiting for engineering personnel to perform maintenance activities, even though jobs are ready to be processed [4][5]. Low throughput and inefficient system performance consequently occur. This work is inspired by the importance of integrating job scheduling and maintenance simultaneously for a semiconductor manufacturing process, more specifically, cleaning operations. The cleaning operation is one stage of the semiconductor manufacturing process, which cleans the dirt off of wafer surfaces so that the wafers can maintain a good status for the next process. Dirt is a catch-all term for any type of residue, such as the particles, organic material, and metal-salts remaining in a machine during this cleaning operation. The amount of dirt will increase as more (wafer) processing jobs are conducted. Once the amount of dirt exceeds a specific threshold value, it will damage the wafer during processing. Therefore, wet stations (machines) must be stopped and the cleaning agent must be periodically changed to keep the machines in a good status.
Our considered problem can be defined as follows. There are n jobs to be processed on m unrelated parallel machines. Each machine is at initial status, i.e., there is no dirt in each machine, and has a positive cleaning time and a dirt threshold . In addition, after cleaning activity, machines become initial. Each job has a positive processing time , a release time , and an amount of dirt , wherein and are determined by the assigned machine . Given a feasible schedule S of jobs, the amount of dirt in each machine must not exceed its dirt threshold. The objective is to find a schedule that minimizes the makespan. According to the standard machine scheduling classification [6], this problem is denoted as | , ≤ , | , where fpa in the second field means that the maintenance activity is flexible and occurs periodically [7]. Flexible maintenance activity is one type of maintenance where the starting time of the maintenance must be determined during the production scheduling process. In addition, the various kinds of flexible maintenance activities and their practical requirements have been addressed by many researchers in the literature. Mosheiov and Sarif [8] considered a single machine problem where the machine has to complete only one maintenance activity prior to a given deadline, and they proposed a pseudopolynomial dynamic programming (DP) method and a heuristic to minimize the total weighted completion time. Lee and Chen [4] considered the parallel machine problem, and proposed branch and bound algorithms to minimize the total weighted completion time, for the two cases in which machines can be maintained simultaneously and in which multiple machines cannot be maintained simultaneously. Levin et al. [9] also considered a parallel-machine problem where all maintenance activities must be simultaneously performed once. They proved the problem to be NP-hard and provided a pseudopolynomial DP algorithm and an SPT-based algorithm to minimize the total completion time. Yoo and Lee [10] considered the same problem as Lee and Chen [4], and proved the problems with the objective of the makespan, the total weighted completion time, the maximum lateness or the total lateness to be NP-hard. Qi et al. [11] considered another type of flexible maintenance where the machine must be stopped and maintained after working for a period of time. For the single machine, they proposed three simple heuristics and a branch and bound algorithm to minimize the total completion time. Sbihi and Varnier [12] proposed a heuristic and a branch and bound algorithm to minimize the maximum tardiness. Lee et al. [13] considered a parallel machine problem in which the objective was to minimize the total tardiness. A branch and bound algorithm was developed that included the implementation of the lower and upper bounding procedure, which can find the optimal solutions for problems with up to 20 jobs. Costa et al. [14] considered a parallel machine problem with tool changes where each machine needs to change to a new tool if the tool service life is reached. The objective was to minimize the total completion time, and they proposed a mixed integer linear programming model and a genetic algorithm (GA) to solve both small-and large-sized problems. Another well-known flexible maintenance activity in which maintenance must be started and finished in a predetermined maintenance interval [u, v] was defined. Chen ([2] and [7]) developed mixed binary integer programming (BIP) models and an efficient heuristic algorithm to minimize the mean flow time and the makespan, respectively, for a single problem with this type of flexible maintenance. Low et al. [15] compared the performances of six kinds of heuristics for a single problem with the objective of minimizing makespan and provided the calculation of the error bounds. A more realistic ε-almost periodic maintenance was assumed by Xu et al. [5], where the difference in the times of any consecutive maintenance activities of the machine is within ε, and proposed an approximation algorithm to minimize the makespan for the parallel-machine problem. Qamhan et al. [16] considered a single-machine problem with time window periodic maintenance where the time between two maintenance activates was a fixed interval (T) and each maintenance activity could start in a time window, i.e., T∓w. Additionally, the time of maintenance activity was not equal. For the problem, they proposed a mixedinteger linear programming model and ant colony optimization (ACO) to minimize the number of tardy jobs. Hidri et al. [17] considered a parallel-machine problem with a single robot server, where the availability interval of the machines and the duration of PM activity were deterministic and known in advance. For the problem, they developed lower bound, simulated annealing (SA), tabu search (TS), and GA to find better solutions under the objective of minimizing the makespan.
In addition to those tasks in which flexible maintenance activity is considered, most studies in the literature considered another type of maintenance activity in the scheduling problem. This problem is called a scheduling problem with machine availability constraints, where the starting time and duration of maintenance are assumed to be fixed and known in advance [1,3,18]. Comprehensive reviews for this case are provided by Sanlaville and Schmidt [19], Schmidt [20], and Ma et al [21]. Some researchers considered both cases of fixed and flexible maintenance activities in their studies. Cui et al. [22] considered a nonpermutation flow shop problem where two cases of nonavailability intervals are involved. In the first case, the maintenance activity is periodically fixed and known in advance, while in the second case, the machine has to be maintained after working for a period of time. Since the problem is NP-hard in the strong sense, a hybrid incremental genetic algorithm was proposed. It is worth noting that the maintenance activities in the above studies are related to a time constraint.
Unlike the abovementioned studies where machine maintenance is related to a time constraint, i.e., given fixed time intervals or a maximum continuous working time, Bock et al. [23] addressed a new model where maintenance activity is subject to job-dependent machine deterioration where the maintenance level drops by a certain jobdependent amount. They considered two types of maintenance activities. One is full maintenance activity, where the maintenance level increases to the maximum maintenance level once maintenance activity is completed, and the other is partial maintenance activity, where the maintenance level increases to an arbitrary maintenance level that is less than the maximum maintenance level. To address this problem, they considered the makespan, the total completion time, the maximum lateness, and the number of tardy jobs and provided complexity analysis for the problems. Girgoriu and Briskorn [24] and Tian et al. [25] extended the study of Bock et al. [23] to parallel machines. Su and Wang [26] introduced another new model where the machine status deteriorates due to jobinduced dirt; that is, the machine needs to be maintained before the accumulation of dirt reaches a threshold value, and the machine status will return to its initial condition once the maintenance activity is finished. The main difference from the study of Bock et al. [23] is that the maintenance time is fixed and not dependent on the current maintenance level. To address the problem, they proposed a mixed binary integer programming model and an effective heuristic to minimize the total absolute deviation of job completion times (TADC). Later, Chen et al. [27] also considered the same problem that Su and Wang [26] had, where the objective is minimizing the total completion time. Pang et al. [28] extended the study of Su and Wang [26] from a static problem with a single objective function to a dynamic problem with a biobjective function, and they proposed a scatter simulated annealing algorithm to obtain nondominated solutions.
In this paper, we extend the study of Su and Wang [26] from a single machine to unrelated parallel machines with job release times, since the literature on parallel machine problems is still not as extensive as that of single-machine scheduling problems. To the best of our knowledge, only three studies address maintenance constraints and job release times simultaneously [28][29][30]; however, the three studies considered single machine problems. Without considering the job release times, when the processing times of all jobs are the same and there is only one machine, the considered problem of | , ≤ , | could be reduced to the one-dimensional bin-packing problem, which aims to pack n items of size dj into a minimum number of bins of capacity T, which it has been proven to be NP-hard [31]. Since this problem can be reduced to a bin-packing problem, it is also NP-hard, and hence, we focus on developing a feature-extraction-based iterated algorithm (FEBIA) to efficiently obtain highquality solutions. The proposed FEBIA is inspired by observations from many evolutionary search algorithms.
In the FEBIA, some information of an individual schedule in the past generation could be further extracted to generate good solutions instead of straightly ignoring the information hidden in the individual schedule. Numerical experiments are conducted to demonstrate the performance of the FEBIA by comparing it with a basic particle swarm optimization (PSO) algorithm and a simple genetic algorithm (GA). The computational results show that the proposed FEBIA gives promising and better results for the problem under study. The rest of the paper is organized as follows. Section II presents a mathematical model for formally describing the considered problem. Section III presents four feature-extraction matrices to recognize important features of job sequences. Our proposed FEBIA is addressed in Section IV. Section V presents the experimental results. Section VI offers the conclusions and future research directions.

II. MIXED INTEGER PROGRAMMING MODEL
In this section, a mixed integer programming (MIP) model is constructed to describe the characteristics of the considered problems in which a set of jobs J={J1, J2,…, Jn} is to be scheduled on m unrelated parallel machines. Preemption is not allowed, and machines are not available all the time because they must be stopped to be cleaned periodically to prevent amount of dirt in the machine from exceeding the threshold. The parameters and decision variables are listed as follows. Our objective is to minimize the makespan, i.e., Cmax.

C. MODEL
Objective function .
The objective function (1) minimizes the makespan. Constraint (2) ensures that each job is assigned to only one position on a machine. Constraint (3) ensures that not more than one job is assigned to any position in the sequence for any machine. Constraint (4) defines the available time for processing jobs at the kth position on machine i. Constraint (5) specifies the job processing time at the kth position on machine i. Constraint (6) ensures that the start time at the kth position on machine i should be greater than or equal to the completion time of the previous position. Constraint (7) prevents a job from being processed before its ready time. Constraint (8) defines the total remaining dirt at the first position on machine i. Constraints (9) and (10) define the total remaining dirt left at the kth position on machine i, excluding the first position. Constraint (11) ensures that the total dirt is not greater than the maximum dirt allowance for each machine. Constraint (12) makes it so that at least one job has to be processed between maintenance activities. Constraint (13) defines the completion time of job j. Constraint (14) defines the makespan.

III. FEATURE-EXTRACTION MATRICES
In this paper, we develop feature-extraction matrices and construct ten procedures to find heuristic solutions based on these feature-extraction matrices. The purpose of the feature-extraction matrices is to find the job-position, jobjob, job-machine, and machine-job relations from each individual solution and use this information to guide job allocations and generate better schedule solutions. To construct these matrices, we first convert each solution into a corresponding feature value to identify its solution quality among the population and then fill the feature value in the matrices. The conversion steps from the objective solutions to the feature values are described as follows.  Feature value Step 1. Decode each individual π with (n+m-1) elements to obtain a makespan value ( ), which is described in Section IV. To enlarge the disparities among the solutions of individuals, we square and obtain = ( ) 2 .
Step 2. Calculate the mean value ̅ and standard deviation for all in this population according to the following equations: , where L is the number of individuals in the population.
Step 3. Normalize , i.e., = ( ̅ − )⁄ . Step 4. Obtain a feature value for an individual by cubing , that is, = ( ) 3 . The steps to obtain the feature value for an individual in the population has two functions: (1) Our objective is to minimize the makespan. We use this conversion to maximize the feature values. In other words, a higher feature value is preferred. (2) The conversion steps can make the better/worse individuals more distinguished in the population. Because the feature values of all individuals are normalized and then enlarged, the best solution is above zero, and the best and worse solution is significant. An example of five jobs in Table I is used to generate twenty individuals randomly. Among the twenty individuals, in Fig.  1 and Fig. 2, the fourth individual marked by a double circle means that the solution is the best where the makespan and feature value are 16 and 7.03089, respectively. Additionally, it can be seen that the best and worse individuals seem to be much more distinguished in Fig. 2 where the worse feature value is far below zero. Moreover, for other insignificant individuals, their values seem to be slightly close to zero, as shown in Fig. 2. Table I An example with five jobs and two machines Once we obtained the feature values for all individuals in the population, we next used the feature values to construct four matrices, including job-position, job-job, job-machine, and machine-job matrices. The first two matrices were first developed by Chou [32] and embedded into a genetic algorithm to successfully solve the single machine total weighted tardiness scheduling problem. We extended the idea of Chou [32] and constructed jobposition, job-job, job-machine, and machine-job matrices using the feature values. The construction steps for the matrices are described as follows.  Job-position (n+m-1, n+m-1) matrix The job-position matrix is used to provide the semaphores of the positions for the jobs in a job-sequence based on the global feature values. A higher semaphore value of job i at position j suggests that this kind of designation could be found in better solutions. The construction of the jobposition matrix is described as follows.
Step 1. We initialize two two-dimensional arrays (JP1 and JP2) with n+m-1 rows and n+m-1 columns, where each element in the two arrays is equal to zero.
Step 2. For each selected individual in the population, if job i exists at position j, the values of JP1(i, j) is added by its feature, and the value of JP2(i, j) is added by 1.
Step 3. We repeat Step 2 until all individuals of the population have been selected and we obtain JP1 and JP2 for this population. To demonstrate the working process, a 5-job example of Table I is used. Suppose there are twenty sequences (individuals) whose feature values are obtained. Then, we accumulate the information from these twenty individuals to generate JP1 and JP2 arrays as shown in Fig. 3.
Step 4. The job-position matrix is obtained by According to Fig. 3, the job-position matrix is obtained as shown in Fig. 4.
It is worth noting that if an element in the JP2 matrix is equal to zero, the division is neglected, and the value is straightly replaced with zero.  Job-job (n+m-1, n+m-1) matrix The precedence information between the jobs in a job sequence is also an important factor in determining an optimal schedule [33]. Thus, we use the job-job matrix to provide the signals for any pair of jobs in the job-sequence based on the global feature values. The generation of the job-job matrix is similar to the steps of the job-position matrix and described in the following.
Step 1. We initialize two two-dimensional arrays (JJ1 and JJ2) with with n+m-1 rows and n+m-1 columns, where each element in these matrices is equal to zero.
Step 2. For each selected individual in the population, if job i is before job j, the values of JJ1 (i, j) and JJ1 (j, i) is added by its feature, and the element of JJ2(i, j) and JJ2(j, i) are added by 1.
Step 3. Repeat Step 2 until all individuals of the population have been selected and obtain the JJ1 and JJ2 for this population.
Step 4. The job-job matrix is obtained by Using these steps for the job-job matrix, for the same population, the JJ1, JJ2, and job-job matrices are obtained and shown in Fig. 5.
For unrelated parallel-machine scheduling problems, there are two interrelated subproblems: first, one needs to determine which jobs are to be allocated to which machines; and, second, one needs to determine the sequence of the jobs allocated to each machine. The above job-position and job-job matrices mainly provide information about the job sequence, whereas the information of the second subproblem is provided by the following job-machine and machine-job matrices. The processes are demonstrated in the following  Job-machine (n, m) matrix The job-machine matrix aims to find the information on which jobs are assigned to which machine. The steps of building the job-machine matrix are also similar to those of the job-position matrix and are described below.
Step 1. We initialize two two-dimensional arrays (JM1 and JM2) with n rows and m columns, where each element in these matrices is equal to zero. Step 2. We construct the JM1 and JM2 arrays, which is similar to Step 2 of the job-position matrix. If job i is assigned at machine j, the values of JM1(i, j) is added by its feature value, and the value of JM2(i, j) is added by 1. VOLUME XX, 2017 1 Step 3. Repeat Step 2 until all individuals of the population have been selected and obtain JM1 and JM2 for this population.
Step 4. The job-machine matrix is obtained by 1 2 .
For the 5-job example of Table I, the job-machine matrix is obtained as shown in Fig. 6.  Machine-job (n+1, m) matrix The machine-job matrix is applied to forecast the number of jobs processed in each machine. The steps are described below.
Step 1. We initialize two two-dimensional arrays (MJ1 and MJ2) with n+1 rows and m columns, where row number starts with zero, and each element in these matrices is equal to zero.
Step 2. For each selected individual in the population, if the number of jobs processed on machine j is i, the values of MJ1 (i, j) is added by its feature value, and the value of MJ2(i, j) is added by 1.
Step 3. We repeat Step 2 until all individuals of the population have been selected and obtain the MJ1 and MJ2 for this population.
Step 4. The Machine-job matrix is obtained by 1 2 . Based on the above steps, the machine-job matrix is obtained for the same population with 20 instances and is displayed in Fig. 7.
Following the construction steps of each matrix, the computational complexity of each matrix is O( • ( + )).  Table I VOLUME XX, 2017 1    In this paper, we propose a population-based stochastic search algorithm, called a feature-extraction-based iterated algorithm (FEBIA), where a group of solutions is maintained in each iteration using linking encoding (LE) procedures based on the mentioned feature-extraction matrices to solve | , ≤ , | problems. Similar to many metaheuristic algorithms such as genetic algorithms (GAs) [34], particle swarm optimization (PSO) algorithms [35], and simulated annealing (SA) algorithms [36], the individuals of the group in the FEBIA are maintained by certain operators, and each individual is represented by an encoding scheme and then decoded to the corresponding objective value. The encoding scheme makes a solution recognizable, whereas the decoding obtains a feasible solution for algorithms. Before demonstrating the proposed FEBIA, the encoding and decoding scheme for the | , ≤ , | problem is described as follows.

A. ENCODING/DECODING SCHEME
Regarding the encoding scheme, Borovska [37] noted that permutation coding is the best method for ordering problems, and to date, many meta-heuristic algorithms have used permutation coding to solve parallel machine scheduling problems in the literature (Vallada Regalado and Ruiz Garcia [38], Borovska [37]). Thus, we also used permutation coding similar to that of Cheng and Gen [39], where an (n+m-1) array is used to represent an individual. For 5-job instance with two machine in Table I, the corresponding solution encoding is represented as 1-3-5-6-2-4 as shown in Fig. 10, in which the dummy job (job 6) is used to separate machines, indicating jobs 1, 3, and 5 to machine 1, and the rest of the jobs 2 and 4 are to allocated machine 2. For our considered problem, there are three decisions to be made: (1) allocating jobs to each machine; (2) the sequence of jobs to be processed on each machine; and (3) the timing point of cleaning activity for each machine under the limit of dirt left in each machine. The proposed permutation coding could solve the first two decisions as shown in Fig. 10. Regarding the third decision, we modify the dynamic programming (DP) method provided by Pang et al. [28] to solve this issue. Additionally, we use this instance of Table I for comparison with the full batching method commonly used in the binpacking problem [40]. The comparison is shown in Fig. 8. In the third decision, the amount of dirt in each machine not exceeding the threshold value is a hard constraint. That is, the The dirt left in the final state=10 modified DP method is used to minimize the makespan; in the meantime, the obtained solution has to satisfy the constraint of dirt left in each machine. Thus, a certain relationship between the solution and dirt left in the final state was not verified in this paper. Fig. 8 shows that using the two batching methods, the amount of dirt in the final state of the machine is different, and using the modified DP method, the final solution (i.e., makespan) is shorter, but there is more dirt left in the machine. However, another example in Fig. 9 shows that there is different dirt left in the machine at the final state even when the makespan is the same. In this paper, we adopt the modified DP method to obtain complete feasible solution.
For the sake of brevity, for the procedures of the DP method, the readers can refer to the study of Pang et al. [28].

B. LINKAGE ENCODING (LE) PROCEDURES
The LE procedure aims to generate ten individuals based on the four feature-extraction matrices mentioned above for the next iteration of the FEBIA. The ten different LE procedures described in the following.  LE01 Procedure The LE01 procedure employs an assignment heuristic method to obtain a permutation encoding based on the jobposition matrix. The steps of LE01 are given in the following.
Step 2. Find the largest feature value ( * ) among the jobposition matrix and place job i into position j in the permutation encoding.  LE02 Procedure The LE02 procedure employs the roulette wheel technique to obtain a permutation encoding based on the job-position matrix. The steps of LE02 are given below.
Step 7. Place the selected job l into position ′ in the permutation encoding.  LE03 Procedure LE03 is similar to LE02. The difference is that the calculations for the roulette wheel are based on the rows of the job-position matrix. The steps of LE03 are given below.
Step 6. Select the position h such that ℎ−1 < ≤ ℎ . The selected position is denoted as h.
Step 7. Place job ′ into the selected position h in the permutation encoding.
Step 8. Replace all values of ′ ℎ in row ′ and column h of the job-position matrix with null.
Step 9. Repeat step 2 until the values of in the jobposition matrix are equal to null. For the three procedures of LE01, LE02, and LE03, the computational complexity is (( + ) 2 ) , where n is the number of jobs and m is the number of machines.
The following four procedures (LE04, LE05, LE06, and LE07) are developed based on the job-job matrix.  LE04 Procedure The LE04 procedure tries to find the higher priority jobs in the permutation encoding. In the job-job matrix, the value of a pair (i, j) indicates the priority that job i is before job j in the permutation encoding from the past population. Therefore, we develop the LE04 method using the information of the job-job matrix. The steps of LE04 are given below.
Step 1. Sum the values for each row of the job-job matrix, i.e., Step 2. Obtain a permutation encoding by sorting the jobs in descending order of their .  LE05 Procedure LE05 is also developed to obtain a permutation encoding based on the job-job matrix. The steps of LE05 are given below.
Step 2. Obtain a permutation encoding by sorting the jobs in increasing order of their .  LE06 Procedure LE04 and LE05 only use a simple index ( or , respectively) to generate a permutation encoding. In LE06, both the and indexes are considered. The steps of LE06 are given below.
Step 1. Sum the values for row i and column j of the job-job matrix, i.e., Step 2. Calculate the priority ( ) for job i using the formula ( − ).
Step 3. Obtain a permutation encoding by sorting the jobs in increasing order of their . For the three procedures of LE04, LE05, and LE06, the computational complexity is (( + ) ( + )).  LE07 Procedure In the job-job matrix, a positive value for a pair (i, j) implies that job i is before job j; in contrary, job i is after job j if the value is negative. Therefore, we can count the successive number of jobs for job i according to positive values in the job-job matrix. The steps of LE07 are given below.
Step 2. Based on the job-job matrix, count the number ( ) of rows where is greater than zero, and sum the values for the rows where > 0, i.e., .
Step 3. Choose the row ′ with the largest . If there is a tie, choose the row with the largest ; otherwise, choose arbitrarily. Place job ′ in position k.
Step 5. If k=(n+m-1), place the unassigned job into the last position. Then, stop the procedure. Otherwise, return to step 2.  LE08 Procedure The feature-extraction procedures mentioned above only use a single matrix such as the job-position or the job-job matrix. We integrated the information from the job-position and jobjob matrices to develop LE08 and the steps of LE08 are given below.
Step 1. Calculate the mean ̂ and standard deviation ̂ based on the values in the job-position matrix. Set a threshold value ( ) equal to ̂+ 3̂.
Step 2. For each column j, find the value ( ′ ′ ) of the jobposition matrix that is greater than the threshold value ( ). If there is a tie, choose the largest ′ ′ and place job ′ in position ′ in the permutation encoding.
Step 3. If the permutation encoding is null, randomly generate a permutation encoding. Otherwise, go to step 4. Step 4. Find the position ′ first occupied by a job in the permutation encoding. If ′ = 1 , go to step 5 (forward filling). Otherwise, go to step 9 (first backward filling and then forward filling).
Step 5. Update the job-job matrix by replacing the values of the occupied rows and columns with null. Step 6. Sum the values for each row of the job-job matrix, i.e., = ∑ + −1 =1 .
Step 7. Select the job ′ with the largest ′ and put it into the next unoccupied position in the permutation encoding. If there is a tie, choose randomly.
Step 8. Return to step 5 until all successive positions are occupied, and then stop. Step 9. Update the job-job matrix by replacing the values of the occupied rows and columns with null.
Step 10. Sum the values for each row of the job-job matrix, i.e., .
Step 11. Select job i with the largest and put it into the previous unoccupied position in the permutation encoding. If there is a tie, choose randomly.
Step 12. Return to step 9 until all previous positions are occupied.
Step 13. If one of the successive positions is not occupied, return to step 5; otherwise, stop.  LE09 Procedure LE09 integrates the information of the job-position, jobmachine and machine-job matrices to generate a new permutation encoding. First, the Machine-Job matrix is used to find the number of jobs processed in each machine. Next, we applied the job-machine matrix to find which jobs were processed on which machine. Finally, we use the jobposition matrix to determine the job sequence in each machine. The detail steps of LE09 are listed below.
Step 1. j=1, and k=0 Step 2. According to the Machine-Job matrix, find the largest value ( ′ * ) in column j, i.e., machine j. The corresponding row number ( ′ ) is the number of jobs ( ) processed on machine j, where = ( + ′ ).
Step 3. Delete column j from the Machine-Job matrix and replace the values of for > ( − ) with null for the Machine-Job matrix. j=(j+1) Step 4. If ≤ , return to step 2; otherwise, go to step 5.
Step 6. Choose the number of jobs ( ) with the largest values of ′ * in the job-machine matrix and put the chosen jobs into the set for machine j ( ).
Step 10. Construct a sub job-position matrix from the current job-position matrix based on the chosen jobs in the set of machine j ( ).
Step 11. Apply the assignment heuristic method to determine the job sequence based on the sub job-position matrix and put the jobs into the permutation encoding accordingly.
Step 12. If < , add the number (i.e., n+j) at the next position in the permutation encoding, j=(j+1) and return to step 10. Otherwise, stop the procedure.  LE10 Procedure LE10 integrates the information of the job-job, job-machine and machine-job matrices to generate a new permutation encoding. LE10 is similar to LE09. The difference is the steps of determining the job sequence; thus, steps 1 to 8 of LE10 are the same as steps 1 to 8 of LE09. Steps 1 to 8. Use steps1 to 8 of LE09.
Step 10. Construct a sub job-job matrix from the current jobjob matrix based on the chosen jobs in the set of machine j ( ).
Step 11. Sum the values for each row of the sub job-job matrix, i.e., .
Step 12. Obtain a permutation encoding by sorting the jobs in decreasing order of their . j=(j+1).
Step 13. If < , add the number (i.e., n+j) at the next position in the permutation encoding, j=(j+1) and return to step 10. Otherwise, stop the procedure. For the four procedures of LE07, LE08, LE09, and LE10, the computational complexity is (( + ) 2 ).
Regarding the mentioned 5-job instance, the four featureextraction matrices are given earlier in Figs. 4-7, therefore, ten permutation lists could be obtained by the LE procedures, and then the solutions are obtained by the modified DP methods, as illustrated in Table II. All new individuals obtained by LE procedures are put into the next iteration. Notably, the optimal solution of the 5-job instance is 15. Job-position matrix 1-3-5-6-4-2 16 LE03 Job-position matrix 2-5-3-6-4-1 22 LE04 Job-job matrix 1-5-3-4-6-2 21 LE05 Job-job matrix 1-4-3-5-6-2 21 LE06 Job-job matrix 1-5-3-4-6-2 21 LE07 Job-job matrix 1-3-5-6-4-2 16 LE08 Job-position and job-job In Table II, it can be seen that using LE procedures to generate ten new individuals in each iteration is likely to promote the FEBIA to search for good individuals because the optimal or better solutions are included in those obtained by LE procedures. On the other hand, Table II also shows that LE procedures will impede the diversity of the population such that it is trapped in local optima. To overcome this drawback, we generate other individuals randomly in each iteration of the FEBIA to diverse the population. That is, L individuals of the population in the FEBIA are randomly generated in the initial iteration, and in a subsequent iteration, the population includes that of the best individual of the current population (i.e., adopting elite strategy), ten individuals are generated by the LE procedures, and the remaining individuals, i.e., L-11, are randomly generated. In the FEBIA, all new individuals with (n+m-1) elements are manipulated for the next iteration based on permutation encoding; thus, their corresponding solution schedules are feasible by the DP method. Additionally, in each iteration, each matrix is updated by the following equation (seen in Fig. 11) for the next iteration. The process continues until the stopping criterion is met. Fig. 11 shows the framework of the proposed FEBIA.
It is worth noting that the FEBIA almost does not need to tune parameters and the simple framework is also easy to implement for other problems.

V. COMPUTATIONAL EXPERIMENT
To examine the efficiency of our proposed algorithm, we test our FEBIA, the basic PSO and simple GA on different sets of randomly generated instances.
To obtain the benchmark solutions, the MIP model is solved using IBM ILOG CPLEX Optimization studio version 12.7.1 on a PC with an Intel Xeon E-2124 3.4 G-Hz CPU with 32 GB of DRAM, and the time limit of the MIP is set to 1800 seconds. For the basic PSO algorithm [41], that is, ( + 1) = ( ) + 1 1 ( − ( )) + 2 2 ( − ( )), ( + 1) = ( ) + ( + 1), where c1 and c2 are learning factors, usually 1 = 2 = 2 [41,42], and to control the excessive roaming of particles outside the search space, the values of (t) and (t) are limited, i.e., (t) ≤ | | = 5 , and (t) ≤ | | = 5 in this paper. Regarding a simple GA to generate new population, × individuals with the lowest makespan values from previous generation are automatically copied to the next generation, where and are the reproduction rate and population size. Two-point crossover is used to generate the rest of individuals for the next generation. Swap mutation is performed on offspring with a probability to make the offspring more diversified. For the three algorithms (PSO, GA, and FEBIA), five replications are performed for each instance. These algorithms are coded in C++ and run on the same personal computer mentioned above.
For the GA, we first conduct a preliminary experiment. The parameter values are set in Table III. For each instance, the minimum, average, and maximum makespan among 5 replications are obtained, and then the mean values for the minimum, average, and maximum makespan are calculated among ninety instances for each problem. The comparison results under different parameter values shown in Figs. 12-16. These results indicate that when the number of jobs is less than 100, the parameter setting of ( = 0.1, = 0.9, = 0.1) is better. However, when the number of jobs is equal to 100, the parameter setting of ( = 0.15, = 0.85, = 0.1) seems to improve. Table IV  shows our parameter settings    In this paper, the average relative percentage deviation (ARPD) is adopted as a performance index, and the formula is as follows: ) denotes the makespan generated by the PSO, GA, and FEBIA algorithm in each run, Best is the best solution obtained by the algorithm or the optimal solution by the MIP model, and R is the number of replications for each instance, i.e., R is equal to five in this paper. In addition,∆ , ∆ , and ∆ denotes the minimum, mean and maximum of ARPD values among 90 instances for each problem. For small-sized problems, all three algorithms (PSO, GA, and FEBIA) could consistently obtain optimal solutions in a short time compared to the MIP model shown in Table V and Fig. 17.
When the number of jobs is greater than or equal to 20, the solutions obtained by the MIP model in the time limit of 1800 seconds are not certain to be optimal. Table VI shows the ∆ , ∆ , and ∆ values obtained for each method. In Table VI, it is obvious that the FEBIA outperforms other algorithms in terms of solution quality and stability. The proposed FEBIA found all of the best solutions among 810 instances, while the number of the best solutions obtained by the PSO, GA (0.1, 0.9, 0.1), GA (0.15, 0.85, 0.1) and MIP incredibly decreases as problem size increases. VOLUME XX, 2017    Remark: The symbol "*" indicates that all algorithms were stopped by the time limit of (0.005 × × ) seconds.

Figure 19. The number of the best solutions found by different algorithms for large-sized problems
Table VII provides the comparison results for large-sized problems, the results reveal that FEBIA performed very better in terms of solution quality and stability than the other compared algorithms. From Tables VI and VII, the GA(0.15, 0.85, 0.1) seemed more sensitive to problems than the GA(0.1, 0.9, 0.1). Fig. 19 shows that the number of the best solution found by each algorithm. The proposed FEBIA apparently performs well and the number of better solutions found by the FEBIA slightly increases as the number of jobs increases with the same number of machines. To further show how fast the algorithms converge to a better solution, we use an instance with 100 jobs and two machine to record the best solution found by each algorithm every 0.005 seconds. As revealed in Fig. 20, the proposed FEBIA converges fast to a better solution. This result confirms that the proposed feature-extraction matrices and LE procedures could make an iterated algorithm to find better quality solutions and that the convergence of the solution can be faster. On the whole, the experimental results provide evidence that the proposed FEBIA has significant performance advantages in terms of the solution quality and stability. This achievement is mainly due to the inclusion of LE procedures based on feature-extraction matrices in the FEBIA together with randomness when generating new solutions for the next iteration. Moreover, the proposed FEBIA does not require excessive parameter tuning, which is an apparent advantage of the FEBIA in decreasing the influence on the solution due to different parameter settings.

VI. CONCLUSIONS
In this paper, we introduce the unrelated parallel machine scheduling problem with periodic maintenance activities and a dynamic job release time, which is denoted as | , ≤ , | . This problem is NP-hard and is of considerable practical interest since it can be used to model many industrial applications, such as for the motivation of this article, which was semiconductor manufacturing cleaning operations. Because the considered problem is NPhard, algorithms for obtaining an optimal solution in polynomial time are unlikely to exist. Thus, this paper develops an effective and efficient novel scheduling algorithm, namely, the FEBIA, for the | , ≤ , | problem. The FEBIA is similar to populationbased algorithms. Each individual in the population is represented by permutation encoding that considers the job allocation and job sequence simultaneously, and the corresponding solution, i.e., the makespan, is obtained via a dynamic programming algorithm. In the FEBIA, the two novel parts are the feature-extraction matrices that are used to recognize the feature values from the previously found solutions and the LE procedures that are used to find better solutions based on the matrices. Based on the results of a comprehensive experiment, the FEBIA outperformed MIP, PSO and the GA, especially for small to medium problem sizes. Additionally, the solutions of the FEBIA are better than those obtained by PSO and the GA for large problem sizes with the same computation time. This study concludes that the combination of feature-extraction matrices and LE procedures gives the FEBIA encouraging performance. Moreover, it is worth noting that the advantages of the FEBIA include its simple framework, fewer parameters to tune and quick convergence, all of which make it a promising and competitive scheduling algorithm.
In summary, the proposed FEBIA is successful at solving the | , ≤ , | problem. The problem is a general real-world case that has not been investigated by many researchers in the literature. Our research will continue in the following directions. (1) We will consider combining the feature-extraction matrices and LE procedure with other meta heuristic algorithms such as the GA or SA. (2) In the current FEBIA, no improvement mechanism is involved, so local search procedures will be included in our algorithm. (3) Other shop environments such as open shops, flow shops, job shop problems can be studied in future research.