Minimizing the Total Tardiness of a Game Project Considering the Overlap Effect

There has long been a custom that game development is not the mainstream of engineering and its tardiness brings little or even no harm to this industry. Nowadays, the pendulum of industrial development has swung to another side. A game project may involve hundreds of developers, thousands of jobs, and millions of costs. Situations would be worse if jobs overlap with others. Any delay of a large game project can cause a heavy penalty. Consequently, in this study, we propose two scheduling algorithms to reduce the total tardiness of a game project. If the problem size is small, a branch-and-bound algorithm is employed to provide the optimal schedules; otherwise a genetic algorithm is used to generate near-optimal schedules. The experimental results show that the proposed algorithms reduce the total tardiness significantly.


I. INTRODUCTION
Today we can employ multiple machines to process many jobs simultaneously in order to improve time efficiency or cost effectiveness. For example, tardiness can be lowered since jobs are processed concurrently or makespans can be reduced due to multi-machine teamwork. Although multiple machines indeed improve the performance compared with a single machine, there are still some issues needed to be noticed. For example, some constraints on these machines need to be respected or some duplicate work can be eliminated. Namely, job scheduling can achieve better performance if these issues are resolved or relieved.
Total tardiness is a common research topic in today's job scheduling. A business should satisfy customers and deliver goods on time whenever possible; i.e., it achieves less tardiness or fewer penalties. If jobs are not finished by their respective due dates, consequences such as fines or losses may result from tardiness. For example, an integrated circuit (IC) producer maximizes on-time delivery or minimizes the total tardiness in order to reduce the related costs [1]. If a job is not completed before its promised delivery date, tardiness occurs and penalties become inevitable. In general, The associate editor coordinating the review of this manuscript and approving it for publication was Geng-Ming Jiang .
penalties increase with the magnitude of tardiness. Consequently, schedules that minimize the total tardiness lead to high performance, customer satisfaction, and minimum penalty. For more recent development in minimizing total tardiness, readers can refer to [2]- [9].
The importance of total tardiness of a game project should not be underestimated. First, a large game will cost a company millions of dollars. For example, each of Call of Duty, GTA, and Star Wars is at least a $10 million project [10]. When developing a game, we need to pay a lot for planning, programming, testing, art design, sound effect, and so on. Clearly, any tardiness will cause a serious budget amendment. Second, a modern game requires considerable teamwork. For example, the team size of a commercial game can reach up to hundreds of developers, e.g., designer, artist, programmer, level designer, sound engineer, or tester. Evidently, a slight delay in a critical job processed by a previous developer may lead to a serious domino effect on other succeeding developers. That is, some extra wage, hotel expense, and dining fee become inevitable. Third, a commercial game races against time. Vanhoucke [11] indicated that the total cost of any game project is made up of the sum of the job processing cost and the penalty cost. Note that the latter is a kind of time cost and it is preventable, avoidable, or at least reducible. Therefore, if the tardiness of each job can be VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ eliminated or reduced as possible as we can, more costs can be saved. Obviously, the importance of minimizing the total tardiness of a game project is not less than that of any project in aviation, semiconductor, and construction industries.
On the other hand, in practice, some jobs may overlap with each other (i.e., have duplicate contents). Two examples are given. The first example is scene design. Consider there are two similar scene design jobs and two undertakers. Each undertaker deals with a job. For example, trees of tonal colors and rocks of smooth textures are requested. Due to enterprise ethics or contract constraints, both undertakers cannot share their work in secret. However, in fact, both jobs have a lot in common. If the two jobs can be assigned to a single undertaker in advance, some unnecessary processing time can be saved and the extra manpower becomes available for other jobs [12]. The second example is road excavation. There are two excavation jobs scheduled in the same construction area and period. Job 1 includes digging, installing drainage pipes, and backfilling; and job 2 includes digging, installing gas pipes, and backfilling. If the two jobs are assigned to the same operator and excavation machine, only one-time operations are needed. That is, the overall cost can be reduced by reducing duplicate efforts at digging and backfilling. For more recent research on scheduling overlapping jobs, readers can refer to [5], [6].
Clearly, such many overlapping jobs cannot be optimally scheduled by skilled manual labor. This is because the overlap relationships between jobs cannot be calculated easily. A simple example is described as follows. There are eight jobs with duplicate contents (called tasks). Assume that each job includes eight tasks on average and a single job cannot be assigned across multiple developers. Then the actual problem size is 64, instead of eight. Consequently, we need to develop more suitable scheduling algorithms to deal with such complex relationships between overlapping jobs.
In this study, a new scheduling problem is presented and two algorithms are proposed for scheduling overlapping jobs. Compared with traditional batch delivery problems and group scheduling problems, this study focuses on dealing with intersected sets. For example, Karimi and Davoudpour [13] considered a batch delivery problem. In their problem, all the batches are disjoint. The most important assumption is the vehicle capacity. That is, no overlap relationships need to be maintained in the above problem. Keshavarz, et al. [14] studied a group scheduling problem. This problem was formulated as a linear programming problem. However, due to the consideration of set operation, our problem cannot be easily formulated as a linear programming problem and cannot be plainly solved by ordinary commercial software, e.g., MATLAB. On the other hand, this problem is semi-preemptive. Unlike common preemptive scheduling, e.g., [15], we had better assign overlapping jobs (i.e., with duplicate tasks) to the same developer. These jobs are preemptive from the view of task. However, we had better not partition these overlapping jobs and assign them to different developers. Consequently, new algorithms are called for. For small problem instances, a branch-and-bound algorithm with dominance rules and a lower bound is developed. For large problem instances, a genetic algorithm equipped with a simulated annealing local search is proposed. The experimental results show that both algorithms achieve high solution quality and rapid execution speed.

II. RELATED WORK
There are two kinds of situations in which two equally sized jobs take different processing times. Due to some practical considerations, the processing time of a job is not always fixed on a default value. Its actual processing time varies with a determined subsequence. The partial sequence may be the experiences obtained from the determined jobs ahead of it or the overlap degree of all the determined jobs. Both are discussed as follows.
In the first situation, learning effect is considered in job scheduling. That is, human-machine interaction exists in this situation. A technician performs similar jobs repeatedly and gains experiences. As time passes by, the actual processing time of a job will be shorter than its default value. This phenomenon is known as learning effect [16]. For a position-based learning model, experiences are accumulated only on the completion times of independent operations, e.g., setting up machines. In this model, the actual processing time is mainly influenced by the counts of completed jobs. Thus, we had better schedule small jobs at the beginning as more as possible. Then the experiences are accumulated rapidly and then applied to later large jobs. For a sum-of-processingtime-based learning model, experiences are accumulated at all times. That is, it takes into account the processing time of all jobs processed so far. In this model, the experiences are mainly from some complicated and error-prone operations, e.g., offset printing. For more recent development in scheduling with learning effects, readers can refer to [6], [17]- [23].
In the second situation, jobs having duplicate contents are common in the real world. Two computerized examples are introduced as follows. In a multi-channel broadcast environment, popular data items can be allocated together to reduce the average waiting time [24]- [26]. To take advantage of overlap, we organize popular item sets together and make the broadcast program shorter. Thus the overall waiting time can be reduced more. Another example is server proxy. If some contents of requests overlap, e.g., caching proxy, these overlapping contents should be allocated to the same server as much as possible. Then the server can improve system performance (e.g., throughput or waiting time) by retrieving the same contents demanded by previous requests. Therefore, users can reduce their bandwidth usage and increase their access efficiency.
A game project usually involves the above two kinds of situations. In general, there are still labor-intensive jobs when developing a game. Consider that multiple 3D figures are first modeled by some new software, e.g., Blender. Many skills are needed to learn for completing these jobs such as sculpting, rendering, painting, and rigging [27]. Developers, to an extent, polish their skills through trial and error. With the experience of previous jobs, these skills can be passed on to subsequent jobs. Namely, processing time can be reduced by such invisible experience. On the other hand, there are also duplicate contents between two similar jobs. For example, it will take time to model all military personnel, e.g., generals, knights, archers, and soldiers. However, their badges, uniforms, knifes, and boots are identical. If the jobs of modeling the whole military can be assigned to the same developer, some duplicate jobs can be eliminated and modeling processes can be accelerated. That is, processing time can be reduced by merging such visible duplicate contents into one batch and assigning it to a single developer. Similarly, we can let another experienced developer to process the jobs of modeling all the different civilians. For more research into game development, please refer to [28]- [36].
In light of the above observations, we can see that scheduling of overlapping jobs is different from traditional scheduling. However, few scheduling algorithms are designed for game development with such overlapping jobs. Therefore, scheduling overlapping jobs in a game industry is worth further research.

III. PROBLEM DESCRIPTION
In this section, the scheduling problem is defined and an example is illustrated. Moreover, the differences between the problem and traditional scheduling problems are discussed. The comparison will be helpful for readers to distinguish the improvement between the problem and past research.

A. PROBLEM DEFINITION
There are N jobs and M identical developers. Each job J j is a set of tasks which has a due date d j and a processing time p j for j = 1, 2, . . . , N . Assume that the processing time of each task is 1. Then p j can be denoted as |J j |; i.e., the number of tasks in J j . If J j ∩ (∪ k =j J k ) = ϕ, it implies that J j overlaps with other jobs. We also assume that a job is the basic scheduling unit and it cannot be assigned to multiple developers. For those overlapping jobs assigned to the same developer, the duplicate tasks need to be processed once only. Namely, |J i ∪ J j | ≤ p i + p j = |J i | + |J j | holds if J i and J j are assigned to the same developer. For a job J j in a schedule π, the developer processing J j is denoted as D j (π ), and the processing order of the job performed by this developer is O j (π ). Thus, the completion time and tardiness of J j can be determined by Under the above assumptions, the problem is to minimize the total tardiness; i.e., Minimize An example is presented to illustrate the difference between the problem and traditional ones.
Though there are 14 tasks, we need to process only 10 of them; i.e., The reason is that these jobs overlap. For a schedule π = (1, 2, 3), the completion time of each job is shown in Fig. 1. Since tasks a and b have been completed in J 1 , we can skip tasks a and b while processing J 2 . Clearly, we omit the duplicate tasks carefully and the objective cost can be reduced more than before.

B. COMPARISON WITH PAST RESEARCH
The proposed problem cannot be solved by simple heuristic algorithms optimally. Most traditional scheduling algorithms are not ideally suited for overlapping jobs. For example, let J 1 = {a, b, c, e} and J 2 = {b, c, e, f }. The processing cost is 5 if we assign them to a single developer; otherwise the cost is 8 if they are assigned to different developers. From the viewpoint of traditional scheduling, e.g., [37], it is just a partition problem. There are only 2! results for J 1 and J 2 . However, if finer granularity is considered (i.e., task), it becomes a permutation problem and has 8! results. The proposed problem becomes more difficult, but it achieves lower costs.
We borrow the concept of graph theory to describe overlapping jobs. A closer look reveals that the above two jobs are actually three groups: The three groups are similar to the connected components in graph theory, so the 8 distinct tasks can be viewed as 3 distinct connected components with 3! permutations. Without the analysis of overlapping jobs, traditional scheduling algorithms cannot perform ideally when jobs are highly overlapping. Scheduling for overlapping jobs is therefore called for.

IV. PROPOSED ALGORITHMS
In this section, two algorithms are proposed for this problem. In Section 4.1, a branch-and-bound algorithm (named B&B) is proposed for generating the optimal schedules when N is small. Dominance rules and a lower bound are also introduced to accelerate B&B. In Section 4.2, a genetic algorithm (named GA) is proposed for obtaining near-optimal schedules when N is large. VOLUME 8, 2020 A. BRANCH-AND-BOUND ALGORITHM First, 14 dominance rules are proposed, and then a lower bound is developed to estimate the remaining cost. Both are used to eliminate unnecessary nodes in a search tree.

B. DOMINANCE RULES
The following dominance rules reduce the scope of a search tree. Before trimming a search tree, we define some notations. Suppose that π = (α, i, j, β) and π = (α, j, i, β) are two schedules of jobs where π is obtained by interchanging two pairwise adjacent jobs J i and J j in π. Moreover, α is a determined partial sequence and β is not. For simplicity, let Since all the proofs are similar, we provide the first one only.
Proof: Let us observe the cost gain between the two schedules. For job J i , after interchange, the incremental tardiness is max{0, Rule 2: If d i < t 1 and t 2 < d j < t 1 , then π dominates π. Rules 3 and 4 show the results of interchange of J i and J j , given that J j is tardy in both π and π . In Rule 3, J i is tardy in π but not in π. In Rule 4, J i is not tardy in either π or π .
Rule 3: If d j < t 2 < t 1 < d i < t 3 , then π dominates π. Rule 4: If d i > t 3 and d j < t 2 , then π dominates π. Rules 5 and 6 show the results of interchange if J i is not tardy in π and J j is not tardy in π . In Rule 5, both J i in π and J j in π are tardy. In Rule 6, J i is not tardy in π but J j is tardy in π.
Rule 5: Rule 6: If d i > t 3 and t 2 < d j < t 3 , then π dominates π. Rules 7 and 8 handle a situation in which the interchange of J i and J j causes the same tardiness. In such a situation, the job with the earlier due date or smaller identifier can be scheduled first.
Consider two adjacent developers and each is given a determined partial sequence of jobs. Rule 9 interchanges two adjacent jobs across both developers. If the interchange improves the original tardiness, the new schedule dominates the original one.
The following rules are used to eliminate hopeless or redundant schedules, or to help us terminate B&B as early as possible by investigating π = (α, β) in greater detail. Rule 10 inserts a job from β into α to test the dominance of the new schedule π = (α , β ). In Rule 11, jobs J i and J j are of the same processing time and due date. A new schedule π is obtained by interchanging them. Note that both schedules are equivalent in terms of objective cost. So one redundant schedule can be eliminated.
Rule 10: Let J e be the ending job on the developer D e (π). If there exists a job J j ∈ β such that C e (π) + p j − d j ≤ 0, then π dominates π.
Rule 11: Let J i ∈ α and J j ∈ β be two disjoint jobs. If p i = p j , d i = d j , and j < i, then π dominates π.
Let π = (α, β) be a schedule, where α is a determined partial sequence and β has not been determined yet. Rule 12 considers that β is just a small portion of jobs and sorted in EDD order on the last developer. That is, the jobs in β is sorted by due date in ascending order. If this arrangement causes no tardiness, then no further branch-and-bound search of π is needed and the objective cost of F(π ) is j∈α T j (π).
The following rules are used to eliminate schedules that cannot achieve optimality or they are redundant. Rule 13 shows that each developer's capacity is limited [38]. Each developer's total processing time cannot exceed the limit. Rule 14 handles a situation of multiple identical optimal schedules. Among the multiple optimal schedules, only the one whose due dates of the leading jobs assigned to each developer in non-descending order will be kept.

C. LOWER BOUND
To improve the efficiency of B&B, we propose a lower bound for overlapping jobs. To prove the following related lemmas, we borrow an idea, problem transformation, from Kondakci, et al. [39]. With this idea, we repeatedly transform a problem into a simpler one to obtain a lower bound. There are two main stages and each stage transforms a problem instance into a simpler one.
The first stage aims to eliminate the influence caused by overlapping tasks. Let INST 0 denote the original problem instance and INST 1 denote the transformed instance in which jobs are all disjoint. Let π = (α, β) be a schedule, where β is an undetermined partial sequence. Suppose there are N g jobs in β and we sort them in ascending order of due date; i.e., It is clear that each group G g is disjoint for g = 1, 2, 3, .., N g . An example of instance transformation is shown in Fig. 2. Let there be three jobs in β. The original instance INST 0 (i.e., the left side of Fig. 2) includes J 1 = {a, b, c, d}, J 2 = {a, e}, and J 3 = {b, g, h} with d 1 = 4, d 2 = 6, and d 3 = 8. Following the first step, we have the transformed instance INST 1 (i.e., the right side of Fig. 2): G 1 = {a, b, c, d}, G 2   The second stage reassigns the due dates and processing times of G 1 , G 2 , . . . , G N g . The processing times and due dates of G 1 , G 2 , . . . , G N g are sorted separately in ascending order. For simplicity, let N (g) = N g be the number of sorted groups, p (1) , p (2) , . . . , p (N g ) and d (1) , d (2) , . . . , d (N g ) be the corresponding sorted processing times and due dates, respectively. We reassign p (1) and d (1) to G (1) , p (2) and d (2) to G (2) , and so on. Then these new processing times and due dates become agreeable, i.e., if p (i) < p (j) implies d (i) ≤ d (j) for all i < j [40]. The final instance INST 2 is obtained. Let us compare the final result with the previous example shown in Fig. 2. It is clear that we have N (g) = 3, p (1) = 1, p (2) = 2, Let F * 0 , F * 1 , F * 2 denote the optimal objective costs for the three instances, INST 0 , INST 1 , INST 2 , respectively. Lemmas 1 and 2 help us to show that F * 0 ≥ F * 1 ≥ F * 2 . That is, the above transformations will lead to no tardiness gain.
Lemma 1: F * 0 ≥ F * 1 . Proof: Consider that an overlapping task is removed from some job J j in INST 0 . The processing time is decreased from |J j | to |J j | − 1 and the d j remains the same. So the total tardiness remains the same or decreases slightly. Continuing the removing process, we can transform INST 0 into INST 1 . All the processes lead to no tardiness gain, so F * 0 ≥ F * 1 holds. The proof is completed.
Lemma 2: F * 1 ≥ F * 2 . Proof: Consider that there are two adjacent groups G i , G j in a schedule π and G i precedes G j . There are three non-agreeable cases for the two groups: 1) For the three cases, we try to rebuild the two groups and make them agreeable. Let π denote the rebuilt schedule.
For case 1) p i ≤ p j , d i > d j , let the two jobs exchange their due dates. The original tardiness is T = max{0, C i (π ) − d i }-max{0, C j (π ) − d j } and the new tardiness is T = max{0, C i (π ) − d j } + max{0, C j (π ) − d i }. Note that we let p i ≤ p j , d i > d j , and C i (π ) + p j = C j (π ). The possible tardiness gains (T − T ) are listed in Table 1. Making the two groups agreeable leads to no tardiness gain, where a minus sign means negative tardiness gain.
For case 2) p i > p j , d i ≤ d j , let the two groups exchange all their tasks. The original tardiness is T = max{0, C i (π) − d i }-max{0, C j (π ) − d j } and the new tardiness is T = max{0, C i (π ) − d i } + max{0, C j (π ) − d j }. Note that C j (π) = C j (π ) and C i (π ) = C j (π ) − p i < C i (π). Again, there is still no tardiness gain.
For case 3) p i > p j , d i > d j , let the two groups exchange all their tasks as well as due dates. The original tardiness is Note that C j (π) = C j (π ) and C i (π ) = C j (π ) − p i < C i (π ). Once again, the transformation leads to no tardiness gain.
Such a transformation of two groups does not cause any tardiness gain. Therefore, we can sort all the groups step by step into an agreeable sequence like that we perform bubble sort. The whole sorting process will also not cause any tardiness gain. Hence F * 1 ≥ F * 2 holds. The proof is completed. VOLUME 8, 2020 By Theorem 1, we obtain a lower bound for a single developer. Again, we borrow the concept of load balance from Azizoglu and Kirca [41] to extend the results to a multi-developer environment, and the details are described as follows.
The algorithm for obtaining the lower bound is shown in Fig. 3. The main idea is that groups are assumed to be preemptive. For each group G g , if a developer that can process G g in time is idle, we utilize the idle developer first (Steps 3-9). If group G g has not been finished yet and some developers are idle before d (g) , G g is divided into several parts and allocated to these idle intervals (Steps [11][12][13][14][15][16]. Note that all the jobs in the original problem are non-preemptive. In order to obtain the lower bound, we split these pseudo jobs (i.e., groups) while estimating the lower bound. In Step 14, we assign some fractional part of G g to developer m such that the fraction can be accomplished punctually on d (g) . In Step 19, if group G g has not been finished yet, we pick a least-loaded developer and assign the remaining part of G g to him/her. Eventually, we can compute the total tardiness for estimating the lower bound. With Lemmas 1 and 2, Theorem 1 shows the validness of the proposed lower bound, i.e., F * 0 ≥ F * 1 ≥ F * 2 ≥ LB. Later, we can employ the lower bound in the B&B algorithm to avoid unnecessary search.
Theorem 1: F * 0 ≥ F * 1 ≥ F * 2 ≥ LB. Proof: By Lemmas 1 and 2, we have F * 0 ≥ F * 1 ≥ F * 2 . The lower bound LB is obtained by transforming INST 2 into a preemptive case. The optimal objective cost of a preemptive case is naturally lower than or equal to that of a nonpreemptive case. Hence, we have F * 2 ≥ LB. The proof is completed.
With the dominance rules and lower bound, we adopt a depth-first search order to develop the B&B algorithm. The details of the recursive B&B algorithm are shown in Fig. 4. At the beginning, we employ a schedule π 0 obtained by GA+SA (see Section 4.2) as the input schedule and let the initial objective cost F * = F(π 0 ). Then B&B(π 0 ,0) is called. If the input schedule is dominated or the lower bound exceeds the currently lowest objective cost, we block the current search path (Steps 2-7). If B&B is visiting the leaf node of a search path, we record its objective cost if it outperforms the currently best schedule π * (Steps 8-12). Finally, B&B enters the next-level recursion in order to exhaustively search the search tree expanded by π 0 (Steps 14-18).

D. GENETIC ALGORITHM
In this section, a genetic algorithm (GA+SA) is developed for large problem instances. Among today's modern metaheuristics, GA has successfully solved many combinatorial problems. On the other hand, this problem requires both the intensification and diversification forces at the same time to achieve high solution quality [42], so we choose GA to deal with this problem when the problem size is large. Moreover, the proposed genetic algorithm is equipped with a local search implemented by simulated annealing (SA). This is because the local search can easily converge at the local minimum of a small neighborhood. The parameters regarding GA will be determined by conducting a pilot experiment.
Initialization: We encode each chromosome π i (i.e., job schedule) using a sequence of N + M − 1 integers (i.e., 1, 2, . . . , N and M − 1 0's) in random order. The mth zero means a separator for separating the jobs on developer m from those on developer m + 1. For example, for N = 5 and M = 4, let (1,0,3,0',5,0'',4,2) be a chromosome for the genetic algorithm. Let K be the population size and K chromosomes be generated.

Selection:
We consider a standard roulette wheel procedure. First, the fitness function, defined as g(π i ) = F max − F(π i ) + 1, is the ith fitness value, F(π i ) is the objective cost of chromosome π i , and F max = max{F(π i )} is the maximum objective cost in the current generation. Then the probability of selecting chromosome π i is defined as q i = g(π i )/ K k=1 g(π k ) for i = 1, . . . , K . Crossover: In this study, the partially matched crossover (PMX) is implemented [43]. We choose the crossover because it has more diversification ability than a traditional two-cut crossover, where diversification means the exploration of the search space [42].
Mutation: A mutation based on extraction and insertion is implemented. First, a circularly left-shifted operation x → y is considered, where x and y are two positions of genes in a chromosome. 1) Two distinct positions x and y are randomly selected. 2) Then we circularly left-shift jobs within positions x and y if x < y. For example, we have a schedule (1, 0, 3, 0 , 5, 0 , 4, 2) and two random positions x = 2 and y = 5.
3) The schedule is mutated to (1, 3, 0 , 5, 0, 0 , 4, 2). Similarly, we can circularly left-shift jobs not within positions x and y if x > y. Second, for each randomly selected schedule to mutate, a probability z and two positions x and y are randomly drawn. Finally, if z < 0.45, x → y is performed, and if z < 0.9, y → x is performed; otherwise, the two jobs at positions x and y are swapped.
Local search: To avoid premature convergence, a simulated annealing (SA) algorithm is implemented [44]. The details of local search are shown in Fig. 5. In Step 1, the initial temperature and iteration number is set. Then, we obtain a new schedule by randomly swapping two jobs (Step 3). If the schedule leads to a lower objective cost or has a higher probability of random walk, we replace the original schedule with the new one (Step 5). In Step 6, the temperature is decreased and the iteration number is increased. Finally, the schedule is returned (Step 7). The detailed genetic algorithm (GA+SA) is shown in Fig. 6. At the beginning, the first population is generated and the best schedule π + and its objective cost F + is recorded (Steps 1-2). Then, two chromosomes are selected for crossover or we let them survive to the next generation (Steps 5-8). Next, about r M K chromosomes are mutated and r L K are improved by local search (Steps 9-12). Finally, each chromosome is evaluated and recorded (Step 14). If the solution quality (i.e., F + ) is improved, the generation number is reset (Step 15). The GA+SA algorithm terminates if some stopping criteria are met.

V. EXPERIMENTAL RESULTS
The experimental results are divided into three parts. In the first part, various parameter settings are considered to examine the performance of the branch-and-bound algorithm (B&B), GA, and GA+SA (GA with simulated annealing). In the second part, the convergence speed and the solution quality of GA and GA+SA are discussed. In the third part, we perform several sensitivity tests to observe the consequences if we slightly change a parameter step by step. All the related algorithms are implemented in Pascal and executed on an Intel Xeon E3 1230 @ 3.20GHz with 8 GB RAM in a Windows 7 environment. For each setting, 50 random problem instances are generated and 50 trials are conducted and recorded. Table 2 lists the parameters used in this section. Parameter N denotes the number of jobs and M denotes the number of developers. For a job J j , the processing time p j follows a discrete uniform distribution over [1,2n − 1] and the due date d j follows a discrete uniform distribution over [T (1-τ − R/2)/M , T (1-τ +R/2)/M ], where τ is the factor of tardiness, R is the factor of due date range, and T = N j=1 p j . To model the overlap effect, we let parameter r J denote the average portion of overlapping jobs and parameter r T denote the average portion of overlapping tasks in an overlapping job. For each setting, the mean number of overlapping jobs is r J N . For each overlapping job, the mean number of overlapping tasks VOLUME 8, 2020 is r T p j . Moreover, these overlapping tasks are generated by randomly drawn from {1, 2, . . . , N }. For GA and GA+SA, the crossover rate, mutation rate, local search rate, and population size are denoted by r C , r M , r L , and K , respectively. Their stopping criteria include: 1) run time is more than S T seconds; 2) no improvement is made during recent S G generations. Moreover, a pilot experiment suggests that r C = 0.8, r M = 0.2, and r L = 0.05 can achieve high solution quality within an acceptable run time.
To understand how r J and r T control the overlap effect in more detail, we give Note that there is no disjoint job. We have r J = 1.0. Since all the tasks of J 1 overlap with others, the overlap degree is 100%. Similarly, the overlap degrees of J 2 , J 3 , J 4 are 50%, 50%, 25%, respectively. Therefore, we have r T = 0.56(= (100% + 50% + 50% + 25%)/4).

A. OPTIMAL SOLUTION
A pilot experiment is conducted for determining significant settings. To evaluate the performance of the proposed algorithms, we compare them with a similar genetic algorithm [45]. The relative error percentage (REP) is defined as (F GA − F B&B )/F B&B × 100%, (F GA+SA − F B&B )/F B&B × 100%, or (F GA (Schaller) (Schaller) , where F means the corresponding objective cost. The largest REP values occur at τ = 0.25, R = 0.5, r J = 0.5, and r T = 0.5. It means that ordinary genetic algorithms are not good at achieving optimality when we have late but medium-ranged due dates and mediumlevel overlapping jobs. Therefore, we choose the setting of τ = 0.75, R = 0.25, r J = 0.75, r T = 0.75 for later B&B's experiments and the setting of τ = 0.25, R = 0.5, r J = 0.5, r T = 0.5 for both genetic algorithms. Table 3 includes 1,250 problem instances (i.e., 27 settings× 50 instances) and compares the performance of B&B, GA, GA+SA, and GA(Schaller) [45] for different problem sizes. The NS column of B&B means the number of the nodes is over 150,000,000 nodes, i.e., not solvable; and the related statistics are not taken into account. When N ≥ 12, B&B takes more than 10 minutes to obtain optimal schedules, and the number of nodes is easily over a hundred million. The main reason is that we need to deal with at least 12 × 4 tasks for each problem instance. The complexity of permutation and combination is more complicated than that of traditional scheduling problems with 12 jobs. On the other hand, the NA columns of both genetic algorithms stand for the times when GA/GA+SA/GA(Schaller) cannot find the existing zero-cost schedules or the times when B&B is terminated prematurely. That is, the related statistics are not valid for calculating REP. It can be seen that the performance of the three genetic algorithms are competitive when the problem size is small; however, GA+SA outperforms GA when the problem size increases to 12. For some settings, GA's REP is up to 4.91% and GA(Schaller)'s REP is up to 21.87%, whereas GA+SA's REP is 0.83%. On the other hand, these small p-values also suppose us in rejecting the null hypotheses. It clearly shows the synergy between SA and GA. Figure 7 shows the effects of r J and r T on the nodes, run time, and objective cost of B&B. As the overlap degree increases, the number of nodes also increases. On the other hand, as r J and r T increase, the objective cost (i.e., the total tardiness) decreases. That is, the proposed algorithm reduces the objective cost more effectively than traditional methods. Table 4 includes 450 problem instances (i.e., 9 settings× 50 instances) and shows the comparison of two B&B algorithms for N = 8, τ = 0.75, R = 0.25, r J = 0.75, and r T = 0.75, where the first B&B algorithm is equipped with the proposed lower bound and the other B&B is not. Since both B&B algorithms always provide optimal solutions, we only compare their execution speeds. From these results, we learn that the average number of tasks in a job (i.e., n) slightly influences the execution performance and the problem becomes more difficult when the number of developers (i.e., M ) increases. The aggregate results show that the lower bound is useful to prune the search tree in terms of the number of nodes, especially for the setting N = 6 and n = 5. The ratio of improvement is as high as   (Schaller) } > 0 means the minimal positive objective cost. If min{F GA+SA , F GA , F GA(Shaller) } = 0, the count in the corresponding NA columns is accumulated. Note that GA+SA always outperforms GA and GA (Schaller) even in terms of run time. It also shows the effectiveness of the local search implemented by simulated annealing.

C. SENSITIVITY ANALYSIS
In the third part, we provide some managerial insights into tardiness cost. In this subsection, the default settings are m = 3, N = 16, n = 8, τ = 0.5, R = 0.5, r J = 0.5, and r T = 0.5. Figure 8 shows the influences of control VOLUME 8, 2020   parameters on tardiness cost. To observe these effects clearly, we adjust a control parameter at a time. Early due dates (i.e., a large τ ) directly lead to rising costs. Conversely, many overlapping jobs (i.e., a large r J or a large r T ) save some duplicate processing time and slightly reduce tardiness costs. Moreover, a wide range of due dates (i.e., a large R) is also helpful to reduce tardiness. Table 6 shows the results of sensitivity analyses on three input parameters (i.e., p j , d j , n). We increase 15% of jobs and only cause 5.03% of extra cost. It implies that we can save the processing time of some duplicate tasks. Note that the objective cost is the accumulated tardiness which increases in O(n 2 ) in terms of processing time. Due to the overlap effect, 15% of increment in processing time only leads to 17.55% of increment in tardiness cost. It shows that the uprising costs are effectively suppressed. However, the objective cost heavily depends on due date. Even a tiny change in due date will causes significant increase on tardiness cost. That is, the positive overlap effect cannot effectively neutralize the negative influence caused by early due dates.

VI. CONCLUSION
This paper studies an interesting scheduling problem and makes several contributions. To improve the total tardiness of a game project, we take advantage of the overlap effect. A branch-and-bound algorithm is proposed for providing the optimal schedules and a hybrid genetic algorithm is proposed for generating near-optimal schedules. The experimental results show that the metaheuristic deals with the problem near optimally even when the number of tasks is as high as 480. In the near future, we will consider some special cases of this problem in some game industry. For example, some overlapping jobs may form a directed graph. i.e., partially ordered sequence. Or we can assume the number of overlapping tasks in each job is a power of two. In such special cases, more dominance rules can be developed to improve the scheduling efficiency of a game project.
JEN-YA WANG (Associate Member, IEEE) received the Ph.D. degree in computer science and engineering from National Chung Hsing University, Taiwan, R.O.C., in 2009. He is currently a Professor with the Department of Computer Science and Information Management, Hungkuang University, Taiwan. His research interests include optimization algorithm, database systems, patent search, medical image, and artificial intelligence.
MENG-WEI CHEN received the master's degree in computer science and engineering from National Chung Hsing University, Taiwan, R.O.C., in 2013. He is currently an Advanced Software Engineer with Transcend-info, Taiwan. He is also in charge of developing SSD (solid state drive) software.
KUEN-FANG JEA (Member, IEEE) received the Ph.D. degree in computer science and engineering from the University of Michigan, Ann Arbor, MI, USA, in 1989. He is currently a Professor with the Department of Computer Science and Engineering, National Chung Hsing University, Taiwan, R.O.C. His research interests include database technology, data mining, big data analytics, and cloud computing.