Adaptive Abstraction-Level Conversion Framework for Accelerated Discrete-Event Simulation in Smart Semiconductor Manufacturing

Speeding up the simulation of discrete-event wafer-fabrication models is essential for fast decision-making to handle unexpected events in smart semiconductor manufacturing because decision-parameter optimization requires repeated simulation execution based on the current manufacturing situation. In this paper, we present a runtime abstraction-level conversion approach for discrete-event fab models to gain simulation speedup. During the simulation, if the fab’s machine group model reaches a steady state, then the proposed method attempts to substitute this group model with a mean-delay model (MDM) as a high abstraction level model. The MDM abstracts detailed event-driven operations of subcomponents in the group into an average delay based on the queuing modeling, which can guarantee acceptable accuracy in predicting the performance of steady-state queuing systems. To detect the steadiness, the proposed abstraction-level converter (ALC) observes the queuing parameters of low-level groups to identify the statistical convergence of each group’s work-in-progress (WIP) level. When a group’s WIP level is converged, the output-to-input couplings between the models are revised to change a wafer-lot process flow from the low-level group to a MDM. When the ALC detects lot-arrival changes or any wafer processing status change (e.g., a machine-down), the high-level model is switched back to its corresponding low-level group model. During high-to-low level conversion, the ALC generates dummy wafer-lot events to re-initialize the machine states. The proposed method was applied to various case studies of wafer-fab systems and achieved simulation speedups up to about 4 times with 0.6 to 8.3% accuracy degradations.


I. INTRODUCTION
For smart manufacturing in wafer-fabrication (wafer-fab) systems, physical manufacturing-execution systems (MESs) collaborate with monitoring and controlling facilities (MCFs) to respond to rapid changes in production plans or machines' statuses. In the smart fab, the MCF collects real-time operation data from monitoring MESs and manages The associate editor coordinating the review of this manuscript and approving it for publication was Shouguang Wang . highly accurate fab models through the model-parameter calibration based on the collected data. The fab models' high accuracy guarantees reliable performance expectations, considering the decision candidates of various scenarios when an unexpected circumstance is met. After evaluating the short-or middle-term performance for decision candidates, process engineers in the MCF can efficiently influence the MEMs by revising the production plans or dispatching rules based on optimal decision candidate(s).
There are several studies on simulation-based optimization (SBO) in fab-production planning and scheduling problems [1]- [5]. For the production-plan decision, the previous studies formulate the planning problem using linear programming (LP, or a mixed integer programming) approaches to optimize the release quantities of wafer-lot in a period. Specific parameter values (e.g., cycle time and work-in-progress (WIP) inventory level), which form an LP's objective function and constraints, are obtained from wafer-fab simulation results under a given production scenario; the scenario determines the rest parameter values of the LP's objective function.
For the scheduling optimization, many studies have utilized wafer-fab models to minimize the cycle time (or variance of cycle time, and so on) considering various dispatching rules and lot-release policies. During the optimization, various optimization algorithms are employed, such as genetic algorithm, simulation annealing, ordinal algorithm, and so on. However, the wafer-fab model is designed conventionally by discrete-event (DE) modeling, which enables a process engineer to build a hierarchical fab system using subcomponents whose dynamic and interactional operations are modeled in an event-driven manner [9]- [12].
Some researches have introduced queuing models for the fast evaluation of wafer-fab system performance [6]- [8]. However, queuing models have an accuracy problem because they abstract the production dynamics into static (timeindependent) equilibrium equations. Thus, queuing models have a certain level of limitation in considering the dynamic effects: unexpected events (e.g., machine down or wafer-lot rework), lot-release changes (caused by new demand, production completion, or release policy) in high-mix fabs, and lot-priority changes according to target dispatching rules.
For the simulation speedup of DE fab models, some studies utilize the parallel simulation approach using multiple CPU cores [13], [14]. The parallel simulation of DE fab models can lead the speedup of a single simulation instance for a design-candidate evaluation. During the SBO, however, when executing multiple independent simulation instances to consider the DE models' stochastic properties or to comply parallel-optimization algorithms, the parallel simulation can be inefficient because of the data and time synchronization overhead among running models on different CPU cores, compared to running multiple independent simulation instances on multiple cores [15], [16].
For the SBO's fast response in the smart fab, this paper focuses on accelerating single simulation instances of DE fab models without utilizing multiple CPU cores. For the speedup, some studies have developed approximated analytical models of detailed DE fab models [17]- [20]. The abstracted models have evolved in various forms: a constant delay, probability distributions, or exponential/quantile functions. They are trained using simulation results according to a design of experiments (DOE). Similar to queuing modeling, since the abstracted models do not include dynamic parameters, we cannot evaluate changing factors in the middle of simulation; the changes can be caused by unexpected machine breakdown, production plan changes, etc. Moreover, when we handle unexpected situations that are out of the DOE coverages of the trained model, the model should be retrained using simulation results based on a new scenario-dependent DOE. The low flexibility caused by the static abstraction, which requires the model training before SBO, makes it challenging to utilize the models in managing smart fabs' various situations.
To resolve the static abstraction problem, we propose an adaptive abstraction-level change approach, which chooses active models among differently abstracted models representing equivalent target subsystems in runtime. The approach has been mainly applied to agent-based simulation, which adaptively reduces the number of agents by abstracting groups of spatial agents into meta-agents that subsume individual behaviors when the clustered agents act together for a long time [22]- [24]. The proposed approach is a first attempt to apply the conversion approach to manufacturing-system simulation. The abstraction-conversion condition is strictly relevant to steady states of detailed models.
This paper proposes a framework that abstracts detailed event-driven operations of the fab's machine-group DE models into statistical mean-delay models (MDMs) when the group models turn into a steady state. When a divergence condition is encountered in a steady-state group, then a DE group model becomes active instead of a MDM to simulate group's transient state. The MDMs are designed based on the queuing models, which guarantee high accuracy in steady-state queuing systems. We denote the DE fab model (which serves the microscopic dynamics of lot-process flow) as a low-abstraction-level (or low-level) model and the MDM as a high abstraction-level (high-level) model. We extended our previous method to cover various lot-release and scheduling policies and to provide reliable detection methods of convergence to or divergence from a steady state [21].
In the proposed approach, a machine group's steady state means that the number of waiting jobs (wafer lots) in the group's queue converges. A key component, abstraction-level converter (ALC), collects the queuing-model parameters of a machine group; the parameters include the inter-arrival and queue-waiting times of a probing DE machine-group model. To prevent a hasty decision of local convergence (or divergence) based on queuing-parameter values in a short time period, the averaged queueing-parameter values within a specific time period is treated as single sample observation. The hasty decision can increase the number of abstraction-level conversions even in unsteady states. When the number of observations reaches a defined sample size, the ALC invokes a Kolmogorov-Smirnov (KS) test to determine whether two consecutive samples' queuing-parameter distributions become statistically indistinguishable. If the time invariances of inter-arrival and waiting times are detected, the ALC concludes that the group's WIP level is converged in a steady state based on Little's law.
If a convergence condition of a steady state is confirmed, the ALC informs its MDM of the observed average delays. Then, the ALC requests the simulation engine to change the flow of incoming jobs (wafer-lot events) from the low-to the high-level model by adjusting the output-to-input port connections between related models, as shown in Fig. 1. To detect a divergence condition from the WIP steady state, the ALC monitors (1) a statistical arrival-rate change from the converged distribution or (2) any event arrival that influences the machine's process time. When a divergence condition is met, the ALC restores the low-level model by re-directing the incoming wafer-lot flow. For maintaining consistency between two-level models during the high-to-low level conversion, the ALC generates the dummy events to the low-level model to mimic the lot-processing workload of the high-level model.
Overall, the contributions of our proposed method can be summarized as follows.
• The proposed dynamic switching abstracts the detailed internal operations and event exchanges of steady-state group models into an observed mean delay during the runtime, which results in several speedups with relatively small accuracy reductions.
• We propose a steady-state convergence detection method based on a KS-test and an event-group concept.
In the iterative divergence test, the ALC compares two inter-arrival time distributions: a fixed distribution previously observed at the steady-state detection and current time-varying distribution. Considering the inter-arrival time distribution of a steady state is fixed, we introduce an effective KS-test method utilizing a proposed data management method to reduce the computation overhead.
• To maintain consistency at the high-to-low level conversion, we present the dummy-event concept to mimic the machine and queue busyness of a steady state. The rest of the paper is organized as follows. Section II describes the proposed multi-level modeling approach. Section III details the WIP-convergence detection and abstraction-level transition. Section IV shows the detailed procedures of MDM and coupling changes. Section V shows the experimental results of the proposed approach, and Section VI concludes the paper.

II. ABSTRACTION-LEVEL CONVERSION APPROACH
In this section, we introduce an overview of the DE fab models. Then, we present a perspective of the model-abstraction conversion based on the WIP-level steadiness and the structure of a framework for the adaptive abstraction-level change in simulation runtime.

A. BACKGROUND OF DISCRETE-EVENT MODELING OF WAFER-FAB MANUFACTURING SYSTEMS
Many modelers have attempted to represent the fab system as a set of event-driven model components whose state changes are triggered by events. The primary model components are inventory models and machine-group models (also called workstation models). Still, other models (e.g., the automatic guided vehicle (AGV) models for wafer delivery) can be included in the fab, depending on the modeler's objectives.
The inventory models release blank wafers into machine groups based on the required demands. There are various release policies, which can be categorized into push-based (open-loop) and pull-based (closed-loop) release policies. The push-based policy has a uniform release rate regardless of any current fab status, so it does not make an adaptive adjustment for a certain disturbance (e.g., machine down). The pull-based policy generates the wafer lots based on the WIP-or workload-related feedback, so that the release rates vary dynamically in response to the disturbances. The examples of pull-based policies are constant work in progress (CONWIP) and workload regulation (WR).
The subcomponents of machine-group models are equipment (or machine) models, operator models, and wafer-lot queues. An equipment model performs a specific process according to a recipe of incoming wafer lots that decides the operational parameters of subprocess steps, the setup time, and so on. The process steps of each machine are modeled considering the sequence of subprocess steps, the number of chambers, and the number of processing wafers (or lots).
During the simulation, wafers are processed through a series of machines in different workstations. For a particular process step in a machine group, waiting wafer lots in a group-queue model can be dispatched to numerous machines. There are multiple dispatching rules to decide the priority between waiting lots. The typical examples of dispatching rules are first in first out (FIFO), shortest processing time (SPT), shortest remaining processing time (SRPT), and fluctuation smoothing policies for mean cycle time (FSMCT), and so on.
In high-mix fabs, the machine groups can be defined in detail for each product, and a machine model can be a member of multiple groups. After finishing a process, wafer lots are loaded to a cassette and transported to another equipment group for a subsequent process. The operator models are reliant upon the human resources to load and unload wafers into machines, to run the machines, and sometimes to perform inspection steps. The modeler can assign different types of operator models, which supervise overall work areas and change the production plans.
Depending on the modeling details of the composed models, various types of events can be employed. A wafer-lot event (e ), among various events, is a key event representing VOLUME 8, 2020 a processing flow and is utilized to exchange wafer lots among machine-group models. The e events can contain the information of the lot name, due date, the number of wafers, a current and required next processing step and the steps' recipes, and so on.

B. PROPOSED ABSTRACTION-LEVEL CONVERSION CONCEPT
During the simulation of DE fab models, wafer-lot events (e ) flow across multiple machine groups according to the required process sequence in their machines. Concerning the level of detail, process engineers can variously design a machine-group model using subcomponent models, such as a group queue, operators, tools, and so on. The subcomponent models can exchange various types of events, including e , for the lot dispatching to available machines, lot loading/unloading process, subchamber operations, and so on.
In real fab, the machine groups of target wafer-fab systems typically start with a transient state. Then, they gradually become a stable state by adjusting the lot-release intervals to prevent the escalation of WIP levels in each group's queue. The stable groups can also turn into a transient state when the lot arrival-rate changes are caused by the start or completion of a product manufacturing in a high-mix fab, the machines or operators' scheduled or unexpected status changes, and so on.
When a group operates in a transient state, the queuing performance, such as the time spent in the group, the average number of wafer lots in the group (called a group's WIP), and the machine utilization, vary abruptly over time; whereas, in a steady state, the average queuing performance of a group reaches stable equilibrium values under the converged rates of input e arrivals.
DE modeling is known as an effective method to reproduce the behaviors of transient and steady-state machine groups. The proposed approach adopts the dynamic partial replacement of an active steady-state group model in runtime, by changing the DE machine-group model to an abstracted MDM, as shown in Fig. 2. The conversion abstracts the detailed operations and event interactions of the group's subcomponents into a constant delay model. We denote DE machine-group models as low abstraction level models and MDMs as high abstraction level models.
For the detection of a group's steady state, the queuing parameters of the inter-arrival times (1/λ) and waiting times (t w ) of the wafer lots are observed. If the observed 1/λ and t w distributions of two consecutive samples becomes statistically time-invariant, then we can confirm that the queue length also converges toŵ based on Little's law:ŵ =λ ·t w , whereλ andt w are the means of the λ and t w observations. The detailed detection method of the steady-state condition (which is the convergence of t w and 1/λ) is discussed in Sec. III-A.
Following the queuing modeling, the proposed approach considers the overall time spent (or delay, t d ) in the group as the sum of t w and service time (t s ); a lot's t w is the difference  between a group-arrival time (t a ) at the queue model and a queue-departure time, and the t s is a remaining time before leaving the machine group (as t d −t w ). The main portion of a lot's t s is a recipe-dependent wafer processing time in a machine. The processing times between wafer lots following the same recipe do not show significant differences. Based on the t s property having a small amount of variance, if a group's t w distributions of consecutive samples converge, the overall t d is assumed to converge.
The dispatching rules determine the priorities of waiting for lots based on their specific parameters. For example, the FIFO-dispatching rule concerns the arrival time, and the SPT rule deals with the machine's process time. Generally, the parameters can be classified into (1) constant parameters (which are assigned before simulation) and (2) variable parameters (which change in runtime), as shown in Fig. 3(a). We denote that the lots having the same composite values of parameters related to the production plan and fabrication process are called the same type of lots.
The sub-parameters of the lot type may be different depending on the target dispatching rule; nevertheless, a recipe is an essential sub-parameter of the lot types. In a steady-state group, the t d (= t w + t s ) distributions can significantly vary depending on the lot type because of the priority difference (affecting t w ) and recipe difference (involved in t s ). Thus, a mean delay is derived by the lot type and is differently applied to input lots based on their lot types.
In a steady-state machine group, the t w of a new wafer lot is highly related to a relative priority rank and the proportion of each type in the group queue. To maintain the steady state of t w , the input wafer lots should be time invariant, which requires incoming lots' λ by lot type to vary within some statistical margins of the observed λ distribution at the steady-state detection. For the calculation of 1/λ (or λ) by the lot type, a lot's 1/λ is defined as the difference between the lot's arrival time and a same-type lot's previous arrival time. The detailed divergence detection method from a steady-state 1/λ distribution is described in Sec. III-A. The lot proportions by types in a steady-state machine group are represented as shown in Fig. 3 An event that changes a machine or an operator's mode (such as scheduled machine maintenance or operator's rest) influences the group's lot process capability; in this case, a mean value of t d at a steady state needs a re-observation. We define the mode-change events (which influence the process capability) as e m . Thus, the overall divergence conditions in a steady-state group are (1) the 1/λ deviation from a statistical margin of steady-state distribution, (2) the e m arrival, and (3) an unobserved type of wafer-lot arrival.

C. OVERVIEW OF PROPOSED FRAMEWORK
For the abstraction-level transition, we propose a framework whose structure is illustrated in Fig. 4. In the framework, each machine group consists of a DE machine-group model and an additional pair of an abstraction-level converter (ALC) and a mean-delay model (MDM).
At the low level, the ALC monitors the outgoing e of its low-level DE group model to extract the queuing parameters for a WIP convergence test. When a steady-state convergence test is passed, the ALC informs its MDM about the observed average t d of each lot type. At the high level, the ALC monitors the incoming events of e from the previous machine groups to detect a steady-state divergence condition and relays e to the MDM. When an abstraction level is converted to the low-level, the ALC generates dummy events of e to make workload consistent between the two-level models.
For the framework, we assume that the target DE simulator supports output-to-input port connections between models, which represent event-flowing paths. The model connection via port interfaces ensures the modularity, which helps the modeler to construct various target fab systems using modular models. The abstraction level of a group determines e destination from the previous machine between the low-level group model and the ALC. If the input e port of a low-level group (or an ALC) is disconnected, the model becomes deactivated because of the absence of future e events. When the target DE fab model adopts a pull-based lot release policy, if a DE machine group produces feedbacks (as a bottleneck group) to an inventory model, the group's ALC generates the feedback (as an event, e fb ) on behalf of the DE group at the high level.
To enable the legacy DE machine-group models to operate with ALCs and MDMs, the DE group models need to be revised to support the following requirements.
• For every e departing a current machine group, the following information must be generated: 1/λ, t w , and t d by lot type (see Sec. II-B).
• The ALC and DE machine-group model should share the latest group-arrival time (t a ) by type to derive the 1/λ of next-arriving lot after the abstraction level is changed.
• The DE machine-group model needs to support the dummy-e processing. The dummy e has no difference from typical e , except that they are discarded after being processed by machines without being propagated to any other group.
• Some mode-change events (e m ) that influence the lot-processing time (such as machine-down/up event) are handled globally based on a certain probability or scheduled plan.
• When a DE machine-group model sends a processingended e to the next machine group, the component sends e through the event's output port, without indicating event's destination explicitly. The simulation engine determines the e 's destination based on the output-toinput port connection information. The ALC can request the engine to revise the connection information in runtime.
• When a DE machine-group model generates the feedback event of e fb to inventory models, the DE group model should provide a query interface for its ALC to observe a state associated with a lot pulling condition (e.g., WIP or workload). At the low-to-high level conversion, the lot-pulling ALC initializes and updates the state to check whether the pulling condition is satisfied and generate the e fb events instead of the low-level model.
When exporting e to the next machine groups, the MDM should revise e for the next groups to receive equivalent data of e (e.g., required processing step) regardless of the sender's abstraction level. When a bottleneck group is required to generate e fb , the group's ALC checks a lot-pulling condition based on the current working status of both high-and low-level model at the high level. The ALC obtains the low-level model's current working status using the extended query interface. VOLUME 8, 2020 When some working (or active) machine groups, which are currently processing lots, share a machine, the abstraction levels of those groups need to be synchronized due to the following example: a shared machine receives the lots from two groups' queues. If the two working groups' abstraction levels are different, another low-level group can fully occupy the shared low-level machine because the high-level group no longer provides the lots to the machine.
Considering the dynamic e -flow change and other purposes, such as the abstraction-level synchronization among machine-sharing groups and the queuing-parameter sampling for the convergence and divergence test, we defined the system events for the ALC as follows.
Definition 1: The system events of ALC consist of e pc , e as , and e s where • e pc is an event for the simulation engine to change the output-to-input connection between models, • e as is an event for the abstraction-level synchronization among machine groups that share machine(s), and • e s is an event to schedule the queuing-parameter sampling and steady-state convergence (or divergence) test of machine groups.
The detailed detection mechanisms of steady-state convergence or divergence are described in the following section.

III. CONVERGENCE AND DIVERGENCE CONDITIONS FOR STEADY STATE OF MACHINE GROUPS A. CONVERGENCE DETECTION TOWARD STEADY STATE AT LOW ABSTRACTION LEVEL
At the low level, ALCs receive outgoing events of e from their current groups. They extract the information of 1/λ, t w , t d , and lot-type values from the event. ALCs can collect other information, such as recipe, for the dummy-e generation. ALCs attempt to detect whether the WIP levels in their machine groups reach a steady state based on the queuing-parameter observations using two consecutive samples.
Typical DE fab models use random variables that follow certain distributions to define model's operation parameters, such as a wafer-processing time of tool models or wafer-lifting (or break) time of operator models. Thus, each queuing parameter of lots can vary abruptly during simulation, even in a steady state. To avoid the hasty decision of local convergences (or divergences) using the observed queuing-parameter values of individual lots for a short-time period, which can result in unnecessary abstraction-level transitions, we introduce a concept of the e group, as shown in Fig. 5. ALCs consider arrived events of e within a defined period as the members of an e group. Each e group's queueing parameters are the averages of individual member lots' parameters and derived using the following procedures.
The queuing parameters of arrived e events are stored using a local list (L g ). We denote the 1/λ, t w , and t d of a i-type e (e ,i ) as 1/λ i , t w,i , and t d,i , respectively. When an e arrives at the empty L g , the ALC schedules a sampling task to observe the current e group's queuing parameters by sending a delayed event of e s to itself. When receives e s at the defined time bound of e group, the ALC calculates averaged values of 1/λ, t w , and t d , as well as other parameters: the number of e arrivals by each lot type, the e -group order, and recipe. The other parameters are utilized to derive the mean delays, generate the dummy events of e , and prepare for the divergence test at the high-level conversion. After deriving the current e group's observations by lot types, the ALC stores the observations in the sample list.
When the number of observed e groups reaches a defined size, the ALC starts a comparison of the current sample's 1/λ and t w distributions with those of the previous sample. In the proposed approach, we employed a two-sample KS test to determine the samples' statistical similarity in terms of means and variances.
The KS test is nonparametric, making no assumptions about distribution types of 1/λ and t w . For the KS test, the ALC should prepare 1/λ and t w cumulative distribution functions (CDFs), using the previous and current samples' observations. For formation into 1/λ or t w CDFs, each sample's observations having different lot types should be sorted. To sort the values of 1/λ (or t w ) of different lot types, we presume that there is precedence between lot types, and the lot type has higher priority than the queueing parameter.
Given the previous and current samples' distributions of 1/λ i (or t w,i ), whose sizes are denoted as n and m, the KS statistic (D n,m ) is calculated using the following equation.
where F 1 and F 2 are the CDFs of two samples. The D n,m means that the maximum difference between two CDFs and has a negative relationship with the p-value; if D n,m becomes zero, the p-value becomes one, and a p-value closing to one means that the two distributions statistically match. In the proposed framework, we need to set the required p-value as a guidance of an allowable similarity (e.g., 0.8) of the two consecutive samples.
Based on a given p-value and an approximated sample size, which is n·m/(n+m) , the derivation methods of a maximum bound of D n,m have been studies in [25], [26]. The allowable maximum D n,m is denoted as d max . The derived values of D n,m less than d max guarantee the required statistical similarity. For example, when the sizes of consecutive samples of After comparing two D n,m s of λ and t w consecutive distributions with d max , the ALC confirms that its machine group runs in steady state. If confirmed, the ALC derives a high-level state (s h ) for the MDM's operation and dummy-e generation at the subsequent high-to-level level conversion. The detailed substates of s h are defined as follows.

B. DIVERGENCE DETECTION AT HIGH ABSTRACTION LEVEL
This section describes the detailed test method for detecting a 1/λ deviation from a statistical margin, which is one of the divergence conditions (as discussed in Sec. II-B). To maintain a group's high level, inter-arrival times of e should remain within the statistical margin of a steady state's 1/λ distribution observed at the low-to-high level transition.
To check the time-invariant of 1/λ, whenever an e arrives, the ALC saves the e 's 1/λ i in the local list (L g ) to derive the averaged 1/λ i of a current e group, as shown in Fig. 6. After probing the 1/λ i of the input e , the ALC relays the event to its MDM. When receives a sampling event of e s at the time bound of the current e group, the ALC inserts the current e -group observations to the current-sample list. Then, the observations of the oldest e group leave to maintain the defined number of e groups, as shown in Fig. 6. The defined e -group number is the same as the number of e -group sampling for the convergence test at the low level. After updates the current sample, the ALC performs a divergence test by examining the current 1/λ i distribution.
For the divergence test, the ALC manages an additional list (L s ) in runtime, as shown in Fig. 6. This list of L s is proposed to efficiently measure the CDF distance without redundant comparison between each probability of two consecutive samples' CDFs in iterative KS tests. The nodes in the L s are sorted by their 1/λ i values.
During iterative divergence tests, the previous sample, which is the 1/λ i distribution of a steady state as a criterion for determining the current sample, is fixed after the high-level conversion. Whereas, the current sample's 1/λ i CDF is repeatedly changing whenever new observations occur. We denote the 1/λ i CDF of a steady-state sample as F s and the other time-varying current sample as F c . The proposed method focuses on partial CDF distances (|F s (1/λ i ) − F c (1/λ i )|) caused by changed observations instead of comparing all CDF distances.
To trace the CDF-distance change, we define the nodes in L s as follows. The L s 's node is divided into two types: a fixed node containing any 1/λ i of the previous steady-state sample and a removable node including an exclusive λ i value in the current VOLUME 8, 2020 sample. The current sample's size (m) can change when the sizes of the leaving and adding observations are mismatched. Using a node's F s (1/λ i ) and c in L c , the CDF distance can be calculated as |F s (1/λ i ) − c /m)|. The initial list of L s at the high-level transition and a node-insertion example are illustrated in Fig. 7(a).
Before testing a divergence, the ALC should prepare a temporal list of λ i , n , where n means the observation-number change. The tuple list is sorted by λ i and denoted as L c . If a λ i value leaves as an oldest e -group observation, n is set to −1. If a λ i value is included in the oldest and newest e groups, n is zero. However, tuple elements whose n is zero are discarded without being added to the L c , as shown in Fig. 7(b).
After initializing m, d max , L c , and L s , the detailed procedures to detect a steady-state divergence are described in Alg. 1.
During the divergence test, each node of λ i , n in the L c is processed one by one. If n of a current node of the L c is −1, then the c values of all the L s 's influenced nodes, whose 1/λ i is equal to or greater than the current L c node's 1/λ, decrease by one because of the property of cumulative sum. Conversely, when the current L c node's n is 1, then the c values of L s influenced nodes increase by one. The proposed test method traverses each node in L s to revise its c based on the assembled n influences ( c ). The ALC traverses the L s 's nodes to update their c amounts using the c .
After updating the c of a current L s node (node 0 ), if the node 0 's CDF distance is greater than d max , the ALC confirms that the machine group turns into a unsteady state. If the node 0 's c (c 0 ) becomes zero and its flag is true, then the node is removed for fast-node traversal at next divergence tests. When any node for a new 1/λ i value is not found, a new node is added based on the previous node's F s (1/λ i ) and c (F s (1/λ − i ) and c − ). If the previous node is not found (when node 0 is the head node of L s ), then the F s (1/λ − i ) and c − are zero.

IV. ABSTRACTION-LEVEL CONVERTER AND DYNAMIC PORT-COUPLING CHANGE
This section introduces the detailed operation procedures of ALC (including the dummy-e generation), based on the described steady-state convergence and divergence detection method. We also describe the technique of output-to-input port-connection change between models.

A. ABSTRACTION-LEVEL CONVERTER
The ALC, a key component for the dynamic change in the abstraction levels, is defined as follows. • s fb is a user-custom state to support a target lot-release policy at the high level; • {m alc } is the set of other equipment groups' ALCs that share a machine with the current group; • t idle is a user-defined threshold symbolizing a long-time span; • ψ l2h : X × S a → S a × B s × S h × Y b h is the low-tohigh-level conversion function, X is the input event set of {e , e m , e as , e s }, and S h is the set of s h ; Y b h is zero, one, or multiple events of e s , e pc , and e as ; l is the high-to-low level conversion function; Y b l is zero, one, or multiple events of e pc , e as , and dummy e ; • η i /η t is a user-custom function to initialize/terminate a lot-pulling operation of bottleneck machine group; and fb is a user-custom lot-release feedback generator, and Y fb is zero or one event of e fb .
At the low level, an ALC's s a is one of low and ready, and the initial s a is low. The ready state means that the current group runs in a steady state, while an active ALC in {m alc } operates in a low state. The active ALC means that a wafer lot recently arrived in its equipment group and can be determined based on the difference between the current time (t c ) and the latest wafer-lot arrival time (sup{t l,i }); the difference is less than t idle . We denote that an ALC is idle when the difference between t c and sup{t l,i } is larger than or equal to t idle . For an ALC to be the high state, its equipment group and others in {m alc } should operate in steady states or be idle.
The s h is updated when detecting a steady state (as discussed in Sec. III-A). As mentioned in Sec. II-C, low-level machine-group models and their ALCs are required to share {t l,i } for the 1/λ i derivation, after the abstraction-level change. The b s is a state to check whether an e s event for the current e -group sampling was previously generated in runtime. The s fb is a model-specific state to calculate a lot-release condition when the target fab model follows a pull-based lotrelease policy. According to the target wafer-fab system and production scenario, the {m alc } is initialized as ∅ or some ACLs of other machine groups sharing any machine with the current group.
When an event arrives at the ALC at the low level, the ALC calls ψ l2h to handle the input event, which is e , e m , e as , or e s . At the high level, the ALC invokes ψ h2l to serve incoming events of e . At the high-level conversion, when a machine group generates feedback to an inventory model, the machine group's ALC initializes s fb by η i to obtain a current working state from the ALC's low-level model. If a bottleneck machine group follows the CONWIP policy, η i can update the s fb based on the current WIP level of the machine group. Otherwise, If a bottleneck group adopts the WP policy, η i can initialize the s fb based on a steady-state workload (that is the sum of remaining processing times of staying lots) of the machine group.
The high-level ALC can produce the feedback events of e fb to an inventory model through the η fb execution, instead of its low-level bottleneck group model. For the CONWIP policy, a bottleneck machine-group model generates an e fb event when a lot is dispatched to a machine. Likewise, based on an arrived e ,i , the group's η fb can be designed to generate an e fb event having thet w,i delay (that is the waiting time to be dispatched). For the WR policy, a bottleneck machine-group model generates feedback when the current workload becomes less than a defined threshold. Similarly, the η fb for the WR policy can be implemented to predict a future threshold-crossing (FTC) time under the assumption that no further lot events arrive. Then, the η fb schedules a delayed event of e fb for the delivery to a target inventory model at the predicted FTC time. When a lot arrives within the FTC time, the η fb nullifies the previously scheduled e fb and generates a new e fb . The s h should include the information of the predicted TC time and the scheduled e fb .
At the high-to-low level conversion, the ALC executes η t to cancel any previously scheduled events of e fb . In the following sections, we explain the detailed procedures of ψ l2h and ψ h2l .

B. EVENT-HANDLING PROCEDURES AT LOW-LEVEL
Based on the proposed convergence test, the overall event-handling procedures of ψ l2h for high-level conversion are described in Alg. 2.
The ALC receives each event through its associated input port. When s a is low, if an e arrives, the ALC saves e 's queuing and other related parameters using the local list (L g ). Then, the ALC schedules the operation of e -group sampling at the group's time bound through sending e s to itself, if the sampling is not scheduled. The sampling reservation sets the b s as true.
When receives e s , the ALC derive the e -group observations and adds the observations to the current sample list. After the insertion at the low state, if the previous and current sample become full (that means that the number of observed e groups in the current sample reaches a user-defined threshold), the ALC performs the KS convergence tests using the 1/λ and t w CDFs of the two samples.
If a steady state is confirmed, the ALC initializes the s h for the mean-delay update and dummy-e generation and L a for the next divergence test. The s a is updated as high or ready depending other ALCs' states in {m alc }. If all ALCs in {m alc } are ready or idle (t c − {t l,i } ≥ t idle ), s a is updated to high. The ALC sends e as to the other ALCs in {m alc } to synchronize the abstraction level and informs the MDM of {t d,i }. When the ALC's machine group is a lot-pulling bottleneck group, the ALC calls its η i to initialize the s fb .
For the e -path changes for the high-level model activation, it exports e pc for the simulation engine to revise the destination of e departing from previous machine groups. The detailed coupling-change mechanism is discussed in Sec. 9. For the next divergence test, the ALC maintains the current sample data but clears the previous sample data. If the convergence test is failed, the ALC empties the current sample after copying the current sample to the previous one.
When s a is ready, the ALC waits for all other ALCs in {m alc } to operate in steady or idle states. The ready ALC executes the 1/λ divergence test using the 1/λ of the current e group. If the 1/λ is diverged, the ALC sets s a to low.
When receives e as , if s a is ready, then the ALC changes its abstraction level to high. However, if the ALC's s a is low, the ALC updates the s a to high so that the abstraction levels of the ALC and its machine-sharing ALCs are synchronized to high. If it receives e m , the ALC cleans up the two consecutive samples and clears the scheduled task and data of e -group sampling after setting the s a as low.

C. EVENT-HANDLING PROCEDURES AT HIGH ABSTRACTION LEVEL
At the high level, the ALC attempts to evaluate the WIP-level divergence conditions, which are (1) observing the 1/λ divergence from its steady-state distributions, (2) receiving a new-type e , (3) receiving e m , and (4) receiving e as . If one of the conditions is met, the ALC generates the dummy events of e to create a similar workload that corresponds to the group's steady state before the low-level transition.
At the current simulation time (t c ), when a divergence condition is detected, the number of i-th priority lots (n h ,i ) in the group is predicted based on the elapsed time (t e,i ) between t c and {t l,i }, as follows.
n h ,i = max(0,n ,i − t e,i ·λ i ) In a steady state, to maintain the n ,i as the observedn ,i (in s h ), the departure rate should remain statistically similar tô λ i , as shown in Fig. 8(a). We also expect that the 1/λ i values of incoming e ,i are similar to the 1/λ i in the steady state. When a divergence condition is detected at t c , if t e,i (that is the newest 1/λ) is larger than 1/λ i , the n h ,i decreases from n ,i by the number of times that 1/λ i has elapsed, as shown in Fig. 8(b). If the t e,i reaches zero, the n h ,i becomesn ,i . After the high-level transition, an ALC's DE group model can still have waiting and processing lots. Based on the interval between t ac (in s h ) and t c , the approximate number of remaining wafer lots in the low-level machine-group model (n l ,i ) can be represented as follows.
n l ,i = max(0,n ,i − (t c − t ac ) ·λ i ). Considering the left lots in the low-level model, the overall number of dummy i-type e (n d ( ,i) ) can be represented as n h ,i − n l ,i + 0.5 (for the integer number), and the total number of dummy e is i n d ,i . Based on the proposed dummy-event generation method, the overall procedures for ψ h2l are described in Alg. 3.
When a low-level idle ALC receives e as from another ALC in {m alc }, the e as -received ALC changes its abstraction level to high to synchronize with other machine sharing ALCs (see Sec. IV-B). If the high-level idle group receives an e , the abstraction level of the group becomes low after relaying the e to the low-level machine-group model.
When a high-level active ALC receives an e , if the type of e is new, the ALC changes its level through the low-level conversion operations (as Line No. from 8 to 13 in Alg. 3). Otherwise, the ALC save e 's 1/λ value in the list of L g to  28 turn abstraction level to low as Line No. 8 to 13 derive the current e group's 1/λ observations. Then, the ALC schedules the e s delivery to itself at the time bound of the current e group for 1/λ sampling and a divergence test, and relays the arrived e to its MDM.
When the ALC's machine group is a lot-pulling bottleneck group, the ALC executes user-custom η fb function to control the lot release. During the η fb execution, the ALC references and updates the current working amount (s h ) based on the input e ,i and can schedules the e fb event for the delivery to inventory models.
When the ALC receives an e s , the ALC updates the L s (see Sec. III-B) and current sample using the current e group's 1/λ observations, then executes the proposed divergence test.
When the ALC confirms the 1/λ divergence or receives an e m /e as , the ALC performs the low-level conversion operations. If there exists a future event of e s , then the ALC nullifies the event to be deactivated and clears the list of L g .

D. MEAN-DELAY MODEL AND DYNAMIC WAFER-LOT FLOW CHANGE
At the low-to-high level conversion, the ALC informs the MDM of {t d,i }. Instead of a detailed simulation of wafer-fabrication processing, the ALC relays input e events to the MDM. The MDM imitate the data of relayed e events to be indistinguishable from output e events of corresponding low-level model. Then, the MDM exports the e to the next group after manipulating the event's arrival time so that the next group will receive the event after the mean delay.
In Sec. II-C, we assume that the target DE simulator supports connection (or coupling) between the model's output and input ports. When a model exports an event through an output, the event is scheduled in the engine's event-list and waits for its turn to be processed, as shown in Fig. 9. The simulation engine delivers the first event in the event list to its destination models to influence the models. For the delivery, the engine characterizes the event's destinations based on the output-port terminals of the event's source model, and the output port's terminals are managed by a map consisting of terminal lists for each output port. From the implementation perspective, when an output-to-input coupling changes, the terminal lists of the output port must be changed accordingly.
In the proposed framework, we employ a coupling handler, which is a loadable engine module, before the start of simulation and can access the terminal lists of the event-source model during runtime. When an e pc is ready to be processed, the engine delivers the event to the coupling handler to achieve the coupling-change requirements of the e pc . The e pc has a higher priority than other types of events, which leads it to be processed preferentially, even in other events exist in the event list.
The low-to-high level conversion of an ALC enables the e and e fb output of its low-level machine-group model to be disconnected. In contrast, the e destinations of the previous machine groups (that handle the last processing step of the current group) are switched from the low-level group model to the ALC. The destinations of ALC's e fb output are reset to inventory models.
To change the e -and e fb -path at the high-level conversion, when the coupling handler receives an e pc event from an ALC, the handler updates the terminal lists of related models' e output ports by • removing the ALC's low-level machine-group model from the terminal lists of the previous groups' e outputs, • adding the ALC to the terminal lists of the previous groups' e outputs, • removing the ALC from the terminal list of its current group's e output, • removing the low-level machine-group model from the terminal lists of the previous groups' e outputs, and VOLUME 8, 2020  • emptying the terminal list of the low-level model's e fs output. For the low-level conversion, when the coupling handler receives an e pc event, the handler restores the terminal lists of related models' e or e fs outputs to the previous lists before the high-level transition.

V. EXPERIMENTATION
We designed a test model (denoted as the case-1 model) to manifest the relationship between the WIP convergence and the KS-test results. The structure of the fab model is illustrated in Fig. 10. There are five machine groups, each having a wafer-lot queue, a machine, and an operator. The inventory model supplies wafer lots to the first machine group, EG_A, for the productions of two different products. Each product requires a different routing path: the order of EG_A, EG_B, EG_C, and EG_D, or the sequence of EG_A, EG_B, EG_C, and EG_E. Each group has a single machine. The machine of EG_C is a batch-processing unit representing a furnace and has a relatively long processing time. We applied two lot-release policies to the case-1 model: uniform job release and CONWIP.
Under the defined processing time of each machine, when supplying the lot at the uniform rate, a high-rate release from an inventory enables the EG_A and EG_C groups to operate in an unsteady state, in which their group queue lengths increase monotonically (see Fig. 11(a)). In contrast, the EG_B, EG_D, and EG_E groups operate in a steady state. For the KS-test-based convergence check, the number of each sample's probing e groups is defined as 30, and the time period of each e group is 20 minutes. The defined p-value for the d max decision is 0.7.
The increasing queue lengths of EG_A and EG_C groups affect the waiting time (t w ), which causes higher D n,m values of t w -convergence test results than the values of other steady groups, as shown in Fig. 11(c). The other groups are approved as steady-state groups at an early simulation stage, and each model of the groups mainly operates at the high abstraction level. After this approval, the KS test of EG_B is no longer executed until a low-level transition occurs caused by scheduled machine maintenance (for one hour). During the machine maintenance, due to a number of pending lots in EG_C's queue, EG_C supplies the lots to EG_D and EG_E in the same pattern as before, and EG_C' queue length is reduced. The 1/λ of each group's input e events follows a Poisson distribution having an average interval, so the D n,m values of the 1/λ-convergence-test results are relatively smaller than those of t w -test results, as shown in Fig. 11(b)(c).
When releasing wafer lots based on the CONWIP policy, the inventory model generates lots based on the required WIP level (defined as 12) of the bottleneck group of EG_C. During the simulation, the overall queue lengths vary within converged regions, as shown in Fig. 12(a). After ten simulation days, all machine group models except EG_C operate at the high level until the scheduled tool down occurs, as displayed in Fig. 12(b) and (c). The EG_C gathers sample observations after the first test failure (around 19 days), and its high-level model is active after 44 days.
We designed another case (denoted as the case-2 model) employing a majority of machines to evaluate the simulation speedup and accuracy change of the proposed approach, as shown in Fig. 13. The model has four types of machine groups for general process steps: deposition, patterning, etching, and chemical-mechanical polishing. The detailed subprocesses in each machine can differ depending on the lot's product recipe. The machines have different values of processing times according to the recipe and their stochastic properties. We referenced the timing parameters described in James's book [27]. There are five types of products, each requiring a different number of mask layers (such as 4, 8, 12, 16, or 20). As the required layer number increases, the number of times that the previous group must be revisited increases (e.g., deposition -> patterning -> etching -> deposition · · · ).
In these experiments, we consider different types of lot-release policies and dispatching rules, as follows.
The inventory models following the uniform lot-release policy produce blank wafer lots at defined periods with stochastic variations. In the given fab model, the first   deposition machine group is the critical bottleneck group, so the first group monitors the WIP levels of each product for the CONWIP or the workload for the WR policy. During the CONWIP simulation, considering the re-entrance, the bottleneck group requests the lot release when a dispatched lot no longer visits the group. In the WR-policy simulation, the bottleneck group measures the current workload considering the remaining processing times at the future reentrances to prevent the WIP-level divergence.
The machine-group queue model has specific comparator functions for target dispatching rules; the priority group queues contain sorted wafer lots using the comparator function corresponding to an employing dispatching rule. Depending on the recipe, the lot experiences a different processing time in the machines, and each product has a different due date.
To evaluate the accuracy of the proposed method, we defined the accuracy index as the average of each lot's cycle-time difference between low-level-only and multi-level simulations. A lot's cycle time is the difference between the generation time and the completion time of all steps of wafer processing. During the experiments, we utilized the same sampling parameters and p-value as the case-1 model experiments. We measured the simulation execution times using an experimental machine with an Intel R Xeon R E5-1650 3.6GHz CPU and 32 GB of memory.
The overall simulation results of the uniform release policy at various injection rates are displayed in Fig. 14. The simulation results include the execution time, high-abstractionlevel duration ratio of groups (over simulation time), ALC execution time at the low and high abstraction level, number of the abstraction level changes, number of generated dummy e events and overall speedup of multi-level simulation (compared to the low-level only simulation). The main overhead of the ALC execution at the low level is the lot sampling and convergence test. The high-level ALC computation overhead is the sum of processing wafer lots based on observed mean delays, divergence test, and additional lot-pulling executions for the CONWIP and WR policy. Fig. 14(a) shows the simulation results of the uniform lot release at different rates, which produces five wafer lots with exponential distributions having average intervals of 150, 200, 250, and 300 minutes. As the release rate decreases (or the lot-release interval increases), WIP levels of all machine groups are more likely to vary in the converged regions. This helps the machine groups to be confirmed as steady-state groups more easily. The easier steady-state confirmation increases the high-level duration ratio and the execution time of high-level ALC. Conversely, the execution time of low-level ALCs decreases. The overhead of the low-level ALC's sampling and test tasks does not significantly degrade the simulation performance due to its small amounts (that distributed from 7.5 to 32 milliseconds).  When the five-lot release intervals are over 200 for the 4000/5000/6000-lot productions, the overall speedups are distributed from 2.47 to 3.04. When the uniform lot-release rate is 5/150 (whose unit is the lot number per minute [1/min]) for the 5000-and 6000-lot productions, the proposed multi-level simulation can be slower than the lowlevel-only simulation because of an increased overhead of priority-queue models in highly congested groups.
Even though a particular group's WIP increases or decreases macroscopically, a local WIP convergence can make the group operate at the high level for a short time. When switching back to the low level, a low-level queue model can execute the comparisons between newly-received dummy e events and remaining lots (that can be more than 500 at the 5/150 rate). The additional queue overhead results in worse simulation performance of the mixed-level simulation than low-level only simulation. However, the highinjection cases are rare because the machine-group queues of general fabs are managed not to be diverged, so corresponding fab models do not encounter the rare cases having extremely large queues.
At the high-release rates from 5/130 to 5/190, the simulation results according to the p-value change are illustrated in Fig. 14(b). The smaller p-value enables machine groups to be confirmed as steady-state groups more easily. Thus, the overall speedup increases, while the accuracy becomes degraded. The larger p-value makes it difficult for unsteady machine groups to operate at the high level. Still, it quickly makes steady-state groups run at the low abstraction level when a small amount of 1/λ changes occur because of the stochastic property of the target DE model.
The simulation results for various lot-release and dispatching policies are displayed in Fig. 15. The overall speedups vary between 1.76 and 4.08, with an accuracy expense from 5% to 0.6%. Compared to other lot-release policies, the inventory models following the CONWIP policy generate lots in more irregular patterns having larger variation, which results in lower ratios of high-level duration and speedups. In the CONWIP simulation, parts of adaptive abstraction-change examples of the first machine group (EG_DEP) under the irregular lot release are illustrated in Fig. 16. The intervals between the last convergence tests (of d max of t w and 1/λ) and dummy-e generations are the durations of the high abstraction level.

VI. CONCLUSION
To speed up a discrete-event wafer-fab model, we proposed a two-level modeling and simulation approach that adjusts the abstraction level of machine groups during runtime. The proposed method attempts to simulate the wafer-lot processing in a steady-state machine group based on observed mean-delay information (as a high-level operation) instead of the detailed computation of the low-level DE group model. To detect a steady state of a machine group, the ALC collects the queuing-model parameter values from its low-level machine group's outgoing wafer-lot events. When the number of parameter sampling reaches a defined count, the ALC checks whether the arrival-rate and waiting-time distributions are statistically indistinguishable. If the queuing-parameter distributions of consecutive samples become converged, then the high-level model handles input wafer lots based on observed mean spent time. At the high level, the ALC analyzes the inter-arrival time of input wafer-lot event deviates from the observed distribution or monitors the arrival of an operationstatus-change event (such as machine down). If a WIP divergence condition is satisfied, the ALC reactivates the low-level model after generating the dummy events to ensure a workload consistency between low-level and high-level models. The proposed approach shows speedup up to 4.08 times when the fab is managed to be stable, with 0.6 to 8.3% accuracy reduction under given scenarios.