Skip to Main Content
This paper presents initial results on the application of a simulation-based Approximate Dynamic Programming (ADP) approach for the optimization of Preventive Maintenance (PM) scheduling decisions in semiconductor manufacturing systems. In particular, the so-called Intel Mini-Fab benchmark is used as an illustrative example. Our approach is based on an actor-critic architecture in which the critic corresponds to a parametric estimation of the optimal differential cost for an infinite horizon average cost criterion-based optimization model. The actor is defined using post-decision state variables and a heuristic approach. Our algorithm also utilizes a temporal-difference learning algorithm with a gradient descent approach to tune a linear parametric structure that approximates the optimal differential cost function. Simulation experiments validated the applicability of our algorithm in the Intel Mini-Fab by showing a significant reduction in average cycle time when compared with a series of fixed baseline PM schedules.