Skip to Main Content
Intervention in gene regulatory networks in the context of Markov decision processes has usually involved finding an optimal one-transition policy, where a decision is made at every transition whether or not to apply treatment. In an effort to model dosing constraint, a cyclic approach to intervention has previously been proposed in which there is a sequence of treatment windows and treatment is allowed only at the beginning of each window. This protocol ignores two practical aspects of therapy. First, a treatment typically has some duration of action: a drug will be effective for some period, after which there can be a recovery phase. This, too, might involve a cyclic protocol; however, in practice, a physician might monitor a patient at every stage and decide whether to apply treatment, and if treatment is applied, then the patient will be under the influence of the drug for some duration, followed by a recovery period. This results in an acyclic protocol. In this paper we take a unified approach to both cyclic and acyclic control with duration of effectiveness by placing the problem in the general framework of multiperiod decision epochs with infinite horizon discounting cost. The time interval between successive decision epochs can have multiple time units, where given the current state and the action taken, there is a joint probability distribution defined for the next state and the time when the next decision epoch will be called. Optimal control policies are derived, synthetic networks are used to investigate the properties of both cyclic and acyclic interventions with fixed-duration of effectiveness, and the methodology is applied to a mutated mammalian cell-cycle network.