Skip to Main Content
Dynamic programming (DP) is a principled way to design optimal controllers for certain classes of nonlinear systems; unfortunately, DP is computationally very expensive. The reinforcement learning methods known as adaptive critics (AC) provide computationally feasible means for performing approximate dynamic programming (ADP). The term 'adaptive' in AC refers to the critic improved estimations of the value function used by DP. To apply DP, the user must craft a Utility function that embodies all the problem-specific design specifications/criteria. Model reference adaptive control methods have been successfully used in the control community to effect on-line redesign of a controller in response to variations in plant parameters, with the idea that the resulting closed loop system dynamics will mimic those of a reference model. The work (1) uses a reference model in ADP as the key information input to the Utility function, and (2) uses ADP off-line to design the desired controller Future work will extend this to on-line application. This method is demonstrated for a hypersonic shaped airplane called LoFLYTE®; its handling characteristics are natively a little "hotter" than a pilot would desire. A control augmentation subsystem is designed using ADP to make the plane "feel like" a better behaved one, as specified by a reference model. The number of inputs to the successfully designed controller are among the largest seen in the literature to date.