Skip to Main Content
In this paper, we study how to optimize the transmission decisions of nodes aimed at supporting mission-critical applications, such as surveillance, security monitoring, and military operations, etc. We focus on a network scenario where multiple source nodes transmit simultaneously mission-critical data through relay nodes to one or multiple destinations in multi-hop wireless Mission-Critical Networks (MCN). In such a network, the wireless nodes can be modeled as agents that can acquire local information from their neighbors and, based on this available information, can make timely transmission decisions to minimize the end-to-end delays of the mission-critical applications. Importantly, the MCN needs to cope in practice with the time-varying network dynamics. Hence, the agents need to make transmission decisions by considering not only the current network status, but also how the network status evolves over time, and how this is influenced by the actions taken by the nodes. We formulate the agents' autonomic decision making problem as a Markov decision process (MDP) and construct a distributed MDP framework, which takes into consideration the informationally-decentralized nature of the multi-hop MCN. We further propose an online model-based reinforcement learning approach for agents to solve the distributed MDP at runtime, by modeling the network dynamics using priority queuing. We compare the proposed model-based reinforcement learning approach with other model-free reinforcement learning approaches in the MCN. The results show that the proposed model-based reinforcement learning approach for mission-critical applications not only outperforms myopic approaches without learning capability, but also outperforms conventional model-free reinforcement learning approaches.