Skip to Main Content
In this paper, we study efficient rate control schemes for delay sensitive communications over wireless fading channels based on reinforcement learning. Our objective is to find a rate control scheme that optimizes the link layer performance, specifically, maximizes the system throughput subject to a fixed bit error rate (BER) constraint and longterm average power constraint. We assume the buffer at the transmitter is finite; hence packet drop happens when the buffer is full. We assume the fading channel under our study can be modeled as a finite state Markov chain, however the transition probability of channel states is not known, and the only information available about the wireless channel is the instantaneous channel gain, which is estimated and fed back from receiver side to the transmitter side on the fly. In this paper, we use reinforcement learning approach to learn the time-varying channel environment and search for the optimal control policy on line. Simulation results show that starting from an arbitrary control policy, the learning agent gradually modifies its estimation about the system model and adjusts the control policy to its optimality.