Skip to Main Content
We propose a novel scheme for delay-optimal scheduling in multi-user multi-relay cellular wireless networks. The cell area is divided into several sectors, each serviced by an individual relay station (RS). In order to have simultaneous transmissions by the users in neighbouring sectors, we assume that users of each individual sector use separate set of orthogonal channels to communicate with the RS and the base station (BS). Moreover, a separate orthogonal channel is shared among relays for transmission to the BS. For uplink communication, users are allowed to choose between two modes of transmission, namely, direct transmission mode and relayed transmission mode through a simple transmission mode selection algorithm. Users are allocated fractions of the time-slot for the first phase of transmission (from the users to the BS and the RSs) in a time-division multiple access (TDMA) fashion. For the second phase of transmission (from the RSs to the BS), each RS is allocated a fraction of the time-slot. We model the problem of end-to-end (e2e) delay-optimal scheduling as an infinite-horizon average reward Markov decision process (MDP) for users and relays in two separate stages. An online learning approach is then employed to solve the problem in a distributed manner for both users and relays in each phase of transmission. The proposed online stochastic learning solution converges to the optimal solution almost surely (with probability 1) under some realistic conditions. Simulation results show that the proposed approach outperforms the conventional scheduling schemes.