I. Introduction
End-to-end learning has attracted a lot of attention in recent years and it is considered to be a promising technology for future wireless communication systems [1], [2]. Its key idea is to implement the transmitter, channel, and receiver as a single neural network (NN), referred to as an autoencoder, that is trained to achieve the highest possible information rate [3], [4]. Since its first application to wireless communications [5], end-to-end learning has been extended to other fields including optical wireless [6] and optical fiber [7]. However, most of the literature is either simulation-based on simple channel models, such as additive white Gaussian noise (AWGN) or Rayleigh block fading (RBF), or experimental, but performed in static environments [3], [8]. Such setups do not account for the Doppler and delay spread encountered in practical wireless systems that lead to variations of the channel response in both time and frequency. The evaluation of end-to-end learning on more realistic channel models is overlooked in the existing literature but critical to bring the technology from theory to practice.