Spectrum Efficient Mode Selection and Resource Allocation Optimization for D2D Communication in HetNet: A Multi-Agent Q-Learning Approach | IEEE Journals & Magazine | IEEE Xplore