Max-AST: Combining Convolution, Local and Global Self-Attentions for Audio Event Classification | IEEE Conference Publication | IEEE Xplore