Audio-Visual Grounding Referring Expression for Robotic Manipulation | IEEE Conference Publication | IEEE Xplore