DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information | IEEE Conference Publication | IEEE Xplore