Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations | IEEE Conference Publication | IEEE Xplore