Improving Vision-and-Language Reasoning via Spatial Relations Modeling | IEEE Conference Publication | IEEE Xplore