Perception Tokens Enhance Visual Reasoning in Multimodal Language Models | IEEE Conference Publication | IEEE Xplore