Fine-Tuning Multimodal Transformer Models for Generating Actions in Virtual and Real Environments | IEEE Journals & Magazine | IEEE Xplore