Vision-language model-driven scene understanding and robotic object manipulation

Vision-language model-driven scene understanding and robotic object manipulation | IEEE Conference Publication | IEEE Xplore