In this paper, a real-time system that utilizes hand gestures to interactively control the presentation is proposed. The system employs a thermal camera for robust human body segmentation to handle the complex background and varying illumination posed by the projector. A fast and robust hand localization algorithm is proposed, with which the head, torso, and arm are sequentially localized. Hand trajectories are segmented and recognized as gestures for interactions. A dual-step calibration algorithm is utilized to map the interaction regions between the thermal camera and the projected contents by integrating a Web camera. Experiments show that the system has a high recognition rate for hand gestures, and corresponding interactions can be performed correctly.