Skip to Main Content
This paper gives an overview of a multi-modal wearable computer system 'SNAP&TELL', which performs real-time gesture tracking combined with audio-based system control commands to recognize objects in the environment including outdoor landmarks. Our system uses a single camera to capture images which are then processed using several algorithms to perform segmentation based on color fingertip shape analysis, robust tracking, and invariant object recognition, in order to quickly identify the objects encircled (SNAPshot) by the user's pointing gesture. In turn, the system returns an audio narration, telling the user information concerning the object's classification, historical facts, usage, etc. This system provides enabling technology for the design of intelligent assistants to support "Web-On-The-World" applications, with potential uses such as travel assistance, business advertisement, the design of smart living and working spaces, and pervasive wireless services and internet vehicles.