Abstract

This paper provides an overview of a multi-modal wearable computer system, SNAP&TELL. The system performs real-time gesture tracking, combined with audio-based control commands, in order to recognize objects in an environment, including outdoor landmarks. The system uses a single camera to capture images, which are then processed to perform color segmentation, fingertip shape analysis, robust tracking, and invariant object recognition, in order to quickly identify the objects encircled and SNAPped by the user’s pointing gesture. In addition, the system returns an audio narration, TELLing the user information concerning the object’s classification, historical facts, usage, etc. This system provides enabling technology for the design of intelligent assistants to support “Web-On-The-World” applications, with potential uses such as travel assistance, business advertisement, the design of smart living and working spaces, and pervasive wireless services and internet vehicles.

Details

Actions