Skip to Main Content
We formalize the representation of gestures and present a model that is capable of synchronizing expressive and relevant gestures with text-to-speech input. A gesture consists of gesture primitives that are executed simultaneously. We formally define the gesture primitive and introduce the concept of a spatially targeted gesture primitive, i.e., a gesture primitive that is directed at a target of interest. The spatially targeted gesture primitive is useful for situations where the direction of the gesture is important for meaningful human-robot interaction. We contribute an algorithm to determine how a spatially targeted gesture primitive is generated. We also contribute a process to analyze the input text, determine relevant gesture primitives from the input text, compose gestures from gesture primitives and rank the combinations of gestures. We propose a set of criteria that weights and ranks the combinations of gestures. Although we illustrate the utility of our model, algorithm and process using a NAO humanoid robot, our contributions are applicable to other robots.