Coordination and Fusion in Multimodal Interaction
November 2001
Mark T. Maybury, The MITRE Corporation
ABSTRACT
When we converse with one another, we utilize an array of media to interact, including spoken language, gestures, and drawings. We exploit multiple sensory systems or modalities of communication including vision, audition, and tac-tion. Providing machines with the ability to interpret multimedia input and generate coordinated multimedia output promises benefits including:
- More efficient interaction: enabling faster task com-pletion with less work.
- More effective interaction: doing the right thing at the right time, tailoring the content and form of interaction to the context of the user, task, and dialogue.
- More natural interaction: supporting fused spoken, written, and gestural interaction, as found in human-human communication.
Our research has focused on intelligent systems that exploit multiple media and modes.

Additional Search Keywords
coordination, fusion, multimedia input analysis, multimedia output generation
|