This project focuses on the joint interpretation of information from multiple input modalities in the context of real-world applications.
Human-human communication tends to involve a multiplicity of signals and cues such as speech, gestures, writing, facial orientation and expression, to name a few. These input sources provide complementary or redundant information, rendering this style of interaction very flexible, expressive, and robust. We aim to realize a similar style of interaction between human and computer.


We have developed a multimodal interface for an appointment scheduling task on a computerized calendar. The user can use any combination of spoken input, gesturing with a pen on a touch-sensitive screen, or handwritten words to interact with the system. In a typical scenario, the user might say "Schedule a meeting on Monday," while at the same time drawing a line on the calendar to indicate when the meeting should start and how long it should last; write words on the newly scheduled meeting to annotate it; draw a cross on another meeting to cancel it; or point to a meeting and say "Reschedule this on Tuesday," etc.



Click on the image to get a closer view

We are currently working on improving this multimodal calendar and investigating the application of our multimodal interpreter engine in other domains.
Site maintained by:
Céline Morel
Last modification:
Jan/03/00