Automatic speech recognition has attracted a lot of interest and funding over the past two decades mainly because the wide range of applications envisioned for this technology. Intensive research over recent years has boosted the performance of speech recognition technology significantly, triggering the development of many research spoken dialogue systems.
However it is widely believed that recognition performance will remain limited, at least for the foreseeable future. Therefore, to build usable speech user interfaces based on unreliable technology, ways to gracefully recover from recognition errors are needed.


Our approach assumes a collaborative user who is willing to assist the system in overcoming recognition errors. Errors are corrected by providing additional input, either in the same modality, or switching to other modalities. This multimodal approach to error recovery attempts to leverage both the fact that input in different modalities provides redundant information and the fact that switching modalities itself alleviates user frustration.

We developed a prototypical speech user interface augmented with repair by respeaking, spelling, handwriting and gestures on a dictation task. In a typical scenario, the user highlights errors in the hypothesis displayed, and corrects them by providing input in a modality of his choice. Preliminary studies have shown that repair by spelling or handwriting can be very effective - requiring significantly less effort than repair by choice from N-best alternatives.
More
to topback to top

page designed by: Céline Morel

 





Questions or Comments? Contact the Webmaster