|
The Interactive Systems Laboratories develops user interfaces that improve human-computer and human-human communication. The Laboratories' fields of interest rest within two confines, speech and multimodality. In the area of multimodality, ISL conducts research in handwriting recognition, gaze tracking, sign translation, and lip-reading. Within speech, ISL has made advancements in speech recognition, machine translation, discourse analysis, and speech understanding. ISL has two locations: Carnegie Mellon University, USA, and the University of Karlsruhe, Germany. These locations provide ISL with an international presence in the speech and multimodal communities. |
|
|
HistorySince its establishment in 1991, ISL has been at the forefront of speech and multimodality technology. In the same year as its founding, ISL, along with Advanced Telecommunications Research Institute International (ATR) of Japan, founded the Consortium for Speech Translation Advanced Research (C-STAR) to conduct research in spoken language translation. Its objective is to build speech translation prototypes for tourism. The Consortium has grown over the last decade to include more than a dozen affiliates in Europe, Asia, America and India, including AT&T of the US, the European Media Lab (EML) of Germany, Limsi of France, SRI of the UK, and the Indian Institute of Technology in India. ISL's speech translation system, developed as part of the C-STAR project, is named JANUS. JANUS was one of the first systems to demonstrate, in 1993, that speaker-independent, continuous speech-to-speech translation is possible. The JANUS system provides speech recognition and machine translation of a variety of languages, including English, German, Spanish, Korean, Chinese, French, Italian, Portuguese, Swedish, Serbo-Croatian, Russian, Turkish, Arabic, Tamil and Czech. Another emphasis of the JANUS system is its ability to handle conversational, 'sloppy' speech under noise and/or cross-talk, such as a conference room or over the telephone. JANUS ranked first in the official 1996 and 1997 DARPA Hub-5 benchmark (conversational telephone speech) and the official German Verbmobil benchmark in 1994, 1995 and 1996. Current ResearchSpeechThe JANUS system is currently being applied in the Babylon project. The goal of the Babylon project is to develop rapid, two-way, natural language speech translation interfaces and platforms for the war fighter for use in force protection, refugee processing, and medical triage. In concurrence with current field operations, Babylon has four target languages: Pastho, Dari, Arabic, and Mandarin. Babylon is an innovative project, not only for its interest in new, challenging languages, but also its objective of deploying the system on a hand-held device. The accessibility of such a system on a handheld device makes Babylon a feasible application for field and triage work. Negotiating through Spoken Language in E-Commerce, or NESPOLE, encompasses both speech and multimodality. The project supports multilingual communication to foster productive exchanges and cooperation across languages and culture. NESPOLE involves four languages, English, French, German and Italian. CMU is working in collaboration with several European partners: AETHRA, APT Trentino and ITC-irst in Italy, and the University of Joseph Fourier in France. NESPOLE has been successfully demonstrated at the Carnegie Science Center of Pittsburgh and LangTech 2002 in Berlin. LingWear is a mobile tourist information system that allows users to traverse foreign cities and inquire after sights, accommodations, and other places of interest. The system runs on a wearable computer, allowing the user to take the system wherever s/he travels. It translates between English, German and Japanese. The development of LingWear took a Multi-Engine approach, using Example Based Machine Translation (EMBT) and Statistical Machine Translation (SMT) methods to improve language portability, and an Interlingua Based Machine Translation (ILMT) method to improve domain portability. MultimodalityThe main multimodal project in ISL is the INTERACT project. Its purpose is to enhance human-computer communication by processing and combining multiple communication modalities known to be helpful in human communicative situations. Several human-computer interaction tasks are explored to see how automatic gesture, speech and handwriting recognition, face and eye tracking, lip-reading and sound source localization can help make human-computer interaction more natural. One of ISL's multimodal projects bringing together much of the research conducted in the INTERACT project is our Meeting Browser. The Meeting Browser records meetings, automatically transcribes and summarizes them, and allows a user to search the meeting, or a set of meetings, for a particular speaker, topic or idea. The Meeting Browser encompasses speaker identification, automatic speech recognition, language model adaptation, dialogue analysis and automatic summarization. This project holds potential for a variety of applications in academia, research and industry. Another of our multimodal projects is the Sign Translation project. Currently, our research has focused on the translation of Chinese signs into English. The system, which is deployed on a PDA, automatically detects signs in a natural scene. Using an extended EMBT method, the system then translates Chinese characters into English text. Imagine the difference between knowing a sign says Watch Your Step and Visitor Entrance. As stamina in globalization and technology continues to thrive, the need for improved methods in human-human and human-computer interactions grows. Technologies in machine translation and speech recognition may help bridge the gap between cultures, while still preserving the world's unique languages and traditions. Multimodal technologies will better enable individuals to interact with computers as computers become evermore prevalent in daily activities. The Interactive Systems Laboratories is proud of its involvement in the research and development of new technologies to improve human-human and human-computer interaction. ISL is affiliated with the following departments:
|
|||
Questions
or Comments? Contact the Webmaster
|
||||
|
| ||||