Speaker: Cenk Demiroglu, Senior Research Scientist, M*Modal Technologies Title: Multi-Sensor Segmentation-Based Speech Enhancement for Ultra-Low Bitrate Speech Coders Abstract: Speech signal is significantly distorted in harsh noise environments such as in a tank or a helicopter. When such distorted speech is encoded with a parametric speech coder, the quality and the intelligibility of the output speech are typically very low. This becomes an important problem especially in military applications where understanding the correct message is critical. In this talk, the DARPA Advanced Speech Encoding (ASE) program will be briefly presented with an overview of the system proposed by the Georgia Tech speech group. Then, a novel multi-sensor speech enhancement system will be presented which was shown to substantially improve the intelligibility of the current NATO MELPe speech coding standard in military environments. The reasons for the intelligibility gain and possible ways for further improvement will also be discussed using several speech samples. The proposed enhancement system uses a broad sound-class level segmentation algorithm which is based on hard-threshold heuristics and performed poorly for some sound classes. In fact, segmentation errors were found to degrade the intelligibility in some cases. To improve the accuracy of the automatic segmenter, some of the noise-robust statistical speech recognition algorithms, such as the missing feature-based approaches, were investigated. In the second part of the presentation, results of those multi-sensor, noise-robust speech recognition research will be presented. Speaker Bio: Cenk Demiroglu got this pHd from the ECE department at Georgia Institute of Technology. During that time he worked on noise-robust automatic speech recognition (ASR), and multi-sensor speech enhancement systems. He worked at Customspeech USA, Inc. for 3 years as a technical-lead. He developed a state-of-the-art LVCSR system, and an HMM-based TTS system during his time at Customspeech. He is currently a senior research scientist at M*Modal Technologies doing speech research on noise-robust and multilingual ASR fields.