We're using neural Networks for lipreading. As task we use speakerdependent continuous spelling of the german alphabet.

Why are we doing lipreading ? We want to improve the recogniton rate of acoustical speechrecognizers, especially in some for those systems not optimal conditions (cross-talking ...).

The goal is to get an on-line Lipreader that is robust against all on-line conditions like illumination, translation and size without using some additional things like a some lip-markers ... etc.
Examples from Database:

Subsystems of our lip-reader:
    Face-Tracker
    Lip-Finder: Neural Net architecture to find the corners of the lips
    visual TDNN: preprocessing: gray-level images, LDA, PCA, FFT, gray-value-modification
    acoustic TDNN: preprocessing: 16 Melscale Coefficient
    av combined MS-TDNN: the combination is done on the phonetic layer.

System overview:

This work is sponsored by the state of Baden-Würtemberg Germany (Landesforschungsschwerpunkt Neuroinformatik). Partial support was also provided by the Advanced Research Projects Agency (US). One of the first member was Chris Bregler in 1992, he's now with International Computer Science Institute in Berkley (have a look at his Lipreading page Visual Acoustic Speech Recognition (Computer Lipreading))

Screen dump:

Click on the image to get a closer view
More
to topback to top
Site maintained by: Céline Morel