home printable version contact information
description
Introduction
Browser
 
recogntion
Emotion
Speech
Speaker ID
 
tracking
Attention
Face
Body
Discourse
 
technologie
Distant microphone
Corpus
Publications
 
 
Interactive Systems Lab

The idea of the dialogue analysis module in the meeting room context is to use features other than keywords for information access to spoken communication. Traditional information retrieval methods focus only on a very narrow notion of topic as a bag of keywords where as spoken language is also happening in a certain situation and in a certain style. In this paper we can only give one simplified example where the speaker identities and their dominance are important, namely in the selection of a meeting from the database. Other problems not covered here include the selection of a database out of a collection of databases, the segmentation of a meeting and the selection of a segment in a meeting. Also not covered is work on the detection of dialogue acts, games and activities.
Five meetings in the meeting database have been annotated with topic segmentations. Selecting a meeting by a query that contains the precise time, all of the keywords or the precise information who was there and how much they talked would be trivial. On the other hand the location of the meeting is uninformative since they were all recorded around the conference table in our lab.
For dialogue selection it is assumed that the queries correspond to features of a dialogue segment and that each segment in the database is equally likely to be chosen as a query.

A neural network that detects a dialogue identity for a segment has been build The network has been designed to create a probability distribution of meeting identities as its output which is tested using round robin over the whole database. To assess information access performance the reduction of empirical entropy for the meeting identity was measured in bit. This retrieval model is quite natural since we could assume that a user remembers just some part of the meeting and that most features are similar (yet not identical) in other segments of the meeting.
The results show that keyword based methods are powerful but that alternatives such as speaker identity and activity exist that seem to be (a) more natural, (b) likely part of queries, (c) easy to visualize in a browsing task and (d) explain most of the word level information implicitly.

 
top