home printable version contact information
description
Introduction
Browser
 
recogntion
Emotion
Speech
Speaker ID
 
tracking
Attention
Face
Body
Discourse
 
technologie
Distant microphone
Corpus
Publications
 
 
Interactive Systems Lab

Gaze information plays an important role in identifying a person's focus of attention. The information can provide useful communication cues to a multimodal interface. For example, it can be used to identify where a person is looking, and what he/she is paying attention to.

In order to make intelligent and interactive environments respond appropriately to their users' needs, it is necessary to equip them with perceptive capabilities to capture as much relevant information about its users and the context in which they act as possible. Obtaining knowledge about a person's focus of attention is a major step towards a better understanding of what users do, how and with what
or whom they interact or to what they refer.
We address the problem of tracking the focus of attention of participants in a meeting, i.e. tracking
who is looking at whom during a meeting. Such information
can for example be used to control interaction with a smart meeting room or to index and analyze multimedia meeting records.
A body of research literature suggests that humans are generally interested in what they look at and the close relationship between gaze and attention during social interaction has been emphasized. In addition, recent user studies reported strong evidence that people naturally look at the objects or devices with which they interact.
A first step to determine someone's focus of attention, therefore is, to find out in which direction the person looks. There are two contributing factors in the formation of where a person looks: head orientation and eye orientation.

 

head orientation can be estimated with non-intrusive methods while eye orientation can not.
Our approach to tracking at whom participants look, i.e.
their focus of attention, is the following:
1. Detect all participants in the scene,
2. estimate each participant's head orientation and
3. map each estimated head orientation to its likely targets using a probabilistic framework.
To improve the robustness of focus of attention tracking, we would like to combine various sources of information.
We have found that focus of attention is correlated to who is speaking in a meeting and that it is possible to estimate a person's focus of attention based on the information of who is talking at or before a given moment.
To estimate where a person is looking, based on who is speaking, probability distributions of where participants are looking during certain "speaking constellations" are used.
The accuracy of sound-based prediction of focus of attention can furthermore significantly be improved by taking a history of speaker constellations into account. We have trained neural networks to predict focus of attention based on who was speaking during a short period of time.
Finally, the head pose based and the sound-based estimations are combined to obtain a multimodal estimation of the participants' focus of attention. This leaded to significant improvements compared to using just one modality for focus of attention tracking alone.
Our system for focus of attention detection in meetings has been successfully installed in both our labs at the Universitat Karlsruhe, Germany and at Carnegie Mellon University in Pittsburgh, USA.

full paper:
Tracking Focus of Attention in Meetings
Rainer Stiefelhagen
IEEE International Conference on Multimodal Interfaces, Pittsburgh, PA, USA, October 14-16, 2002.

[more]

 
top