SPEECH ACT, DIALOG GAME, and DIALOG ACTIVITY 
TAGGING MANUAL 
for Spanish Conversational Speech

Version 1.0


Project CLARITY
Language Technologies Institute
Carnegie Mellon University


Ann Thymé-Gobbel
Natural Speech Technologies, Inc.

Lori Levin
Carnegie Mellon University

0. INTRODUCTION

This manual describes a 3-level manual discourse coding scheme developed and used for manual tagging of the CallHome Spanish and CallFriend Spanish databases used in the CLARITY project at the Language Technologies Institute at Carnegie Mellon University.

The goal of CLARITY is to explore the use of discourse structure in understanding conversational speech. The project combines empirical methods for dialogue processing with state-of-the art LVCSR (using the JANUS recognizer). The three levels of the coding scheme are (1) a speech act level consisting of a tag set extended from DAMSL and Switchboard; (2) a dialogue game level defined by initiative and speaker intention; and (3) an activity level defined within topic units. Each of the three levels of tagging are discussed in the following sections. The manually tagged dialogues are used to train automatic classifiers.

INDEX

1. Speech Act Level Tagging: Background Information
2. Speech Act Level Tagging: Descriptions And Examples
2a. Semantic Feature Dimension
2b. List of Speech Act Tags
Questions
Answers
Agreement/Disagreement
Discourse Markers
Forward Functions
Control Acts
Statements
Other Tags
3. Speech Act Level Tagging: Decision Trees
4. Dialog Game Level Tagging: Background Information
5. Dialog Game Level Tagging: Descriptions and Examples
6. Dialog Game Segmentation
7. Activity Level Tagging: Descriptions and Examples
8. Putting It All Together
9. References


Contact: lsl+@nl.cs.cmu.edu

manualintro.html
last updated: January26, 1999