Context-Sensitive Spoken Dialogue Processing with the DOP Model
1Context-Sensitive Spoken Dialogue Processing with the DOP Model Rens Bod Institute for Logic, Language and Computation University of Amsterdam Spuistraat 134, NL-1012 VB Amsterdam & School of Computer Studies University of Leeds Leeds LS2 9JT, UK rens@scs.leeds.ac.uk Abstract We show how the DOP model can be used for fast and robust context-sensitive processing of spoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport Information System"), is a Dutch spoken language information system which operates over ordinary telephone lines. The prototype system is the immediate goal of the NWO Priority Programme "Language and Speech Technology". In this paper, we extend the original DOP model to context-sensitive interpretation of spoken input. The system we describe uses the OVIS corpus (which consists of 10,000 trees enriched with compositional semantics) to compute from an input word-graph the best utterance together with its meaning. Dialogue context is taken into account by dividing up the OVIS corpus into context- dependent subcorpora. Each system question triggers a subcorpus by which the user answer is analyzed and interpreted. Our experiments indicate that the context-sensitive DOP model obtains better accuracy than the original model, allowing for fast and robust processing of spoken input. 21. Introduction The Data-Oriented Parsing (DOP) model is a corpus-based parsing model which uses subtrees from parse trees in a corpus to analyze new sentences. The occurrence-frequencies of the subtrees are used to estimate the most probable analysis of a sentence (cf. Bod 1992, 93, 95, 98; Bod & Kaplan 1998; Bonnema et al. 1997; Chappelier & Rajman 1998; Charniak 1996; Cormons 1999; Goodman 1996, 98; Kaplan 1996; Rajman 1995; Scha 1990, 92; Scholtes 1992; Sekine & Grishman 1995; Sima'an 1995, 97, 99; Tugwell 1995; Way 1999). To date, DOP has mainly been applied to corpora of trees whose labels consist of


