Clustering of syntactic and discursive information for the dynamic adaptation of Language Models

Juan Manuel Lucas Cuesta , Fernando Fernández Martínez , Javier Ferreiros López , Verónica López Ludeña , Rubén San Segundo Hernández

Resumen


In this paper we present an approach for clustering dialogue items, both semantic and discursive. We use Latent Semantic Analysis (LSA) to cluster the different dialogue items according to a correlation-based distance. After building a set of groups that make up a partition of the semantic or discursive space, we train a stochastic Language Model (LM) for each group. We use these LM to dynamically adapt the language model used by a speech recognition module included in a Spoken Dialogue System. We use dialogue-based information (namely, the posterior probabilities of the dialogue items that our Dialogue Manager estimates on each dialogue turn) to automatically estimate the interpolation weights among LM. The initial evaluation shows a reduction of the word error rate when using the information of an utterance to rescore the same utterance.

Texto completo:

PDF