Spanish Morphological Generation with Wide-Coverage Lexicons and Decision Trees

Daniel Ferrés, Ahmed AbuRa'ed, Horacio Saggion


Morphological Generation is the task of producing the appropiate inected form of a lemma in a given textual context and according to some morphological features. This paper describes and evaluates wide-coverage morphological lexicons and a Decision Tree algorithm that perform Morphological Generation in Spanish at state-of-the art level. The Freeling, Leffe and Apertium Spanish lexicons, the J48 Decision Tree algorithm and the combination of J48 with Freeling and Leffe lexicons have been evaluated with the following datasets for Spanish: i) CoNLL2009 Shared Task dataset, ii) Durrett and DeNero dataset of Spanish Verbs (DDN), and iii) SIGMORPHON 2016 Shared Task (task-1) dataset. The results show that: i) the Freeling and Leffe lexicons achieve high coverage and precision over the DDN and SIGMORPHON 2016 datasets, ii) the J48 algorithm achieves state-of-the-art results in all of the three datasets, and iii) the combination of Freeling, Leffe and the J48 algorithm outperformed the results of our other approaches in the three evaluation datasets, improved slightly the results of the CoNLL2009 and SIGMORPHON 2016 reported in the state-of-the-art literature, and achieved results comparable to the ones reported in the state-of-the-art literature on the DDN dataset evaluation.

Texto completo: