Characterizing Spans for Sequence Labeling: A Case on Anglicism Detection

Elena Álvarez Mellado, Julio Gonzalo

Resumen


We propose a set of formal dimensions to characterize spans in sequence labeling evaluation. We apply them to a dataset and model results obtained for anglicism detection in Spanish. Results show that the best performing system is outperformed by other models on certain types of spans. Our methodology can uncover limitations in performance that go unnoticed with standard evaluation.

Texto completo:

PDF