Towards Robust Models for Fake News Detection in Spanish

Sergio Gómez González, Mariona Coll Ardanuy, Paolo Rosso

Resumen


In this paper, we face the challenge of fake news detection exclusively in Spanish, an application domain in which there has not been much research. Furthermore, the news topics are in continuous change and models that are not able to adapt end up being ineffective in the long term. For that reason, in this domain, the robustness of the models is key. With that goal in mind, we have applied several techniques that include data exploitation and augmentation in order to improve the performance of a simple pre-trained transformer-based model. Additionally, we have included a comparison with a generative large language model. Moreover, we use two different dataset splits to compare that performance: a standard approach to partitioning the dataset, balancing the training and test sets, and a more realistic (adversarial) one. Finally, we discuss which aspects have more influence over the robustness and performance of the fake news detection models.

Texto completo:

PDF