Automatic counter-narrative generation for hate speech in Spanish

Maria Estrella Vallecillo-Rodríguez, Arturo Montejo-Raéz, Maria Teresa Martín-Valdivia


This paper analyzes the use of language models to automatically generate counter-narratives for hate speech in Spanish. Despite the existence of a few studies in English and other languages, no previous work has explored this topic focused on Spanish. The article shows that the use of GPT-3 outperforms other models in generating non-offensive and informative counter-narratives, which sometimes present compelling arguments. We have used few-shot learning algorithms applying different prompt strategies and analyzing the results for each of them. Additionally, a new corpus called CONAN-SP, which consists of 238 pairs of hate speech and counter-narratives in Spanish, has been made available to the research community to facilitate further investigations in this area. These findings highlight the potential of language models to combat hate speech in Spanish by counter-narrative generation.

Texto completo: