Categorizing Misogynistic Behaviours in Italian, English and Spanish Tweets

Silvia Lazzardi, Viviana Patti, Paolo Rosso


Misogyny is a multifaceted phenomenon and can be linguistically manifested in numerous ways. The evaluation campaigns of EVALITA and IberEval in 2018 proposed a shared task of Automatic Misogyny Identification (AMI) based on Italian, English and Spanish tweets. Since the participating teams’ results were pretty low in the misogynistic behaviour categorization, the aim of this study is to investigate the possible causes. We measured the overlap and the homogeneity of the clusters by varying the number of categories. This experiment showed that the clusters overlap. Finally, we tested several machine learning models both using the original data sets and merging together some categories according to their overlap, obtaining an increase in terms of macro F1.

Texto completo: