Normalisation of Education Information in Digitalised Recruitment Processes

Laura García-Sardiña, Federico Retyk, Hermenegildo Fabregat, Lucas Alvarez Lacasa, Rus Poves, Rabih Zbib


Digitalised recruitment processes typically rely on key information automatically extracted from resumes. The case of educational background information is particularly noisy, considering the ever-growing naming of degrees, thus making its normalisation a decisive aspect for subsequent exploitation of such data. In this work we define the normalisation of education information as its transformation into pairs of level/field-of-study. Towards that purpose, we define and share a new taxonomy for fields of study within the labour context. We develop a simple approach where level of study is identified using expert rules, and field of study is normalised using a combination of rules to cover the most frequent occurrences and classifier predictions to generalise over the less frequent cases. We evaluate the proposed system on a new test set that we also make publicly available. We also investigate the application of education normalisation to a candidate-job matching use case.

Texto completo: