Metadatos de indexación

Generating Multiple-Choice Questions in Spanish and Basque using LLMs: A Comparative Manual Evaluation


 
Dublin Core Elementos de metadatos PKP Metadatos para el documento
 
1. Título Título del documento Generating Multiple-Choice Questions in Spanish and Basque using LLMs: A Comparative Manual Evaluation
 
2. Creador/a Nombre de autor/a, institución, país Maddalen López de Lacalle
 
2. Creador/a Nombre de autor/a, institución, país Xabier Saralegi
 
2. Creador/a Nombre de autor/a, institución, país Aitzol Saizar
 
3. Materia Disciplina(s)
 
3. Materia Palabra/s clave
 
4. Descripción Resumen Multiple-Choice Questions (MCQs) are widely applied across various domains, such as education and assessing the technical skills of staff in companies. However, creating such questions manually is challenging and time-consuming, especially for specialized fields. In this paper, we explore how generative large language models (LLMs) can be exploited to generate MCQs from instructional texts that serve as tests for vocational qualification assessment. We focus on two topics—basic first aid and production scheduling in companies—for which we created two datasets of parallel course texts in Spanish and Basque. The manual evaluation reveals that both the open-source Llama3 instructed models (8B and 70B) and the proprietary GPT-4o can generate MCQs of acceptable quality in a zero-shot setting for Spanish. No significant differences were observed in performance based on model size or licensing type, with performance rates of 91%, 84%, and 80% for GPT-4o, Llama3- 70B, and Llama3-8B, respectively. However, the results for Basque show a marked decline, with performance dropping to 70% for GPT-4o and 59% for Llama3-70B, and a notably low 27% for Llama3-8B. Finally, few-shot generation using Basqueadapted Llama-eus-8B foundational model shows promising potential.
 
5. Editorial Institución organizadora, ubicación Sociedad Española para el Procesamiento del Lenguaje Natural
 
6. Colaborador/a Patrocinador(es)
 
7. Fecha (DD-MM-AAAA) 2025-03-31
 
8. Tipo Estado y género Artículo revisado por pares
 
8. Tipo Tipo
 
9. Formato Formato de archivo PDF
 
10. Identificador Identificador uniforme de recursos http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6674
 
11. Fuente Título; vol., núm. (año) Procesamiento del Lenguaje Natural; Vol. 74 (2025): Procesamiento del Lenguaje Natural, Revista nº 74, marzo de 2025
 
12. Idioma Español=es es_ES
 
13. Relación Archivos complementarios
 
14. Cobertura Localización geoespacial, periodo cronológico, muestra de investigación (sexo, edad, etc.)
 
15. Derechos Derechos de autor/a y permisos Copyright (c) 2025 Procesamiento del Lenguaje Natural