OntoLM: Integrating Knowledge Bases and Language Models for classification in the medical domain

Fabio Yáñez-Romero, Andrés Montoyo, Rafael Muñoz, Yoan Gutiérrez, Armando Suárez


Large language models have shown impressive performance in Natural Language Processing tasks, but their black box characteristics render the explainability of the model’s decision difficult to achieve and the integration of semantic knowledge. There has been a growing interest in combining external knowledge sources with language models to address these drawbacks. This paper, OntoLM, proposes a novel architecture combining an ontology with a pre-trained language model to classify biomedical entities in text. This approach involves constructing and processing graphs from ontologies and then using a graph neural network to contextualize each entity. Next, the language model and the graph neural network output are combined into a final classifier. Results show that OntoLM improves the classification of entities in medical texts using a set of categories obtained from the Unified Medical Language System. We can create more traceable natural language processing architectures using ontology graphs and graph neural networks.

Texto completo: