Using Semantic Graphs and Word Sense Disambiguation Techniques to Improve Text Summarization
Resumen
This paper presents a semantic graph-based method for extractive summarization. The summarizer uses WordNet concepts and relations to produce a semantic graph that represents the document, and a degree-based clustering algorithm is used to discover different themes or topics within the text. The selection of sentences for the summary is based on the presence in them of the most representative concepts for each topic. The method has proven to be an efficient approach to the identification of salient concepts and topics in free text. In a test on the DUC data for single document summarization, our system achieves significantly better results than previously published approaches based on terms and mere syntactic information. Besides, the system can be easily ported to other domains, as it only requires modifying the knowledge base and the method for concept annotation. In addition, we address the problem of word ambiguity in semantic approaches to automatic summarization.