Comparative Scientific Research Analysis with A Language-Independent Cross-Collection Model

Michael Paul , Roxana Girju


This paper addresses the problem of scientific research analysis across multiple research literature collections. We use topic modeling in three novel comparative tasks: (1) unsupervised discovery and comparison of scientific topics across multiple disciplines; (2) comparison of topics within the same discipline; (3) analysis of topic evolution over time within and across disciplines and trend analysis. We also experiment with trend analysis and propose a novel measurement of topic influence which measures the temporal correlation of related topics over time. Additionally, we evaluate the model on the task of document classification, which yields performance comparable to an optimally-tuned SVM.

Texto completo: