Enriching User Reviews Through An Opinion Extraction System

F. Javier Ortega, José A. Troyano, Fermín L. Cruz, Fernando Enríquez


Web sites based on User-Generated Content (UGC) have a potentially valuable applicability in a number of fields. In this work we carry out a study of the usefulness of these systems from the point of view of being aware of the perception expressed by users about the services or items being opinionated. To this end, we have compiled and analyzed opinions expressed and shared by users on TripAdvisor, one of the most relevant UGC-based websites in the domain of tourism. The study is focused on two aspects: the structured and the unstructured data (for example, numerical ratings and natural language texts, respectively). We perform a quantitative and a qualitative analysis of the information extracted by an opinion extraction system from our dataset, being the last one especially interesting because it provides really valuable information to tourist agents and hospitality managers since it can identify the strong and weak points of hotels according to user perceptions, going beyond the structured data offered by TripAdvisor. Finally, we provide a study on the complementarity of the knowledge extracted from the textual opinions and the structured data, observing a noticeable increment of the amount of information available with the conjunction of both sources.

Texto completo: