IBEREVAL OM: Mining Opinions from the new textual genres

Alexandra Balahur , Ester Boldrini , Andrés Montoyo , Patricio Martínez-Barco

Resumen


The increasing amount of subjective data on the Web is creating the need to
develop effective Question Answering systems able to discriminate such information from factual
data, and subsequently process it with specific methods. The participants in the IBEREVAL OM tasks
will be given a set of opinion questions (in Spanish and English). Optionally, they will also be able to
receive the same set of opinion questions, in which the source, target and expected polarity, as well as
the time span the question is referring to are given. They will also be provided with a collection of
blog posts, extracted using the Technorati blog search engine (in Spanish and English), in which the
answers to the opinion questions should be found
The gold standard for this blog posts collection will previously be annotated using the EmotiBlog
scheme, by a number of 3 annotators. The EmotiBlog corpus and the set of questions presented in
(Balahur et al., 2009) – in their present state will be provided for system training. The participants will
be able to participate in two subtasks : 1) in the first one, they will be asked to provide the list of
answers to each of the questions (in the same language as the questions, or in the other language); 2) in
the second one, they will be asked to provide a summary of the question answers – the top x% of the
most important answers, in a non-redundant manner. The Gold Standard for the summaries will be
automatically extracted from the manual annotations, taking into account the “intensity” parameter of
the opinions expressed.

Texto completo:

PDF