Local modifications and paraphrases in Wikipedia's revision history

Camille Dutrey , Houda Bouamor , Delphine Bernhard , Aurélien Max


In this article, we analyse the modifications available in the French Wikipedia revision history.
We define a typology of modifications based on a detailed study of WiCoPaCo, a freely-available resource built by automatically mining Wikipedia's revision history.
Based on this typology, we detail a manual annotation study of a subpart of the corpus aimed at assessing the difficulty of automatic paraphrase identification in such a corpus.
Finally, we assess a rule-based paraphrase identification tool.

