Using collocation segmentation to extract translation units in a phrase-based statistical machine translation system

Marta R. Costa-jussà , Vidas Daudaravicius , Rafael E. Banchs


This report evaluates the impact of using a novel collocation segmentation method for phrase extraction in the standard phrase-based statistical machine translation approach.The collocation segmentation technique is implemented simultaneously in the source and target side. The resulting collocation segmentation is used to extract translation units. Experiments are reported in the Spanish-to-English EuroParl task and promising results are achieved in translation quality.

