Unsupervised Neural Machine Translation, a new paradigm solely based on monolingual text

Mikel Artetxe, Gorka Labaka, Eneko Agirre

Resumen


This article presents UnsupNMT, a 3-year project of which the first year has already been completed. UnsupNMT proposes a radically different approach to machine translation: unsupervised translation, that is, translation based on monolingual data alone with no need for bilingual resources. This method is based on deep learning of temporal sequences and uses cutting-edge interlingual word representations in the form of cross-lingual word embeddings. This project is not only a highly innovative proposal but it also opens a new paradigm in machine translation which branches out to other disciplines, such us transfer learning. Despite the current limitations of unsupervised machine translation, the techniques developed are expected to have great repercussions in areas where machine translation achieves worse results, such as translation between languages which have little contact, e.g. German and Russian.

Texto completo:

PDF