

Second, we extract a synthetic dictionary from the resulting shared embedding space and fine-tune the mapping with the closed-form Procrustes solution from Schönemann ( 1966). First, in a two-player game, a discriminator is trained to distinguish between the mapped source embeddings and the target embeddings, while the mapping (which can be seen as a generator) is jointly trained to fool the discriminator. Our method leverages adversarial training to learn a linear mapping from a source to a target space and operates in two steps. We only use two large monolingual corpora, one in the source and one in the target language. In this paper, we introduce a model that either is on par, or outperforms supervised state-of-the-art methods, without employing any cross-lingual annotated data. We finally show that our method is aįirst step towards fully unsupervised machine translation and describeĮxperiments on the English-Esperanto language pair, on which there only existsĪ limited amount of parallel data. Our experimentsĭemonstrate that our method works very well also for distant language pairs, Methods on cross-lingual tasks for some language pairs. Using any character information, our model even outperforms existing supervised In this work, we show that we can build a bilingualĭictionary between two languages without using any parallel corpora, byĪligning monolingual word embedding spaces in an unsupervised way. Par with their supervised counterparts and are limited to pairs of languages While these methods showed encouraging results, they are not on The need for parallel data supervision can be alleviated with character-level Relied on bilingual dictionaries or parallel corpora. State-of-the-art methods for learning cross-lingual word embeddings have
