On May 18th 11:30am, Jasneet will give a practice talk for his Masters thesis defence on the topic “BILINGUAL LANGUAGE MODELS USING WORD EMBEDDINGS FOR MACHINE TRANSLATION”.
Abstract
Bilingual language models (Bi-LMs) refer to language models over pairs of words in source and target languages in a machine translation task. When translating from source to target language, the decoder in phrase-based machine translation system segments the source sentence into phrases and then translates each phrase to the target language. While decoding each phrase, the decoder does not have sufficient information about source words that are outside the phrase under consideration. Bi-LMs have been used to tackle this problem. Bi-LMs are estimated by first creating bi-token sequences using word alignments over a parallel corpus. We propose the use of bilingual word embeddings to deal with the large number of bi-token types in a bi-token language model. Our approach outperforms previous work with an increase of 1.4 BLEU points in our machine translation experiments.
M.Sc. Examining Committee: