18 May 2016

On May 19th 2pm in TASC1 9204, Jasneet will defend his Masters thesis on the topic “BILINGUAL LANGUAGE MODELS USING WORD EMBEDDINGS FOR MACHINE TRANSLATION”.

Abstract

Bilingual language models (Bi-LMs) refer to language models over pairs of words in source and target languages in a machine translation task. When translating from source to target language, the decoder in phrase-based machine translation system segments the source sentence into phrases and then translates each phrase to the target language. While decoding each phrase, the decoder does not have sufficient information about source words that are outside the phrase under consideration. Bi-LMs have been used to tackle this problem. Bi-LMs are estimated by first creating bi-token sequences using word alignments over a parallel corpus. We propose the use of bilingual word embeddings to deal with the large number of bi-token types in a bi-token language model. Our approach outperforms previous work with an increase of 1.4 BLEU points in our machine translation experiments.

M.Sc. Examining Committee:

  • Dr. Anoop Sarkar, Co-Senior Supervisor
  • Dr. Fred Popowich, Co-Senior Supervisor
  • Dr. Jiannan Wang, Examiner
  • Dr. Ryan Shea, Chair