On Tuesday, November 17th, 3:30 p.m. at TASC1 9204 WEST, Te Bu will defend his M.SC thesis on the topic of “Joint Prediction of Word Alignment and Alignment Types for Statistical Machine Translation”.
Here is the abstract of his thesis:
Learning word alignments between parallel sentence pairs is an important task in Statistical Machine Translation. Existing models for word alignment have assumed that word alignment links are untyped. In this work, we propose new machine learning models that use linguistically informed link types to enrich word alignments. We use 11 different alignment link types based on annotated data released by the Linguistics Data Consortium. We first provide a solution to the sub-problem of alignment type prediction given an aligned word pair and then propose two different models to simultaneously predict word alignment and alignment types. Our experimental results show that we can recover alignment link types with an F-score of 81.5%. Our joint model improves the word alignment F-score by 0.9% over a baseline that does not use typed alignment links. We expect typed word alignments to benefit SMT and other NLP tasks that rely on word alignments.