On Tuesday, November 10th, Te Bu will give a practice talk on “Joint prediction of word alignment and alignment types for statistical machine translation”. The talk will be in TASC1 9408 at 3:30pm.
Learning word alignments between parallel sentence pairs is an important task in Statistical Machine Translation. Existing models for word alignment have assumed that word alignment links are untyped. In this work, we propose new machine learning models that use linguistically informed link types to enrich word alignments. We use 11 different alignment link types based on annotated data released by the Linguistics Data Consortium. We first provide a solution to the sub-problem of alignment type prediction given an aligned word pair and then propose two different models to simultaneously predict word alignment and alignment types. Our experimental results show that we can recover alignment link types with an F-score of 81.4%. Our joint model improves the word alignment F-score by 4.6% over a baseline that does not use typed alignment links. We expect typed word alignments to bene t SMT and other NLP tasks that rely on word alignments.