Baskaran will give a practice talk for his IJCNLP 2013 paper on Friday, October 4th. The talk will be held at ASB 9921 from 0930 hours. A short description is below:
The hierarchical phrase-based translation (Chiang 2007) has gained wide acceptance within the MT community. It has been shown to be more effective than the equivalent phrase-based models for language pairs involving long-distance reordering.
We propose a Bayesian model for extracting Hiero rules by reasoning over the derivations of phrase-pairs. Our model employing scalable Variational Bayesian inference extracts a sparse Hiero grammar with better discriminative power. We evaluate our model across three different language pairs demonstrating improvements in small data setting and competitive performance in large-scale datasets.
We then take a step back to consider the Hiero training pipeline in its entirety, where the alignments and Hiero rules are learned in disconnected steps. We propose a novel unified-cascade framework for jointly learning the alignments and Hiero rules in distinct but iterative steps. Using two distinct models for the two components, we demonstrate the effectiveness of our framework for the translation task across two language-pairs.