Publications

  • Siahbani, M., & Sarkar, A. (2014). Two Improvements to Left-to-Right Decoding for Hierarchical Phrase-based Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha, Qatar: Association for Computational Linguistics.
  • Siahbani, M., & Sarkar, A. (2014). Expressive Hierarchical Rule Extraction for Left-to-Right Translation. In Proceedings of the 11th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-2014). Vancouver, Canada.
  • Sankaran, B., & Sarkar, A. (2014). Bayesian Iterative-cascade Framework for Hierarchical Phrase-based Translation. In Proceedings of the 11th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-2014). Vancouver, Canada.
  • Dholakia, R., & Sarkar, A. (2014). Pivot-based Triangulation for Low-Resource Languages. In Proceedings of the 11th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-2014). Vancouver, Canada.
  • Siahbani, M., Mehdizadeh Seraj, R., & Sarkar, A. (2014). Incremental Translation using a Hierarchical Phrase-based Translation System. In In Proceedings of the 2014 IEEE Spoken Language Technology Workshop (SLT 2014). Nevada, USA.

News

Brat Rapid Annotation Tool 28 Oct 2014

Brat is a web-based tool for text annotation; which is going to be presented by Jasneet in the lab meeting this week (29th of Oct).
Brat is designed in particular for structured annotation, where the notes are not freeform text but have a fixed form that can be automatically processed and interpreted by a computer.
The meeting will be at TASC1 9408 from 1030 hours.

Expressive Hierarchical Rule Extraction for Left-to-Right Translation 14 Oct 2014

In the lab meeting this week, 15th of Oct, Maryam will give a talk about Expressive Hierarchical Rule Extraction for Left-to-Right Translation. The meeting will be at TASC1 9408 from 1030 hours. Following is the abstract of the paper:

Left-to-right (LR) decoding Watanabe et al. (2006) is a promising decoding algorithm for hi- erarchical phrase-based translation (Hiero) that visits input spans in arbitrary order producing the output translation in left to right order. This leads to far fewer language model calls. But the constrained SCFG grammar used in LR-Hiero (GNF) with at most two non-terminals is unable to account for some complex phrasal reordering. Allowing more non-terminals in the rules results in a more expressive grammar. LR-decoding can be used to decode with SCFGs with more than two non-terminals, but the CKY decoders used for Hiero systems cannot deal with such expressive grammars due to a blowup in computational complexity. In this paper we present a dynamic programming algorithm for GNF rule extraction which efficiently ex- tracts sentence level SCFG rule sets with an arbitrary number of non-terminals. We analyze the performance of the obtained grammar for statistical machine translation on three language pairs.

Incremental Translation using Hierarchichal Phrase-based MT 07 Oct 2014

In the lab meeting this week, 8th of Oct, Maryam and Ramtin will give a talk about their recent paper. The meeting will be at TASC1 9408 from 1030 hours.

Abstract : Hierarchical phrase-based machine translation (Hiero) is a prominent approach for Statistical Machine Translation usually comparable to or better than conventional phrase-based systems. But Hiero typically uses the CKY decoding algorithm which requires the entire input sentence before decoding begins, as it produces the translation in a bottom-up fashion. Left-to-right (LR) decoding is a promising decoding algorithm for Hiero that produces the output translation in left to right order. In this paper we focus on simultaneous translation using the Hiero translation framework. In simultaneous translation, translations are generated incrementally as source language speech input is processed. We propose a novel approach for incremental translation by integrating segmentation and decoding in LR-Hiero. We compare two incremental decoding algorithms for LR-Hiero and present translation quality scores (BLEU) and the latency of generating translations for both decoders on audio lectures from the TED collection.

Evaluating Visual Text Analytics: Characterizing the Target Problem Space 29 Sep 2014

In the lab meeting this week, 1st of Oct, Milan will give an overview of the evaluation of visual text analytics. The talk will focus on characterizing the target problem space which forms the framework for the evaluation process. The meeting will be at TASC1 9408 from 1030 hours.

Paper/Idea Discussions for Upcoming Conferences 22 Sep 2014

In the lab meeting this week, 24th of Sep, we will discuss paper ideas/plans that lab members have or what they are currently working on for the upcoming TACL or *ACL conference deadlines. The meeting will be at TASC1 9408 from 1030 hours.