News

Jetic will give a talk about NMT@ACL2019
18 Nov 2019

In our lab meeting tomorrow, Jetic will give a talk on recent NMT papers, more specifically those tackling inconsistencies, model compression, and training efficiencies. Here is the title and abstract:

Beyond Chasing After BLEU Scores I

Abstract: Neural machine translation models while offer great performance promises on paper suffer severely in production, where inconsistencies in translation, huge model sizes and expensive training and inference operations hindering reliability and cost-efficiencies. In this presentation (1/2), we look at recent NMT papers published inproc ACL2019 that may help with these very specific issues.

Tuesday, November 19th, 10:30 a.m. TASC1 9408.   Title: Beyond Chasing After BLEU Scores I

Pooya will give a talk about NMT
11 Nov 2019

In our lab meeting tomorrow, Pooya will give a talk about NMT. Here is the title and abstract:

Interpretable neural models in NLP

Abstract: Neural models have become the state-of-the-art in many machine learning applications. However, they usually do not explain their predictions which is a barrier for understanding their behaviors and trusting them specially for critical settings. In this presentation we investigate several NLP papers which have tried to propose interpretable neural models as a remedy. (bearbeitet)

Tuesday, November 12th, 10:30 a.m. TASC1 9408.   Title: Interpretable neural models in NLP

Wasifa will give a practice thesis talk
24 Oct 2019

In our lab meeting next Tuesday, Wasifa will give a practice thesis talk. Here is the title and abstract:

Employing Neural Hierarchical Model with Pointer Generator Networks for Abstractive Text Summarization

Abstract: *As​ ​growth​ ​of​ ​online​​ ​data​ ​continues, ​ ​​automatic summarization​ ​is​ ​integral​ ​in​ ​generating​ ​a​ ​condensed​ ​version of a text while preserving the meaning of the original input. Although​ ​most​ ​of the​ ​earlier​ ​works​ ​on​ ​automatic​ ​summarization​ ​use​ extractive​ ​approaches​ ​to​ ​identify​ ​the​ ​most​ ​important​ ​information of​ a ​document, ​ recent​ ​research​ ​focus​ ​on​ ​the​ ​more​ ​challenging​ ​task​ ​of​ ​making​ ​the summaries​ ​abstractive. Sequence-to-sequence models with attention have quantitatively shown to be effective for abstractive summarization, but the quality of the generated summaries is often poor with incorrect and redundant information. In this thesis, we present an end-to-end neural network framework which combines a hierarchical content selector and pointer generator networks abstractor through a multi-level attention mechanism that uses the sentence importance scores from the former model to help the word-level attention of the latter model make better decisions when generating the output words. Hence, words from key sentences will be attended more than words in less salient sentences of input. Our approach is motivated by human writers who tend to focus only on the relevant portions of an article when summarizing while ignoring anything irrelevant that might degrade the output quality. We conduct experiments on the challenging CNN/Daily Mail dataset, which consists of long newswire articles paired with multiple-sentence summaries. Experimental results show that our end-to-end architecture outperforms the extractive systems and strong lead-3 baseline and achieves competitive ROUGE and METEOR scores with previous abstractive systems on the same dataset. Qualitative analysis on test data shows that the generated summaries are fluent as well as informative. *

Tuesday, October 29th, 10:30 a.m. TASC1 9408.  

Pooya and Zhenqi will present their recent papers due to appear in EMNLP 2019.
21 Oct 2019

In our lab meeting tomorrow Zhenqi and Pooya will present their EMNLP2019 papers. Here is the title and abstract of their talks:

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Abstract: Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model’s decision in generating a specific token but it has not yet been rigorously established to what extent attention is a reliable source of information in NMT. To evaluate the explanatory power of attention for NMT, we examine the possibility of yielding the same prediction but with counterfactual attention models that modify crucial aspects of the trained attention model. Using these counterfactual attention mechanisms we assess the extent to which they still preserve the generation of function and content words in the translation process. Compared to a state of the art attention model, our counterfactual attention models produce 68% of function words and 21% of content words in our German-English dataset. Our experiments demonstrate that attention models by themselves cannot reliably explain the decisions made by a NMT model.

Deconstructing Supertagging into Multi-task Sequence Prediction

Abstract: In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks. MT-DNN not only leverages large amounts of cross-task data, but also benefits from a regularisation effect that leads to more general representations to help adapt to new tasks and domains. MT-DNN extends the model proposed in Liu et al. (2015) by incorporating a pre-trained bidirectional transformer language model, known as BERT (Devlin et al., 2018). MT-DNN obtains new state-of-the-art results on ten NLU tasks, including SNLI, SciTail, and eight out of nine GLUE tasks, pushing the GLUE benchmark to 82.7% (2.2% absolute improvement) as of February 25, 2019 on the latest GLUE test set. We also demonstrate using the SNLI and SciTail datasets that the representations learned by MT-DNN allow domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations. Our code and pre-trained models will be made publicly available.

Tuesday, October 22th, 10:30 a.m. TASC1 9408.  

Pooya will present his WNGT 2019 paper.
15 Oct 2019

In our lab meeting tomorrow Pooya will present on interpretibility of attention mechanism in NMT. Here is the title and abstract of his talk:

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Abstract: Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model’s decision in generating a specific token but it has not yet been rigorously established to what extent attention is a reliable source of information in NMT. To evaluate the explanatory power of attention for NMT, we examine the possibility of yielding the same prediction but with counterfactual attention models that modify crucial aspects of the trained attention model. Using these counterfactual attention mechanisms we assess the extent to which they still preserve the generation of function and content words in the translation process. Compared to a state of the art attention model, our counterfactual attention models produce 68% of function words and 21% of content words in our German-English dataset. Our experiments demonstrate that attention models by themselves cannot reliably explain the decisions made by a NMT model.

Tuesday, October 15th, 10:30 a.m. TASC1 9408.

Recent Publications