News

Nishant will talk about using discriminators for sequence generation.
16 Jan 2019

In our lab meeting this week, Nishant will talk about some clever adversarial training tricks for sequence generation tasks. The title and abstract of his talk:

Simple yet effective usage of discriminators in conditional sequence generation

Abstract: Adversarial methods of training typically employ a “discriminator” that criticizes some aspect of the predicted output of a generative model. Although discriminators are a key component in the seemingly complex generative adversarial network (GAN) or adversial feature learning frameworks, they can be used in quite simple but clever ways to control sequence generation (especially). In this discussion, we will see two such applications to language modelling and machine translation.

Relevant Papers:

Wednesday, January 16th, 11:00 a.m. TASC1 9408.

Introducing SFUTranslate Toolkit
09 Jan 2019

In our lab meeting this week, Hassan will introduce the new Neural Machine Translation Toolkit developed recently in the lab. The toolkit is designed so that the different parts of the model in the data prepration, model creation and training process, are simply accessible and alterable modules with very little dependacy on each other. The main objective of the toolkit is to facilitate the process of implementing new models and ideas, as much as possible.

Wednesday, January 9th, 11:00 a.m. TASC1 9408.

Natlang in EMNLP 2018
23 Oct 2018

In our lab meeting this week, Ashkan, Jetic, and Nishant will present their posters, and Golnar will have a talk covering the material she prepared for EMNLP 2018. Here are the titles and abstracts of their presentations:

Golnar: In-domain Context-aware Token Embeddings Improve Biomedical Named Entity Recognition

Abstract: Rapidly expanding volume of publications in the biomedical domain makes it increasingly difficult for a timely evaluation of the latest literature. That, along with a push for automated evaluation of clinical reports, present opportunities for effective natural language processing methods. In this study we target the problem of named entity recognition, where texts are processed to annotate terms that are relevant for biomedical studies. Terms of interest in the domain include gene and protein names, and cell lines and types. Here we report on a pipeline built on Embeddings from Language Models (ELMo) and a deep learning package for natural language processing (AllenNLP). We trained context-aware token embeddings on a dataset of biomedical papers using ELMo, and incorporated these embeddings in the LSTM-CRF model used by AllenNLP for named entity recognition. We show these representations improve named entity recognition for different types of biomedical named entities. We also achieve a new state of the art in gene mention detection on the BioCreative II gene mention shared task.

Ashkan: Prediction Improves Simultaneous Neural Machine Translation

Abstract : Simultaneous speech translation aims to maintain translation quality while minimizing the delay between reading input and incrementally producing the output. We propose a new general-purpose prediction action which predicts future words in the input to improve quality and minimize delay in simultaneous translation. We train this agent using reinforcement learning with a novel reward function. Our agent with prediction has better translation quality and less delay compared to an agent-based simultaneous translation system without prediction.

Jetic: Top-down Tree Structured Decoding with Syntactic Connections for Neural Machine Translation and Parsing

Abstract: The addition of syntax-aware decoding in Neural Machine Translation (NMT) systems requires an effective tree-structured neural network, a syntax-aware attention model and a language generation model that is sensitive to sentence structure. We exploit a top-down tree-structured model called DRNN (Doubly-Recurrent Neural Networks) first proposed by Alvarez-Melis and Jaakola (2017) to create an NMT model called Seq2DRNN that combines a sequential encoder with tree-structured decoding augmented with a syntax-aware attention model. Unlike previous approaches to syntax-based NMT which use dependency parsing models our method uses constituency parsing which we argue provides useful information for translation. In addition, we use the syntactic structure of the sentence to add new connections to the tree-structured decoder neural network (Seq2DRNN+SynC). We compare our NMT model with sequential and state of the art syntax-based NMT models and show that our model produces more fluent translations with better reordering. Since our model is capable of doing translation and constituency parsing at the same time we also compare our parsing accuracy against other neural parsing models.

Nishant: Decipherment of Substitution Cipher Using Neural Language Models

Abstract: The decipherment of homophonic substitution ciphers using language models (LMs) is a well-studied task in Natural Language Processing (NLP). Previous work in this topic score short local spans of possible plaintext decipherments using n-gram LMs. The most widely used technique is the use of a beam search with n-gram LMs proposed by Nuhn et al. (2013). We propose a new approach on decipherment using a beam search algorithm that scores the entire candidate plaintext at each step with a neural LM. We augment beam search with a novel rest cost estimation that exploits the predictive power of a neural LM. This work, to our knowledge, is the first to use a large pretrained neural language model for decipherment. Our neural decipherment approach outperforms the state-of-the-art n-gram based methods on many different ciphers. On challenging ciphers such as the Beale cipher, our system reports significantly lower error rates with much smaller beam sizes.

Tuesday, October 23rd, 11:00 a.m. TASC1 9408.

Nishant and Hassan will present their works this week
02 Oct 2018

In our lab meeting this week, Nishant and Hassan will present their work for 30 minutes each. Nishant will go over his accepted EMNLP paper and Hassan will summarize his internship work at Riken.Here are the title and abstract of their presentations:

Nishant: Decipherment of Substitution Cipher Using Neural Language Models

Abstract: The decipherment of homophonic substitution ciphers using language models (LMs) is a well-studied task in Natural Language Processing (NLP). Previous work in this topic score short local spans of possible plaintext decipherments using n-gram LMs. The most widely used technique is the use of a beam search with n-gram LMs proposed by Nuhn et al. (2013). We propose a new approach on decipherment using a beam search algorithm that scores the entire candidate plaintext at each step with a neural LM. We augment beam search with a novel rest cost estimation that exploits the predictive power of a neural LM. This work, to our knowledge, is the first to use a large pretrained neural language model for decipherment. Our neural decipherment approach outperforms the state-of-the-art n-gram based methods on many different ciphers. On challenging ciphers such as the Beale cipher, our system reports significantly lower error rates with much smaller beam sizes.

Hassan: SHINRA; a Dataset for Multi-Labled Multilingual Classification of Wikipedia Articles

Abstract: In order to construct a language understanding system, such as Question-Answering, which can also explain its decision in language, we need the world knowledge which can be processed by machines. Wikipedia is a great resource of such knowledge, but it is hardly understandable for machines to process. In order to create machine processable knowledge base, we are trying to structure Wikipedia. Our first step towards this goal would be to classify the wikipedia entities into predefined categories. This project summarizes our attempt to prepare the data for the wikipedia entity classification task.

Tuesday, October 2nd, 11:30 a.m. TASC1 9408.

Nishant will defend his Masters thesis
25 Jul 2018

On July 26th 2:30pm in TASC1 9204 W, Nishant will defend his Masters thesis on the topic “DECIPHERMENT OF SUBSTITUTION CIPHERS USING NEURAL LANGUAGE MODELS”.

Abstract:

The decipherment of homophonic substitution ciphers using language models (LMs) is a well-studied task in Natural Language Processing (NLP). Previous work in this topic score short local spans of possible plaintext decipherments using n-gram LMs. The most widely used technique is the use of a beam search with n-gram LMs proposed by Nuhn et al. (2013). We propose a new approach on decipherment using a beam search algorithm that scores the entire candidate plaintext at each step with a neural LM. We augment beam search with a novel rest cost estimation that exploits the predictive power of a neural LM. This work, to our knowledge, is the first to use a large pretrained neural language model for decipherment. Our neural decipherment approach outperforms the state-of-the-art n-gram based methods on many different ciphers. On challenging ciphers such as the Beale cipher, our system reports significantly lower error rates with much smaller beam sizes.

M.Sc. Examining Committee:

Dr. Anoop Sarkar, Senior Supervisor Dr. Fred Popowich, Supervisor Dr. David Campbell, Examiner Dr. Keval Vora, Chair

Recent Publications