News

A review on Nested Named Entity Recognizion
19 May 2020

In our lab meeting tomorrow, Vincent will introduce a review on Nested Named Entity Recognizion. A Zoom link will be posted to Twist on the morning of the meeting.

A review on Nested Named Entity Recognizion

Abstract: Named entity recognition (NER) is the task to extract some certain semantic entities, such as person, organization, etc, from a sentence or a paragraph. In other words, it detects the span and the semantic categories that each entity belongs to. NER plays an important role in many downstream tasks such as relation extraction, co-reference resolution and entity linking. Nested NER, namely, refers to the stuation where some entities may contain others. Due to the technical problems, not semantic ones, Nested NER had been ignored for a lone time. However, Nested NER is very common, especially in biomedical domain, and fine-grained entities provide a necessary and detailed information for the downstream tasks. This review mainly focuses on three parts: i. Sequence labeling model with multiple labels classification. ii. Sequence labeling model with modified Decoder. iii. Other models apart from sequence labeling.

Tuesday, May 19th, 09:30 a.m.

State-of-The-Art Speech Recognition with Sequence-to-Sequence Models
12 May 2020

In our lab meeting tomorrow, Jetic will introduce a Sequence-to-Sequence model for Speech Recognition. A Zoom link will be posted to Twist on the morning of the meeting.

State-of-The-Art Speech Recognition with Sequence-to-Sequence Models

Abstract: Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network. In previous work, we have shown that such architectures are comparable to state-of-the- art ASR systems on dictation tasks, but it was not clear if such architectures would be practical for more challenging tasks such as voice search. In this work, we explore a variety of structural and optimization improvements to our LAS model which significantly improve performance. On the structural side, we show that word piece models can be used instead of graphemes. We also introduce a multi-head attention architecture, which offers improvements over the commonly-used single-head attention. On the optimization side, we explore synchronous training, scheduled sampling, label smoothing, and minimum word error rate optimization, which are all shown to improve accuracy. We present results with a unidirectional LSTM encoder for streaming recognition. On a 12, 500 hour voice search task, we find that the proposed changes improve the WER from 9.2% to 5.6%, while the best conventional system achieves 6.7%; on a dictation task our model achieves a WER of 4.1% compared to 5% for the conventional system.

Tuesday, May 12th, 09:30 a.m.

Linguistic Feature Extraction from BERT for Neural Machine Translation
05 May 2020

In our lab meeting tomorrow, Hassan will introduce a Neural Machine Translation method based on linguistic feature extraction. A Zoom link will be posted to Twist on the morning of the meeting.

Linguistic Feature Extraction from BERT for Neural Machine Translation

Abstract: The emergence of massively pre-trained language models (e.g. BERT) would intuitively lead to study the usefulness of such models in improving the accuracy of translation models. In this presentation, we focus on feature extraction from BERT while discussing the improvements of the extracted feature vectors in Neural Machine Translation framework. We will also cover some background from the other papers suggesting closely related ideas.

Tuesday, May 5th, 09:30 a.m.

Neural Hidden Markov Model for Word Alignment
28 Apr 2020

In our lab meeting tomorrow, Anahita will introduce a Neural method for Word Alignment. A Zoom link will be posted to Twist on the morning of the meeting.

Neural Hidden Markov Model for Word Alignment

Abstract: We present our results for neuralizing an unsupervised Hidden Markov Model (HMM) for word alignment. This work proposes a hidden Markov model with neural network-based lexicon and alignment models, which are trained jointly using the Baum-Welch algorithm. Our experimental results show that neural HMM generally outperforms its GIZA++ IBM4 baseline.

Tuesday, Apr 28st, 09:30 a.m.

A Review of Representational Constraints to Improve Zero-Shot NMT
21 Apr 2020

In our lab meeting tomorrow, Nishant will review techniques for improving zero-shot NMT. A Zoom link will be posted to Twist on the morning of the meeting.

A Review of Representational Constraints to Improve Zero-Shot NMT

Abstract: Multilingual Neural Machine Translation (NMT) models are capable of translating between multiple source and target languages. Despite various approaches to train such models, they have difficulty with zero-shot translation: translating between language pairs that were not together seen during training. We first diagnose why state-of-the-art multilingual NMT models that rely purely on parameter sharing (language-specific methods) , fail to generalize to unseen language pairs. We then review auxiliary losses (language-independent constraints) on the NMT encoder and decoder that impose representational invariance across languages.

Tuesday, Apr 21st, 09:30 a.m.

Recent Publications