News

Synthesizer: Rethinking Self-Attention in Transformer Models
07 Jul 2020

In our lab meeting tomorrow, Nishant will introduce the Transformer Models. A Zoom link will be posted to Twist on the morning of the meeting.

Synthesizer: Rethinking Self-Attention in Transformer Models

Abstract: The dot product self-attention is known to be central and indispensable to state-of-the-art Transformer models. But is it really required? This paper investigates the true importance and contribution of the dot product-based self-attention mechanism on the performance of Transformer models. Via extensive experiments, we find that (1) random alignment matrices surprisingly perform quite competitively and (2) learning attention weights from token-token (query-key) interactions is not that important after all. To this end, we propose Synthesizer, a model that learns synthetic attention weights without token-token interactions. Our experimental results show that Synthesizer is competitive against vanilla Transformer models across a range of tasks, including MT (EnDe, EnFr), language modeling (LM1B), abstractive summarization (CNN/Dailymail), dialogue generation (PersonaChat) and Multi-task language understanding (GLUE, SuperGLUE).

Tuesday, July 7th, 09:30 a.m.

Does Privacy Exist When We Are Online?
30 Jun 2020

In our lab meeting tomorrow, Hassan will introduce the online privacy. A Zoom link will be posted to Twist on the morning of the meeting.

Does Privacy Exist When We Are Online?

Abstract: You have most likely tried creating accounts in different websites, and have been forced to look at their privacy policy and terms pages. If you are like me, you have very likely tried to get to the last line of those pages as fast as possible to accept the terms and get your account. In this presentation, we are going to discuss “privacy” and “digital identity” and what you might have missed in fast browsing those terms pages (we will study the privacy policy and terms of an example service). We will also learn our rights in releation to these policies as the service users. At the end, I will present my suggestions to have a higher level of privacy while using internet.

Tuesday, June 30th, 09:30 a.m.

Recent trends in Automatic Speech Translation
23 Jun 2020

In our lab meeting tomorrow, Ashkan will introduce Automatic Speech Translation. A Zoom link will be posted to Twist on the morning of the meeting.

Recent trends in Automatic Speech Translation

Abstract: Automatic Speech Translation (AST) aims to directly translate audio signals in the source language, into the text words in the target language. For many years, the pipeline of transcribing speech with ASR and then translating with the MT component was a standard method to address the speech translation problem. In recent years, it has shown that we can remove the transcription step and build an end-to-end model that is strong enough to compete with the cascaded model. In this talk I go through the most influential ideas in this research direction.

Tuesday, June 23th, 09:30 a.m.

Reformer: The Efficient Transformer
09 Jun 2020

In our lab meeting tomorrow, Pooya will introduce Efficient Transformer. A Zoom link will be posted to Twist on the morning of the meeting.

Reformer: The Efficient Transformer

Abstract: Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences. We introduce two techniques to improve the efficiency of Transformers. For one, we replace dot-product attention by one that uses locality-sensitive hashing, changing its complexity from O(L2) to O(LlogL), where L is the length of the sequence. Furthermore, we use reversible residual layers instead of the standard residuals, which allows storing activations only once in the training process instead of N times, where N is the number of layers. The resulting model, the Reformer, performs on par with Transformer models while being much more memory-efficient and much faster on long sequences.

Tuesday, June 9th, 09:30 a.m.

A review on Nested Named Entity Recognizion
19 May 2020

In our lab meeting tomorrow, Vincent will introduce a review on Nested Named Entity Recognizion. A Zoom link will be posted to Twist on the morning of the meeting.

A review on Nested Named Entity Recognizion

Abstract: Named entity recognition (NER) is the task to extract some certain semantic entities, such as person, organization, etc, from a sentence or a paragraph. In other words, it detects the span and the semantic categories that each entity belongs to. NER plays an important role in many downstream tasks such as relation extraction, co-reference resolution and entity linking. Nested NER, namely, refers to the stuation where some entities may contain others. Due to the technical problems, not semantic ones, Nested NER had been ignored for a lone time. However, Nested NER is very common, especially in biomedical domain, and fine-grained entities provide a necessary and detailed information for the downstream tasks. This review mainly focuses on three parts: i. Sequence labeling model with multiple labels classification. ii. Sequence labeling model with modified Decoder. iii. Other models apart from sequence labeling.

Tuesday, May 19th, 09:30 a.m.

Recent Publications