This week, Logan will discuss parts of his upcoming publication in Findings of ACL 2021, as well as his submission to EMNLP 2021.
Our recent progress about the Undeciphered Proto-Elamite Script
Abstract: Compositionality of Complex Graphemes in the Undeciphered Proto-Elamite Script using Image and Text Embedding Models
We introduce a language modeling architecture which operates over sequences of images, or over multimodal sequences of images with associated labels. We use this architecture alongside other embedding models to investigate a category of signs called complex graphemes (CGs) in the undeciphered proto-Elamite script. We argue that CGs have meanings which are at least partly compositional, and we discover novel rules governing the construction of CGs. We find that a language model over sign images produces more interpretable results than a model over text or over sign images and text, which suggests that the names given to signs may be obscuring signals in the corpus. Our results reveal previously unknown regularities in proto-Elamite sign use that can inform future decipherment efforts, and our image-aware language model provides a novel way to abstract away from biases introduced by human annotators.
Creating a Signlist from Sign Images in an Undeciphered Script using Deep Clustering
We propose an architecture for revising transliterations of an undeciphered script by clustering sign images from that script. The clustering is optimized on a multi-part objective that includes unsupervised tasks such as entropy of the sign labels, visual similarity between signs and partial supervision that exploits existing transliterations for language modeling. This allows us to learn revised labelings for an undeciphered script which may be difficult for human annotators to transliterate since distinctions between signs may be relevant or irrelevant based on contextual information spread across the entire corpus. By automating this process we obtain a simplified signlist which we find to give better results than the existing transliterations on downstream tasks.
Tuesday, June 8th, 09:30 a.m.
This week, Ashkan will introduce his research progress on Simultaneous MT. A zoom link will be sent tomorrow morning.
Our recent progress on Simultaneous Machine Translation
Abstract: Simultaneous neural machine translation (SNMT) aims to maintain translation quality while minimizing the delay between reading the input and incrementally producing the output. The eventual goal of SNMT is to match the performance of highly skilled human interpreters who can simultaneously listen to a speaker in a source language and produce a translation in the target language with minimal delay. In this presentation I will talk about our latest progress on generating more efficient policies to balance the trade-off between translation quality and delay.
Tuesday, June 1st, 09:30 a.m.
This week, Jetic will introduce recent research and applications on Memory Network. A zoom link will be sent tomorrow morning.
Recent work on Memory Network
Abstract: Recent years have seen wide applications of external knowledge sources in end-to-end NLP tasks. Through the use of memory network (Sukhbaatar et al., 2015) alongside variants of attention mechanism (Bahdanau et al., 2014), neural models can directly access information from a much wider variety of sources such as Wikipedia or a task-specific Knowledge Base, in a context-aware manner, and have been shown to perform well against knowledge-agnostic or symbolic context-agnostic approaches. This work highlights recent advances in memory network research, and several of its applications.
Tuesday, May 25th, 09:30 a.m.
This week, Hassan will give us a review on a paper about Error Analysis. This is one of the two “Best Long Paper Award” winners of the EACL 2021 which ended 3 days ago. A zoom link will be sent tomorrow morning.
Error Analysis and the Role of Morphology
Abstract: We evaluate two common conjectures in error analysis of NLP models: (i) Morphology is predictive of errors; and (ii) the importance of morphology increases with the morphological complexity of a language. We show across four different tasks and up to 57 languages that of these conjectures, somewhat surprisingly, only (i) is true. Using morphological features does improve error prediction across tasks; however, this effect is less pronounced with morphologically complex languages. We speculate this is because morphology is more discriminative in morphologically simple languages. Across all four tasks, case and gender are the morphological features most predictive of error.
Tuesday, Apr 27th, 09:30 a.m.
This week, Nishant will give us a review on a paper about multilingual BERT. A zoom link will be sent tomorrow morning.
How multilingual is multilingual BERT?
Abstract: This work by Pires et al. (2019) empirically investigates the degree to which pre-trained contextualized general-purpose linguistic representations generalize across languages. The key finding is that multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. To understand why, authors present a large number of probing experiments, showing that transfer is possible even to languages in different scripts, that transfer works best between typologically similar languages, that monolingual corpora can train models for code-switching, and that the model can find translation pairs. From these results, we can conclude that M-BERT does create multilingual representations, but that these representations exhibit systematic deficiencies affecting certain language pairs.
Tuesday, Apr 20th, 09:30 a.m.