17 Sep 2019

In our lab meeting this week, Logan will talk about some recent work on the analysis of the undeciphered proto-Elamite script that he presented in a NAACL 2019 workshop. The title and abstract of the talk:

Sign Clustering and Topic Extraction in Proto-Elamite

Abstract: We describe a first attempt at using techniques from computational linguistics to analyze the undeciphered proto-Elamite script. Using hierarchical clustering, n-gram frequencies, and LDA topic models, we both replicate results obtained by manual decipherment and reveal previously-unobserved relationships between signs. This demonstrates the utility of these techniques as an aid to manual decipherment.

Tuesday, September 17th, 10:30 a.m. TASC1 9408.

10 Sep 2019

In our lab meeting this week, Nishant will talk about the state-of-the-art methods for neural word sense disambiguation. The title and abstract of his talk:

Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

Abstract: We tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database. We propose two different methods that greatly reduces the size of neural WSD models, with the benefit of improving their coverage without additional training data, and without impacting their precision. In addition to our method, we present a WSD system which relies on pre-trained BERT word vectors in order to achieve results that significantly outperform the state of the art on all WSD evaluation tasks.

Tuesday, September 10th, 10:30 a.m. TASC1 9408.

10 Sep 2019

Hi everyone, we’ll have the first lab meeting of the Fall Semester on Tuesday, September 10.

Join us in our weekly lab meetings and paper discussions on Tuesdays at 10:30 AM in TASC 9408.

12 Aug 2019

In our lab meeting this week, Anahita will talk about part-of-speech induction. The title and abstract of her talk:

Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora

Abstract: We will discuss the seminal work of Yarowsky and Ngai (2001) for inducing part-of-speech taggers for languages that have no annotated training data, but have translated text in a resource-rich language. This method does not assume any knowledge about the target language (no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages.

Tuesday, August 13thth, 12:00 p.m. TASC1 9408.

06 Aug 2019

In our lab meeting this week, Zhenqi will talk abou fine-grained entity recognition. The title and abstract of his talk:

Overview of latest works on fine-grained and ultra-fine entity recognition

Abstract: Recent works on fine-grained entity typing task has shift their attentions from encoding mention and context to a better type embedding. We will present four latest works on the fine-grained and ultra-fine entity recognition task and explore techniques such as zero-shot learning, graph convolution network and hyperbolic space of embedding entity types.

Tuesday, August 6th, 12:00 p.m. TASC1 9408.

