News

Nishant will talk about Decipherment.
16 May 2018

For our lab meeting this week, Nishant will talk about improving decipherment with RNNs . Here’s the title and abstract of his talk:

Title: Can Recurrent Neural Networks help improve Decipherment ?

Abstract: Decipherment in NLP has always been an interesting topic of research. There have been quite a few successful approaches inspired by statistical machine translation (SMT) methodologies to address the decipherment problem in the past. With the rampant rise of Neural Machine Translation (NMT) to prominence and the encouraging results neural language model powered systems have produced, approaching the decipherment problem with deep learning is only logical. In this talk, Nishant will discuss the progress of his research on his on Neural Decipherment, a few solved ciphers and a comparison of his results with notable previous work.

Wednesday, May 16th, 10-11 AM, Location: TASC1 9408.

Ashkan's presentation about Unsupervised Machine Translation.
09 May 2018

For our lab meeting this week, Ashkan will present an ICLR 2018 paper about Unsupervised Machine Translation. Here’s the title and abstract of his talk:

Title: Unsupervised Machine Translation Using Monolingual Corpora Only

Abstract: Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. There have been numerous attempts to extend these successes to low-resource language pairs, yet requiring tens of thousands of parallel sentences. In this work, we take this research direction to the extreme and investigate whether it is possible to learn to translate even without any parallel data. We propose a model that takes sentences from monolingual corpora in two different languages and maps them into the same latent space. By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data. We demonstrate our model on two widely used datasets and two language pairs, reporting BLEU scores of 32.8 and 15.1 on the Multi30k and WMT English-French datasets, without using even a single parallel sentence at training time.

Wednesday, May 9th, 10-11 AM, Location: TASC1 9408.

Logan's practice talk about Weighted Synchronous Context-Free Grammars
01 May 2018

Our first lab meeting for summer semester starts with Logan. Here is the title and abstract of his talk:

Title: A Weight-Preserving Lexicalization for Weighted Synchronous Context-Free Grammars.

Abstract: This is a practice talk for an upcoming presentation at WATA 2018. The talk will present an algorithm which lexicalizes a weighted SCFG without changing the weight which it assigns to strings. We also show that it is possible to normalize the lexicalized grammar into a probabilistic SCFG, and we evaluate the lexicalized grammars on a synthetic inference task.

Wednesday, May 2nd, 10-11 AM, Location: TASC1 9408.

Artashes will talk about 'Attention is all you need' paper in our lab meeting
10 Apr 2018

This week in our lab meeting, Artashes will talk about google brain’s paper “attention is all you need”. Here is the abstract of the paper:

Abstract:The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing model also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 Englishto- German translation task, improving over the existing best results, including ensembles, by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

If we have more time, he’ll also talk about how I’m applying this architecture for time-series regression prediction and sentiment analysis text classification.

Wednesday, April 11th, 10-11 AM, Location: TASC1 9408.

Fatemeh from linguistics department is our next speaker for this week's lab meeting
03 Apr 2018

In our lab meeting this week, Fatemeh Torabi will talk about word embeddings. Here is the abstract of her talk:

Abstract: Word embeddings obtained from neural networks trained on big text corpora have become popular representations of word meaning in computational linguistics. The most popular model recently, i.e., word2vec, simultaneously generates a set of word and context embeddings, the latter usually discarded after training. We demonstrate how these two layers of distributional representation can be used in predicting taxonomic similarity vs. asymmetric association between words. Our study is composed of both artificial language experiments and evaluations based on word similarity and relatedness datasets collected through crowdsourcing and psycholinguistic experiments. In particular, we use two recently published datasets: SimLex-999 (Hills et al. 2016) including explicitly instructed ratings for word similarity, and explicitly instructed production norms (Jouravlev & McRae, 2016) for word relatedness. We find that people respond with words closer to the cue within the context embedding space (rather than the word embedding space) when they are explicitly asked to generate thematically related words. Taxonomic similarity ratings are however better predicted by word embeddings alone. This suggests that the distributional information encoded in different layers of the neural network reflect different aspects of word meaning. Our experiments also elaborate on word2vec as a model of human lexical memory by showing that both types of semantic relations among words are encoded within a unified network through reinforcement learning. Recommendations for biasing the model to organize words based either on taxonomic similarity vs. relatedness are introduced for practical applications.

Wednesday, April 4th, 10-11 AM, Location: TASC1 9408.

Recent Publications