This week in our lab meeting, Artashes will talk about google brain’s paper “attention is all you need”. Here is the abstract of the paper:
Abstract:The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing model also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 Englishto- German translation task, improving over the existing best results, including ensembles, by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
If we have more time, he’ll also talk about how I’m applying this architecture for time-series regression prediction and sentiment analysis text classification.
Wednesday, April 11th, 10-11 AM, Location: TASC1 9408.
In our lab meeting this week, Fatemeh Torabi will talk about word embeddings. Here is the abstract of her talk:
Abstract: Word embeddings obtained from neural networks trained on big text corpora have become popular representations of word meaning in computational linguistics. The most popular model recently, i.e., word2vec, simultaneously generates a set of word and context embeddings, the latter usually discarded after training. We demonstrate how these two layers of distributional representation can be used in predicting taxonomic similarity vs. asymmetric association between words. Our study is composed of both artificial language experiments and evaluations based on word similarity and relatedness datasets collected through crowdsourcing and psycholinguistic experiments. In particular, we use two recently published datasets: SimLex-999 (Hills et al. 2016) including explicitly instructed ratings for word similarity, and explicitly instructed production norms (Jouravlev & McRae, 2016) for word relatedness. We find that people respond with words closer to the cue within the context embedding space (rather than the word embedding space) when they are explicitly asked to generate thematically related words. Taxonomic similarity ratings are however better predicted by word embeddings alone. This suggests that the distributional information encoded in different layers of the neural network reflect different aspects of word meaning. Our experiments also elaborate on word2vec as a model of human lexical memory by showing that both types of semantic relations among words are encoded within a unified network through reinforcement learning. Recommendations for biasing the model to organize words based either on taxonomic similarity vs. relatedness are introduced for practical applications.
Wednesday, April 4th, 10-11 AM, Location: TASC1 9408.
In our lab meeting this week, Nadia will talk about EPIC corpus. Here is the title and abstract of her talk:
Title: The European Parliament Interpreting Corpus E.P.I.C.
Abstract: The EPIC corpus is an open, parallel, trilingual (Italian, English and Spanish) corpus of European Parliament speeches and their corresponding interpretations. The basic idea behind the EPIC project was to collect a large quantity of interpreting data (i.e. source and interpreted speeches) to produce much-needed empirical research on the characteristics of interpreted speeches and to inform and improve training practices. More specifically, the research interest of the Directionality Research Group lies in the study of interpreters’ strategies across different language combinations and directions (language-pair-related issues and directionality issues, respectively).
Wednesday, March 21, 10-11 AM, Location: TASC1 9408.
In our lab meeting this week, Varada Kolhatkar from linguistics department will join us and talk about Understanding Discourse From Resolving Complex Anaphora to Identifying Constructive News Comments. Here is the title and abstract of her talk:
Title: Understanding Discourse: From Resolving Complex Anaphora to Identifying Constructive News Comments
Abstract: Recently, computational linguists have been increasingly interested in understanding discourse. This talk focuses on two prominent issues related to understanding discourse structure: (a) resolving complex cases of anaphora which depends on understanding how discourse is constructed and maintained, and (b) identifying constructive reader comments which requires the analysis of discourse at the pragmatic level.
The goal of the first project is to develop computational methods for tackling cases of anaphora where the antecedents are typically of a non-nominal syntactic form and the referents typically represent proposition-like entities, as shown in example (1). This research is guided by two primary questions: first, how an automated process can determine the interpretation of such expressions, and second, to what extent the knowledge derived from the linguistics literature can help in this process.
(1) Living expenses are much lower in rural India than in New York, but this fact is not fully captured if prices are converted with currency exchange rates.
The goal of the second project is to encourage constructive discussion online. In particular, I will talk about identifying constructive news comments. This research is guided by three questions: a) what characteristics make a reader comment constructive, b) how can an automated process determine constructive language in news comments and c) how do constructiveness and toxicity interact in online language. The methods which are being developed can assist in moderation tasks, typically performed by humans, such as promoting constructive comments and summarizing reader comments.
Wednesday, March 14th, 10-11 AM, Location: TASC1 9408.
In our first lab meeting in March, Anoop will talk about computational morphology. Here is the title and abstract of his talk:
Title: The K&K result
Abstract: Kaplan and Kay (henceforth K&K) announce two goals: “to provide the core of a mathematical framework for phonology” and “to establish a solid basis for computation in thedomain of phonological and orthographic systems.” They show how the algebra of regular relations, with their corresponding automata, can be used to compile systems of phonological rules in the style of SPE, including directionality, optionality, and ordering. They sketch mechanisms for incorporating a lexicon and for dealing with exceptional forms, thus providing a complete treatment in a unified framework. (text from “Commentary on Kaplan and Kay by Mark Liberman)
Wednesday, March 7th, 10-11 AM, Location: TASC1 9408.