In our lab meeting this week, Nadia will talk about making use of deep neural networks for encoding semantic information in word representations. Here is title and abstract of her talk:
Title: Deep contextualized word representations
Abstract: We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bi-directional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment, and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.
Link to paper: https://arxiv.org/abs/1802.05365
Wednesday, July 3rd, 10:00 a.m. TASC1 9408.
This week in our lab meeting, Mahmoud from Computational Logic Lab will have a presentation about dynamic gated graph neural networks. Here is the title and abstract of his talk:
Title: Scene graph generation with dynamic gated graph neural networks. Abstract: In spite of recent advances in Visual Question Answering (VQA), current VQA models often fail on sufficiently new samples, converge on an answer after listening to only a few words of the question, and do not alter their answers across different images. Most of these models try to build a loose association between the given training QA pairs and images in an end-to-end framework. But, to achieve success at VQA task, a model must be able to recognize the objects and their visual relationships in an image, identify the attributes of these objects, and reason about the role of each object in the scene context. To address these issues, we propose a new deep model, called Dynamic Gated Graph Neural Networks (D-GGNN), for extracting a scene graph for an image, given a set of bounding box proposals. A scene graph is a visually-grounded digraph for an image, where the nodes represent the objects and the edges show the relationships between them. Unlike the recently proposed Gated Graph Neural Networks (GGNN), the D-GGNN can be applied to an input image when only partial relationship information, or none at all, is known. In each training episode, the D-GGNN sequentially builds a candidate scene graph for a given training input and labels additional nodes and edges of the graph. The scene graph is constructed using a deep reinforcement learning framework, where the actions are choosing labels for edges and nodes, and the rewards are defined by the match between the ground-truth annotations in the data and the labels assigned at a point in the search. The predicted scene graph is then used to answer questions about the image using an attention mechanism, where we compute an attention weight for each object of the scene graph based on the given question. Our preliminary experiments show promising results on both VQA and scene graph generation tasks.
Wednesday, June 26th, 10-11 AM, Location: TASC1 9408.
In our lab meeting tomorrow, Anahita will present another paper from ICLR 2018 about Neural Phrase-based Machine Translation. Here is the title and abstract of the paper:
Title: Towards Neural Phrase-based Machine Translation
Abstract: In this paper, we present Neural Phrase-based Machine Translation (NPMT). Our method explicitly models the phrase structures in output sequences using Sleep- WAke Networks (SWAN), a recently proposed segmentation-based sequence modeling method. To mitigate the monotonic alignment requirement of SWAN, we introduce a new layer to perform (soft) local reordering of input sequences. Different from existing neural machine translation (NMT) approaches, NPMT does not use attention-based decoding mechanisms. Instead, it directly outputs phrases in a sequential order and can decode in linear time. Our experiments show that NPMT achieves superior performances on IWSLT 2014 German-English/English- German and IWSLT 2015 English-Vietnamese machine translation tasks compared with strong NMT baselines. We also observe that our method produces meaningful phrases in output languages.
The paper can be found here: https://openreview.net/forum?id=HktJec1RZ
Wednesday, June 13th, 10-11 AM, Location: TASC1 9408.
This week in our lab meeting, Anoop will talk about computational decipherment. Here’s the title and abstract of his talk:
Title: Computational Decipherment of Ancient Scripts
Abstract: A brief overview of methods in the computational decipherment of ancient scripts. We will compare and contrast with more recent unsupervised neural machine translation models.
Wednesday, June 6th, 10-11 AM, Location: TASC1 9408.
In our lab meeting this week, Lindsey presented a paper from Facebook team today. Here is the abstract of his talk:
Abstract: Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.
The paper can be found here: https://arxiv.org/pdf/1509.00685.pdf
Wednesday, May 30th, 10-11 AM, Location: TASC1 9408.