Today in the lab meeting, Dan Fass will be leading the discussion. The title of his talk is “A Selected Review of N-gram Research and Two Proposals”. Here is a brief description of his talk:
N-grams are widely used in NLP applications such as statistical machine translation, speech recognition, optical character recognition, and spelling correction. Three research groups are notable for developing n-grams that combine lexical and syntactic information of various types. A representation for n-gram information is proposed that attempts to reconcile differences between those n-grams, along with a classification underpinning the proposed representation and others. Potential applications and extensions of the representation are briefly described.
Today in the lab meeting, Golnar will give a talk about Named Entity recognition with LSTM-X. Here is a brief description of her talk: “We will be discussing variations of LSTM tailored for Named Entity recognition and plans for integrating them into GraphNER.”
In the lab meeting tomorrow, Ashkan will give a talk about Real-Time Neural Machine Translation. Here is a brief description of his talk:
Simultaneous translation differs from a more usual consecutive translation. In simultaneous translation, the objective of a translator, or a translation system, is defined as a combination of quality and delay, as opposed to consecutive translation in which translation quality alone matters. In order to minimize delay while maximizing quality, a simultaneous translator must start generating symbols in a target languages before a full source sentence is received. In our meeting, I will talk about the main Ideas and approaches presented in two novel papers in this area of research. Here’s their links: https://arxiv.org/abs/1606.02012 https://arxiv.org/abs/1610.00388
In the lab meeting today, we will be discussing the following paper: Neubig, Graham, et al. “DyNet: The Dynamic Neural Network Toolkit.” arXiv preprint arXiv:1701.03980 (2017).
In the lab meeting today, Nishant will give a talk about handling Out-of-Vocabulary (OOV) words in Machine Translation. Here is a brief description of his talk:
Out-of-vocabulary (OOV) words - words that appear in the recognition task at hand, but not in the training set - are a ubiquitous and difficult problem in machine translation. Data-driven machine translation systems are able to translate words that have been seen in the training corpora, however translating unseen words is still a bottleneck for even the best performing systems. In general, the amount of parallel data is finite which results in infrequent terms to be absent in the training parallel corpora. This lack of information can potentially produce incomplete, erroneous and disfluent translations. In this discussion, we will investigate the different approaches of handling OOVs in Statistical Machine Translation leading up to Neural Machine Translation.