News

Bdour will give her thesis defence practice talk
10 Dec 2019

On this Thursday, Bdour will practice her thesis defence. Here is the title and abstract:

**Multifaceted NLP Analysis of Hate Speech And Kinetic Actions Descriptions Online **

Abstract: Despite the many great advantages of social media and online forums in bringing people, communities, and groups together, other problems have emerged when using these sites including hate speech and abusive behaviour online. Unfortunately, these platforms can be used as spaces to bully, harass, assault or even plan to carry on a kinetic action against others. Most of the data that comes from these sources is noisy, unstructured and unlabelled, which makes designing supervised classifiers a task that requires a lot of human effort for labelling and going through the data to determine the severity of toxicity in it. Also, the human toll of working with this data may include negative psychological effects on the person after reading a potentially large amount of data. For these reasons, our goal is to provide a framework to help perform an exploration of such unstructured data to be able to determine the important topics, features, sentiment, and entities involved without the need to manually read all the text, including providing the capability for the automatic redaction of toxic terminology. The net result would be an improved environment and exposure for people that need to analyse this data to explore these documents and identify documents of interest in a less harmful way. We use different state-of-the-art natural language and machine learning techniques to design a pipeline that takes in unstructured noisy data and converts it into actionable structured data that incorporates visualisation. We also design a simple and modifiable scoring scheme that combines all the features of the multidimensional analysis and returns a score that can be used as a filtering metric to perform information retrieval on the documents, thus prioritising those that require human intervention. We then provide an evaluation of the resulting system that incorporates a range of objective and subjective criteria.

Thursday, December 12th, 2:00 p.m. TASC1 9408.   Title: Multifaceted NLP Analysis of Hate Speech And Kinetic Actions Descriptions Online

Happy holiday everyone!

Nadia will give her thesis defence practice talk
08 Dec 2019

In our lab meeting next Tuesday, Nadia will practice her thesis defence. Here is the title and abstract:

Translation versus Language Model Pre-training Objectives for Word Sense Disambiguation

Abstract: Contextual word representations pre-trained on large text data have advanced the state of the art in many tasks in Natural Language Processing. Most recent approaches pre-train such models using a language modelling (LM) objective. In this work, we compare and contrast such LM models with the encoder of an encoder-decoder model pre-trained using a machine translation (MT) objective. For certain tasks such as word-sense disambiguation the MT task provides an intuitively better pre-training objective since different senses of a word tend to translate differently into a target language, while word senses might not always need to be distinguished when using an LM objective. Our experimental results on word sense disambiguation provide insight into pre-training objective functions and can be helpful in guiding future work into large-scale pre-trained models for transfer learning in NLP.

Tuesday, December 10th, 10:30 a.m. TASC1 9408.   Title: Translation versus Language Model Pre-training Objectives for Word Sense Disambiguation

Jetic will give a talk about NMT@ACL2019
02 Dec 2019

In our lab meeting tomorrow, Jetic will give a talk on recent NMT papers, more specifically those tackling inconsistencies, model compression, and training efficiencies. Here is the title and abstract:

Beyond Chasing After BLEU Scores II

Abstract: Neural machine translation models while offer great performance promises on paper suffer severely in production, where inconsistencies in translation, huge model sizes and expensive training and inference operations hindering reliability and cost-efficiencies. In this presentation (1/2), we look at recent NMT papers published inproc ACL2019 that may help with these very specific issues.

Tuesday, December 3rd, 10:30 a.m. TASC1 9408.   Title: Beyond Chasing After BLEU Scores II

Hassan will do his PhD Depth Examination
25 Nov 2019

In our lab meeting tomorrow, Hassan will give a depth presentation on Lexical Constraints in NMT. In attendance will be his Ph.D. Examining Committee members, Dr. Anoop Sarkar, Dr. Angel Chang, and Dr. Fred Popowich.

Here is the title and abstract:

Imposing Bilingual Lexical Constraints to Neural Machine Translation

Abstract: *Neural Machine Translation (NMT) models have reached astonishing results in recent years, and yet infrequent words remain a problem to them as well as out-of-domain terminologies. Bilingual lexical resources can be used to guide the model through difficulties when it faces infrequent and out-of-domain, vocabulary words. Hence, imposing bilingual lexical preferences (constraints) into NMT models have received rising attention in the past few years.

In this report, we summarize different threads of work on Constrained NMT including approaches that modify the input or output of the model (pre-processing and post-processing) without changing the model itself, as well as, the approaches that change the inference algorithms and target vocabulary set to satisfy the Bilingual Lexical Constraints.*

Tuesday, November 26th, 10:30 a.m. TASC1 9408.   Title: Imposing Bilingual Lexical Constraints to Neural Machine Translation

Jetic will give a talk about NMT@ACL2019
18 Nov 2019

In our lab meeting tomorrow, Jetic will give a talk on recent NMT papers, more specifically those tackling inconsistencies, model compression, and training efficiencies. Here is the title and abstract:

Beyond Chasing After BLEU Scores I

Abstract: Neural machine translation models while offer great performance promises on paper suffer severely in production, where inconsistencies in translation, huge model sizes and expensive training and inference operations hindering reliability and cost-efficiencies. In this presentation (1/2), we look at recent NMT papers published inproc ACL2019 that may help with these very specific issues.

Tuesday, November 19th, 10:30 a.m. TASC1 9408.   Title: Beyond Chasing After BLEU Scores I

Recent Publications