News

Mohammad Mahdi Soleimani Nasab MSc Thesis Defence
07 Apr 2016

On April 1st 10:30am, Mahdi gave his thesis defence on the topic “On the Importance of Decoding in Semi-supervised Learning”.

Abstract:

In many natural language processing (NLP) tasks a large amount of unlabelled data is available while labelled data is hard to attain. Bootstrapping techniques have been shown to be very successful on a variety of NLP tasks using only a small amount of supervision. In this research we have studied different bootstrapping techniques that separate the training step of the algorithm from the decoding step which produces the argmax label on test data. We then explore generative models trained in the conventional way using the EM algorithm but we use an initialization step and a decoding techniques similar to the Yarowsky bootstrapping algorithm. The new model is tested on named entity classification and word sense disambiguation tasks and has shown significant improvement over previous generative approaches.

Mahdi Soleimani Thesis Practice Talk
31 Mar 2016

Today Mahdi will be giving a practice talk of his master thesis defense during the lab meeting.

Abstract: Here’s the abstract: In many natural language processing (NLP) tasks a large amount of unlabelled data is available while labelled data is hard to attain. Bootstrapping techniques have been shown to be very successful on a variety of NLP tasks using only a small amount of supervision. In this research we have studied different bootstrapping techniques that separate the training step of the algorithm from the decoding step which produces the argmax label on test data. We then explore generative models trained in the conventional way using the EM algorithm but we use an initialization step and a decoding techniques similar to the Yarowsky bootstrapping algorithm. The new approach is tested on named entity classification and word sense disambiguation tasks and has shown significant improvement over :previous generative models.

Dr. Orland Hoeber's Reseach Talk
03 Mar 2016

Dr. Orland Hoeber is visiting us from the University of Regina. He is going to give a talk about his research during our lab meeting at 11:30 on March 3rd. His primary research interests are in the fields of information visualization, visual analytics, geovisual analytics, web/image search interfaces, web search personalization, human-computer interaction, mobile computing, and web intelligence.

Transition-Based Dependency Parsing by Juneki Hong
24 Feb 2016

Juneki Hong is a visiting student from CMU. He will give a talk about his research during the lab meeting tomorrow (Thursday) at 11:30. He will talk about Transition-Based Dependency Parsing: how to introduce backtracking revision operators to the Arc-Eager action set, and how you could train a parser via coaching. He will also discuss speed-accuracy tradeoffs and beam search, and show some preliminary results.

Mahdi Soleimani Thesis Talk
02 Feb 2016

On Feburary 4th, Mahdi Soleimani will talk about his MSc thesis research on bootstrapping classifiers with limited amounts of training data during our lab meeting. Abstract: For many NLP tasks a large amount of unlabelled data is available while labelled data is hard to attain. Bootstrapping techniques have been shown to be very successful on different NLP tasks using only a small amount of supervision (labelled data) alongside a large set of unlabelled data. While most of the previous research and algorithms are done on the parameter estimation step in bootstrapping, here we have studied the decoding step (classification using the estimated parameters). We show that by using different decoding techniques, similar to decoding step in Yarowsky algorithm, simple EM algorithm can achieve same results as more complicated learning approaches.

Recent Publications