02 Feb 2016

On Feburary 4th, Mahdi Soleimani will talk about his MSc thesis research on bootstrapping classifiers with limited amounts of training data during our lab meeting. Abstract: For many NLP tasks a large amount of unlabelled data is available while labelled data is hard to attain. Bootstrapping techniques have been shown to be very successful on different NLP tasks using only a small amount of supervision (labelled data) alongside a large set of unlabelled data. While most of the previous research and algorithms are done on the parameter estimation step in bootstrapping, here we have studied the decoding step (classification using the estimated parameters). We show that by using different decoding techniques, similar to decoding step in Yarowsky algorithm, simple EM algorithm can achieve same results as more complicated learning approaches.