News

Optimizing Multivariate Performance Measures for Learning Relation Extraction Models
02 Jul 2015

Dr. Reza Haffari will visit our lab tomorrow, Friday 3rd of July. He will give a talk on Optimizing Multivariate Performance Measures for Learning Relation Extraction Models. The talk will be at 1PM at TASC1 9204 West. Here is the abstract and a short bio:

Title: Optimizing Multivariate Performance Measures for Learning Relation Extraction Models

Abstract: We describe a novel max-margin learning approach to optimize non-linear performance measures for distantly-supervised relation extraction models. Our approach can be generally used to learn latent variable models under multivariate non-linear performance measures, such as Fβ-score. Our approach interleaves Concave-Convex Procedure (CCCP) for populating latent variables with dual decomposition to factorize the original hard problem into smaller independent sub-problems. The experimental results demonstrate that our learning algorithm is more effective than the ones commonly used in the literature for distant supervision of information extraction models. On several data conditions, we show that our method outperforms the baseline and results in up to 8.5% improvement in the F1-score.

Bio: Reza Haffari is an Assistant Prof. in the Faculty of IT, Monash University. His research is in the intersection of Machine Learning and Natural Language Processing (NLP). His primary research is developing new models and learning algorithms for real-life problems, particularly those arise in NLP. This includes topics like structured prediction, domain adaptation, and semi-supervised learning for problems such as machine translation, parsing, language modelling, and information extraction.

Training in Big Text Data Workshop
29 Apr 2015

Wed 4/28 we will have an extended “lab meeting” with special guests,
Evangelos Milios and Axel Soto from Dalhousie University who will
be joining us for a collection of presentations and discussion in
the natural language lab.

Schedule for Training in Big Text Data Workshop from 9:30am to 2:30pm

  • 9:30 – 12:00: Student Presentations (Ellert, Odilinye, Marques, Sabharwal, Tofiloski)
  • 12:00 – 1:30: Lunch
  • 1:30 – 2:30: General Research Discussions

New Features of Lensing Wikipedia + Apache Spark
14 Apr 2015

In the lab meeting this week, 15th of April, Anoop will give a demo on new features in Lensing Wikipedia Project. In the second half of the meeting, Anoop will talk about how to exploit Apache Spark for distributed computing. The meeting would be the same location at the usual time.

Glm-parser Project
08 Apr 2015

In the lab meeting this week, 8th of April, Ziqi Wang and Yulan Huang will talk about the glm-parser project to write a state of the art dependency parser using Python and Cython. They will talk about the design of the project, a brief look at the algorithms and the speed up they were able to obtain by the use of better algorithms, the use of Cython, etc. The project is hosted on github at:

glm-parser

Graph-based Semi-supervised Learning
30 Mar 2015

Golnar is going to give a talk about Graph-based Semi-supervised learning in the lab meeting this week. The meeting is on Wednesday, 1st of April, at 1:30 pm.
Abstract:“Semi-supervised learning (SSL) brings the best of supervised and unsupervised learning together: it takes advantage of labelled data when available, while using information hidden in usually abundant unlabelled data.
Graph-based SSL has frequently beaten other SSL approaches in the past, and has been applied to many NLP applications: POS-tagging, dependency parsing, and semantic analysis to name a few. It encourages similar data points to take similar labels even if they appear far from each other in training data (ex. across sentences).
In this talk, I will cover the basics of Graph-based SSL such as graph construction, graph propagation, and inductive vs. transductive methods, while using POS-tagging as a running example task.”

Recent Publications