Publications

  • Sankaran, B., Sarkar, A., & Duh, K. (2013). Multi-Metric Optimization Using Ensemble Tuning. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 947–957). Atlanta, Georgia: Association for Computational Linguistics.
  • Siahbani, M., Sankaran, B., & Sarkar, A. (2013). Efficient Left-to-Right Hierarchical Phrase-Based Translation with Improved Reordering. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 1089–1099). Seattle, Washington, USA: Association for Computational Linguistics.
  • Razmara, M., & Sarkar, A. (2013). Stacking for Statistical Machine Translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 334–339). Sofia, Bulgaria: Association for Computational Linguistics.
  • Razmara, M., Siahbani, M., Haffari, R., & Sarkar, A. (2013). Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1105–1115). Sofia, Bulgaria: Association for Computational Linguistics.
  • Razmara, M., Siahbani, M., & Sarkar, A. (2013). Ensemble Triangulation for Statistical Machine Translation. In Proceedings of the 6th International Joint Conference on Natural Language Processing (pp. 252–260). Nagoya, Japan: Association for Computational Linguistics.

News

Biomedical Text Mining, visit from BC Genome Sciences Centre 07 Apr 2014

Lab meeting on Wed 4/9 at 10:30 note new time in TASC1 9208.

A group of researchers from the BC Genome Sciences Centre will visit us this week to talk about Biomedical Text Mining and Artificial Intelligence Applied to Clinical Reporting.

The group includes Inanc Birol who is a Senior Scientist at the BC Genome Sciences Centre and Victoria Stuart who will present on the following topic as a way to guide our discussion.

We are a newly-formed group (Jan. 2014) within the GSC Bioinformatics Technology Lab that seeks to apply a combined natural language processing, text mining, computational linguistics, machine learning, artificial intelligence (NLP/TM/CL/ML/AI) approach to address the issue of the annotation of clinical reports with (i) the clinical data that is generated through our sequencing pipeline, and (ii) the relevant biomedical literature. As this is being done in a clinical setting, issues of quality and accuracy (genomic sequencing and analyses), turnaround time (speed), and standardized reporting are critically important. Our BTL-AI group is addressing the clinical reporting component by developing protocols to retrieve the biomedical literature relevant to our clinical reports, that will be provided to our end users as annotations that include quality assessments. In the broader sense, our work is relevant to the issue of information overload affecting all researchers, including the sheer volume of published material such as that provided by PubMed, that currently contains >24 million biomedical records with new records being added at a rate of ~1 million records/year. Through our project, we will develop and provide the tools needed to automatically search, retrieve, analyze, condense and focus the scientific literature – transparently returning information in an indicated domain, with confidence measures indicated. We have been reviewing the current state of the art in this topic area (NLP/TM/CL/ML/AI) and have identified leading, top-performing tools and resources developed through community efforts (e.g. The BioNLP Shared Task challenges) that we are currently bringing into our lab for evaluation and extension (customization), as needed.

t-Distributed Stochastic Neighbour Embedding (t-SNE) 02 Apr 2014

The topic of the lab meeting, tomorrow, April 2, is t-distributed Stochastic Neighbor Embedding. We are going to try something different for the lab meeting. Please watch the following talk before the lab meeting. And we will discuss the details of the model (i.e go through the equations) and discuss the results.

t-SNE

The meeting will be at the usual time and place, TASC1 9408 from 1130 hours.

Schloss Dagstuhl seminar on SMT into morphologically rich languages 26 Mar 2014

In the lab meeting tomorrow, Ann will talk about her experience attending the Schloss Dagstuhl seminar on SMT into morphologically rich languages. The details of the seminar can be found here

No lab meeting tomorrow 12 Mar 2014

In honor of the short ACL paper deadline, there will not be a lab meeting tomorrow. We will resume from next week.

Hierarchical Browsing for LensingWikipedia 05 Mar 2014

In the lab meeting tomorrow, March 5th, Anoop will discuss about possible short papers for the upcoming ACL deadline. Anoop will also discuss a proposal for hierarchical faceted browsing for the LensingWikipedia project.