02 Mar 2021

In our lab meeting tomorrow, Golnar will practice her Phd seminar talk on NER and Word disambiguation.

Investigations into the Value of Labeled and Unlabeled Data in Biomedical Entity Recognition and Word Sense Disambiguation

Abstract: Human annotations, especially in highly technical domains, are expensive and time consuming to gather, and can also be erroneous. As a result, we never have sufficiently accurate data to train and evaluate supervised methods.In this thesis, we address this problem by taking a semi-supervised approach to biomedical named entity recognition (NER), and by proposing an inventory-independent evaluation framework for supervised and unsupervised word sense disambiguation.Our contributions are as follows: • We introduce a novel graph-based semi-supervised approach to named entity recognition(NER) and exploit pre-trained contextualized word embeddings in several biomedical NER tasks. • We propose a new evaluation framework for word sense disambiguation that permits a fair comparison between supervised methods trained on different sense inventories as well as unsupervised methods without a fixed sense inventory.

Tuesday, Mar 2nd, 09:30 a.m.