For contact information, see the contact page.
Natural Language Processing (NLP) is the automatic analysis of human languages such as English, Korean, etc. by computer algorithms. Unlike artificially created programming languages where the structure and meaning of programs is easy to encode, human languages provide an interesting challenge, both in terms of its analysis and the learning of language from observations.
Success in NLP implies great benefits to society. Imagine a world where you can pick up a phone and talk in English, while at the other end of the line your words are spoken in Chinese. Imagine a computer animated representation of yourself speaking fluently what you have written in an email. Imagine medical experts automatically uncovering protein/drug interactions in gigabytes of medical abstracts—a quantity of text no human could possibly read and summarize. Imagine feeding a computer an ancient script that no living person can remember, then listening as the computer reads aloud in this dead language.
NLP can be used for the transduction of one linguistic form to another or parsing of language into a structured form. Transduction of language involves summarizing, paraphrasing or translating languages. Parsing involves conversion of unstructured data into a structured form, such as speech into text or large text collections like the web into informative labels. Examples of parsing include identifying a group of words as a person’s name or identifying the recursive grammar of a language.
Like other artificial intelligence sub-fields, there is the issue of what knowledge the computer needs to process human language, and how is knowledge is obtained. For example, if the computer is to “learn” this knowledge, what can be learned automatically, and what can be achieved with human supervision.
The natural language laboratory at Simon Fraser University was founded in 1983 and is one of the larger North American labs working on natural language processing and computational linguistics.
Faculty, students and researchers from the School of Computing Science and the Department of Linguistics use the lab to conduct research in both the theory and applications of natural language processing. Our lab has done research in several areas of NLP including:Information extraction, Machine translation, Summarization of natural language document collections, Semi-supervised learning of language and language processing tasks Statistical syntactic and semantic parsing using treebanks, Theory of parsing and probabilistic grammars, Information retrieval, Computer assisted language learning, Natural language interfaces,
A more comprehensive description of our current and past research is available in the list of Projects on this web site. Our lab has a strong relationship with the natural language industry in Canada. In 1999, researchers associated with the lab formed a company called Gavagai Inc., which became Axonwave Software Inc. in 2003. The company was founded to commercialize natural language processing technology.
Our lab also has relationships with natural language research groups around the world.
We are always looking for motivated, good graduate students. We are also keen on fostering links with industry for joint projects that are based on research grants and/or student internships that allow graduate students to spend time at companies doing high-risk high-reward natural language processing projects.
Please contact either of the following faculty members:
If you are a graduate student in the School of Computing Science and would like to know more about the lab and our activities, or if you are a prospective graduate student, please contact us for more information on the lab and your research interests. We are looking for qualified and motivated PhD students. Please note however, if you are applying for graduate school to the School of Computing Science at SFU, you are admitted only on the merit of your academic credentials and not based on your potential interest in any particular research lab.
The lab is located in the TASC-1 Building (Room 9404). We run computational experiments on our departmental Linux grid, with a large number of nodes, some of them with 32G and 64G of RAM. We have web server machines and fileservers for the massive NLP datasets that we work on in our research. We also have several desktop machines in the lab, including Linux, Sun, Windows and Mac workstations.
In addition, lab members also have access to Westgrid which is a large community grid computing resource in Western Canada.