What is Natural Language Processing?

Natural Language Processing (NLP) is the automatic analysis of human languages such as English, Korean, etc. by computer algorithms. Unlike artificially created programming languages where the structure and meaning of programs is easy to encode, human languages provide an interesting challenge, both in terms of its analysis and the learning of language from observations.

Success in NLP implies great benefits to society. Imagine a world where you can pick up a phone and talk in English, while at the other end of the line your words are spoken in Chinese. Imagine a computer animated representation of yourself speaking fluently what you have written in an email. Imagine medical experts automatically uncovering protein/drug interactions in gigabytes of medical abstracts–a quantity of text no human could possibly read and summarize. Imagine feeding a computer an ancient script that no living person can remember, then listening as the computer reads aloud in this dead language.

NLP can be used for the transduction of one linguistic form to another or parsing of language into a structured form. Transduction of language involves summarizing, paraphrasing or translating languages. Parsing involves conversion of unstructured data into a structured form, such as speech into text or large text collections like the web into informative labels. Examples of parsing include identifying a group of words as a person’s name or identifying the recursive grammar of a language.

Like other artificial intelligence sub-fields, there is the issue of what knowledge the computer needs to process human language, and how is knowledge is obtained. For example, if the computer is to “learn” this knowledge, what can be learned automatically, and what can be achieved with human supervision.

Our lab has a particular focus on research into statistical machine translation and the visual and textual summarization of information contained in natural language.

History of our Lab

The natural language laboratory at Simon Fraser University was founded in 1983 and is one of the larger North American labs working on natural language processing and computational linguistics.

Faculty, students and researchers from the School of Computing Science and the Department of Linguistics use the lab to conduct research in both the theory and applications of natural language processing. Our lab has done research in several areas of NLP including:

Information extraction, Machine translation, Summarization of natural language document collections, Semi-supervised learning of language and language processing tasks Statistical syntactic and semantic parsing using treebanks, Theory of parsing and probabilistic grammars, Information retrieval, Computer assisted language learning, Natural language interfaces,

A more comprehensive description of our current and past research is available in the list of Projects on this web site. Our lab has a strong relationship with the natural language industry in Canada. In 1999, researchers associated with the lab formed a company called Gavagai Inc., which became Axonwave Software Inc. in 2003. The company was founded to commercialize natural language processing technology.

Our lab also has relationships with natural language research groups around the world.

Computational Infrastructure

The lab is located in the TASC-1 Building (Room 9404). We run computational experiments on our departmental Linux grid, with a large number of nodes, some of them with 32G and 64G of RAM. We have web server machines and fileservers for the massive NLP datasets that we work on in our research. We also have several desktop machines in the lab, including Linux, Sun, Windows and Mac workstations.

In addition, lab members also have access to Westgrid which is a large community grid computing resource in Western Canada.

For Prospective Students and Potential Industry Partners

We are always looking for motivated, good graduate students. We are also keen on fostering links with industry for joint projects that are based on research grants and/or student internships that allow graduate students to spend time at companies doing high-risk high-reward natural language processing projects.

TASC-1 building interior

Please contact either of the following faculty members:

If you are a graduate student in the School of Computing Science and would like to know more about the lab and our activities, or if you are a prospective graduate student, please contact us for more information on the lab and your research interests. We are looking for qualified and motivated PhD students. Please note however, if you are applying for graduate school to the School of Computing Science at SFU, you are admitted only on the merit of your academic credentials and not based on your potential interest in any particular research lab.

Contact Us

The SFU Natural Language Laboratory and the faculty offices are located on the SFU Burnaby campus in the TASC-1 building (Technology and Science Complex 1), just south of Science Road and north of South Campus Road, east of the South Science Building. The lab is in Suite 9404.

TASC-1 building street view

See here for maps and driving and transit instructions. If you come by bus, stay on until the last stop on campus (the Bus Loop next to Lot E in the map). It is a short walk south to TASC-1. If you drive up, you will have to find parking in one of the visitor spots and make your way to the TASC-1 building. The closest visitor parking is Lot VB in the map above.

Fred Popowich’s office number is 9423, on the 2nd floor of the building. Anoop Sarkar’s office number is 9427, on the 2nd floor of the building.

Mailing Address

Natural Language Lab Simon Fraser University Technology and Applied Sciences Building: TASC 9404 8888 University Drive Burnaby, B.C. V5A 1S6

Phone Numbers

Lab phone: +1.778.782.3208 Fax: +1.778.782.3045 Email: nll-contact at sfu dot ca Lab FAX number: +1-778-782-3045