Morphology, in many morphologically complex languages, encodes information that is expressed with syntax involving a series of function words. In such cases, factored phrase-based SMT is not directly applicable as there is no relation between the morphological structures of words in the source language and the target language. In this talk we present a scheme to employ factored phrase-based SMT when the morphological structures of the source and target languages are widely disparate, and experiment with it in SMT between English and Turkish.
Kemal Oflazer received his PhD in Computer Science from Carnegie Mellon University and MS in CS and BS in EE degrees from Middle East Technical University in Ankara Turkey. He is currently a faculty member at Carnegie Mellon University - Qatar. Prior to this, he was on the faculties of Sabanci University in Istanbul and Bilkent University in Ankara, Turkey. He has held visiting positions at Computing Research Laboratory at New Mexico State University, and at the Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA. He served on the Editorial Boards of Machine Translation, Computational Linguistics and Journal of AI Research, and is currently the Book Reviews Editor for Natural Language Engineering. He also served as the Program Co-chair for 43rd Annual Meeting of the Association for Computational Linguistics in 2005, in Ann Arbor. His current research interests are in statistical machine translation into morphologically complex languages, using language processing for language learning applications.