![](https://csd.cmu.edu/sites/default/files/styles/full_width_focal_point/public/graduate.png.webp?itok=Wsy3nMEH)
Peter Venable
Thesis Title:
Modeling Syntax for Parsing and Translation
Degree Type:
Ph.D. in Computer Science
Advisor(s):
John Lafferty
Graduated:
December
2003
Abstract:
Syntactic structure is an important component of natural language utterances, for both form and content. Therefore, a variety of applications can benefit from the integration of syntax into their statistical models of language. In this thesis, two new syntax-based models are presented, along with their training algorithms: a monolingual generative model of sentence structure, and a model of the relationship between the structure of a sentence in one language and the structure of its translation into another language. After these models are trained and tested on the respective tasks of monolingual parsing and word-level bilingual corpus alignment, they are demonstrated in two additional applications. First, a new statistical parser is automatically induced for a language in which none was available, using a bilingual corpus. Second, a statistical translation system is augmented with syntax-based models. Thus the contributions of this thesis include: a statistical parsing system; a bilingual parsing system, which infers a structural relationship between two languages using a bilingual corpus; a method for automatically building a parser for a language where no parser is available; and a translation model that incorporates phrase structure.
Thesis Committee:
John Lafferty (Chair)
Daniel Sleator
Jaime Carbonell
Michael Collins (MIT)
Randy Bryant, Head, Computer Science Department
James Morris, Dean, School of Computer Science
Keywords: Statistical, syntax, parsing, translation
CMU-CS-03-216.pdf (2.55 MB) ( 130 pages)Copyright Notice