| Speaker: | Owen Rambow |
| CoGenTex Co., Ithaca, USA | |
| Date: | Friday 3rd September 1999 |
| Time: | 13:00-14:00 |
| Place: | E6A 102, Macquarie University |
This event is co-hosted with the Department of Computing.
Abstract: In this talk, I explore the notion of syntactic dependency. It is a fundamental notion in syntax, dating at least to Tesniere (in work done in the 1930s). In a dependency representation of the syntax of a sentence, only words are represented, no nonterminal categories. A word w_1 depends on another word w_2 if, roughly, w_2 licenses the presence of w_1 in the sentence. We can define a "deep-syntactic representation" (based on a similar notion in Mel'cuk's Meaning-Text Theory) in which only full lexical words, but no function words are represented (as nodes). In addition, word order is not represented. At this level, almost all of the language-specific details of syntax have been abstracted, and we have a very general yet still syntactic representation.
In the first part of the talk, I will argue that this level of representation is ideally suited for applications: it expresses in a direct manner the crucial syntactic relations which are needed in natural language processing applications, such as machine translation or information extraction. At the same time, it avoids the need for developing a semantic representation and full-scale semantic resources, which is difficult and time-consuming. I will illustrate this point by comparing the use of a deep-syntactic representation in machine translation to a semantically-based interlingua.
In the second part of the talk, I will discuss formal (mathematical) models for dependency representations. Given the usefulness of this representation, it is surprising that no good formal models exist. The issue is complex because of so-called "non-projective" constructions, in which a dependency representation does not map to the surface string of the sentence without "tangled branches". There is a natural affinity between lexicalized phrase-structure formalisms such as Lexicalized Tree Adjoining Grammar (LTAG: Joshi 1988) and dependency representations. In fact, LTAG goes a long way towards providing a mathematical model for syntactic dependency. However, it fails to correctly model the dependency between words in general, instead concentrating on deriving correct phrase structure. I will present a new formalism under development, D-Tree Substitution Grammar (Rambow, Vijay-Shanker & Weir 1995; 1999) which can be motivated directly from the desire to model dependency. I illustrate several cases which pose problems for LTAG but which DSG can derive, maintaining proper dependencies.
About the Presenter: Owen Rambow obtained his Ph.D. in computational linguistics from the University of Pennsylvania in 1994, and has worked on automatic text generation technology since 1987, participating in the design and implementation of several text generation systems. He has also worked in natural language understanding, as well as formal and mathematical linguistics.
Dr. Rambow was a co-founder of CoGenTex (Ithaca) with Dr. Richard Kittredge and Dr. Tanya Korelsky in 1990. Since that time he has conducted research on theoretical issues in syntax as well as applications. His research interests include the notion of syntactic dependency, formal models of syntax, computational models of parsing, and the proper linguistic representation of word order. His work on text generation has concentrated on applications in the software engineering domain and the role of domain-specific and sublanguage information.
Dr. Rambow has been responsible for CoGenTex's cooperation with the University of Pennsylvania, aimed at transitioning parsing technology developed there into robust application systems. He is also project leader for the machine translation project, as well as for the KACS project.
Enquiries: sals@mri.mq.edu.au
| Last modified: August, 1999 |