SALS-SIG Research Seminar

Home ButtonPeople ButtonDOTG Buttonltg buttonEmail MRI

TEI Encoding and Syntactic Tagging of an Old French text


Speaker:

Dominique Estival

University of Melbourne
Date: Tuesday 10th November 1998
Time: 11:30 - 12:30
Place: Seminar room 357, Building E6A, Macquarie University

Abstract:

In this talk I will report on one of the concrete outcomes of a research project undertaken at the University of Melbourne, on the Computational Modelling of Syntactic Change. In this part of the project, we are collecting and encoding historical texts and tagging them for syntactic analysis. We have so far produced a TEI-conformant version of an Old French text, La Vie de Saint Louis written by Jehan de Joinville around 1305, and we are in the process of adding syntactic tags to this text. Those syntactic tags are derived from the Penn-Helsinki coding scheme, which had been devised for the syntactic encoding of Middle English texts, and they have been translated into TEI.

Thus this paper addresses two issues: the development of a TEI encoding for the text, and the adaptation of the Penn-Helsinki syntactic coding scheme. While the first part of this work raises issues of a textual nature independently of the language of the text, and proposes concrete immediate solutions, the second part of this work points to a more general extension of the PH tagset to other types of texts and to other languages.

In the long run, the whole project should lead to a better understanding of two universal characteristics of language, language variation and language change. I will argue that an understanding of the limits and types of variation in language is necessary to deal adequately with actual language, and that ultimately the integration of language variability in NLP modelling should result in models and applications that are closer to the behavior of humans in real language situations.


Enquiries: sals@mri.mq.edu.au

Last modified: November, 1998