Abstract

This paper describes an implemented program that takes a raw, untagged text corpus as its only input (no open-class dictionary) and generates a partial list of verbs occurring in the text and the subcategorization frames (SFs) in which they occur. Verbs are detected by a novel technique based on the Case Filter of Rouvret and Vergnaud (1980). The completeness of the output list increases monotonically with the total number of occurrences of each verb in the corpus. False positive rates are one to three percent of observations. Five SFs are currently detected and more are planned. Ultimately, I expect to provide a large SF dictionary to the NLP community and to train dictionaries for specific corpora.

Original languageEnglish
Pages (from-to)209-214
Number of pages6
JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume1991-June
StatePublished - 1991
Event29th Annual Meeting of the Association for Computational Linguistics, ACL 1991 - Berkeley, United States
Duration: Jun 18 1991Jun 21 1991

Fingerprint

Dive into the research topics of 'Automatic acquisition of subcategorization frames from untagged text'. Together they form a unique fingerprint.

Cite this