Abstract
This paper describes and evaluates a rule-based system implementing a novel method for quote attribution in Portuguese text, working on top of a Constraint-Grammar parse. Both direct and indirect speech are covered, as well as certain other text- embedded quote sources. In a first step, the system performs quote segmentation and identifies speech verbs, taking into account the different styles used in literature and news text. Speakers are then identified using syntactically and semantically grounded Constraint-Grammar rules. We rely on relational links and stream variables to handle anaphorical mentions and to recover the names of implied or underspecified speakers. In an evaluation including both literature and news text, the system performed well on both the segmentation and attribution tasks, achieving F-scores of 98-99% for the former and 89-94% for the latter.
Original language | English |
---|---|
Title of host publication | Proceedings of CG-MTA 2023 : Constraint Grammar Workshop at NoDaLiDa 2023, Thórshavn |
Publisher | Association for Computational Linguistics (ACL) |
Publication date | 2023 |
Pages | 1-9 |
Publication status | Published - 2023 |