Development of robust NER Models and Named Entity Tagsets for Ancient Greek

Chiara Palladino*, Tariq Yousef

*Kontaktforfatter

Publikation: Kapitel i bog/rapport/konference-proceedingKonferencebidrag i proceedingsForskningpeer review

13 Downloads (Pure)

Abstract

This contribution presents a novel approach to the development and evaluation of transformer-based models for Named Entity Recognition and Classification in Ancient Greek texts. We trained two models with annotated datasets by consolidating potentially ambiguous entity types under a harmonized tagset. Then, we tested their performance with out-of-domain texts, reproducing a real-world use case. Both models performed very well under these conditions, with the multilingual model Ancient Greek Alignment being slightly superior. In the conclusion, we emphasize current limitations due to the scarcity of high-quality annotated corpora and to the lack of cohesive annotation strategies for ancient languages.

OriginalsprogEngelsk
Titel3rd Workshop on Language Technologies for Historical and Ancient Languages, LT4HALA 2024 at LREC-COLING 2024 - Workshop Proceedings
RedaktørerRachele Sprugnoli, Marco Passarotti
ForlagEuropean Language Resources Association (ELRA)
Publikationsdato2024
Sider89–97
ISBN (Elektronisk)9782493814463
StatusUdgivet - 2024
Begivenhed3rd Workshop on Language Technologies for Historical and Ancient Languages, LT4HALA 2024 - Torino, Italien
Varighed: 25. maj 2024 → …

Konference

Konference3rd Workshop on Language Technologies for Historical and Ancient Languages, LT4HALA 2024
Land/OmrådeItalien
ByTorino
Periode25/05/2024 → …

Fingeraftryk

Dyk ned i forskningsemnerne om 'Development of robust NER Models and Named Entity Tagsets for Ancient Greek'. Sammen danner de et unikt fingeraftryk.

Citationsformater