Abstract
This paper illustrates a workflow for developing and evaluating automatic translation alignment models for Ancient Greek. We designed an annotation Style Guide and a gold standard for the alignment of Ancient Greek-English and Ancient Greek-Portuguese, measured inter-annotator agreement and used the resulting dataset to evaluate the performance of various translation alignment models. We proposed a fine-tuning strategy that employs unsupervised training with mono- and bilingual texts and supervised training using manually aligned sentences. The results indicate that the fine-tuned model based on XLM-Roberta is superior in performance, and it achieved good results on language pairs that were not part of the training data.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the Thirteenth Language Resources and Evaluation Conference |
Redaktører | Nicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Jan Odijk, Stelios Piperidis |
Forlag | European Language Resources Association |
Publikationsdato | jun. 2022 |
Sider | 5894-5905 |
ISBN (Elektronisk) | 9791095546726 |
Status | Udgivet - jun. 2022 |
Udgivet eksternt | Ja |
Begivenhed | 13th Conference on Language Resources and Evaluation - Palais du Pharo, Marseille, Frankrig Varighed: 20. jun. 2022 → 25. jun. 2022 |
Konference
Konference | 13th Conference on Language Resources and Evaluation |
---|---|
Lokation | Palais du Pharo |
Land/Område | Frankrig |
By | Marseille |
Periode | 20/06/2022 → 25/06/2022 |