Abstract
This paper covers three datasets containing texts in Ancient Greek, manually aligned at word level against translations in English (Grc-Eng), Portuguese (Grc-Por) and Latin (Grc-Lat). The datasets were collected by two domain experts through annotation on the Ugarit Translation Alignment Editor (https://ugarit.ialigner.com/). The quality of each dataset was measured through Inter-Annotator-Agreement (IAA) above 80%. Each dataset contains the aligned pairs and an Annotation Style Guide, and serves as a Gold Standard for translation alignment of Ancient Greek, for the evaluation of automatic translation alignment models, and as high-quality training data. The Annotation Style Guide provides a starting point to approach the task of translation alignment for research and teaching. The data is stored on GitHub and Zenodo.
Originalsprog | Engelsk |
---|---|
Artikelnummer | 22 |
Tidsskrift | Journal of Open Humanities Data |
Vol/bind | 9 |
Antal sider | 6 |
ISSN | 2059-481X |
DOI | |
Status | Udgivet - 10. nov. 2023 |