Abstract
Like many other historical languages, Classical Arabic is hindered by the absence of adequate training datasets and accurate" off-the-shelf" models that can be readily used in processing pipelines. In this paper, we discuss our ongoing work to develop and train deep learning models specially designed to manage various tasks related to classical Arabic texts. We specifically concentrate on Named Entity Recognition, classification of person relationships, toponym classification, detection of onomastic section boundaries, onomastic element classification, as well as date recognition and classification. Our efforts aim to confront the difficulties tied to these tasks and to deliver effective solutions for analyzing classical Arabic texts. Though this work is still under development, the preliminary results presented in the paper suggest excellent to satisfactory performance of the fine-tuned models, successfully achieving the intended objectives for which they were trained.
Originalsprog | Engelsk |
---|---|
Titel | Ancient Language Processing Workshop |
Redaktører | Adam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti |
Antal sider | 10 |
Publikationsdato | 2023 |
Sider | 160-169 |
ISBN (Trykt) | 978-954-452-087-8 |
DOI | |
Status | Udgivet - 2023 |
Udgivet eksternt | Ja |
Begivenhed | 14th International Conference on Recent Advances in Natural Language Processing - , Bulgarien Varighed: 4. sep. 2023 → 6. sep. 2023 |
Konference
Konference | 14th International Conference on Recent Advances in Natural Language Processing |
---|---|
Land/Område | Bulgarien |
Periode | 04/09/2023 → 06/09/2023 |