Enhancing State-of-the-Art NLP Models for Classical Arabic

Tariq Yousef, Lisa Mischer, Hamid Reza Hakimi, Maxim Romanov

Publikation: Kapitel i bog/rapport/konference-proceedingKonferencebidrag i proceedingsForskningpeer review

5 Downloads (Pure)

Abstract

Like many other historical languages, Classical Arabic is hindered by the absence of adequate training datasets and accurate" off-the-shelf" models that can be readily used in processing pipelines. In this paper, we discuss our ongoing work to develop and train deep learning models specially designed to manage various tasks related to classical Arabic texts. We specifically concentrate on Named Entity Recognition, classification of person relationships, toponym classification, detection of onomastic section boundaries, onomastic element classification, as well as date recognition and classification. Our efforts aim to confront the difficulties tied to these tasks and to deliver effective solutions for analyzing classical Arabic texts. Though this work is still under development, the preliminary results presented in the paper suggest excellent to satisfactory performance of the fine-tuned models, successfully achieving the intended objectives for which they were trained.
OriginalsprogEngelsk
TitelAncient Language Processing Workshop
RedaktørerAdam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti
Antal sider10
Publikationsdato2023
Sider160-169
ISBN (Trykt)978-954-452-087-8
DOI
StatusUdgivet - 2023
Udgivet eksterntJa
Begivenhed14th International Conference on Recent Advances in Natural Language Processing - , Bulgarien
Varighed: 4. sep. 20236. sep. 2023

Konference

Konference14th International Conference on Recent Advances in Natural Language Processing
Land/OmrådeBulgarien
Periode04/09/202306/09/2023

Fingeraftryk

Dyk ned i forskningsemnerne om 'Enhancing State-of-the-Art NLP Models for Classical Arabic'. Sammen danner de et unikt fingeraftryk.

Citationsformater