Enhancing State-of-the-Art NLP Models for Classical Arabic

Tariq Yousef, Lisa Mischer, Hamid Reza Hakimi, Maxim Romanov

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

4 Downloads (Pure)

Abstract

Like many other historical languages, Classical Arabic is hindered by the absence of adequate training datasets and accurate" off-the-shelf" models that can be readily used in processing pipelines. In this paper, we discuss our ongoing work to develop and train deep learning models specially designed to manage various tasks related to classical Arabic texts. We specifically concentrate on Named Entity Recognition, classification of person relationships, toponym classification, detection of onomastic section boundaries, onomastic element classification, as well as date recognition and classification. Our efforts aim to confront the difficulties tied to these tasks and to deliver effective solutions for analyzing classical Arabic texts. Though this work is still under development, the preliminary results presented in the paper suggest excellent to satisfactory performance of the fine-tuned models, successfully achieving the intended objectives for which they were trained.
Original languageEnglish
Title of host publicationAncient Language Processing Workshop
EditorsAdam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti
Number of pages10
Publication date2023
Pages160-169
ISBN (Print)978-954-452-087-8
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event14th International Conference on Recent Advances in Natural Language Processing - , Bulgaria
Duration: 4. Sept 20236. Sept 2023

Conference

Conference14th International Conference on Recent Advances in Natural Language Processing
Country/TerritoryBulgaria
Period04/09/202306/09/2023

Fingerprint

Dive into the research topics of 'Enhancing State-of-the-Art NLP Models for Classical Arabic'. Together they form a unique fingerprint.

Cite this