Hana-MC: Heading of Arabic News Analysis by Multi-label Classification

Mariem El Abdi, Boutheina Smine, Sadok Ben Yahia*, Hella Kaffel Ben Ayed

*Kontaktforfatter

Publikation: Bidrag til tidsskriftKonferenceartikelForskningpeer review

5 Downloads (Pure)

Abstract

Because of the spread of epidemics and diseases in many countries around the world, news consumption from online sources has substantially increased. These news stories written in Arabic are about more than one topic, which is interesting for the multi-label classification paradigm. Furthermore, the recent studies based on multi-label Arabic text classification deal with news articles, which are rather long texts. Thus, we put forward a large dataset of concise Arabic news based basically on the Corona virus, namely Hana-MC, which has been built from various news portals. We conducted a comparative study using several multi-label classification approaches, including algorithm adaptation, problem transformation, and ensemble methods. Experimental results showed that the Ensemble Method RAKELD with the Random Forest base classifier obtained the best accuracy score.

OriginalsprogEngelsk
TidsskriftProcedia Computer Science
Vol/bind246
Sider (fra-til)3556-3565
Antal sider10
ISSN1877-0509
DOI
StatusUdgivet - nov. 2024
Begivenhed28th International Conference on Knowledge Based and Intelligent information and Engineering Systems, KES 2024 - Seville, Spanien
Varighed: 11. nov. 202212. nov. 2022

Konference

Konference28th International Conference on Knowledge Based and Intelligent information and Engineering Systems, KES 2024
Land/OmrådeSpanien
BySeville
Periode11/11/202212/11/2022

Bibliografisk note

Publisher Copyright:
© 2024 The Authors.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Hana-MC: Heading of Arabic News Analysis by Multi-label Classification'. Sammen danner de et unikt fingeraftryk.

Citationsformater