ARCID: A new approach to deal with imbalanced datasets classification

  • Safa Abdellatif*
  • , Mohamed Ali Ben Hassine
  • , Sadok Ben Yahia
  • , Amel Bouzeghoub
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Classification is one of the most fundamental and well-known tasks in data mining. Class imbalance is the most challenging issue encountered when performing classification, i.e. when the number of instances belonging to the class of interest (minor class) is much lower than that of other classes (major classes). The class imbalance problem has become more and more marked while applying machine learning algorithms to real-world applications such as medical diagnosis, text classification, fraud detection, etc. Standard classifiers may yield very good results regarding the majority classes. However, this kind of classifiers yields bad results regarding the minority classes since they assume a relatively balanced class distribution and equal misclassification costs. To overcome this problem, we propose, in this paper, a novel associative classification algorithm called Association Rule-based Classification for Imbalanced Datasets (ARCID). This algorithm aims to extract significant knowledge from imbalanced datasets by emphasizing on information extracted from minor classes without drastically impacting the predictive accuracy of the classifier. Experimentations, against five datasets obtained from the UCI repository, have been conducted with reference to four assessment measures. Results show that ARCID outperforms standard algorithms. Furthermore, it is very competitive to Fitcare which is a class imbalance insensitive algorithm.

Original languageEnglish
Title of host publicationSOFSEM 2018 : Theory and Practice of Computer Science - 44th International Conference on Current Trends in Theory and Practice of Computer Science, Proceedings
EditorsJirí Wiedermann, A Min Tjoa, Stefan Biffl, Ladjel Bellatreche, Jan van Leeuwen
PublisherSpringer
Publication date2018
Pages569-580
ISBN (Print)9783319731162
DOIs
Publication statusPublished - 2018
Externally publishedYes
Event44th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2018 - Krems, Austria
Duration: 29. Jan 20182. Feb 2018

Conference

Conference44th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2018
Country/TerritoryAustria
CityKrems,
Period29/01/201802/02/2018
SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10706 LNCS
ISSN0302-9743

Keywords

  • Associative classification
  • Data mining
  • Imbalanced datasets
  • Machine learning

Fingerprint

Dive into the research topics of 'ARCID: A new approach to deal with imbalanced datasets classification'. Together they form a unique fingerprint.

Cite this