Abstract
With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require a threshold to function. We have created a novel meta-filter automatic feature selection, Ranked Distinct Elitism Selection Filter (RDESF) which is fully automatic and is composed of five common filters and a distinct selection process.
To test the performance and speed of RDESF it will be benchmarked against 4 other common automatic feature selection algorithms: Backward selection, forward selection, NLPCA and PCA as well as using no algorithms at all. The benchmarking will be performed through two experiments with two different data sets that are both time-series regression-based problems. The prediction will be performed by a Multilayer Perceptron (MLP).
Our results show that RDESF is a strong competitor and allows for a fully automatic feature selection system using filters. RDESF was only outperformed by forward selection, which was expected as it is a wrapper which includes the prediction model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature selection algorithms that can be used for many different applications.
To test the performance and speed of RDESF it will be benchmarked against 4 other common automatic feature selection algorithms: Backward selection, forward selection, NLPCA and PCA as well as using no algorithms at all. The benchmarking will be performed through two experiments with two different data sets that are both time-series regression-based problems. The prediction will be performed by a Multilayer Perceptron (MLP).
Our results show that RDESF is a strong competitor and allows for a fully automatic feature selection system using filters. RDESF was only outperformed by forward selection, which was expected as it is a wrapper which includes the prediction model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature selection algorithms that can be used for many different applications.
Originalsprog | Engelsk |
---|---|
Titel | Advances in Big Data : Proceedings of the 2nd INNS Conference on Big Data |
Redaktører | Plamen Angelov, Yannis Manolopoulos, Lazaros IIiadis, Asim Roy, Marley Vellasco |
Forlag | Springer |
Publikationsdato | 2017 |
Sider | 71-80 |
ISBN (Trykt) | 978-3-319-47897-5 |
ISBN (Elektronisk) | 978-3-319-47898-2 |
DOI | |
Status | Udgivet - 2017 |
Begivenhed | 2nd INNS Conference on Big Data - Thessaloniki, Grækenland Varighed: 23. okt. 2016 → 25. okt. 2016 Konferencens nummer: 2 https://conferences.cwa.gr/inns-big-data2016/ |
Konference
Konference | 2nd INNS Conference on Big Data |
---|---|
Nummer | 2 |
Land/Område | Grækenland |
By | Thessaloniki |
Periode | 23/10/2016 → 25/10/2016 |
Internetadresse |
Navn | Advances in Intelligent Systems and Computing |
---|---|
Vol/bind | 529 |
ISSN | 2194-5357 |