Journal Article FZJ-2021-03405

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Enhanced Binary Cuckoo Search With Frequent Values and Rough Set Theory for Feature Selection

 ;

2021
IEEE New York, NY

IEEE access 9, 119430 - 119453 () [10.1109/ACCESS.2021.3107901]

This record in other databases:  

Please use a persistent id in citations:   doi:

Abstract: Redundant and irrelevant features in datasets decrease classification accuracy, and increase computational time of classification algorithms, overfitting problem and complexity of the underlying classification model. Feature selection is a preprocessing technique used in classification algorithms to improve the selection of relevant features. Several approaches that combine Rough Set Theory (RST) with Nature Inspired Algorithms (NIAs) have been used successfully for feature selection. However, due to the inherit limitations of RST for some data types and the inefficient convergence of NIAs for high dimensional datasets, these approaches have mainly focused on a specific type of low dimensional nominal dataset. This paper proposes a new filter feature selection approach based on Binary Cuckoo Search (BCS) and RST, which is more efficient for low and high dimensional nominal, mixed and numerical datasets. It enhances BCS by developing a new initialization and global update mechanisms to increase the efficiency of convergence for high dimensional datasets. It also develops a more efficient objective function for numerical, mixed and nominal datasets. The proposed approach was validated on 16 benchmark datasets; 4 nominal, 4 mixed and 8 numerical drawn from the UCI repository. It was also evaluated against standard BCS; five NIAs with fuzzy RST approaches; two popular traditional FS approaches; and multi objective evolutionary, Genetic, and Particle Swarm Optimization (PSO) algorithms. Decision tree and Naive Bayes algorithms were used to measure the classification performance of the proposed approach. The results show that the proposed approach achieved improved classification accuracy while minimizing the number of features compared to other state-of-the-art methods.

Classification:

Contributing Institute(s):
  1. Zivile Sicherheitsforschung (IAS-7)
Research Program(s):
  1. 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) (POF4-511)

Appears in the scientific report 2021
Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; DOAJ ; OpenAccess ; Article Processing Charges ; Clarivate Analytics Master Journal List ; Current Contents - Electronics and Telecommunications Collection ; Current Contents - Engineering, Computing and Technology ; DOAJ Seal ; Essential Science Indicators ; Fees ; IF < 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Institute Collections > IAS > IAS-7
Workflow collections > Public records
Workflow collections > Publication Charges
Publications database
Open Access

 Record created 2021-09-06, last modified 2023-05-31


OpenAccess:
Download fulltext PDF
External link:
Download fulltextFulltext by OpenAccess repository
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)