Enhanced Binary Cuckoo Search With Frequent Values and Rough Set Theory for Feature Selection

Alia, Ahmed; Taweel, Adel
doi:10.1109/ACCESS.2021.3107901
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@ARTICLE{Alia:894810,
      author       = {Alia, Ahmed and Taweel, Adel},
      title        = {{E}nhanced {B}inary {C}uckoo {S}earch {W}ith {F}requent
                      {V}alues and {R}ough {S}et {T}heory for {F}eature
                      {S}election},
      journal      = {IEEE access},
      volume       = {9},
      issn         = {2169-3536},
      address      = {New York, NY},
      publisher    = {IEEE},
      reportid     = {FZJ-2021-03405},
      pages        = {119430 - 119453},
      year         = {2021},
      abstract     = {Redundant and irrelevant features in datasets decrease
                      classification accuracy, and increase computational time of
                      classification algorithms, overfitting problem and
                      complexity of the underlying classification model. Feature
                      selection is a preprocessing technique used in
                      classification algorithms to improve the selection of
                      relevant features. Several approaches that combine Rough Set
                      Theory (RST) with Nature Inspired Algorithms (NIAs) have
                      been used successfully for feature selection. However, due
                      to the inherit limitations of RST for some data types and
                      the inefficient convergence of NIAs for high dimensional
                      datasets, these approaches have mainly focused on a specific
                      type of low dimensional nominal dataset. This paper proposes
                      a new filter feature selection approach based on Binary
                      Cuckoo Search (BCS) and RST, which is more efficient for low
                      and high dimensional nominal, mixed and numerical datasets.
                      It enhances BCS by developing a new initialization and
                      global update mechanisms to increase the efficiency of
                      convergence for high dimensional datasets. It also develops
                      a more efficient objective function for numerical, mixed and
                      nominal datasets. The proposed approach was validated on 16
                      benchmark datasets; 4 nominal, 4 mixed and 8 numerical drawn
                      from the UCI repository. It was also evaluated against
                      standard BCS; five NIAs with fuzzy RST approaches; two
                      popular traditional FS approaches; and multi objective
                      evolutionary, Genetic, and Particle Swarm Optimization (PSO)
                      algorithms. Decision tree and Naive Bayes algorithms were
                      used to measure the classification performance of the
                      proposed approach. The results show that the proposed
                      approach achieved improved classification accuracy while
                      minimizing the number of features compared to other
                      state-of-the-art methods.},
      cin          = {IAS-7},
      ddc          = {621.3},
      cid          = {I:(DE-Juel1)IAS-7-20180321},
      pnm          = {5111 - Domain-Specific Simulation $\&$ Data Life Cycle Labs
                      (SDLs) and Research Groups (POF4-511)},
      pid          = {G:(DE-HGF)POF4-5111},
      typ          = {PUB:(DE-HGF)16},
      UT           = {WOS:000692224900001},
      doi          = {10.1109/ACCESS.2021.3107901},
      url          = {https://juser.fz-juelich.de/record/894810},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help