Rank Selection in Non-negative Matrix Factorization: systematic comparison and a new MAD metric

Muzzarelli, Laura; Patil, Kaustubh R.; Weis, Susanne; Eickhoff, Simon B.
doi:10.1109/IJCNN.2019.8852146
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INPROCEEDINGS{Muzzarelli:865858,
      author       = {Muzzarelli, Laura and Weis, Susanne and Eickhoff, Simon B.
                      and Patil, Kaustubh R.},
      title        = {{R}ank {S}election in {N}on-negative {M}atrix
                      {F}actorization: systematic comparison and a new {MAD}
                      metric},
      publisher    = {IEEE},
      reportid     = {FZJ-2019-05146},
      pages        = {8},
      year         = {2019},
      note         = {This study was partly supported by the Helmholtz Portfolio
                      Theme"Supercomputing and Modeling for the Human Brain" and
                      the EuropeanUnion’s Horizon 2020 Research and Innovation
                      Programme under GrantAgreement No. 785907 (HBP SGA2).},
      abstract     = {Non-Negative Matrix Factorization (NMF) is apowerful
                      dimensionality reduction and factorization method
                      thatprovides a part-based representation of the data. In the
                      absence ofa priori knowledge about the latent dimensionality
                      of the data, itis necessary to select a rank of the reduced
                      representation. Severalrank selection methods have been
                      proposed, but no consensusexists on when a method is
                      suitable to use. In this work, we proposea new metric for
                      rank selection based on imputation crossvalidation,and we
                      systematically compare it against six othermetrics while
                      assessing the effects of data properties. Usingsynthetic
                      datasets with different properties, our work
                      criticallyevidences that most methods fail to identify the
                      true rank. Weshow that properties of the data heavily impact
                      the ability ofdifferent methods. Imputation-based metrics,
                      including our newMADimput, provided the best accuracy
                      irrespective of the datatype, but no solution worked
                      perfectly in all circumstances. Oneshould therefore
                      carefully assess characteristics of their dataset inorder to
                      identify the most suitable metric for rank selection.},
      month         = {Jul},
      date          = {2019-07-14},
      organization  = {2019 International Joint Conference on
                       Neural Networks (IJCNN), Budapest
                       (Hungary), 14 Jul 2019 - 19 Jul 2019},
      cin          = {INM-7},
      cid          = {I:(DE-Juel1)INM-7-20090406},
      pnm          = {574 - Theory, modelling and simulation (POF3-574) / HBP
                      SGA2 - Human Brain Project Specific Grant Agreement 2
                      (785907)},
      pid          = {G:(DE-HGF)POF3-574 / G:(EU-Grant)785907},
      typ          = {PUB:(DE-HGF)8},
      doi          = {10.1109/IJCNN.2019.8852146},
      url          = {https://juser.fz-juelich.de/record/865858},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help