Preprint FZJ-2024-01127

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Empirical Comparison between Cross-Validation and Mutation-Validation in Model Selection

 ;  ;  ;  ;

2023
arXiv

arXiv () [10.48550/ARXIV.2311.14079]

This record in other databases:

Please use a persistent id in citations: doi:  doi:

Abstract: Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and $k$-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding three posterior probabilities: practical equivalence, CV superiority, and MV superiority. We also evaluated the differences in the capacity of the selected models and computational efficiency. We found that both MV and CV select models with practically equivalent generalization performance across various machine learning algorithms and the majority of benchmark datasets. MV exhibited advantages in terms of selecting simpler models and lower computational costs. However, in some cases MV selected overly simplistic models leading to underfitting and showed instability in hyperparameter selection. These limitations of MV became more evident in the evaluation of a real-world neuroscientific task of predicting sex at birth using brain functional connectivity.

Keyword(s): Machine Learning (cs.LG) ; Machine Learning (stat.ML) ; FOS: Computer and information sciences


Contributing Institute(s):
  1. Gehirn & Verhalten (INM-7)
Research Program(s):
  1. 5254 - Neuroscientific Data Analytics and AI (POF4-525) (POF4-525)

Appears in the scientific report 2023
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Institutssammlungen > INM > INM-7
Dokumenttypen > Berichte > Vorabdrucke
Workflowsammlungen > Öffentliche Einträge
Publikationsdatenbank
Open Access

 Datensatz erzeugt am 2024-01-30, letzte Änderung am 2024-02-26


OpenAccess:
Volltext herunterladen PDF
Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)