Journal Article FZJ-2025-02765

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Overview of leakage scenarios in supervised machine learning

 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;

2025
SpringerOpen Heidelberg [u.a.]

Journal of Big Data 12(1), 135 () [10.1186/s40537-025-01193-8]

This record in other databases:    

Please use a persistent id in citations: doi:  doi:

Abstract: Machine learning (ML) provides powerful tools for predictive modeling. ML’s popularity stems from the promise of sample-level prediction with applications across a variety of fields from physics and marketing to healthcare. However, if not properly implemented and evaluated, ML pipelines may contain leakage typically resulting in overoptimistic performance estimates and failure to generalize to new data. This can have severe negative financial and societal implications. Our aim is to expand understanding associated with causes leading to leakage when designing, implementing, and evaluating ML pipelines. Illustrated by concrete examples, we provide a comprehensive overview and discussion of various types of leakage that may arise in ML pipelines.

Classification:

Contributing Institute(s):
  1. Gehirn & Verhalten (INM-7)
Research Program(s):
  1. 5254 - Neuroscientific Data Analytics and AI (POF4-525) (POF4-525)

Appears in the scientific report 2025
Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; DOAJ ; OpenAccess ; Article Processing Charges ; Clarivate Analytics Master Journal List ; Current Contents - Engineering, Computing and Technology ; DOAJ Seal ; Essential Science Indicators ; Fees ; IF >= 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Aufsätze > Zeitschriftenaufsätze
Institutssammlungen > INM > INM-7
Workflowsammlungen > Öffentliche Einträge
Publikationsdatenbank
Open Access

 Datensatz erzeugt am 2025-06-08, letzte Änderung am 2025-08-04


OpenAccess:
s40537-025-01193-8 - Volltext herunterladen PDF
Main paper - Volltext herunterladen PDF
Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)