Computational simulation of virtual patients reduces dataset bias and improves machine learning-based detection of ARDS from noisy heterogeneous ICU datasets

Sharafutdinov, Konstantin; Hardman, Jonathan G.; Bickenbach, Johannes; Schuppert, Andreas; Fritsch, Sebastian Johannes; Mayer, Hannah; Ghalati, Pejman Farhadi; Polzin, Richard; Iravani, Mina; Marx, Gernot; Bates, Declan G.; Saffaran, Sina
doi:10.1109/OJEMB.2023.3243190
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@ARTICLE{Sharafutdinov:1005293,
      author       = {Sharafutdinov, Konstantin and Fritsch, Sebastian Johannes
                      and Iravani, Mina and Ghalati, Pejman Farhadi and Saffaran,
                      Sina and Bates, Declan G. and Hardman, Jonathan G. and
                      Polzin, Richard and Mayer, Hannah and Marx, Gernot and
                      Bickenbach, Johannes and Schuppert, Andreas},
      title        = {{C}omputational simulation of virtual patients reduces
                      dataset bias and improves machine learning-based detection
                      of {ARDS} from noisy heterogeneous {ICU} datasets},
      journal      = {IEEE open journal of engineering in medicine and biology},
      volume       = {5},
      issn         = {2644-1276},
      address      = {New York, NY},
      publisher    = {IEEE},
      reportid     = {FZJ-2023-01408},
      pages        = {611 - 620},
      year         = {2023},
      abstract     = {Goal: Machine learning (ML) technologies that leverage
                      large-scale patient data are promising tools
                      predictingdisease evolution in individual patients. However,
                      the limited generalizability of ML models developed on
                      single-center datasets,and their unproven performance in
                      real-world settings, remain significant constraints to their
                      widespread adoption in clinicalpractice. One approach to
                      tackle this issue is to base learning on large multi-center
                      datasets. However, such heterogeneous datasetscan introduce
                      further biases driven by data origin, as data structures and
                      patient cohorts may differ between hospitals. Methods:
                      Inthis paper, we demonstrate how mechanistic virtual patient
                      (VP) modeling can be used to capture specific features of
                      patients’states and dynamics, while reducing biases
                      introduced by heterogeneous datasets. We show how VP
                      modeling can be used for dataaugmentation through
                      identification of individualized model parameters
                      approximating disease states of patients with suspectedacute
                      respiratory distress syndrome (ARDS) from observational data
                      of mixed origin. We compare the results of an
                      unsupervisedlearning method (clustering) in two cases: where
                      the learning is based on original patient data and on data
                      derived in the matchingprocedure of the VP model to real
                      patient data. Results: More robust cluster configurations
                      were observed in clustering using themodel-derived data. VP
                      model-based clustering also reduced biases introduced by the
                      inclusion of data from different hospitalsand was able to
                      discover an additional cluster with significant ARDS
                      enrichment. Conclusions: Our results indicate
                      thatmechanistic VP modeling can be used to significantly
                      reduce biases introduced by learning from heterogeneous
                      datasets and toallow improved discovery of patient cohorts
                      driven exclusively by medical conditions.},
      cin          = {JSC},
      ddc          = {570},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs)
                      and Research Groups (POF4-511) / SMITH -
                      Medizininformatik-Konsortium - Beitrag Forschungszentrum
                      Jülich (01ZZ1803M)},
      pid          = {G:(DE-HGF)POF4-5112 / G:(BMBF)01ZZ1803M},
      typ          = {PUB:(DE-HGF)16},
      pubmed       = {39184970},
      UT           = {WOS:001294340500001},
      doi          = {10.1109/OJEMB.2023.3243190},
      url          = {https://juser.fz-juelich.de/record/1005293},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help