001025679 001__ 1025679
001025679 005__ 20240709082025.0
001025679 0247_ $$2doi$$a10.48550/ARXIV.2401.01874
001025679 037__ $$aFZJ-2024-03068
001025679 1001_ $$0P:(DE-HGF)0$$aBrozos, Christoforos$$b0
001025679 245__ $$aGraph Neural Networks for Surfactant Multi-Property Prediction
001025679 260__ $$barXiv$$c2024
001025679 3367_ $$0PUB:(DE-HGF)25$$2PUB:(DE-HGF)$$aPreprint$$bpreprint$$mpreprint$$s1714560096_354
001025679 3367_ $$2ORCID$$aWORKING_PAPER
001025679 3367_ $$028$$2EndNote$$aElectronic Article
001025679 3367_ $$2DRIVER$$apreprint
001025679 3367_ $$2BibTeX$$aARTICLE
001025679 3367_ $$2DataCite$$aOutput Types/Working Paper
001025679 520__ $$aSurfactants are of high importance in different industrial sectors such as cosmetics, detergents, oil recovery and drug delivery systems. Therefore, many quantitative structure-property relationship (QSPR) models have been developed for surfactants. Each predictive model typically focuses on one surfactant class, mostly nonionics. Graph Neural Networks (GNNs) have exhibited a great predictive performance for property prediction of ionic liquids, polymers and drugs in general. Specifically for surfactants, GNNs can successfully predict critical micelle concentration (CMC), a key surfactant property associated with micellization. A key factor in the predictive ability of QSPR and GNN models is the data available for training. Based on extensive literature search, we create the largest available CMC database with 429 molecules and the first large data collection for surface excess concentration ($Γ$$_{m}$), another surfactant property associated with foaming, with 164 molecules. Then, we develop GNN models to predict the CMC and $Γ$$_{m}$ and we explore different learning approaches, i.e., single- and multi-task learning, as well as different training strategies, namely ensemble and transfer learning. We find that a multi-task GNN with ensemble learning trained on all $Γ$$_{m}$ and CMC data performs best. Finally, we test the ability of our CMC model to generalize on industrial grade pure component surfactants. The GNN yields highly accurate predictions for CMC, showing great potential for future industrial applications.
001025679 536__ $$0G:(DE-HGF)POF4-899$$a899 - ohne Topic (POF4-899)$$cPOF4-899$$fPOF IV$$x0
001025679 588__ $$aDataset connected to DataCite
001025679 650_7 $$2Other$$aChemical Physics (physics.chem-ph)
001025679 650_7 $$2Other$$aMachine Learning (cs.LG)
001025679 650_7 $$2Other$$aFOS: Physical sciences
001025679 650_7 $$2Other$$aFOS: Computer and information sciences
001025679 7001_ $$0P:(DE-HGF)0$$aRittig, Jan G.$$b1
001025679 7001_ $$0P:(DE-HGF)0$$aBhattacharya, Sandip$$b2
001025679 7001_ $$0P:(DE-HGF)0$$aAkanny, Elie$$b3
001025679 7001_ $$0P:(DE-HGF)0$$aKohlmann, Christina$$b4
001025679 7001_ $$0P:(DE-Juel1)172025$$aMitsos, Alexander$$b5$$eCorresponding author$$ufzj
001025679 773__ $$a10.48550/ARXIV.2401.01874
001025679 909CO $$ooai:juser.fz-juelich.de:1025679$$pVDB
001025679 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$a BASF$$b0
001025679 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-HGF)0$$aRWTH Aachen$$b0$$kRWTH
001025679 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-HGF)0$$aRWTH Aachen$$b1$$kRWTH
001025679 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$a BASF$$b2
001025679 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$a BASF$$b3
001025679 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$a BASF$$b4
001025679 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)172025$$aForschungszentrum Jülich$$b5$$kFZJ
001025679 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-Juel1)172025$$aRWTH Aachen$$b5$$kRWTH
001025679 9131_ $$0G:(DE-HGF)POF4-899$$1G:(DE-HGF)POF4-890$$2G:(DE-HGF)POF4-800$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$aDE-HGF$$bProgrammungebundene Forschung$$lohne Programm$$vohne Topic$$x0
001025679 9141_ $$y2024
001025679 920__ $$lyes
001025679 9201_ $$0I:(DE-Juel1)IEK-10-20170217$$kIEK-10$$lModellierung von Energiesystemen$$x0
001025679 980__ $$apreprint
001025679 980__ $$aVDB
001025679 980__ $$aI:(DE-Juel1)IEK-10-20170217
001025679 980__ $$aUNRESTRICTED
001025679 981__ $$aI:(DE-Juel1)ICE-1-20170217