001     1048464
005     20251212202212.0
024 7 _ |a 10.1371/journal.pcbi.1013679
|2 doi
024 7 _ |a 1553-734X
|2 ISSN
024 7 _ |a 1553-7358
|2 ISSN
024 7 _ |a 10.34734/FZJ-2025-04662
|2 datacite_doi
037 _ _ |a FZJ-2025-04662
082 _ _ |a 610
100 1 _ |a Flöge, Klemens
|0 P:(DE-HGF)0
|b 0
245 _ _ |a OneProt: Towards multi-modal protein foundation models via latent space alignment of sequence, structure, binding sites and text encoders
260 _ _ |a San Francisco, Calif.
|c 2025
|b Public Library of Science
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1765560250_32338
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a Recent advances in Artificial Intelligence have enabled multi-modal systems to model and translate diverse information spaces. Extending beyond text and vision, we introduce OneProt, a multi-modal Deep Learning model for proteins that integrates structural, sequence, text, and binding site data. Using the ImageBind framework, OneProt aligns the latent spaces of protein modality encoders in a lightweight fine-tuning scheme that focuses on pairwise alignment with sequence data, rather than requiring full matches. This novel approach comprises a mix of Graph Neural Networks and transformer architectures. It demonstrates good performance in retrieval tasks and showcases the efficacy of multi-modal systems in Protein Machine Learning through a broad spectrum of downstream baselines, including enzyme function prediction and binding site analysis. Furthermore, OneProt enables the transfer of representational information from specialized encoders to the sequence encoder, enhancing capabilities for distinguishing evolutionarily related and unrelated sequences and exhibiting representational properties where evolutionarily related proteins align in similar directions within the latent space. In addition, we extensively investigate modality ablations to identify the encoders that contribute the most to predictive performance, highlighting the significance of the binding site encoder, which has not been used in similar models previously. This work expands the horizons of multi-modal protein models, paving the way for transformative applications in drug discovery, biocatalytic reaction planning, and protein engineering.
536 _ _ |a 2171 - Biological and environmental resources for sustainable use (POF4-217)
|0 G:(DE-HGF)POF4-2171
|c POF4-217
|f POF IV
|x 0
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|f POF IV
|x 1
536 _ _ |a Helmholtz AI Consultant Team FB Information (E54.303.11)
|0 G:(DE-Juel-1)E54.303.11
|c E54.303.11
|x 2
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |a Udayakumar, Srisruthi
|0 P:(DE-HGF)0
|b 1
700 1 _ |a Sommer, Johanna
|0 P:(DE-HGF)0
|b 2
700 1 _ |a Piraud, Marie
|0 P:(DE-HGF)0
|b 3
700 1 _ |a Kesselheim, Stefan
|0 P:(DE-Juel1)185654
|b 4
|u fzj
700 1 _ |a Fortuin, Vincent
|0 P:(DE-HGF)0
|b 5
700 1 _ |a Günnemann, Stephan
|0 P:(DE-HGF)0
|b 6
700 1 _ |a van der Weg, Karel J.
|0 P:(DE-Juel1)164893
|b 7
700 1 _ |a Gohlke, Holger
|0 P:(DE-Juel1)172663
|b 8
700 1 _ |a Merdivan, Erinc
|0 P:(DE-HGF)0
|b 9
700 1 _ |a Bazarova, Alina
|0 P:(DE-Juel1)192120
|b 10
|e Corresponding author
773 _ _ |a 10.1371/journal.pcbi.1013679
|g Vol. 21, no. 11, p. e1013679 -
|0 PERI:(DE-600)2193340-6
|n 11
|p e1013679
|t PLoS Computational Biology
|v 21
|y 2025
|x 1553-734X
856 4 _ |u https://juser.fz-juelich.de/record/1048464/files/journal.pcbi.1013679-2.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1048464
|p openaire
|p open_access
|p OpenAPC
|p driver
|p VDB
|p openCost
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)185654
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 8
|6 P:(DE-Juel1)172663
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 10
|6 P:(DE-Juel1)192120
913 1 _ |a DE-HGF
|b Forschungsbereich Erde und Umwelt
|l Erde im Wandel – Unsere Zukunft nachhaltig gestalten
|1 G:(DE-HGF)POF4-210
|0 G:(DE-HGF)POF4-217
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-200
|4 G:(DE-HGF)POF
|v Für eine nachhaltige Bio-Ökonomie – von Ressourcen zu Produkten
|9 G:(DE-HGF)POF4-2171
|x 0
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 1
914 1 _ |y 2025
915 p c |a APC keys set
|0 PC:(DE-HGF)0000
|2 APC
915 p c |a Local Funding
|0 PC:(DE-HGF)0001
|2 APC
915 p c |a DFG OA Publikationskosten
|0 PC:(DE-HGF)0002
|2 APC
915 p c |a DOAJ Journal
|0 PC:(DE-HGF)0003
|2 APC
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1050
|2 StatID
|b BIOSIS Previews
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1190
|2 StatID
|b Biological Abstracts
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0600
|2 StatID
|b Ebsco Academic Search
|d 2024-12-16
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b PLOS COMPUT BIOL : 2022
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0501
|2 StatID
|b DOAJ Seal
|d 2024-02-08T09:42:16Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0500
|2 StatID
|b DOAJ
|d 2024-02-08T09:42:16Z
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2024-12-16
915 _ _ |a Fees
|0 StatID:(DE-HGF)0700
|2 StatID
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2024-12-16
915 _ _ |a IF < 5
|0 StatID:(DE-HGF)9900
|2 StatID
|d 2024-12-16
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b ASC
|d 2024-12-16
915 _ _ |a Article Processing Charges
|0 StatID:(DE-HGF)0561
|2 StatID
|d 2024-12-16
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2024-12-16
915 _ _ |a Creative Commons Attribution CC BY 4.0
|0 LIC:(DE-HGF)CCBY4
|2 HGFVOC
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2024-12-16
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)IBG-4-20200403
|k IBG-4
|l Bioinformatik
|x 0
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 1
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)IBG-4-20200403
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a APC
980 1 _ |a APC
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21