Multi-modal integration for biological tasks: perks, caveats and applications

Bazarova, Alina

Items
Marc 21

001			1052326
005			20260127203441.0
037	_	_	\|a FZJ-2026-00934
100	1	_	\|a Bazarova, Alina \|0 P:(DE-Juel1)192120 \|b 0 \|e Corresponding author
111	2	_	\|a Systems Biology Lecture Series \|c Berlin \|w Germany
245	_	_	\|a Multi-modal integration for biological tasks: perks, caveats and applications \|f 2026-01-21 -
260	_	_	\|c 2026
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a Other \|2 DataCite
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a LECTURE_SPEECH \|2 ORCID
336	7	_	\|a Talk (non-conference) \|b talk \|m talk \|0 PUB:(DE-HGF)31 \|s 1769503847_4497 \|2 PUB:(DE-HGF) \|x Invited
336	7	_	\|a Other \|2 DINI
502	_	_	\|c MDC-BIMSB
520	_	_	\|a In this talk, I will present OneProt, a versatile artificial intelligence framework for protein analysis that leverages multi-modal integration across structural, sequence, textual, and binding-site data. To align these heterogeneous modalities, OneProt adopts an ImageBind-inspired training strategy, enabling efficient cross-modal representation learning without requiring fully paired data. By combining graph neural networks and transformer-based architectures, OneProt achieves strong performance across tasks such as enzyme function prediction and binding-site analysis. I will highlight two key features of the framework: its ability to seamlessly incorporate custom modalities during pre-training, and a lightweight fine-tuning strategy that relies only on a simple multi-layer perceptron projection. Through empirical results, I will demonstrate how multi-modal integration can reduce the reliance on large task-specific datasets while maintaining competitive downstream performance. Alongside these benefits, I will discuss the practical challenges and caveats of adding new modalities, including alignment noise, modality imbalance, and training stability. Finally, I will present preliminary results from a follow-up project, OneProtGPT, which integrates OneProt with scientific large language models to enable cross-modal retrieval and the integration of protein representations with natural language.
536	_	_	\|a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) \|0 G:(DE-HGF)POF4-5112 \|c POF4-511 \|f POF IV \|x 0
536	_	_	\|a Helmholtz AI Consultant Team FB Information (E54.303.11) \|0 G:(DE-Juel-1)E54.303.11 \|c E54.303.11 \|x 1
909	C	O	\|o oai:juser.fz-juelich.de:1052326 \|p VDB
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)192120
913	1	_	\|a DE-HGF \|b Key Technologies \|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action \|1 G:(DE-HGF)POF4-510 \|0 G:(DE-HGF)POF4-511 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Enabling Computational- & Data-Intensive Science and Engineering \|9 G:(DE-HGF)POF4-5112 \|x 0
914	1	_	\|y 2026
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a talk
980	_	_	\|a VDB
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help