Multi-modal integration for biological tasks: perks, caveats and applications

Bazarova, Alina
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INPROCEEDINGS{Bazarova:1052326,
      author       = {Bazarova, Alina},
      title        = {{M}ulti-modal integration for biological tasks: perks,
                      caveats and applications},
      school       = {MDC-BIMSB},
      reportid     = {FZJ-2026-00934},
      year         = {2026},
      abstract     = {In this talk, I will present OneProt, a versatile
                      artificial intelligence framework for protein analysis that
                      leverages multi-modal integration across structural,
                      sequence, textual, and binding-site data. To align these
                      heterogeneous modalities, OneProt adopts an
                      ImageBind-inspired training strategy, enabling efficient
                      cross-modal representation learning without requiring fully
                      paired data. By combining graph neural networks and
                      transformer-based architectures, OneProt achieves strong
                      performance across tasks such as enzyme function prediction
                      and binding-site analysis. I will highlight two key features
                      of the framework: its ability to seamlessly incorporate
                      custom modalities during pre-training, and a lightweight
                      fine-tuning strategy that relies only on a simple
                      multi-layer perceptron projection. Through empirical
                      results, I will demonstrate how multi-modal integration can
                      reduce the reliance on large task-specific datasets while
                      maintaining competitive downstream performance. Alongside
                      these benefits, I will discuss the practical challenges and
                      caveats of adding new modalities, including alignment noise,
                      modality imbalance, and training stability. Finally, I will
                      present preliminary results from a follow-up project,
                      OneProtGPT, which integrates OneProt with scientific large
                      language models to enable cross-modal retrieval and the
                      integration of protein representations with natural
                      language.},
      organization  = {Systems Biology Lecture Series, Berlin
                       (Germany)},
      subtyp        = {Invited},
      cin          = {JSC},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs)
                      and Research Groups (POF4-511) / Helmholtz AI Consultant
                      Team FB Information (E54.303.11)},
      pid          = {G:(DE-HGF)POF4-5112 / G:(DE-Juel-1)E54.303.11},
      typ          = {PUB:(DE-HGF)31},
      url          = {https://juser.fz-juelich.de/record/1052326},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help