Journal Article FZJ-2025-00819

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
MISATO: machine learning dataset of protein–ligand complexes for structure-based drug discovery

 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;

2024
Nature Research London

Nature computational science 4(5), 367 - 378 () [10.1038/s43588-024-00627-2]

This record in other databases:      

Please use a persistent id in citations: doi:  doi:

Abstract: Large language models have greatly enhanced our ability to understand biology and chemistry, yet robust methods for structure-based drug discovery, quantum chemistry and structural biology are still sparse. Precise biomolecule–ligand interaction datasets are urgently needed for large language models. To address this, we present MISATO, a dataset that combines quantum mechanical properties of small molecules and associated molecular dynamics simulations of ~20,000 experimental protein–ligand complexes with extensive validation of experimental data. Starting from the existing experimental structures, semi-empirical quantum mechanics was used to systematically refine these structures. A large collection of molecular dynamics traces of protein–ligand complexes in explicit water is included, accumulating over 170 μs. We give examples of machine learning (ML) baseline models proving an improvement of accuracy by employing our data. An easy entry point for ML experts is provided to enable the next generation of drug discovery artificial intelligence models.

Classification:

Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)

Appears in the scientific report 2024
Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; OpenAccess ; Clarivate Analytics Master Journal List ; DEAL Nature ; Emerging Sources Citation Index ; IF >= 10 ; JCR ; SCOPUS ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2025-01-20, last modified 2025-02-03


OpenAccess:
Download fulltext PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)