Regression-Based Active Learning for Accessible Acceleration of Ultra-Large Library Docking.

Researchers

Albert Guskov Alexey Mishin Andrey Rogachev Egor Marin Khalid Mustafin Margarita Kovaleva Maria Kadukova Polina Khorn Valentin Borshchevskiy

Journal

Journal of chemical information and modeling

Modalities

Models

linear regression

Abstract

Structure-based drug discovery is a process for both hit finding and optimization that relies on a validated three-dimensional model of a target biomolecule, used to rationalize the structure-function relationship for this particular target. An ultralarge virtual screening approach has emerged recently for rapid discovery of high-affinity hit compounds, but it requires substantial computational resources. This study shows that active learning with simple linear regression models can accelerate virtual screening, retrieving up to 90% of the top-1% of the docking hit list after docking just 10% of the ligands. The results demonstrate that it is unnecessary to use complex models, such as deep learning approaches, to predict the imprecise results of ligand docking with a low sampling depth. Furthermore, we explore active learning meta-parameters and find that constant batch size models with a simple ensembling method provide the best ligand retrieval rate. Finally, our approach is validated on the ultralarge size virtual screening data set, retrieving 70% of the top-0.05% of ligands after screening only 2% of the library. Altogether, this work provides a computationally accessible approach for accelerated virtual screening that can serve as a blueprint for the future design of low-compute agents for exploration of the chemical space via large-scale accelerated docking. With recent breakthroughs in protein structure prediction, this method can significantly increase accessibility for the academic community and aid in the rapid discovery of high-affinity hit compounds for various targets.

Show Full Text

Regression-Based Active Learning for Accessible Acceleration of Ultra-Large Library Docking.

Researchers

Journal

Modalities

Models

Abstract

DFRscore: Deep Learning-Based Scoring of Synthetic Complexity with Drug-Focused Retrosynthetic Analysis for High-Throughput Virtual Screening.

CADD, AI and ML in Drug Discovery: A Comprehensive Review.

Molecular graph convolutions: moving beyond fingerprints.

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Solvation entropy, enthalpy and free energy prediction using a multi-task deep learning functional in 1D-RISM.

The power of deep learning to ligand-based novel drug discovery.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply