|

Review of unsupervised pretraining strategies for molecules representation.

Researchers

Journal

Modalities

Models

Abstract

In recent years, the computer-assisted techniques make a great progress in the field of drug discovery. And, yet, the problem of limited labeled data problem is still challenging and also restricts the performance of these techniques in specific tasks, such as molecular property prediction, compound-protein interaction and de novo molecular generation. One effective solution is to utilize the experience and knowledge gained from other tasks to cope with related pursuits. Unsupervised pretraining is promising, due to its capability of leveraging a vast number of unlabeled molecules and acquiring a more informative molecular representation for the downstream tasks. In particular, models trained on large-scale unlabeled molecules can capture generalizable features, and this ability can be employed to improve the performance of specific downstream tasks. Many relevant pretraining works have been recently proposed. Here, we provide an overview of molecular unsupervised pretraining and related applications in drug discovery. Challenges and possible solutions are also summarized.© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected].

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *