|

Protein-ligand binding residue prediction enhancement through hybrid deep heterogeneous learning of sequence and structure data.

Researchers

Journal

Modalities

Models

Abstract

Knowledge of protein-ligand binding residues is important for understanding the functions of proteins and their interaction mechanisms. From experimentally solved protein structures, how to accurately identify its potential binding sites of a specific ligand on the protein is still a challenging problem. Compared with structure-alignment-based methods, machine learning algorithms provide an alternative flexible solution which is less dependent on annotated homogeneous protein structures. Several factors are important for an efficient protein-ligand prediction model, e.g. discriminative feature representation and effective learning architecture to deal with both the large-scale and severe imbalanced data.
In this study, we propose a novel deep-learning-based method called DELIA for protein-ligand binding residue prediction. In DELIA, a hybrid deep neural network is designed to integrate 1D sequence-based features with 2D structure-based amino acid distance matrices. In order to overcome the problem of severe data imbalance between the binding and non-binding residues, strategies of oversampling in mini-batch, random under-sampling, and stacking ensemble strategy are designed to enhance the model. Experimental results on five benchmark datasets demonstrate the effectiveness of proposed DELIA pipeline.
The web server of DELIA is available at www.csbio.sjtu.edu.cn/bioinf/delia/.
Supplementary data are available at Bioinformatics online.
© The Author(s) (2020). Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected].

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *