|

Predicting protein-peptide binding residues via interpretable deep learning.

Researchers

Journal

Modalities

Models

Abstract

Identifying the protein-peptide binding residues is fundamentally important to understand the mechanisms of protein functions and explore drug discovery. Although several computational methods have been developed, they highly rely on third-party tools or information for feature design, easily resulting in low computational efficacy and suffering from low predictive performance. To address the limitations, we propose PepBCL, a novel BERT (Bidirectional Encoder Representation from Transformers)-based Contrastive Learning framework to predict the protein-Peptide binding residues based on protein sequences only. PepBCL is an end-to-end predictive model that is independent of designed features. Specifically, we introduce a well pre-trained protein language model that can automatically extract and learn high-latent representations of protein sequences relevant for protein structure and functions. Further, we design a novel contrastive learning module to optimize the feature representations of binding residues underlying the imbalanced dataset. We demonstrate that our proposed method significantly outperforms the state-of-the-art methods under benchmarking comparison, and achieves more robust performance. Moreover, we found that we further improve the performance via the integration of traditional features and our learnt features. Our results highlight the flexibility and adaptability of deep learning-based protein language model to capture both conserved and non-conserved sequential characteristics of peptide-binding residues. Interestingly, we demonstrate that peptide-binding residues in local sequential regions have more specific sequential patterns as compared with other protein-ligand binding residues, which potentially provides functional difference. Finally, to facilitate the use of our method, we establish an online predictive platform as the implementation of the proposed PepBCL, which is now available at http://server.wei-group.net/PepBCL/.https://github.com/Ruheng-W/PepBCL.Supplementary data are available at Bioinformatics online.© The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected].

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *