|

TT3D: Leveraging Pre-Computed Protein 3D Sequence Models to Predict Protein-Protein Interactions.

Researchers

Journal

Modalities

Models

Abstract

High-quality computational structural models are now pre-computed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method of van Kempen et al. encodes the structural information of distances and angles along the protein backbone into a linear string of the same length as the protein string, using tokens from a 21-letter discretized structural alphabet (3Di).We show that using both the amino acid sequence and the 3Di sequence generated by Foldseek as inputs to our recent deep-learning method, Topsy-Turvy, substantially improves the performance of predicting protein-protein interactions cross-species. Thus TT3D presents a way to reuse all the computational effort going into producing high-quality structural models from sequence, while being sufficiently lightweight so that high-quality binary PPI predictions across all protein pairs can be made genome-wide.TT3D is available at https://github.com/samsledje/D-SCRIPT. An archived version of the code at time of submission can be found at https://zenodo.org/records/10037674.Supplementary data are available at Bioinformatics online.© The Author(s) 2023. Published by Oxford University Press.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *