| |

Cryo2Struct : A Large Labeled Cryo-EM Density Map Dataset for AI-based Reconstruction of Protein Structures.

Researchers

Journal

Modalities

Models

Abstract

The advent of single-particle cryogenic electron microscopy (cryo-EM) has brought forth a new era of structural biology, enabling the routine determination of large biological protein complexes and assemblies at atomic resolution. The high-resolution structures of protein complexes and assemblies significantly expedite biomedical research and drug discovery. However, automatically and accurately reconstructing protein structures from high-resolution density maps generated by cryo-EM is still time-consuming and challenging when template structures for the protein chains in a target protein complex are unavailable. Artificial intelli-gence (AI) methods such as deep learning trained on limited amounts of labeled cryo-EM density maps generate unstable reconstructions. To address this issue, we created a dataset called Cryo2Struct consisting of 7,600 preprocessed cryo-EM density maps whose voxels are labelled according to their corresponding known protein structures for training and testing AI methods to infer protein structures from density maps. It is larger and has better quality than any existing, publicly available dataset. We trained and tested deep learning models on Cryo2Struct to make sure it is ready for the large-scale development of AI methods for reconstructing protein structures from cryo-EM density maps. The source code, data and instructions to reproduce our results are freely available at https://github.com/BioinfoMachineLearning/cryo2struct .

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *