|

How deepBICS quantifies intensities of transcription factor-DNA binding and facilitates prediction of single nucleotide variant pathogenicity with a deep learning model trained on ChIP-seq data sets.

Researchers

Journal

Modalities

Models

Abstract

The binding of DNA sequences to cell typespecific transcription factors is essential for regulating gene expression in all organisms. Many variants occurring in these binding regions play crucial roles in human disease by disrupting the cis-regulation of gene expression. We first implemented a sequence-based deep learning model called deepBICS to quantify the intensity of transcription factors-DNA binding. The experimental results not only showed the superiority of deepBICS on ChIP-seq data sets but also suggested deepBICS as a language model could help the classification of disease-related and neutral variants. We then built a language model-based method called deepBICS4SNV to predict the pathogenicity of single nucleotide variants. The good performance of deepBICS4SNV on 2 tests related to Mendelian disorders and viral diseases shows the sequence contextual information derived from language models can improve prediction accuracy and generalization capability.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *