| |

BertSNR: an interpretable deep learning framework for single nucleotide resolution identification of transcription factor binding sites based on DNA language model.

Researchers

Journal

Modalities

Models

Abstract

Transcription factors (TFs) are pivotal in the regulation of gene expression, and accurate identification of transcription factor binding sites (TFBSs) at high resolution is crucial for understanding the mechanisms underlying gene regulation. The task of identifying TFBSs from DNA sequences is a significant challenge in the field of computational biology today. To address this challenge, a variety of computational approaches have been developed. However, these methods face limitations in their ability to achieve high-resolution identification and often lack interpretability.We propose BertSNR, an interpretable deep learning framework for identifying TFBSs at single nucleotide resolution. BertSNR integrates sequence-level and token-level information by multi-task learning based on pre-trained DNA language models. Benchmarking comparisons show that our BertSNR outperforms the existing state-of-the-art methods in TFBS predictions. Importantly, we enhanced the interpretability of the model through attentional weight visualization and motif analysis, and discovered the subtle relationship between attention weight and motif. Moreover, BertSNR effectively identifies TFBSs in promoter regions, facilitating the study of intricate gene regulation.The BertSNR source code can be found at https://github.com/lhy0322/BertSNR.© The Author(s) 2024. Published by Oxford University Press.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *