DeepOCR: A multi-species deep-learning framework for accurate identification of open chromatin regions in livestock.
Researchers
Journal
Modalities
Models
Abstract
A wealth of experimental evidence has suggested that open chromatin regions (OCRs) are involved in many critical biological activities, such as DNA replication, enhancer activity, and gene transcription. Accurately identifying OCRs in livestock species can provide critical insights into the distribution and characteristics of OCRs for disease treatment in livestock, thereby improving animal welfare. However, most current machine-learning methods for OCR prediction were originally designed for a limited number of model organisms, such as humans and some model organisms, and thus their performance on non-model organisms, specifically livestock, is often unsatisfactory. To bridge this gap, we propose DeepOCR, a lightweight depth-separable residual network model for predicting OCRs in livestock, including chicken, cattle, and sheep. DeepOCR integrates a single convolution layer and two improved residue structure blocks to extract and learn important features from the input DNA sequences. A fully connected layer was also employed to further process the extracted features and improve the robustness of the entire network. Our benchmarking experiments demonstrated superior prediction performance of DeepOCR compared to state-of-the-art approaches on testing datasets of the three species. The source code of DeepOCR is freely available for academic purposes at https://github.com/jasonzhao371/DeepOCR/. We anticipate DeepOCR servers as a practical and reliable computational tool for OCR-relatedĀ studies in livestock species.Copyright Ā© 2024 The Authors. Published by Elsevier Ltd.. All rights reserved.