A Contrastive-Learning-Based Deep Neural Network for Cancer Subtyping by Integrating Multi-Omics Data.
Researchers
Journal
Modalities
Models
Abstract
Accurate identification of cancer subtypes is crucial for disease prognosis evaluation and personalized patient management. Recent advances in computational methods have demonstrated that multi-omics data provides valuable insights into tumor molecular subtyping. However, the high dimensionality and small sample size of the data may result in ambiguous and overlapping cancer subtypes during clustering. In this study, we propose a novel contrastive-learning-based approach to address this issue. The proposed end-to-end deep learning method can extract crucial information from the multi-omics features by self-supervised learning for patient clustering.By applying our method to nine public cancer datasets, we have demonstrated superior performance compared to existing methods in separating patients with different survival outcomes (pā<ā0.05). To further evaluate the impact of various omics data on cancer survival, we developed an XGBoost classification model and found that mRNA had the highest importance score, followed by DNA methylation and miRNA. In the presented case study, our method successfully clustered subtypes and identified 14 cancer-related genes, of which 12 (85.7%) were validated through literature review.Our findings demonstrate that our method is capable of identifying cancer subtypes that are both statistically and biologically significant. The code about COLCS is given at: https://github.com/Mercuriiio/COLCS .Ā© 2024. International Association of Scientists in the Interdisciplinary Areas.