CHD-CXR: a de-identified publicly available dataset of chest x-ray for congenital heart disease.
Researchers
Journal
Modalities
Models
Abstract
Congenital heart disease is a prevalent birth defect, accounting for approximately one-third of major birth defects. The challenge lies in early detection, especially in underdeveloped medical regions where a shortage of specialized physicians often leads to oversight. While standardized chest x-rays can assist in diagnosis and treatment, their effectiveness is limited by subtle cardiac manifestations. However, the emergence of deep learning in computer vision has paved the way for detecting subtle changes in chest x-rays, such as lung vessel density, enabling the detection of congenital heart disease in children. This highlights the need for further investigation. The lack of expert-annotated, high-quality medical image datasets hinders the progress of medical image artificial intelligence. In response, we have released a dataset containing 828 DICOM chest x-ray files from children with diagnosed congenital heart disease, alongside corresponding cardiac ultrasound reports. This dataset emphasizes complex structural characteristics, facilitating the transition from machine learning to machine teaching in deep learning. To ascertain the dataset’s applicability, we trained a preliminary model and achieved an area under the receiver operating characteristic curve (ROC 0.85). We provide detailed introductions and publicly available datasets at: https://www.kaggle.com/competitions/congenital-heart-disease.© 2024 Zhixin, Gang, Zhixian, Sibao and Silin.