|

Limited generalizability of deep learning algorithm for pediatric pneumonia classification on external data.

Researchers

Journal

Modalities

Models

Abstract

(1) Develop a deep learning system (DLS) to identify pneumonia in pediatric chest radiographs, and (2) evaluate its generalizability by comparing its performance on internal versus external test datasets.Radiographs of patients between 1 and 5 years old from the Guangzhou Women and Children’s Medical Center (Guangzhou dataset) and NIH ChestXray14 dataset were included. We utilized 5232 radiographs from the Guangzhou dataset to train a ResNet-50 deep convolutional neural network (DCNN) to identify pediatric pneumonia. DCNN testing was performed on a holdout set of 624 radiographs from the Guangzhou dataset (internal test set) and 383 radiographs from the NIH ChestXray14 dataset (external test set). Receiver operating characteristic curves were generated, and area under the curve (AUC) was compared via DeLong parametric method. Colored heatmaps were generated using class activation mapping (CAM) to identify important image pixels for DCNN decision-making.The DCNN achieved AUC of 0.95 and 0.54 for identifying pneumonia on internal and external test sets, respectively (p < 0.0001). Heatmaps generated by the DCNN showed the algorithm focused on clinically relevant features for images from the internal test set, but not for images from the external test set.Our model had high performance when tested on an internal dataset but significantly lower accuracy when tested on an external dataset. Likewise, marked differences existed in the clinical relevance of features highlighted by heatmaps generated from internal versus external datasets. This study underscores potential limitations in the generalizability of such DLS models.© 2021. American Society of Emergency Radiology.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *