The usage of deep neural network improves distinguishing COVID-19 from other suspected viral pneumonia by clinicians on chest CT: a real-world study.
Researchers
Journal
Modalities
Models
Abstract
Based on the current clinical routine, we aimed to develop a novel deep learning model to distinguish coronavirus disease 2019 (COVID-19) pneumonia from other types of pneumonia and validate it with a real-world dataset (RWD).
A total of 563 chest CT scans of 380 patients (227/380 were diagnosed with COVID-19 pneumonia) from 5 hospitals were collected to train our deep learning (DL) model. Lung regions were extracted by U-net, then transformed and fed to pre-trained ResNet-50-based IDANNet (Identification and Analysis of New covid-19 Net) to produce a diagnostic probability. Fivefold cross-validation was employed to validate the application of our model. Another 318 scans of 316 patients (243/316 were diagnosed with COVID-19 pneumonia) from 2 other hospitals were enrolled prospectively as the RWDs to testify our DL model’s performance and compared it with that from 3 experienced radiologists.
A three-dimensional DL model was successfully established. The diagnostic threshold to differentiate COVID-19 and non-COVID-19 pneumonia was 0.685 with an AUC of 0.906 (95% CI: 0.886-0.913) in the internal validation group. In the RWD cohort, our model achieved an AUC of 0.868 (95% CI: 0.851-0.876) with the sensitivity of 0.811 and the specificity of 0.822, non-inferior to the performance of 3 experienced radiologists, suggesting promising clinical practical usage.
The established DL model was able to achieve accurate identification of COVID-19 pneumonia from other suspected ones in the real-world situation, which could become a reliable tool in clinical routine.
• In an internal validation set, our DL model achieved the best performance to differentiate COVID-19 from non-COVID-19 pneumonia with a sensitivity of 0.836, a specificity of 0.800, and an AUC of 0.906 (95% CI: 0.886-0.913) when the threshold was set at 0.685. • In the prospective RWD cohort, our DL diagnostic model achieved a sensitivity of 0.811, a specificity of 0.822, and AUC of 0.868 (95% CI: 0.851-0.876), non-inferior to the performance of 3 experienced radiologists. • The attention heatmaps were fully generated by the model without additional manual annotation and the attention regions were highly aligned with the ROIs acquired by human radiologists for diagnosis.