Performance of a Deep Learning Algorithm Compared with Radiologic Interpretation for Lung Cancer Detection on Chest Radiographs in a Health Screening Population.
Researchers
Journal
Modalities
Models
Abstract
Background The performance of a deep learning algorithm for lung cancer detection on chest radiographs in a health screening population is unknown. Purpose To validate a commercially available deep learning algorithm for lung cancer detection on chest radiographs in a health screening population. Materials and Methods Out-of-sample testing of a deep learning algorithm was retrospectively performed using chest radiographs from individuals undergoing a comprehensive medical check-up between July 2008 and December 2008 (validation test). To evaluate the algorithm performance for visible lung cancer detection, the area under the receiver operating characteristic curve (AUC) and diagnostic measures, including sensitivity and false-positive rate (FPR), were calculated. The algorithm performance was compared with that of radiologists using the McNemar test and the Moskowitz method. Additionally, the deep learning algorithm was applied to a screening cohort undergoing chest radiography between January 2008 and December 2012, and its performances were calculated. Results In a validation test comprising 10β285 radiographs from 10β202 individuals (mean age, 54 years Β± 11 [standard deviation]; 5857 men) with 10 radiographs of visible lung cancers, the algorithm’s AUC was 0.99 (95% confidence interval: 0.97, 1), and it showed comparable sensitivity (90% [nine of 10 radiographs]) to that of the radiologists (60% [six of 10 radiographs]; P = .25) with a higher FPR (3.1% [319 of 10β275 radiographs] vs 0.3% [26 of 10β275 radiographs]; P < .001). In the screening cohort of 100β525 chest radiographs from 50β070 individuals (mean age, 53 years Β± 11; 28β090 men) with 47 radiographs of visible lung cancers, the algorithm’s AUC was 0.97 (95% confidence interval: 0.95, 0.99), and its sensitivity and FPR were 83% (39 of 47 radiographs) and 3% (2999 of 100β478 radiographs), respectively. Conclusion A deep learning algorithm detected lung cancers on chest radiographs with a performance comparable to that of radiologists, which will be helpful for radiologists in healthy populations with a low prevalence of lung cancer. Β©βRSNA, 2020 Online supplemental material is available for this article. See also the editorial by Armato in this issue.