Technical Adequacy of Fully Automated Artificial Intelligence Body Composition Tools: Assessment in a Heterogeneous Sample of External CT Examinations.

Researchers

Journal

Modalities

Models

Abstract

Please see the Editorial Comment by Robert D. Boutin discussing this article. Chinese (audio/PDF) and Spanish (audio/PDF) translations are available for this article’s abstract. BACKGROUND. Clinically usable artificial intelligence (AI) tools analyzing imaging studies should be robust to expected variations in study parameters. OBJECTIVE. The purposes of this study were to assess the technical adequacy of a set of automated AI abdominal CT body composition tools in a heterogeneous sample of external CT examinations performed outside of the authors’ hospital system and to explore possible causes of tool failure. METHODS. This retrospective study included 8949 patients (4256 men, 4693 women; mean age, 55.5 ± 15.9 years) who underwent 11,699 abdominal CT examinations performed at 777 unique external institutions with 83 unique scanner models from six manufacturers with images subsequently transferred to the local PACS for clinical purposes. Three independent automated AI tools were deployed to assess body composition (bone attenuation, amount and attenuation of muscle, amount of visceral and sub-cutaneous fat). One axial series per examination was evaluated. Technical adequacy was defined as tool output values within empirically derived reference ranges. Failures (i.e., tool output outside of reference range) were reviewed to identify possible causes. RESULTS. All three tools were technically adequate in 11,431 of 11,699 (97.7%) examinations. At least one tool failed in 268 (2.3%) of the examinations. Individual adequacy rates were 97.8% for the bone tool, 99.1% for the muscle tool, and 98.9% for the fat tool. A single type of image processing error (anisometry error, due to incorrect DICOM header voxel dimension information) accounted for 81 of 92 (88.0%) examinations in which all three tools failed, and all three tools failed whenever this error occurred. Anisometry error was the most common specific cause of failure of all tools (bone, 31.6%; muscle, 81.0%; fat, 62.8%). A total of 79 of 81 (97.5%) anisometry errors occurred on scanners from a single manufacturer; 80 of 81 (98.8%) occurred on the same scanner model. No cause of failure was identified for 59.4% of failures of the bone tool, 16.0% of failures of the muscle tool, or 34.9% of failures of the fat tool. CONCLUSION. The automated AI body composition tools had high technical adequacy rates in a heterogeneous sample of external CT examinations, supporting the generalizability of the tools and their potential for broad use. CLINICAL IMPACT. Certain causes of AI tool failure related to technical factors may be largely preventable through use of proper acquisition and reconstruction protocols.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *