Comparison of Evaluation Metrics of Deep Learning for Imbalanced Imaging Data in Osteoarthritis Studies.

Abstract

To compare the evaluation metrics for deep learning methods that were developed using imbalanced imaging data in osteoarthritis (OA) studies.This retrospective study utilized 2996 sagittal intermediate-weighted (IW) fat-suppressed (FS) knee MRIs with MRI Osteoarthritis Knee Score (MOAKS) readings from 2467 participants in the Osteoarthritis Initiative (OAI) study. We obtained probabilities of the presence of bone marrow lesions (BMLs) from MRIs in testing dataset at the sub-region (15 sub-regions), compartment, and whole-knee levels based on the trained deep learning models. We compared different evaluation metrics (e.g., receiver operating characteristic (ROC) and precision-recall (PR) curves) in the testing dataset with various class ratios (presence of BMLs vs. absence of BMLs) at these three data levels to assess the model’s performance.In a subregion with an extremely high imbalance ratio, the model achieved an ROC-AUC of 0.84, a PR-AUC of 0.10, a sensitivity of 0, and a specificity of 1.The commonly used ROC curve is not sufficiently informative, especially in the case of imbalanced data. We provide the following practical suggestions based on our data analysis: 1) ROC-AUC is recommended for balanced data, 2) PR-AUC should be used for moderately imbalanced data (i.e., when the proportion of the minor class is above 5% and less than 50%), and 3) for severely imbalanced data (i.e., when the proportion of the minor class is below 5%), it is not practical to apply a deep learning model, even with the application of techniques addressing imbalanced data issues.Copyright © 2023 The Authors. Published by Elsevier Ltd.. All rights reserved.

Show Full Text

Comparison of Evaluation Metrics of Deep Learning for Imbalanced Imaging Data in Osteoarthritis Studies.

Researchers

Journal

Modalities

Models

Abstract

Thigh Muscle Composition Changes in Knee Osteoarthritis Patients During Weight Loss: Sex-Specific Analysis Using Data from Osteoarthritis Initiative.

MotifCNN-fold: protein fold recognition based on fold-specific features extracted by motif-based convolutional neural networks.

Emotion recognition while applying cosmetic cream using deep learning from EEG data; cross-subject analysis.

FSTIF-UNet: a deep learning-based method towards automatic segmentation of intracranial aneurysms in un-reconstructed 3D-RA.

Integrating uncertainty in deep neural networks for MRI based stroke analysis.

Intelligent evaluation for lens optical performance based on machine vision.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply