Learning from undercoded clinical records for automated International Classification of Diseases (ICD) coding.

Researchers

Chunlei Tang Dan Shi David W Bates Joseph M Plasek Li Zhou Lifang He Yao Zhang Yifei Lin Yucheng Jin Yun Xiong

Journal

Journal of the American Medical Informatics Association : JAMIA

Modalities

Models

Abstract

To develop an unbiased objective for learning automatic coding algorithms from clinical records annotated with only partial relevant International Classification of Diseases codes, as annotation noise in undercoded clinical records used as training data can mislead the learning process of deep neural networks.We use Medical Information Mart for Intensive Care III as our dataset. We employ positive-unlabeled learning to achieve unbiased loss estimation, which is free of misleading training signal. We then utilize reweighting mechanism to compensate for the imbalance between positive and negative samples. To further close the performance gap caused by poor quality annotation, we integrate the supervision provided by the automatic annotation tool Medical Concept Annotation Toolkit which can ease the heavy burden of manual validation.Our benchmarking results show that positive-unlabeled learning with reweighting outperforms competitive baseline methods over a range of missing label ratios. Integrating supervision provided by annotation tool further boosted the performance.Considering the annotation noise and severe imbalance, unbiased loss estimation and reweighting mechanism are both important for learning from undercoded clinical records. Unbiased loss requires the estimation of false negative ratios and estimation through trained models is practical and competitive.The combination of positive-unlabeled learning with reweighting and supervision provided by the annotation tool is a promising solution to learn from undercoded clinical records.© The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: [email protected].

Show Full Text

Learning from undercoded clinical records for automated International Classification of Diseases (ICD) coding.

Researchers

Journal

Modalities

Models

Abstract

Synthetic minority oversampling of vital statistics data with generative adversarial networks.

Using natural language processing for automated classification of disease and to identify misclassified ICD codes in cardiac disease.

Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation.

Building a challenging medical dataset for comparative evaluation of classifier capabilities.

GGAECDA: Predicting circRNA-disease associations using graph autoencoder based on graph representation learning.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply