| |

Transformer-based models for ICD-10 coding of death certificates with Portuguese text.

Researchers

Journal

Modalities

Models

Abstract

Natural Language Processing (NLP) can offer important tools for unlocking relevant information from clinical narratives. Although Transformer-based models can achieve remarkable results in several different NLP tasks, these models have been less used in clinical NLP, and particularly in low resource languages, of which Portuguese is one example. It is still not entirely clear whether pre-trained Transformer models are useful for clinical tasks, without further architecture engineering or particular training strategies. In this work, we propose a BERT model to assign ICD-10 codes for causes of death, by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We used a novel pre-training procedure that incorporates in-domain knowledge, and also a fine-tuning method to address the class imbalance issue. Experimental results show that, in this particular clinical task that requires the processing of relatively short documents, Transformer-based models can achieve very strong results, significantly outperforming tailored approaches based on recurrent neural networks.Copyright © 2022 The Author(s). Published by Elsevier Inc. All rights reserved.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *