|

MetBERT: a generalizable and pre-trained deep learning model for the prediction of metastatic cancer from clinical notes.

Researchers

Journal

Modalities

Models

Abstract

Distant metastasis is the major cause of cancer-related deaths; however, early diagnosis of cancer metastasis remains a significant challenge. The recent advances in pre-trained natural language processing models coupled with the accumulation of publicly available Electronic Health Records (EHR) data provide an unprecedented opportunity to computationally tackle the challenge. Here, we fine-tuned multiple state-of-the-art BERT-based models using discharge summaries from the open MIMIC-III dataset and derived MetBERT, a novel model tailored to predict cancer metastasis from clinical notes. MetBERT achieved high performance (AUC=0.94) on our in-house validation dataset, suggesting its high generalizability. In addition, MetBERT enabled determining the date of cancer metastasis using the rich information in clinical notes and therefore could be potentially deployed as a tool for early diagnosis. Finally, we interpreted MetBERT at different scales and revealed a possible association between radiation therapy and metastasis risk in multiple cancer types.©2022 AMIA – All rights reserved.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *