|

Multisemantic Level Patch Merger Vision Transformer for Diagnosis of Pneumonia.

Researchers

Journal

Modalities

Models

Abstract

The most popular test for pneumonia, a serious health threat to children, is chest X-ray imaging. However, the diagnosis of pneumonia relies on the expertise of experienced radiologists, and the scarcity of medical resources has forced us to conduct research on CAD (computer-aided diagnosis). In this study, we propose MP-ViT, the Multisemantic Level Patch Merger Vision Transformer, to achieve automatic diagnosis of pneumonia in chest X-ray images. We introduce Patch Merger to reduce the computational cost of ViT. Meanwhile, the intermediate results calculated by Patch Merger participate in the final classification in a concise way, so as to make full use of the intermediate information of the high-level semantic space to learn from local to overall and to avoid information loss caused by Patch Merger. We conducted experiments on a dataset with 3,883 chest X-ray images described as pneumonia and 1,349 images labeled as normal, and the results show that even without pretraining ViT on a large dataset, our model can achieve the accuracy of 0.91, the precision of 0.92, the recall of 0.89, and the F1-score of 0.90, which is better than Patch Merger on a small dataset. The model can provide CAD for physicians and improve diagnostic reliability.Copyright © 2022 Zheng Jiang and Liang Chen.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *