| |

Global Clue-Guided Cross-Memory Quaternion Transformer Network for Multisource Remote Sensing Data Classification.

Researchers

Journal

Modalities

Models

Abstract

Multisource remote sensing data classification is a challenging research topic, and how to address the inherent heterogeneity between multimodal data while exploring their complementarity is crucial. Existing deep learning models usually directly adopt feature-level fusion designs, most of which, however, fail to overcome the impact of heterogeneity, limiting their performance. As such, a multimodal joint classification framework, called global clue-guided cross-memory quaternion transformer network (GCCQTNet), is proposed for multisource data i.e., hyperspectral image (HSI) and synthetic aperture radar (SAR)/light detection and ranging (LiDAR) classification. First, a three-branch structure is built to extract the local and global features, where an independent squeeze-expansion-like fusion (ISEF) structure is designed to update the local and global representations by considering the global information as an agent, suppressing the negative impact of multimodal heterogeneity layer by layer. A cross-memory quaternion transformer (CMQT) structure is further constructed to model the complex inner relationships between the intramodality and intermodality features to capture more discriminative fusion features that fully characterize multimodal complementarity. Finally, a cross-modality comparative learning (CMCL) structure is developed to impose the consistency constraint on global information learning, which, in conjunction with a classification head, is used to guide the end-to-end training of GCCQTNet. Extensive experiments on three public multisource remote sensing datasets illustrate the superiority of our GCCQTNet with regards to other state-of-the-art methods.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *