BoostXML: Gradient Boosting for Extreme Multilabel Text Classification With Tail Labels.

Researchers

Journal

IEEE transactions on neural networks and learning systems

Modalities

Models

Abstract

Multilabel learning involving hundreds of thousands or even millions of labels is referred to as extreme multilabel learning (XML), in which the labels often follow a power-law distribution with the majority occurring in very few data points as tail labels. Recent years have witnessed the intensive use of deep-learning methods for high-performance XML, but they are typically optimized for the head labels with abundant training instances and less consider the performance on tail labels, which, however, like the needles in haystacks, are often the focus of attention in real-life applications. In light of this, we present BoostXML, a deep learning-based XML method for extreme multilabel text classification, enhanced greatly by gradient boosting. In BoostXML, we pay more attention to tail labels in each Boosting Step by optimizing the residual mostly from unfitted training instances with tail labels. A Corrective Step is further proposed to avoid the mismatching between the text encoder and weak learners during optimization, which reduces the risk of falling into local optima and improves model performance. A Pretraining Step is also introduced in the initial stage of BoostXML to avoid exorbitant bias to tail labels. Extensive experiments on five benchmark datasets with state-of-the-art baselines demonstrate the advantage of BoostXML in tail-label prediction.

Show Full Text

BoostXML: Gradient Boosting for Extreme Multilabel Text Classification With Tail Labels.

Researchers

Journal

Modalities

Models

Abstract

Deep learning-based auditory attention decoding in listeners with hearing impairment.

CO-WOA: Novel Optimization Approach for Deep Learning Classification of Fish Image.

Comparison of Intraoperative Ultrasound B-Mode and Strain Elastography for the Differentiation of Glioblastomas From Solitary Brain Metastases. An Automated Deep Learning Approach for Image Analysis.

CheXNet and feature pyramid network: a fusion deep learning architecture for multilabel chest X-Ray clinical diagnoses classification.

Deep Learning in RNA Structure Studies.

Using Deep Learning Algorithms to Grade Hydronephrosis Severity: Toward a Clinical Adjunct.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply