Dont Just Pay Attention, PLANT It

Published in Preprint, 2024

Recommended citation: Your Name, You. (2024). "Paper Title Number 1." Journal 1. 1(1). https://arxiv.org/pdf/2410.23066

Abstract: The keystone of state-of-the-art Extreme Multi-Label Text Classification (XMTC) models is the multi-label attention layer within the decoder, which deftly directs label-specific focus to salient tokens in input text. Nonetheless, the process of acquiring these optimal attention weights is onerous and resource-intensive. To alleviate this strain, we introduce PLANTPretrained and Leveraged AtteNTion — an innovative transfer learning strategy to fine-tune XMTC decoders. The central notion involves transferring a pretrained learning-to-rank (L2R) model, utilizing its activations as attention weights, thereby serving as the ‘planted’ attention layer in the decoder. On the full MIMIC-III dataset, Plant excels in four out of seven metrics and surpasses in five for the top-50 code set, demonstrating its effectiveness. Remarkably, for the rare-50 code set, Plant achieves a significant 12.7-52.2% improvement in four metrics. On MIMIC-IV, it leads in three metrics. Notably, in low-shot scenarios, Plant matches traditional attention models’ precision despite using significantly less data (1/10 for precision at 5, 1/5 for precision at 15), highlighting its efficiency with skewed label distributions.

Download paper here

Code

Recommended citation: Your Name, You. (2024). “Paper Title Number 1.” Journal 1. 1(1).