Enhanced Adverse Drug Event Extraction Using Prefix-Based Multi-Prompt Tuning in Transformer Models
DOI: http://dx.doi.org/10.62527/joiv.8.3-2.3454
Abstract
Extracting mentions of adverse drug events and relationships between them is crucial for effective pharmacovigilance and drug safety surveillance. Recently, transformer-based models have significantly improved this task through fine-tuning. However, traditional fine-tuning of transformer models, especially those with many parameters, is resource-intensive, memory-inefficient, and often leaves a gap between pre-training and downstream task-specific objectives. Soft prompting is a lightweight approach that updates a trainable prompt to guide task-specific fine-tuning, showing comparable performance to traditional fine-tuning for large language models on simple tasks. However, its effectiveness on complex tasks like token-based sequence labeling requiring multiple predictions for a single input sequence remains underexplored, particularly in multi-task settings. In addition, using holistic prompts in multi-task learning settings may be biased to other subtasks. Additionally, some prompt tokens hurt the model prediction. This study proposes a prefix-based multi-prompt soft tuning method with attention-driven prompt token selection for tuning transformer models on multi-task dual sequence labelling for concept and relation extraction. We experimented with BERT and SciBERT models using frozen and unfrozen parameter strategies. Our approach achieved state-of-the-art performance on the n2c2 2018 and TAC 2017 datasets for adverse drug event extraction, with multi-prompt tuning in unfrozen models surpassing traditional fine-tuning. Moreover, it outperforms the largest clinical natural language processing model, GatorTron, on the n2c2 2018 dataset. This research highlights the potential of soft prompts in efficiently adapting large language models to complex downstream NLP tasks.
Keywords
Full Text:
PDFReferences
G. Herman Bernardim Andrade et al., “Assessing domain adaptation in adverse drug event extraction on real-world breast cancer records,” Int. J. Med. Inform., vol. 191, p. 105539, 2024, doi:10.1016/j.ijmedinf.2024.105539.
E. D. El-Allaly, M. Sarrouti, N. En-Nahnahi, and S. Ouatik El Alaoui, “An attentive joint model with transformer-based weighted graph convolutional network for extracting adverse drug event relation,” J. Biomed. Inform., vol. 125, p. 103968, Jan. 2022, doi:10.1016/j.jbi.2021.103968.
Y. B. Gumiel et al., “Temporal Relation Extraction in Clinical Texts,” ACM Comput. Surv., vol. 54, no. 7, p. 144, 2022, doi:10.1145/3462475.
B. S. Kaas-Hansen, D. Placido, C. L. Rodríguez, and A. P. Nielsen, “Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records,” J. Basic Clin. Pharmacol. Toxicol., pp. 282–293, Jul. 2022, doi:10.1111/bcpt.13773.
S. Narayanan, K. Mannam, S. P. Rajan, and P. V. Rangan, “Evaluation of Transfer Learning for Adverse Drug Event (ADE) and Medication Entity Extraction,” Proc. 3rd Clin. Nat. Lang. Process. Work., pp. 55–64, 2020, doi: 10.18653/v1/2020.clinicalnlp-1.6.
G. Duan, J. Miao, T. Huang, W. Luo, and D. Hu, “A Relational Adaptive Neural Model for Joint Entity and Relation Extraction,” Front. Neurorobot., vol. 15, p. 635492, Mar. 2021, doi:10.3389/fnbot.2021.635492.
Z. Xu, S. Lin, J. Chen, Y. Sheng, and L. Chen, “A Semi-supervised Method for Extracting Multiple Relations of Adverse Drug Events from Biomedical Literature,” IEEE Adv. Inf. Technol. Electron. Autom. Control Conf., pp. 934–938, 2021, doi:10.1109/iaeac50856.2021.9390651.
Y. Fan, S. Zhou, Y. Li, and R. Zhang, “Deep learning approaches for extracting adverse events and indications of dietary supplements from clinical text,” J. Am. Med. Informatics Assoc., vol. 28, no. 3, pp. 569–577, 2021, doi: 10.1093/jamia/ocaa218.
J. Lamy, “A data science approach to drug safety : Semantic and visual mining of adverse drug events from clinical trials of pain treatments,” Artif. Intell. Med., vol. 115, p. 102074, 2021, doi:10.1016/j.artmed.2021.102074.
A. Wasylewicz et al., “Identifying adverse drug reactions from free-text electronic hospital health record notes,” British Journal of Clinical Pharmacology, vol. 88, no. 3. pp. 1235–1245, 2022. doi:10.1111/bcp.15068.
C. Zhan, E. Roughead, L. Liu, N. Pratt, and J. Li, “Detecting potential signals of adverse drug events from prescription data,” Artif. Intell. Med., vol. 104, p. 101839, 2020, doi: 10.1016/j.artmed.2020.101839.
I. Beltagy, K. Lo, and C. Arman, “SciBERT: A Pretrained Language Model for Scientific Text,” Proc. 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process., pp. 3615–3620, 2019, doi: 10.18653/v1/D19-1371.
J. Sanyal, D. Rubin, and I. Banerjee, “A weakly supervised model for the automated detection of adverse events using clinical notes,” J. Biomed. Inform., vol. 126, p. 103969, 2022, doi:10.1016/j.jbi.2021.103969.
Y. Mohammadi, F. Ghasemian, J. Varshosaz, and M. Sattari, “Informatics in Medicine Unlocked Classifying referring / non-referring ADR in biomedical text using deep learning,” Informatics Med. Unlocked, vol. 39, p. 101246, 2023, doi:10.1016/j.imu.2023.101246.
T. ValizadehAslani et al., “PharmBERT: a domain-specific BERT model for drug labels,” Brief. Bioinform., vol. 24, p. bbad226, Jul. 2023, doi: 10.1093/bib/bbad226.
S. Modi, K. A. Kasmiran, N. Mohd Sharef, and M. Y. Sharum, “Extracting adverse drug events from clinical Notes: A systematic review of approaches used,” J. Biomed. Inform., vol. 151, p. 104603, 2024, doi: 10.1016/j.jbi.2024.104603.
W. Liu, L. Zhou, D. Zeng, and H. Qu, “Document-Level Relation Extraction with Structure Enhanced Transformer Encoder,” 2022 Int. Jt. Conf. Neural Networks, pp. 1–8, 2022, doi:10.1109/ijcnn55064.2022.9892647.
C. Mcmaster et al., “Developing a deep learning natural language processing algorithm for automated reporting of adverse drug reactions,” J. Biomed. Inform., vol. 137, p. 104265, 2023, doi:10.1016/j.jbi.2022.104265.
C. Peng et al., “Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction,” J. Biomed. Inform., vol. 153, p. 104630, 2024, doi:10.1016/j.jbi.2024.104630.
J. Li, X. Ji, and L. Hua, “Improving the Prediction of Adverse Drug Events Using Feature Fusion-Based Predictive Network Models,” IEEE Access, vol. 8, pp. 48812–48821, 2020, doi:10.1109/access.2020.2979452.
G. Qin and J. Eisner, “Learning How to Ask: Querying LMs with Mixtures of Soft Prompts,” NAACL-HLT 2021 - 2021 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Proc. Conf., pp. 5203–5212, 2021, doi: 10.18653/v1/2021.naacl-main.410.
X. Liu et al., “GPT understands, too,” AI Open, 2023, doi:10.1016/j.aiopen.2023.08.012.
X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” ACL-IJCNLP 2021 - 59th Annu. Meet. Assoc. Comput. Linguist. 11th Int. Jt. Conf. Nat. Lang. Process. Proc. Conf., pp. 4582–4597, 2021, doi: 10.18653/v1/2021.acl-long.353.
X. Liu et al., “P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks,” Proc. Annu. Meet. Assoc. Comput. Linguist., vol. 2, pp. 61–68, 2022, doi: 10.18653/v1/2022.acl-short.8.
F. Ma et al., “XPROMPT: Exploring the Extreme of Prompt Tuning,” Proc. 2022 Conf. Empir. Methods Nat. Lang. Process. EMNLP 2022, pp. 11033–11047, 2022, doi: 10.18653/v1/2022.emnlp-main.758.
P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing,” ACM Comput. Surv., vol. 55, no. 9, p. 195, 2023, doi: 10.1145/3560815.
M. Belousov, N. Milosevic, and W. Dixon, “Extracting adverse drug reactions and their context using sequence labelling ensembles in TAC2017,” TAC2017 Conf., pp. 1–11, 2019, doi:10.48550/arXiv.1905.11716.
S. Henry, K. Buchan, M. Filannino, A. Stubbs, and O. Uzuner, “2018 N2C2 Shared Task on Adverse Drug Events and Medication Extraction in Electronic Health Records,” J. Am. Med. Informatics Assoc., vol. 27, no. 1, pp. 3–12, 2020, doi: 10.1093/jamia/ocz166.
A. Vaswani et al., “Attention is all you need,” Adv. Neural Inf. Process. Syst., pp. 6000–6010, 2017, doi: 10.5555/3295222.3295349.
E. D. El-Allaly, M. Sarrouti, N. En-Nahnahi, and S. Ouatik El Alaoui, “MTTLADE: A multi-task transfer learning-based method for adverse drug events extraction,” Inf. Process. Manag., vol. 58, no. 3, p. 102473, 2021, doi: 10.1016/j.ipm.2020.102473.
X. Liu, P. He, W. Chen, and J. Gao, “Multi-Task Deep Neural Networks for Natural Language Understanding,” Proc. 57th Annu. Meet. Assoc. Comput. Linguist., pp. 4487–4496, 2019, doi:10.18653/v1/P19-1441.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 4171–4186, 2019, doi: 10.18653/v1/N19-1423.