Classification of health product defect reports by deep learning – Scientific Reports

14 March 2026
colind88
News Feed

References

Nagaich, U. & Sadhna, D. Drug recall: An incubus for pharmaceutical companies and most serious drug recall of history. Int. J. Pharm. Investig. 5, 13–19 (2015).

Google Scholar
US Food & Drug Administration. Annual Report. (2022). https://www.fda.gov/media/166289/download
Lindström-Gommers, L. & Mullin, T. International Conference on Harmonization: Recent reforms as a driver of global regulatory harmonization and innovation in medical products. Clin. Pharmacol. Ther. 105, 926–931 (2019).

Google Scholar
Ang, P. S. et al. A risk classification model for prioritising the management of quality issues relating to substandard medicines in Singapore. Pharmacoepidemiol. Drug Saf. 31, 729–738 (2022).

Google Scholar
Vasey, B. et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nat. Med. 28, 924–933 (2022).

Google Scholar
Vaswani, A. et al. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 30, 5999–6009 (2017).

Google Scholar
Vig, J. A. Multiscale Visualization of Attention in the Transformer Model. ACL 2019–57th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations 37–42. arXiv preprint arXiv:1906.05714 (2019).
Rogers, A., Kovaleva, O. & Rumshisky, A. A primer in bertology: What we know about how BERT works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2020).

Google Scholar
Clark, K., Khandelwal, U., Levy, O. & Manning, C. D. What Does BERT Look At? An Analysis of BERT’s Attention. arXiv preprint arXiv:1906.04341. (2019).
He, P., Liu, X., Gao, J. & Chen, W. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv preprint arXiv:2006.03654 (2020).
Suzgun, M. et al. Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. (2022). arXiv preprint arXiv:2210.09261.
Raffel, C. et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 5485–5551 (2020).

Google Scholar
Radford, A. et al. Language Models are Unsupervised Multitask Learners. OpenAI blog. 1, 9 (2019).

Google Scholar
Yang, Z. et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding. Adv Neural Inf. Process. Syst https://doi.org/10.48550/arXiv.1906.08237 (2019).

Google Scholar
Liu, Y. et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 (2019).
Brown, T. B. et al. Language Models are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).

Google Scholar
Clark, K. & ELECTRA. : Pre-training Text Encoders as Discriminators Rather Than Generators. arXiv preprint arXiv:2003.10555 (2020).
Bahdanau, D., Cho, K. H. & Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, ICLR – Conference Track Proceedings. arXiv preprint arXiv:1409.0473 (2014). arXiv preprint arXiv:1409.0473 (2014). (2015).
Bommasani, R. et al. On the Opportunities and Risks of Foundation Models. arXiv preprint arXiv:2108.07258 (2021).
Ghaseminejad Raeini, M. The evolution of language models: From N-Grams to LLMs, and beyond. Nat. Lang. Process. J. 12, 100168 (2025).

Google Scholar
Hu, Y. et al. PheCatcher: Leveraging LLM-Generated Synthetic Data for Automated Phenotype Definition Extraction from Biomedical Literature. Stud. Health Technol. Inf. 329, 718–722 (2025).

Google Scholar
Li, Y., Li, J., He, J. & Tao, C. AE-GPT: Using large language models to extract adverse events from surveillance reports-A use case with influenza vaccine adverse events. PLoS One https://doi.org/10.1371/journal.pone.0300919 (2024).

Google Scholar
Devlin, J. et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL HLT 2019–2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies – Proceedings of the Conference 1, 4171–4186. arXiv preprint arXiv:1810.04805 (2019).
Sun, C. et al. Biomedical named entity recognition using BERT in the machine reading comprehension framework. J Biomed. Inform 118, 103799 (2021).

Google Scholar
Gu, Y. U. et al. Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing. ACM Trans. Comput. Healthc. (HEALTH). 3 (1), 1–23 (2021).

Google Scholar
Tan, F. et al. Multigrained Representation Analysis and Ensemble Learning for Text Moderation. IEEE Trans. Neural Netw. Learn. Syst. 34, 7014–7023 (2022).

Google Scholar
Senn, S., Tlachac, M. L., Flores, R. & Rundensteiner, E. Ensembles of BERT for depression classification. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2022, 4691–4694 (2022).

Google Scholar
Widad, A., El Habib, B. L. & Ayoub, E. F. Bert for Question Answering applied on Covid-19. Procedia Comput. Sci. 198, 379–384 (2022).

Google Scholar
Xu, C., Yuan, F. & Chen, S. BJBN: BERT-JOIN-BiLSTM networks for medical auxiliary diagnostic. J. Healthc. Eng. https://doi.org/10.1155/2022/3496810 (2022).

Google Scholar
Ji, Z., Wei, Q. & Xu, H. BERT-based ranking for biomedical entity normalization. AMIA Jt. Summits Transl. Sci. Proc. 2020, 269–277 (2020).

Google Scholar
Jiang, L. et al. IUP-BERT: Identification of umami peptides based on BERT features. Foods 11, 3742 (2022).

Google Scholar
Aldahdooh, J., Vähä-Koskela, M., Tang, J. & Tanoli, Z. Using BERT to identify drug-target interactions from whole PubMed. BMC Bioinformatics 23245. (2022).

Google Scholar
Tejani, A. S. et al. Performance of Multiple Pretrained BERT Models to Automate and Accelerate Data Annotation for Large Datasets. Radiol Artif. Intell 4, e220007 (2022).

Google Scholar
Kuo, C. C., Chen, K. Y. & Luo, S. B. Audio-Aware Spoken Multiple-Choice Question Answering with Pre-Trained Language Models. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3170–3179 (2021).

Google Scholar
Wang, Z. Y. et al. Pre-trained models based receiver design with natural redundancy for Chinese characters. IEEE Commun. Lett. 26, 2350–2354 (2022).

Google Scholar
Kowsher, M. et al. Bangla-BERT: Transformer-based efficient model for transfer learning and language understanding. IEEE Access 10, 91855–91870 (2022).

Google Scholar
Zhu, X., Wu, H. & Zhang, L. Automatic short-answer grading via BERT-based deep neural networks. IEEE Trans. Learn. Technol. 15, 364–375 (2022).

Google Scholar
Liu, N., Hu, Q., Xu, H., Xu, X. & Chen, M. Med-BERT: A pretraining framework for medical records named entity recognition. IEEE Trans. Industr Inf. 18, 5600–5608 (2022).

Google Scholar
Zhou, C. Comparative evaluation of GPT, BERT, and XLNet: Insights into their performance and applicability in NLP tasks. Trans. Comput. Sci. Intell. Syst. Res. 7, 415–421 (2024).

Google Scholar
Gardazi, N. M. et al. BERT applications in natural language processing: a review. Artif. Intell. Rev. 2025 58, 166 (2025).

Google Scholar
Zhong, R., Ghosh, D., Klein, D. & Steinhardt, J. Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level. Findings of the Association for Computational Linguistics: ACL-IJCNLP 3813–3827 (2021).
Vinyals, O. et al. Matching Networks for One Shot Learning. Adv. Neural Inf. Process. Syst. 29, 3637–3645 (2016).

Google Scholar
Baevski, A. et al. Cloze-driven Pretraining of Self-attention Networks. EMNLP-IJCNLP 2019–2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference 5360–5369. arXiv preprint arXiv:1903.07785 (2019).
Schick, T. & Schütze, H. Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference. EACL –16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference 255–269. arXiv preprint arXiv:2001.07676 (2020). 255–269. arXiv preprint arXiv:2001.07676 (2020). (2021).
Schick, T. & Schütze, H. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. NAACL-HLT 2021–2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference 2339–2352. arXiv preprint arXiv:2009.07118 (2020).
Gao, T., Fisch, A. & Chen, D. Making Pre-trained Language Models Better Few-shot Learners. ACL-IJCNLP 2021–59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference 3816–3830. arXiv preprint arXiv:2012.15723 (2020).
Shin, T. et al. Eliciting Knowledge from Language Models with Automatically Generated Prompts. EMNLP –2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference 4222–4235. arXiv preprint arXiv:2010.15980 (2020). 4222–4235. arXiv preprint arXiv:2010.15980 (2020).
Lester, B., Al-Rfou, R. & Constant, N. The Power of Scale for Parameter-Efficient Prompt Tuning. EMNLP –2021 Conference on Empirical Methods in Natural Language Processing, Proceedings 3045–3059. arXiv preprint arXiv:2104.08691 (2021). 3045–3059. arXiv preprint arXiv:2104.08691 (2021).
Liu, X. et al. GPT Understands, Too. AI Open https://doi.org/10.1016/j.aiopen.2023.08.012(2023).
Li, X. L., Liang, P. & Prefix-Tuning Optimizing Continuous Prompts for Generation. ACL-IJCNLP 2021–59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference 4582–4597. arXiv preprint arXiv:2101.00190 (2021).
Qin, G. & Eisner, J. Learning How to Ask: Querying LMs with Mixtures of Soft Prompts. NAACL-HLT 2021–2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference 5203–5212. arXiv preprint arXiv:2104.06599 (2021).
Liu, X. et al. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. arXiv preprint arXiv:2110.07602 (2021).
Khandelwal, U., He, H., Qi, P., Jurafsky, D. S. & Nearby Fuzzy Far Away: How Neural Language Models Use Context. ACL –56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 284–294. arXiv preprint arXiv:1805.04623 (2018). 1, 284–294. arXiv preprint arXiv:1805.04623 (2018).
Zorzi, M., Combi, C., Lora, R., Pagliarini, M. & Moretti, U. Automagically encoding Adverse Drug Reactions in MedDRA. International Conference on Healthcare Informatics, IEEE 90–99 (2015). 90–99 (2015).
Tiftikci, M., Özgür, A., He, Y. & Hur, J. Machine learning-based identification and rule-based normalization of adverse drug reactions in drug labels. BMC Bioinform. 20, 1–9 (2019).

Google Scholar
Létinier, L. et al. Artificial Intelligence for Unstructured Healthcare Data: Application to Coding of Patient Reporting of Adverse Drug Reactions. Clin. Pharmacol. Ther. 110, 392–400 (2021).

Google Scholar
McInnes, L., Healy, J. & Melville, J. U. M. A. P. Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv preprint arXiv:1802.03426 (2018).
Lundberg, S. M. & Lee, S. I. A Unified Approach to Interpreting Model Predictions. Adv Neural Inf. Process. Syst https://doi.org/10.48550/arXiv.1705.07874 (2017).

Google Scholar
Peryea, T. et al. Global Substance Registration System: consistent scientific descriptions for substances related to health. Nucleic Acids Res. 49, D1179–D1185 (2021).

Google Scholar
Li, Y. et al. Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets. J. Biomed. Inf. 152, 104621 (2024).

Google Scholar
Howard, J. & Ruder, S. Universal Language Model Fine-tuning for Text Classification. ACL –56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 328–339. arXiv preprint arXiv:1801.06146 (2018). 1, 328–339. arXiv preprint arXiv:1801.06146 (2018).
He, J. et al. Prompt Tuning in Biomedical Relation Extraction. J. Healthc. Inf. Res. 8, 206–224 (2024).

Google Scholar
Chooi, W. H. et al. Vaccine contamination: Causes and control. Vaccine 40, 1699–1701 (2022).

Google Scholar
Wu, Y. et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv :160908144 (2016).
Hao, Y., Dong, L., Wei, F. & Xu, K. Visualizing and Understanding the Effectiveness of BERT. EMNLP-IJCNLP 2019–2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference 4143–4152. arXiv preprint arXiv:1908.05620 (2019).
Tan, C. et al. A Survey on Deep Transfer Learning. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11141. Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_27 (2018)
Kingma, D. P., Ba, J. L. & Adam A Method for Stochastic Optimization. 3rd International Conference on Learning Representations, ICLR – Conference Track Proceedings. arXiv preprint arXiv:1412.6980 (2014). arXiv preprint arXiv:1412.6980 (2014). (2015).
Loshchilov, I. & Hutter, F. Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR. arXiv preprint arXiv:1711.05101 (2017). arXiv preprint arXiv:1711.05101 (2017). (2019).
Beltagy, I., Lo, K. & Cohan, A. SciBERT: A Pretrained Language Model for Scientific Text. EMNLP-IJCNLP –2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference 3615–3620 (2019). 3615–3620 (2019).
Alsentzer, E. et al. Publicly Available Clinical BERT Embeddings. Proceedings of the 2nd Clinical Natural Language Processing Workshop, 72–78 (2019).
Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data. 2016 3, 1–9 (2016).

Google Scholar
Lee, J. et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240 (2019).

Google Scholar
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. 34th International Conference on Machine Learning ICML 2017 70, 3145–3153 (2017).

Google Scholar

Download references

References

Share This

colind88

Related Posts

REACH OUT!