Skip to content
classification-of-health-product-defect-reports-by-deep-learning-–-scientific-reports

Classification of health product defect reports by deep learning – Scientific Reports

References

  1. Nagaich, U. & Sadhna, D. Drug recall: An incubus for pharmaceutical companies and most serious drug recall of history. Int. J. Pharm. Investig. 5, 13–19 (2015).

    Google Scholar 

  2. US Food & Drug Administration. Annual Report. (2022). https://www.fda.gov/media/166289/download

  3. Lindström-Gommers, L. & Mullin, T. International Conference on Harmonization: Recent reforms as a driver of global regulatory harmonization and innovation in medical products. Clin. Pharmacol. Ther. 105, 926–931 (2019).

    Google Scholar 

  4. Ang, P. S. et al. A risk classification model for prioritising the management of quality issues relating to substandard medicines in Singapore. Pharmacoepidemiol. Drug Saf. 31, 729–738 (2022).

    Google Scholar 

  5. Vasey, B. et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nat. Med. 28, 924–933 (2022).

    Google Scholar 

  6. Vaswani, A. et al. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 30, 5999–6009 (2017).

    Google Scholar 

  7. Vig, J. A. Multiscale Visualization of Attention in the Transformer Model. ACL 2019–57th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations 37–42. arXiv preprint arXiv:1906.05714 (2019).

  8. Rogers, A., Kovaleva, O. & Rumshisky, A. A primer in bertology: What we know about how BERT works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2020).

    Google Scholar 

  9. Clark, K., Khandelwal, U., Levy, O. & Manning, C. D. What Does BERT Look At? An Analysis of BERT’s Attention. arXiv preprint arXiv:1906.04341. (2019).

  10. He, P., Liu, X., Gao, J. & Chen, W. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv preprint arXiv:2006.03654 (2020).

  11. Suzgun, M. et al. Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. (2022). arXiv preprint arXiv:2210.09261.

  12. Raffel, C. et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 5485–5551 (2020).

    Google Scholar 

  13. Radford, A. et al. Language Models are Unsupervised Multitask Learners. OpenAI blog. 1, 9 (2019).

    Google Scholar 

  14. Yang, Z. et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding. Adv Neural Inf. Process. Syst https://doi.org/10.48550/arXiv.1906.08237 (2019).

    Google Scholar 

  15. Liu, Y. et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 (2019).

  16. Brown, T. B. et al. Language Models are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).

    Google Scholar 

  17. Clark, K. & ELECTRA. : Pre-training Text Encoders as Discriminators Rather Than Generators. arXiv preprint arXiv:2003.10555 (2020).

  18. Bahdanau, D., Cho, K. H. & Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. 3rd International Conference on Learning Representations, ICLR – Conference Track Proceedings. arXiv preprint arXiv:1409.0473 (2014). arXiv preprint arXiv:1409.0473 (2014). (2015).

  19. Bommasani, R. et al. On the Opportunities and Risks of Foundation Models. arXiv preprint arXiv:2108.07258 (2021).

  20. Ghaseminejad Raeini, M. The evolution of language models: From N-Grams to LLMs, and beyond. Nat. Lang. Process. J. 12, 100168 (2025).

    Google Scholar 

  21. Hu, Y. et al. PheCatcher: Leveraging LLM-Generated Synthetic Data for Automated Phenotype Definition Extraction from Biomedical Literature. Stud. Health Technol. Inf. 329, 718–722 (2025).

    Google Scholar 

  22. Li, Y., Li, J., He, J. & Tao, C. AE-GPT: Using large language models to extract adverse events from surveillance reports-A use case with influenza vaccine adverse events. PLoS One https://doi.org/10.1371/journal.pone.0300919 (2024).

    Google Scholar 

  23. Devlin, J. et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL HLT 2019–2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies – Proceedings of the Conference 1, 4171–4186. arXiv preprint arXiv:1810.04805 (2019).

  24. Sun, C. et al. Biomedical named entity recognition using BERT in the machine reading comprehension framework. J Biomed. Inform 118, 103799 (2021).

    Google Scholar 

  25. Gu, Y. U. et al. Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing. ACM Trans. Comput. Healthc. (HEALTH). 3 (1), 1–23 (2021).

    Google Scholar 

  26. Tan, F. et al. Multigrained Representation Analysis and Ensemble Learning for Text Moderation. IEEE Trans. Neural Netw. Learn. Syst. 34, 7014–7023 (2022).

    Google Scholar 

  27. Senn, S., Tlachac, M. L., Flores, R. & Rundensteiner, E. Ensembles of BERT for depression classification. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2022, 4691–4694 (2022).

    Google Scholar 

  28. Widad, A., El Habib, B. L. & Ayoub, E. F. Bert for Question Answering applied on Covid-19. Procedia Comput. Sci. 198, 379–384 (2022).

    Google Scholar 

  29. Xu, C., Yuan, F. & Chen, S. BJBN: BERT-JOIN-BiLSTM networks for medical auxiliary diagnostic. J. Healthc. Eng. https://doi.org/10.1155/2022/3496810 (2022).

    Google Scholar 

  30. Ji, Z., Wei, Q. & Xu, H. BERT-based ranking for biomedical entity normalization. AMIA Jt. Summits Transl. Sci. Proc. 2020, 269–277 (2020).

    Google Scholar 

  31. Jiang, L. et al. IUP-BERT: Identification of umami peptides based on BERT features. Foods 11, 3742 (2022).

    Google Scholar 

  32. Aldahdooh, J., Vähä-Koskela, M., Tang, J. & Tanoli, Z. Using BERT to identify drug-target interactions from whole PubMed. BMC Bioinformatics 23245. (2022).

    Google Scholar 

  33. Tejani, A. S. et al. Performance of Multiple Pretrained BERT Models to Automate and Accelerate Data Annotation for Large Datasets. Radiol Artif. Intell 4, e220007 (2022).

    Google Scholar 

  34. Kuo, C. C., Chen, K. Y. & Luo, S. B. Audio-Aware Spoken Multiple-Choice Question Answering with Pre-Trained Language Models. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3170–3179 (2021).

    Google Scholar 

  35. Wang, Z. Y. et al. Pre-trained models based receiver design with natural redundancy for Chinese characters. IEEE Commun. Lett. 26, 2350–2354 (2022).

    Google Scholar 

  36. Kowsher, M. et al. Bangla-BERT: Transformer-based efficient model for transfer learning and language understanding. IEEE Access 10, 91855–91870 (2022).

    Google Scholar 

  37. Zhu, X., Wu, H. & Zhang, L. Automatic short-answer grading via BERT-based deep neural networks. IEEE Trans. Learn. Technol. 15, 364–375 (2022).

    Google Scholar 

  38. Liu, N., Hu, Q., Xu, H., Xu, X. & Chen, M. Med-BERT: A pretraining framework for medical records named entity recognition. IEEE Trans. Industr Inf. 18, 5600–5608 (2022).

    Google Scholar 

  39. Zhou, C. Comparative evaluation of GPT, BERT, and XLNet: Insights into their performance and applicability in NLP tasks. Trans. Comput. Sci. Intell. Syst. Res. 7, 415–421 (2024).

    Google Scholar 

  40. Gardazi, N. M. et al. BERT applications in natural language processing: a review. Artif. Intell. Rev. 2025 58, 166 (2025).

    Google Scholar 

  41. Zhong, R., Ghosh, D., Klein, D. & Steinhardt, J. Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level. Findings of the Association for Computational Linguistics: ACL-IJCNLP 3813–3827 (2021).

  42. Vinyals, O. et al. Matching Networks for One Shot Learning. Adv. Neural Inf. Process. Syst. 29, 3637–3645 (2016).

    Google Scholar 

  43. Baevski, A. et al. Cloze-driven Pretraining of Self-attention Networks. EMNLP-IJCNLP 2019–2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference 5360–5369. arXiv preprint arXiv:1903.07785 (2019).

  44. Schick, T. & Schütze, H. Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference. EACL –16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference 255–269. arXiv preprint arXiv:2001.07676 (2020). 255–269. arXiv preprint arXiv:2001.07676 (2020). (2021).

  45. Schick, T. & Schütze, H. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. NAACL-HLT 2021–2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference 2339–2352. arXiv preprint arXiv:2009.07118 (2020).

  46. Gao, T., Fisch, A. & Chen, D. Making Pre-trained Language Models Better Few-shot Learners. ACL-IJCNLP 2021–59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference 3816–3830. arXiv preprint arXiv:2012.15723 (2020).

  47. Shin, T. et al. Eliciting Knowledge from Language Models with Automatically Generated Prompts. EMNLP –2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference 4222–4235. arXiv preprint arXiv:2010.15980 (2020). 4222–4235. arXiv preprint arXiv:2010.15980 (2020).

  48. Lester, B., Al-Rfou, R. & Constant, N. The Power of Scale for Parameter-Efficient Prompt Tuning. EMNLP –2021 Conference on Empirical Methods in Natural Language Processing, Proceedings 3045–3059. arXiv preprint arXiv:2104.08691 (2021). 3045–3059. arXiv preprint arXiv:2104.08691 (2021).

  49. Liu, X. et al. GPT Understands, Too. AI Open https://doi.org/10.1016/j.aiopen.2023.08.012(2023).

  50. Li, X. L., Liang, P. & Prefix-Tuning Optimizing Continuous Prompts for Generation. ACL-IJCNLP 2021–59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference 4582–4597. arXiv preprint arXiv:2101.00190 (2021).

  51. Qin, G. & Eisner, J. Learning How to Ask: Querying LMs with Mixtures of Soft Prompts. NAACL-HLT 2021–2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference 5203–5212. arXiv preprint arXiv:2104.06599 (2021).

  52. Liu, X. et al. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. arXiv preprint arXiv:2110.07602 (2021).

  53. Khandelwal, U., He, H., Qi, P., Jurafsky, D. S. & Nearby Fuzzy Far Away: How Neural Language Models Use Context. ACL –56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 284–294. arXiv preprint arXiv:1805.04623 (2018). 1, 284–294. arXiv preprint arXiv:1805.04623 (2018).

  54. Zorzi, M., Combi, C., Lora, R., Pagliarini, M. & Moretti, U. Automagically encoding Adverse Drug Reactions in MedDRA. International Conference on Healthcare Informatics, IEEE 90–99 (2015). 90–99 (2015).

  55. Tiftikci, M., Özgür, A., He, Y. & Hur, J. Machine learning-based identification and rule-based normalization of adverse drug reactions in drug labels. BMC Bioinform. 20, 1–9 (2019).

    Google Scholar 

  56. Létinier, L. et al. Artificial Intelligence for Unstructured Healthcare Data: Application to Coding of Patient Reporting of Adverse Drug Reactions. Clin. Pharmacol. Ther. 110, 392–400 (2021).

    Google Scholar 

  57. McInnes, L., Healy, J. & Melville, J. U. M. A. P. Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv preprint arXiv:1802.03426 (2018).

  58. Lundberg, S. M. & Lee, S. I. A Unified Approach to Interpreting Model Predictions. Adv Neural Inf. Process. Syst https://doi.org/10.48550/arXiv.1705.07874 (2017).

    Google Scholar 

  59. Peryea, T. et al. Global Substance Registration System: consistent scientific descriptions for substances related to health. Nucleic Acids Res. 49, D1179–D1185 (2021).

    Google Scholar 

  60. Li, Y. et al. Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets. J. Biomed. Inf. 152, 104621 (2024).

    Google Scholar 

  61. Howard, J. & Ruder, S. Universal Language Model Fine-tuning for Text Classification. ACL –56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 328–339. arXiv preprint arXiv:1801.06146 (2018). 1, 328–339. arXiv preprint arXiv:1801.06146 (2018).

  62. He, J. et al. Prompt Tuning in Biomedical Relation Extraction. J. Healthc. Inf. Res. 8, 206–224 (2024).

    Google Scholar 

  63. Chooi, W. H. et al. Vaccine contamination: Causes and control. Vaccine 40, 1699–1701 (2022).

    Google Scholar 

  64. Wu, Y. et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv :160908144 (2016).

  65. Hao, Y., Dong, L., Wei, F. & Xu, K. Visualizing and Understanding the Effectiveness of BERT. EMNLP-IJCNLP 2019–2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference 4143–4152. arXiv preprint arXiv:1908.05620 (2019).

  66. Tan, C. et al. A Survey on Deep Transfer Learning. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11141. Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_27 (2018)

  67. Kingma, D. P., Ba, J. L. & Adam A Method for Stochastic Optimization. 3rd International Conference on Learning Representations, ICLR – Conference Track Proceedings. arXiv preprint arXiv:1412.6980 (2014). arXiv preprint arXiv:1412.6980 (2014). (2015).

  68. Loshchilov, I. & Hutter, F. Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR. arXiv preprint arXiv:1711.05101 (2017). arXiv preprint arXiv:1711.05101 (2017). (2019).

  69. Beltagy, I., Lo, K. & Cohan, A. SciBERT: A Pretrained Language Model for Scientific Text. EMNLP-IJCNLP –2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference 3615–3620 (2019). 3615–3620 (2019).

  70. Alsentzer, E. et al. Publicly Available Clinical BERT Embeddings. Proceedings of the 2nd Clinical Natural Language Processing Workshop, 72–78 (2019).

  71. Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data. 2016 3, 1–9 (2016).

    Google Scholar 

  72. Lee, J. et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240 (2019).

    Google Scholar 

  73. Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. 34th International Conference on Machine Learning ICML 2017 70, 3145–3153 (2017).

    Google Scholar 

Download references

colind88

Back To Top