{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,28]],"date-time":"2026-06-28T11:55:14Z","timestamp":1782647714468,"version":"3.54.5"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,7]],"date-time":"2024-02-07T00:00:00Z","timestamp":1707264000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,7]],"date-time":"2024-02-07T00:00:00Z","timestamp":1707264000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Deep learning has demonstrated significant advancements across various domains. However, its implementation in specialized areas, such as medical settings, remains approached with caution. In these high-stake environments, understanding the model's decision-making process is critical. This study assesses the performance of different pretrained Bidirectional Encoder Representations from Transformers (BERT) models and delves into understanding its decision-making within the context of medical image protocol assignment.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>Four different pre-trained BERT models (BERT, BioBERT, ClinicalBERT, RoBERTa) were fine-tuned for the medical image protocol classification task. Word importance was measured by attributing the classification output to every word using a gradient-based method. Subsequently, a trained radiologist reviewed the resulting word importance scores to assess the model\u2019s decision-making process relative to human reasoning.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>The BERT model came close to human performance on our test set. The BERT model successfully identified relevant words indicative of the target protocol. Analysis of important words in misclassifications revealed potential systematic errors in the model.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>The BERT model shows promise in medical image protocol assignment by reaching near human level performance and identifying key words effectively. The detection of systematic errors paves the way for further refinements to enhance its safety and utility in clinical settings.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12911-024-02444-z","type":"journal-article","created":{"date-parts":[[2024,2,7]],"date-time":"2024-02-07T10:02:54Z","timestamp":1707300174000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Exploring the performance and explainability of fine-tuned BERT models for neuroradiology protocol assignment"],"prefix":"10.1186","volume":"24","author":[{"given":"Salmonn","family":"Talebi","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Elizabeth","family":"Tong","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Anna","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ghiam","family":"Yamin","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Greg","family":"Zaharchuk","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mohammad R. K.","family":"Mofrad","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,2,7]]},"reference":[{"key":"2444_CR1","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1146\/annurev-bioeng-071516-044442","volume":"19","author":"D Shen","year":"2017","unstructured":"Shen D, Wu G, Suk H-I. Deep learning in medical image analysis. Annual review of biomedical engineering. 2017;19:221.","journal-title":"Annual review of biomedical engineering"},{"issue":"6","key":"2444_CR2","doi-asserted-by":"publisher","first-page":"1236","DOI":"10.1093\/bib\/bbx044","volume":"19","author":"R Miotto","year":"2018","unstructured":"Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Briefings in bioinformatics. 2018;19(6):1236\u201346.","journal-title":"Briefings in bioinformatics"},{"issue":"1","key":"2444_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41746-018-0065-x","volume":"1","author":"A Madani","year":"2018","unstructured":"Madani A, Ong JR, Tibrewal A, Mofrad MR. Deep echocardiography: data-efficient supervised and semi- supervised deep learning towards automated diagnosis of cardiac disease. NPJ digital medicine. 2018;1(1):1\u201311.","journal-title":"NPJ digital medicine"},{"key":"2444_CR4","doi-asserted-by":"publisher","first-page":"104956","DOI":"10.1016\/j.ijmedinf.2022.104956","volume":"170","author":"Kim Yoojoong","year":"2023","unstructured":"Yoojoong Kim, et al. \u201cPredicting medical specialty from text based on a domain-specific pre-trained BERT.\u201d Int J Med Inform. 2023;170:104956.","journal-title":"Int J Med Inform"},{"key":"2444_CR5","doi-asserted-by":"publisher","DOI":"10.1016\/j.imu.2022.101139","volume":"36","author":"Alexander Turchin","year":"2023","unstructured":"Turchin Alexander, Masharsky Stanislav, Zitnik Marinka. Comparison of BERT implementations for natural language processing of narrative medical documents. Informatics in Medicine Unlocked. 2023;36: 101139.","journal-title":"Informatics in Medicine Unlocked"},{"key":"2444_CR6","unstructured":"Wang A, Pruksachatkun Y, Nangia N, et al. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. In: Proceedings of the Advances in Neural Information Processing Systems. Vancouver; 2019. p. 3261\u20133275."},{"key":"2444_CR7","unstructured":"Pandey B, Kumar Pandey D, Pratap Mishra B, Rhmann W. A comprehensive survey of deep learning in the field of medical imaging and medical natural language processing: Challenges and research directions. J King Saud Univ Comput Inf Sci. 2021:1\u201317."},{"key":"2444_CR8","unstructured":"F. Doshi-Velez and B. Kim, \u201cTowards a rigorous science of interpretable machine learning,\u201d arXiv preprint arXiv:1702.08608, 2017."},{"key":"2444_CR9","doi-asserted-by":"crossref","unstructured":"Albahri AS, Duhaim AM, Fadhel MA, Alnoor A, Baqer NS, Alzubaidi L, Albahri OS Alamoodi AH, Bai J, Salhi A, et al. A systematic review of trustworthy and explainable artificial Intelligence in healthcare: assessment of quality, bias risk, and data fusion. Inf Fusion. 2023;96:156\u201391.","DOI":"10.1016\/j.inffus.2023.03.008"},{"key":"2444_CR10","unstructured":"(2019) Explainable ai: the basics policy brief. [Online]. Available: https:\/\/royalsociety.org\/-\/media\/policy\/projects\/explainable-ai\/ 985 AI-and-interpretability-policy-briefing.pdf"},{"key":"2444_CR11","unstructured":"G. Cina`, T. Ro\u00a8ber, R. Goedhart, and I. Birbil, \u201cWhy we do need explainable ai for healthcare,\u201d arXiv preprint arXiv:2206.15363, 2022."},{"issue":"7","key":"2444_CR12","doi-asserted-by":"crossref","first-page":"e14","DOI":"10.1002\/jmri.26211","volume":"49","author":"EJ van Beek","year":"2019","unstructured":"van Beek EJ, Kuhl C, Anzai Y, Desmond P, Ehman RL, Gong Q, Gold G, Gulani V, Hall-Craggs M, Leiner T, et al. Value of mri in medicine: more than just another test? Journal of Magnetic Resonance Imaging. 2019;49(7):e14\u201325.","journal-title":"Journal of Magnetic Resonance Imaging"},{"issue":"1","key":"2444_CR13","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1016\/j.jacr.2010.07.009","volume":"8","author":"CC Blackmore","year":"2011","unstructured":"Blackmore CC, Mecklenburg RS, Kaplan GS. Effectiveness of clinical decision support in controlling inappropriate imaging. Journal of the American College of Radiology. 2011;8(1):19\u201325.","journal-title":"Journal of the American College of Radiology"},{"issue":"5","key":"2444_CR14","doi-asserted-by":"publisher","first-page":"440","DOI":"10.1016\/j.jacr.2014.01.021","volume":"11","author":"GW Boland","year":"2014","unstructured":"Boland GW, Duszak R, Kalra M. Protocol design and optimization. Journal of the American College of Radiology. 2014;11(5):440\u20131.","journal-title":"Journal of the American College of Radiology"},{"issue":"10","key":"2444_CR15","doi-asserted-by":"publisher","first-page":"1210","DOI":"10.1016\/j.jacr.2016.04.009","volume":"13","author":"A Schemmel","year":"2016","unstructured":"Schemmel A, Lee M, Hanley T, Pooler BD, Kennedy T, Field A, Wiegmann D, John-Paul JY. Radiology workflow disruptors: a detailed analysis. Journal of the American College of Radiology. 2016;13(10):1210\u20134.","journal-title":"Journal of the American College of Radiology"},{"issue":"11","key":"2444_CR16","doi-asserted-by":"publisher","first-page":"981","DOI":"10.1056\/NEJMp1714229","volume":"378","author":"DS Char","year":"2018","unstructured":"Char DS, Shah NH, Magnus D. Implementing machine learning in health care\u2014addressing ethical challenges. The New England journal of medicine. 2018;378(11):981.","journal-title":"The New England journal of medicine"},{"issue":"5","key":"2444_CR17","doi-asserted-by":"publisher","first-page":"568","DOI":"10.1093\/jamia\/ocx125","volume":"25","author":"AD Brown","year":"2018","unstructured":"Brown AD, Marotta TR. Using machine learning for sequence-level automated MRI protocol selection in neuroradiology. Journal of the American Medical Informatics Association. 2018;25(5):568\u201371. https:\/\/doi.org\/10.1093\/jamia\/ocx125.","journal-title":"Journal of the American Medical Informatics Association."},{"issue":"9","key":"2444_CR18","doi-asserted-by":"publisher","first-page":"1149","DOI":"10.1016\/j.jacr.2020.03.012","volume":"17","author":"A Kalra","year":"2020","unstructured":"Kalra A, Chakraborty A, Fine B, Reicher J. Machine Learning for Automation of Radiology Protocols for Quality and Efficiency Improvement. Journal of the American College of Radiology. 2020;17(9):1149\u201358. https:\/\/doi.org\/10.1016\/j.jacr.2020.03.012.","journal-title":"Journal of the American College of Radiology."},{"key":"2444_CR19","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1016\/j.jbi.2018.09.008","volume":"87","author":"Y Wang","year":"2018","unstructured":"Wang Y, Liu S, Afzal N, et al. A comparison of word embeddings for the biomedical natural language processing. Journal of Biomedical Informatics. 2018;87:12\u201320. https:\/\/doi.org\/10.1016\/j.jbi.2018.09.008.","journal-title":"Journal of Biomedical Informatics."},{"key":"2444_CR20","unstructured":"Vaswani A, Shazeer N, Parmar N, et al. Attention is All you Need. In: Guyon I, Luxburg UV, Bengio S, et al., eds. Advances in Neural Information Processing Systems. Vol 30. Curran Associates, Inc.; 2017. https:\/\/proceedings.neurips.cc\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf"},{"key":"2444_CR21","doi-asserted-by":"publisher","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics; 2019:4171-4186. doi: https:\/\/doi.org\/10.18653\/v1\/N19-1423","DOI":"10.18653\/v1\/N19-1423"},{"key":"2444_CR22","doi-asserted-by":"publisher","unstructured":"Peters ME, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics; 2018:2227-2237. doi: https:\/\/doi.org\/10.18653\/v1\/N18-1202","DOI":"10.18653\/v1\/N18-1202"},{"issue":"4","key":"2444_CR23","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","volume":"36","author":"J Lee","year":"2020","unstructured":"Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234\u201340. https:\/\/doi.org\/10.1093\/bioinformatics\/btz682.","journal-title":"Bioinformatics."},{"key":"2444_CR24","unstructured":"Huang, Kexin, Jaan Altosaar, and Rajesh Ranganath. \"Clinicalbert: Modeling clinical notes and predicting hospital readmission.\"\u00a0arXiv preprint arXiv:1904.05342\u00a0(2019)."},{"key":"2444_CR25","doi-asserted-by":"crossref","unstructured":"T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., \u201cTransformers: State-of-the-art natural language processing,\u201d in Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, 2020, pp. 38\u201345.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"2444_CR26","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning CD. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.","DOI":"10.3115\/v1\/D14-1162"},{"key":"2444_CR27","unstructured":"Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning. 2017;70:3319\u201328."},{"key":"2444_CR28","unstructured":"N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds, A. Melnikov, N. Kliushkina, C. Araya, S. Yan et al., \u201cCaptum: A unified and generic model interpretability library for pytorch,\u201d arXiv preprint arXiv:2009.07896, 2020."},{"key":"2444_CR29","doi-asserted-by":"crossref","unstructured":"D. Alvarez-Melis and T. S. Jaakkola, \u201cA causal framework for explaining the predictions of black-box sequence-to-sequence models,\u201d arXiv preprint arXiv:1707.01943, 2017.","DOI":"10.18653\/v1\/D17-1042"},{"key":"2444_CR30","doi-asserted-by":"crossref","unstructured":"Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. \"\" Why should i trust you?\" Explaining the predictions of any classifier.\"\u00a0Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016.","DOI":"10.1145\/2939672.2939778"},{"key":"2444_CR31","unstructured":"Jain SWallace BC. Attention is not explanation. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human Language Technologies,\u00a0Volume 1 (Long and Short Papers). 2019. p. 3543\u201356."},{"key":"2444_CR32","unstructured":"Achiam, OpenAI Josh et al. \u201cGPT-4 Technical Report.\u201d (2023)."},{"key":"2444_CR33","unstructured":"Bills S, Cammarata N, Mossing D, Tillman H, Gao L, Goh G, Sutskever I, Leike J, Wu J, Saunders W. Language models can explain neurons in language models. 2023. URL https:\/\/openaipublic.blob.core.windows.net\/neuron-explainer\/paper\/index.html. Accessed 14 May 2023."},{"issue":"2","key":"2444_CR34","doi-asserted-by":"publisher","first-page":"160","DOI":"10.1016\/j.acra.2016.09.013","volume":"24","author":"Andrew D Brown","year":"2017","unstructured":"D Brown Andrew, R Marotta Thomas. A natural language processing-based model to automate mri brain protocol selection and prioritization. Acad Radiol. 2017;24(2):160\u20136.","journal-title":"Acad Radiol"},{"key":"2444_CR35","unstructured":"D. Hendrycks, C. Burns, A. Chen, and S. Ball, \u201cCuad: An expert-annotated nlp dataset for legal contract review,\u201d arXiv preprint arXiv:2103.06268, 2021."},{"key":"2444_CR36","doi-asserted-by":"crossref","unstructured":"Lai V, Tan C. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency. 2019. pp. 29\u201338.","DOI":"10.1145\/3287560.3287590"},{"key":"2444_CR37","doi-asserted-by":"crossref","unstructured":"Hao Y, Dong L, Wei F, Xu K. Self-attention attribution: Interpreting information interactions inside transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 14. 2021. pp. 12 963\u201312 971.","DOI":"10.1609\/aaai.v35i14.17533"},{"key":"2444_CR38","doi-asserted-by":"crossref","unstructured":"Hayati SA, Kang D, Ungar L. Does bert learn as humans perceive? understanding linguistic styles through lexica. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. URL https:\/\/arxiv.org\/abs\/2109.02738.","DOI":"10.18653\/v1\/2021.emnlp-main.510"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-024-02444-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-024-02444-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-024-02444-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,7]],"date-time":"2024-02-07T10:05:43Z","timestamp":1707300343000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-024-02444-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,7]]},"references-count":38,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["2444"],"URL":"https:\/\/doi.org\/10.1186\/s12911-024-02444-z","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,7]]},"assertion":[{"value":"17 August 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"This retrospective study (and all experimental protocols) was conducted with the approval of the Stanford Institutional Review Board (IRB) and under a waiver of informed consent. The study was approved for collaboration between Stanford University and the University of California, Berkeley. All methods were carried out in accordance with the relevant guidelines and regulations.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"40"}}