{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T01:59:28Z","timestamp":1778637568703,"version":"3.51.4"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,5,26]],"date-time":"2021-05-26T00:00:00Z","timestamp":1621987200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,5,26]],"date-time":"2021-05-26T00:00:00Z","timestamp":1621987200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Biomedical question answering (QA) is a sub-task of natural language processing in a specific domain, which aims to answer a question in the biomedical field based on one or more related passages and can provide people with accurate healthcare-related information. Recently, a lot of approaches based on the neural network and large scale pre-trained language model have largely improved its performance. However, considering the lexical characteristics of biomedical corpus and its small scale dataset, there is still much improvement room for biomedical QA tasks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Inspired by the importance of syntactic and lexical features in the biomedical corpus, we proposed a new framework to extract external features, such as part-of-speech and named-entity recognition, and fused them with the original text representation encoded by pre-trained language model, to enhance the biomedical question answering performance. Our model achieves an overall improvement of all three metrics on BioASQ 6b, 7b, and 8b factoid question answering tasks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>The experiments on BioASQ question answering dataset demonstrated the effectiveness of our external feature-enriched framework. It is proven by the experiments conducted that external lexical and syntactic features can improve Pre-trained Language Model\u2019s performance in biomedical domain question answering task.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-021-04176-7","type":"journal-article","created":{"date-parts":[[2021,5,26]],"date-time":"2021-05-26T10:02:58Z","timestamp":1622023378000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["External features enriched model for biomedical question answering"],"prefix":"10.1186","volume":"22","author":[{"given":"Gezheng","family":"Xu","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenge","family":"Rong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanmeng","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuanxin","family":"Ouyang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhang","family":"Xiong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,5,26]]},"reference":[{"key":"4176_CR1","doi-asserted-by":"crossref","unstructured":"Zhang Y, Qian S, Fang Q, Xu C. Multi-modal knowledge-aware hierarchical attention network for explainable medical question answering. In: Proceedings of the 27th ACM international conference on multimedia; 2019. p. 1089\u201397.","DOI":"10.1145\/3343031.3351033"},{"key":"4176_CR2","doi-asserted-by":"crossref","unstructured":"Yin J, Jiang X, Lu Z, Shang L, Li H, Li X. Neural generative question answering. In: Proceedings of the 25th international joint conference on artificial intelligence; 2016. p. 2972\u20132978.","DOI":"10.18653\/v1\/W16-0106"},{"key":"4176_CR3","doi-asserted-by":"crossref","unstructured":"Chen D, Fisch A, Weston J, Bordes A. Reading wikipedia to answer open-domain questions. In: Proceedings of the 55th annual meeting of the association for computational linguistics; 2017. p. 1870\u20131879.","DOI":"10.18653\/v1\/P17-1171"},{"key":"4176_CR4","doi-asserted-by":"crossref","unstructured":"Wiese G, Weissenborn D, Neves ML. Neural domain adaptation for biomedical question answering. In: Proceedings of the 21st conference on computational natural language learning; 2017. p. 281\u2013289.","DOI":"10.18653\/v1\/K17-1029"},{"key":"4176_CR5","unstructured":"Devlin J, Chang M, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies; 2019. p. 4171\u20134186."},{"key":"4176_CR6","doi-asserted-by":"crossref","unstructured":"Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies; 2018. p. 2227\u20132237.","DOI":"10.18653\/v1\/N18-1202"},{"issue":"4","key":"4176_CR7","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","volume":"36","author":"J Lee","year":"2020","unstructured":"Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234\u201340.","journal-title":"Bioinformatics"},{"key":"4176_CR8","doi-asserted-by":"crossref","unstructured":"Lamurias A, Couto FM. Lasigebiotm at MEDIQA 2019: biomedical question answering using bidirectional transformers and named entity recognition. In: Proceedings of the 18th BioNLP workshop and shared task. 2019. p. 523\u2013527.","DOI":"10.18653\/v1\/W19-5057"},{"issue":"1","key":"4176_CR9","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1186\/s13321-018-0312-9","volume":"10","author":"FM Couto","year":"2018","unstructured":"Couto FM, Lamurias A. MER: a shell script and annotation server for minimal named entity recognition and linking. J Cheminform. 2018;10(1):58\u201315810.","journal-title":"J Cheminform"},{"key":"4176_CR10","unstructured":"Tateisi Y, Tsujii J. Part-of-speech annotation of biology research abstracts. In: Proceedings of the 4th international conference on language resources and evaluation. 2004."},{"key":"4176_CR11","doi-asserted-by":"crossref","unstructured":"Yoon W, Lee J, Kim D, Jeong M, Kang J. Pre-trained language model for biomedical question answering. In: Proceedings of 2019 ECML PKDD workshop on machine learning and knowledge discovery in databases; 2019. p. 727\u2013740.","DOI":"10.1007\/978-3-030-43887-6_64"},{"key":"4176_CR12","doi-asserted-by":"crossref","unstructured":"Telukuntla SK, Kapri A, Zadrozny W. UNCC biomedical semantic question answering systems. bioasq: Task-7b, phase-b. In: Proceedings of 2019 ECML PKDD workshop on machine learning and knowledge discovery in databases; 2019. p. 695\u2013710.","DOI":"10.1007\/978-3-030-43887-6_62"},{"key":"4176_CR13","unstructured":"Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV. XLNet: Generalized autoregressive pretraining for language understanding. In: Proceedings of 2019 annual conference on neural information processing systems; 2019. p. 5753\u20135763."},{"key":"4176_CR14","unstructured":"Jeong M, Sung M, Kim G, Kim D, Yoon W, Yoo J, Kang J. Transferability of natural language inference to biomedical question answering. In: Working notes of CLEF 2020 conference and labs of the evaluation forum. 2020."},{"key":"4176_CR15","doi-asserted-by":"crossref","unstructured":"Qu C, Yang L, Qiu M, Croft WB, Zhang Y, Iyyer M. BERT with history answer embedding for conversational question answering. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval; 2019. p. 1133\u20131136.","DOI":"10.1145\/3331184.3331341"},{"key":"4176_CR16","doi-asserted-by":"crossref","unstructured":"Levine Y, Lenz B, Dagan O, Padnos D, Sharir O, Shalev-Shwartz S, Shashua A, Shoham Y. SenseBERT: driving some sense into BERT. In: Proceedings of the 58th annual meeting of the association for computational linguistics; 2020. p. 4656\u20134667.","DOI":"10.18653\/v1\/2020.acl-main.423"},{"key":"4176_CR17","unstructured":"Wang W, Bi B, Yan M, Wu C, Xia J, Bao Z, Peng L, Si L. StructBERT: incorporating language structures into pre-training for deep language understanding. In: Proceedings of 8th international conference on learning representations; 2020."},{"key":"4176_CR18","doi-asserted-by":"crossref","unstructured":"Wu S, He Y. Enriching pre-trained language model with entity information for relation classification. In: Proceedings of the 28th ACM international conference on information and knowledge management; 2019. p. 2361\u20132364.","DOI":"10.1145\/3357384.3358119"},{"key":"4176_CR19","doi-asserted-by":"crossref","unstructured":"Oita M, Vani K, Oezdemir-Zaech F. Semantically corroborating neural attention for biomedical question answering. In: Proceedings of 2019 ECML PKDD workshop on machine learning and knowledge discovery in databases; 2019. p. 670\u2013685.","DOI":"10.1007\/978-3-030-43887-6_60"},{"key":"4176_CR20","doi-asserted-by":"crossref","unstructured":"Rajpurkar P, Zhang J, Lopyrev K, Liang P. SQuAD: 100, 000+ questions for machine comprehension of text. In: Proceedings of the 2016 conference on empirical methods in natural language processing; 2016. p. 2383\u20132392.","DOI":"10.18653\/v1\/D16-1264"},{"key":"4176_CR21","doi-asserted-by":"crossref","unstructured":"Kamath S, Grau B, Ma Y. How to pre-train your model? Comparison of different pre-training models for biomedical question answering. In: Proceedings of 2019 ECML PKDD workshop on machine learning and knowledge discovery in databases; 2019. p. 646\u2013660.","DOI":"10.1007\/978-3-030-43887-6_58"},{"key":"4176_CR22","unstructured":"Bird S. NLTK: The natural language toolkit. In: Proceedings of 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics; 2006. p. 69\u201372."},{"key":"4176_CR23","unstructured":"Srivastava RK, Greff K, Schmidhuber J. Training very deep networks. In: Proceedings of 2015 annual conference on neural information processing systems; 2015. p. 2377\u20132385."},{"key":"4176_CR24","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. In: Proceedings of 2017 annual conference on neural information processing systems; 2017. p. 5998\u20136008."},{"key":"4176_CR25","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1186\/s12859-015-0564-6","volume":"16","author":"G Tsatsaronis","year":"2015","unstructured":"Tsatsaronis G, Balikas G, Malakasiotis P, Partalas I, Zschunke M, Alvers MR, Weissenborn D, Krithara A, Petridis S, Polychronopoulos D, Almirantis Y, Pavlopoulos J, Baskiotis N, Gallinari P, Arti\u00e8res T, Ngomo AN, Heino N, Gaussier \u00c9, Barrio-Alvers L, Schroeder M, Androutsopoulos I, Paliouras G. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 2015;16:138\u2013113828.","journal-title":"BMC Bioinform."},{"key":"4176_CR26","doi-asserted-by":"crossref","unstructured":"Gururangan S, Marasovic A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith NA. Don\u2019t stop pretraining: Adapt language models to domains and tasks. In: Proceedings of the 58th annual meeting of the association for computational linguistics; 2020. p. 8342\u20138360.","DOI":"10.18653\/v1\/2020.acl-main.740"},{"key":"4176_CR27","doi-asserted-by":"publisher","first-page":"73729","DOI":"10.1109\/ACCESS.2019.2920708","volume":"7","author":"D Kim","year":"2019","unstructured":"Kim D, Lee J, So CH, Jeon H, Jeong M, Choi Y, Yoon W, Sung M, Kang J. A neural named entity recognition and multi-type normalization tool for biomedical text mining. IEEE Access. 2019;7:73729\u201340.","journal-title":"IEEE Access"},{"key":"4176_CR28","doi-asserted-by":"crossref","unstructured":"Beltagy I, Lo K, Cohan A. SciBERT: Pretrained language model for scientific text. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing; 2019. p. 3613\u20133618.","DOI":"10.18653\/v1\/D19-1371"},{"key":"4176_CR29","doi-asserted-by":"crossref","unstructured":"Nentidis A, Bougiatiotis K, Krithara A, Paliouras G. Results of the seventh edition of the BioASQ challenge. In: Proceedings of 2019 ECML PKDD workshop on machine learning and knowledge discovery in databases; 2019. p. 553\u2013568.","DOI":"10.1007\/978-3-030-43887-6_51"},{"key":"4176_CR30","unstructured":"Peng S, You R, Xie Z, Wang B, Zhang Y, Zhu S. The Fudan participation in the 2015 BioASQ challenge: large-scale biomedical semantic indexing and question answering. In: Working Notes of CLEF 2015 conference and labs of the evaluation forum; 2015."},{"key":"4176_CR31","doi-asserted-by":"crossref","unstructured":"Hosein S, Andor D, McDonald R. Measuring domain portability and error propagation in biomedical QA. In: Proceedings of 2019 ECML PKDD workshop on machine learning and knowledge discovery in databases; 2019. p. 686\u2013694.","DOI":"10.1007\/978-3-030-43887-6_61"},{"key":"4176_CR32","unstructured":"Kommaraju V, Gunasekaran K, Li K, Bansal T, McCallum A, Williams I, Istrate A. Unsupervised pre-training for biomedical question answering. In: Working notes of CLEF 2020 conference and labs of the evaluation forum; 2020."},{"key":"4176_CR33","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1162\/tacl_a_00300","volume":"8","author":"M Joshi","year":"2020","unstructured":"Joshi M, Chen D, Liu Y, Weld DS, Zettlemoyer L, Levy O. SpanBERT: improving pre-training by representing and predicting spans. Trans Assoc Comput Linguist. 2020;8:64\u201377.","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"4176_CR34","unstructured":"Nentidis A, Krithara A, Bougiatiotis K, Paliouras G. Overview of BioASQ 8a and 8b: results of the eighth edition of the BioASQ tasks a and b. In: Working notes of CLEF 2020 conference and labs of the evaluation forum; 2020."},{"key":"4176_CR35","first-page":"452","volume":"7","author":"T Kwiatkowski","year":"2019","unstructured":"Kwiatkowski T, Palomaki J, Redfield O, Collins M, Parikh AP, Alberti C, Epstein D, Polosukhin I, Devlin J, Lee K, Toutanova K, Jones L, Kelcey M, Chang M, Dai AM, Uszkoreit J, Le Q, Petrov S. Natural questions: a benchmark for question answering research. Trans Assoc Comput Linguist. 2019;7:452\u201366.","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"4176_CR36","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1162\/tacl_a_00266","volume":"7","author":"S Reddy","year":"2019","unstructured":"Reddy S, Chen D, Manning CD. CoQA: a conversational question answering challenge. Trans Assoc Comput Linguist. 2019;7:249\u201366.","journal-title":"Trans. Assoc. Comput. Linguist."},{"key":"4176_CR37","doi-asserted-by":"crossref","unstructured":"Rajpurkar P, Jia R, Liang P. Know what you don\u2019t know: unanswerable questions for SQuAD. In: Proceedings of the 56th annual meeting of the association for computational linguistics; 2018. p. 784\u2013789.","DOI":"10.18653\/v1\/P18-2124"},{"key":"4176_CR38","doi-asserted-by":"crossref","unstructured":"Williams A, Nangia N, Bowman SR. A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies; 2018. p. 1112\u20131122.","DOI":"10.18653\/v1\/N18-1101"},{"key":"4176_CR39","doi-asserted-by":"publisher","first-page":"2320","DOI":"10.1093\/bioinformatics\/bth227","volume":"20","author":"LH Smith","year":"2004","unstructured":"Smith LH, Rindflesch TC, Wilbur WJ. Medpost: a part-of-speech tagger for biomedical text. Bioinformatics. 2004;20:2320\u20131.","journal-title":"Bioinformatics"},{"key":"4176_CR40","doi-asserted-by":"crossref","unstructured":"Neumann M, King D, Beltagy I, Ammar W. Scispacy: Fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP workshop and shared task, BioNLP@ACL 2019; 2019.","DOI":"10.18653\/v1\/W19-5034"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04176-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-021-04176-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04176-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,4]],"date-time":"2023-11-04T05:10:13Z","timestamp":1699074613000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-021-04176-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,26]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["4176"],"URL":"https:\/\/doi.org\/10.1186\/s12859-021-04176-7","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,26]]},"assertion":[{"value":"6 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 May 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 May 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"272"}}