{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T23:46:28Z","timestamp":1777851988245,"version":"3.51.4"},"reference-count":41,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2024,10,1]],"date-time":"2024-10-01T00:00:00Z","timestamp":1727740800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["72074222"],"award-info":[{"award-number":["72074222"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012456","name":"National Social Science Foundation of China","doi-asserted-by":"crossref","award":["21CTQ016"],"award-info":[{"award-number":["21CTQ016"]}],"id":[{"id":"10.13039\/501100012456","id-type":"DOI","asserted-by":"crossref"}]},{"name":"CAMS Innovation Fund for Medical Sciences","award":["2021-I2M-1-056"],"award-info":[{"award-number":["2021-I2M-1-056"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Health Informatics J"],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:p>Objective: Faced with the challenges of differential diagnosis caused by the complex clinical manifestations and high pathological heterogeneity of pituitary adenomas, this study aims to construct a high-quality annotated corpus to characterize pituitary adenomas in clinical notes containing rich diagnosis and treatment information. Methods: A dataset from a pituitary adenomas neurosurgery treatment center of a tertiary first-class hospital in China was retrospectively collected. A semi-automatic corpus construction framework was designed. A total of 2000 documents containing 9430 sentences and 524,232 words were annotated, and the text corpus of pituitary adenomas (TCPA) was constructed and analyzed. Its potential application in large language models (LLMs) was explored through fine-tuning and prompting experiments. Results: TCPA had 4782 medical entities and 28,998 tokens, achieving good quality with the inter-annotator agreement value of 0.862\u20130.986. The LLMs experiments showed that TCPA can be used to automatically identify clinical information from free texts, and introducing instances with clinical characteristics can effectively reduce the need for training data, thereby reducing labor costs. Conclusion: This study characterized pituitary adenomas in clinical notes, and the proposed method were able to serve as references for relevant research in medical natural language scenarios with highly specialized language structure and terminology.<\/jats:p>","DOI":"10.1177\/14604582241291442","type":"journal-article","created":{"date-parts":[[2024,10,9]],"date-time":"2024-10-09T21:12:04Z","timestamp":1728508324000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":1,"title":["Characterizing pituitary adenomas in clinical notes: Corpus construction and its application in LLMs"],"prefix":"10.1177","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4352-3250","authenticated-orcid":false,"given":"Jiahui","family":"Hu","sequence":"first","affiliation":[{"name":"Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-1429-2926","authenticated-orcid":false,"given":"Jin","family":"Fu","sequence":"additional","affiliation":[{"name":"Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3705-5737","authenticated-orcid":false,"given":"Wanqing","family":"Zhao","sequence":"additional","affiliation":[{"name":"Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1426-670X","authenticated-orcid":false,"given":"Pei","family":"Lou","sequence":"additional","affiliation":[{"name":"Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1666-2717","authenticated-orcid":false,"given":"Ming","family":"Feng","sequence":"additional","affiliation":[{"name":"Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-4796-3720","authenticated-orcid":false,"given":"Huiling","family":"Ren","sequence":"additional","affiliation":[{"name":"Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9187-0954","authenticated-orcid":false,"given":"Shanshan","family":"Feng","sequence":"additional","affiliation":[{"name":"Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3466-2593","authenticated-orcid":false,"given":"Yansheng","family":"Li","sequence":"additional","affiliation":[{"name":"DHC Mediway Technology Co., Ltd., Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9526-9306","authenticated-orcid":false,"given":"An","family":"Fang","sequence":"additional","affiliation":[{"name":"Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2024,10,8]]},"reference":[{"key":"bibr1-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1210\/endrev\/bnad014"},{"key":"bibr2-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2023.5444"},{"key":"bibr3-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-022-01779-1"},{"key":"bibr4-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2020.103526"},{"key":"bibr5-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocz200"},{"key":"bibr6-14604582241291442","doi-asserted-by":"publisher","DOI":"10.2196\/45849"},{"key":"bibr7-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2021.103961"},{"key":"bibr8-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2023.104478"},{"key":"bibr9-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1186\/s13326-022-00269-1"},{"key":"bibr10-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1186\/s12911-023-02239-8"},{"key":"bibr11-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocae197"},{"issue":"11","key":"bibr12-14604582241291442","first-page":"410","volume":"13","author":"Meedin N","year":"2022","journal-title":"Int J Adv Comput Sci Appl"},{"key":"bibr13-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbaa110"},{"key":"bibr14-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-023-35482-0"},{"key":"bibr15-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-021-27358-6"},{"key":"bibr16-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1038\/s41591-023-02448-8"},{"key":"bibr17-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1007\/s11023-020-09548-1"},{"key":"bibr18-14604582241291442","volume-title":"Evaluating large language models trained on code","author":"Chen M","year":"2021"},{"key":"bibr19-14604582241291442","first-page":"342","volume":"270","author":"Funkner A","year":"2020","journal-title":"Stud Health Technol Inf"},{"key":"bibr20-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2021.103761"},{"key":"bibr21-14604582241291442","unstructured":"Institute of Medical Information, Chinese Academy of Medical Sciences (IMICAMS). Chinese clinical natural language processing (CCNLP) platform. https:\/\/ccnlp.imicams.ac.cn\/ (2024, accessed 16 August 2024)."},{"issue":"240","key":"bibr22-14604582241291442","first-page":"1","volume":"24","author":"Chowdhery A","year":"2023","journal-title":"J Mach Learn Res"},{"key":"bibr23-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP48485.2024.10446885"},{"key":"bibr24-14604582241291442","first-page":"1877","volume":"33","author":"Brown T","year":"2020","journal-title":"Adv Neural Inf Process Syst"},{"key":"bibr25-14604582241291442","volume-title":"Lora: low-rank adaptation of large language models","author":"Hu EJ","year":"2021"},{"key":"bibr26-14604582241291442","volume-title":"ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools","author":"Glm T","year":"2024"},{"key":"bibr27-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2878696"},{"key":"bibr28-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-00889-5_10"},{"key":"bibr29-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2878696"},{"key":"bibr30-14604582241291442","unstructured":"Institute of Medical Information, Chinese Academy of Medical Sciences (IMICAMS). Text corpus of pituitary adenomas (TCPA). https:\/\/ccnlp.imicams.ac.cn\/tcpa\/ (2024, accessed 16 August 2024)."},{"key":"bibr31-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1210\/endrev\/bnac024"},{"key":"bibr32-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1038\/s41574-023-00883-8"},{"key":"bibr33-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1530\/EJE-21-0462"},{"key":"bibr34-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocn.2023.07.026"},{"key":"bibr35-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkaa1113"},{"key":"bibr36-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkad976"},{"key":"bibr37-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1007\/s10916-023-01925-4"},{"key":"bibr38-14604582241291442","doi-asserted-by":"publisher","DOI":"10.2196\/48659"},{"key":"bibr39-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocad259"},{"key":"bibr40-14604582241291442","volume-title":"Gpt-ner: named entity recognition via large language models","author":"Wang S","year":"2023"},{"key":"bibr41-14604582241291442","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocad218"}],"container-title":["Health Informatics Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14604582241291442","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14604582241291442","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14604582241291442","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T22:28:44Z","timestamp":1777501724000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14604582241291442"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10]]},"references-count":41,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["10.1177\/14604582241291442"],"URL":"https:\/\/doi.org\/10.1177\/14604582241291442","relation":{},"ISSN":["1460-4582","1741-2811"],"issn-type":[{"value":"1460-4582","type":"print"},{"value":"1741-2811","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10]]},"article-number":"14604582241291442"}}