{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T23:45:52Z","timestamp":1777851952607,"version":"3.51.4"},"reference-count":35,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2022,10,1]],"date-time":"2022-10-01T00:00:00Z","timestamp":1664582400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Veterans Administration Merit Review Grant","award":["IIR 14-011"],"award-info":[{"award-number":["IIR 14-011"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Health Informatics J"],"published-print":{"date-parts":[[2022,10]]},"abstract":"<jats:p>Colorectal cancer incidence has continually fallen among those 50\u00a0years old and over. However, the incidence has increased in those under 50. Even with the recent screening guidelines recommending that screening begins at age 45, nearly half of all early-onset colorectal cancer will be missed. Methods are needed to identify high-risk individuals in this age group for targeted screening. Colorectal cancer studies, as with other clinical studies, have required labor intensive chart review for the identification of those affected and risk factors. Natural language processing and machine learning can be used to automate the process and enable the screening of large numbers of patients. This study developed and compared four machine learning and statistical models: logistic regression, support vector machine, random forest, and deep neural network, in their performance in classifying colorectal cancer patients. Excellent classification performance is achieved with AUCs over 97%.<\/jats:p>","DOI":"10.1177\/14604582221134406","type":"journal-article","created":{"date-parts":[[2022,10,27]],"date-time":"2022-10-27T06:26:27Z","timestamp":1666851987000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":8,"title":["Identification of colorectal cancer using structured and free text clinical data"],"prefix":"10.1177","volume":"28","author":[{"given":"Douglas F","family":"Redd","sequence":"first","affiliation":[{"name":"Washington DC VA Medical Center, Washington, DC, USA Biomedical Informatics Center, The George Washington University School of Medicine and Health Sciences, Washington, DC, USA"}]},{"given":"Yijun","family":"Shao","sequence":"additional","affiliation":[{"name":"Washington DC VA Medical Center, Washington, DC, USA Biomedical Informatics Center, The George Washington University School of Medicine and Health Sciences, Washington, DC, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8353-7473","authenticated-orcid":false,"given":"Qing","family":"Zeng-Treitler","sequence":"additional","affiliation":[{"name":"Washington DC VA Medical Center, Washington, DC, USA Biomedical Informatics Center, The George Washington University School of Medicine and Health Sciences, Washington, DC, USA"}]},{"given":"Laura J","family":"Myers","sequence":"additional","affiliation":[{"name":"Richard L Roudebush VA Medical Center, Indianapolis, IN, USAIndiana University School of Medicine, Indianapolis, IN, USA Regenstrief Institute Inc, Indianapolis, IN, USA"}]},{"given":"Barry C","family":"Barker","sequence":"additional","affiliation":[{"name":"Richard L Roudebush VA Medical Center, Indianapolis, IN, USA"}]},{"given":"Stuart J","family":"Nelson","sequence":"additional","affiliation":[{"name":"Biomedical Informatics Center, The George Washington University School of Medicine and Health Sciences, Washington, DC, USA"}]},{"given":"Thomas F","family":"Imperiale","sequence":"additional","affiliation":[{"name":"Richard L Roudebush VA Medical Center, Indianapolis, IN, USA Indiana University School of Medicine, Indianapolis, IN, USA Regenstrief Institute Inc, Indianapolis, IN, USA"}]}],"member":"179","published-online":{"date-parts":[[2022,10,27]]},"reference":[{"key":"bibr1-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1002\/cncr.24760"},{"key":"bibr2-14604582221134406","doi-asserted-by":"publisher","DOI":"10.7326\/0003-4819-149-9-200811040-00243"},{"key":"bibr3-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1038\/ajg.2009.104"},{"key":"bibr4-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1053\/j.gastro.2008.02.002"},{"key":"bibr5-14604582221134406","volume-title":"Colorectal Cancer - Cancer Stat Facts","year":"2020"},{"key":"bibr6-14604582221134406","doi-asserted-by":"crossref","unstructured":"Wachter K. Colorectal cancer rates up in people aged 40 to 44, 4. GI & Hepatology News AGA Institute, 2010, pp. 1\u20134.","DOI":"10.1016\/S0031-398X(10)70122-3"},{"key":"bibr7-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1001\/archinternmed.2011.602"},{"key":"bibr8-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1002\/cncr.22012"},{"key":"bibr9-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1002\/jso.2930510311"},{"key":"bibr10-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1007\/s00268-004-7306-7"},{"key":"bibr11-14604582221134406","doi-asserted-by":"publisher","DOI":"10.3322\/caac.21457"},{"key":"bibr12-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1001\/jama.2021.6238"},{"key":"bibr13-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1111\/j.1553-2712.2004.tb01433.x"},{"key":"bibr14-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1097\/00002060-198906000-00010"},{"key":"bibr15-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1177\/0272989X11400418"},{"key":"bibr16-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1016\/j.gie.2020.04.077"},{"key":"bibr17-14604582221134406","doi-asserted-by":"publisher","DOI":"10.2196\/preprints.32973"},{"key":"bibr18-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1016\/j.cgh.2012.11.035"},{"key":"bibr19-14604582221134406","first-page":"1564","volume":"2011","author":"Xu H","year":"2011","journal-title":"AMIA Annu Symp Proc"},{"key":"bibr20-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-10-213"},{"key":"bibr21-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-15-S11-S11"},{"key":"bibr22-14604582221134406","first-page":"993","volume":"3","author":"Blei DM","year":"2003","journal-title":"J Machine Learn Research"},{"key":"bibr23-14604582221134406","volume-title":"Corporate Data Warehouse (CDW)","author":"Health Services Research &amp; Development"},{"key":"bibr24-14604582221134406","first-page":"191","volume":"411","author":"Le Cessie S","year":"1992","journal-title":"J R Stat Soc Ser C (Applied Statistics)"},{"key":"bibr25-14604582221134406","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/1130.003.0016"},{"key":"bibr26-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"bibr27-14604582221134406","volume-title":"Proceedings of the Python for Scientific Computing Conference","author":"Bergstra JBO","year":"2010"},{"key":"bibr28-14604582221134406","volume-title":"Lasagne","author":"Dieleman SSJ","year":"2015"},{"key":"bibr29-14604582221134406","first-page":"1139","volume":"28","author":"Sutskever I","year":"2013","journal-title":"Proc 30th Int Conf Machine Learn PMLR"},{"key":"bibr30-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1186\/s12911-019-0846-4"},{"key":"bibr31-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1002\/acr.23140"},{"key":"bibr32-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1002\/pds.4810"},{"key":"bibr33-14604582221134406","doi-asserted-by":"publisher","DOI":"10.2196\/23930"},{"key":"bibr34-14604582221134406","doi-asserted-by":"publisher","DOI":"10.3322\/caac.21601"},{"key":"bibr35-14604582221134406","doi-asserted-by":"publisher","DOI":"10.1007\/s10916-020-01701-8"}],"container-title":["Health Informatics Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14604582221134406","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14604582221134406","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14604582221134406","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T22:28:05Z","timestamp":1777501685000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14604582221134406"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10]]},"references-count":35,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,10]]}},"alternative-id":["10.1177\/14604582221134406"],"URL":"https:\/\/doi.org\/10.1177\/14604582221134406","relation":{},"ISSN":["1460-4582","1741-2811"],"issn-type":[{"value":"1460-4582","type":"print"},{"value":"1741-2811","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10]]},"article-number":"14604582221134406"}}