{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T14:52:52Z","timestamp":1763563972468,"version":"3.41.0"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2017,8,22]],"date-time":"2017-08-22T00:00:00Z","timestamp":1503360000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2017,8,22]],"date-time":"2017-08-22T00:00:00Z","timestamp":1503360000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["P60AR062755"],"award-info":[{"award-number":["P60AR062755"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["UL1TR000062"],"award-info":[{"award-number":["UL1TR000062"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2017,12]]},"DOI":"10.1186\/s12911-017-0518-1","type":"journal-article","created":{"date-parts":[[2017,8,22]],"date-time":"2017-08-22T12:06:55Z","timestamp":1503403615000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":62,"title":["Word2Vec inversion and traditional text classifiers for phenotyping lupus"],"prefix":"10.1186","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0585-3507","authenticated-orcid":false,"given":"Clayton A.","family":"Turner","sequence":"first","affiliation":[]},{"given":"Alexander D.","family":"Jacobs","sequence":"additional","affiliation":[]},{"given":"Cassios K.","family":"Marques","sequence":"additional","affiliation":[]},{"given":"James C.","family":"Oates","sequence":"additional","affiliation":[]},{"given":"Diane L.","family":"Kamen","sequence":"additional","affiliation":[]},{"given":"Paul E.","family":"Anderson","sequence":"additional","affiliation":[]},{"given":"Jihad S.","family":"Obeid","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2017,8,22]]},"reference":[{"issue":"5p2","key":"518_CR1","doi-asserted-by":"publisher","first-page":"1620","DOI":"10.1111\/j.1475-6773.2005.00444.x","volume":"40","author":"KJ O\u2019Malley","year":"2005","unstructured":"O\u2019Malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. Measuring Diagnoses: ICD Code Accuracy. Health Serv Res. 2005; 40(5p2):1620\u201339. doi: 10.1111\/j.1475-6773.2005.00444.x .","journal-title":"Health Serv Res"},{"issue":"8","key":"518_CR2","doi-asserted-by":"publisher","first-page":"1602","DOI":"10.1161\/01.STR.29.8.1602","volume":"29","author":"LB Goldstein","year":"1998","unstructured":"Goldstein LB. Accuracy of ICD-9-CM Coding for the Identification of Patients With Acute Ischemic Stroke : Effect of Modifier Codes. Stroke. 1998; 29(8):1602\u201304. doi: 10.1161\/01.STR.29.8.1602 .","journal-title":"Stroke"},{"issue":"3","key":"518_CR3","doi-asserted-by":"publisher","first-page":"660","DOI":"10.1212\/WNL.49.3.660","volume":"49","author":"C Benesch","year":"1997","unstructured":"Benesch C, Witter Jr D, Wilder A, Duncan P, Samsa G, Matchar D. Inaccuracy of the International Classification of Diseases (ICD-9-CM) in identifying the diagnosis of ischemic cerebrovascular disease. Neurology. 1997; 49(3):660\u20134. doi: 10.1212\/WNL.49.3.660 .","journal-title":"Neurology"},{"key":"518_CR4","doi-asserted-by":"crossref","unstructured":"Taddy M. Association for Computational Linguistics. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: 2015. p. 45\u20139. http:\/\/anthology.aclweb.org\/P\/P15\/P15-2.pdf#page=73 .","DOI":"10.3115\/v1\/P15-2008"},{"issue":"Pt 2","key":"518_CR5","first-page":"968","volume":"129","author":"Y Aphinyanaphongs","year":"2007","unstructured":"Aphinyanaphongs Y, Aliferis C. Text categorization models for identifying unproven cancer treatments on the web. Stud Health Technol Inform. 2007; 129(Pt 2):968\u201372.","journal-title":"Stud Health Technol Inform"},{"issue":"1-3","key":"518_CR6","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1016\/S1386-5056(02)00057-6","volume":"67","author":"P Ruch","year":"2002","unstructured":"Ruch P, Baud R, Geissb\u00fchler A. Evaluating and reducing the effect of data corruption when applying bag of words approaches to medical records. Int J Med Inform. 2002; 67(1-3):75\u201383. doi: 10.1016\/S1386-5056(02)00057-6 .","journal-title":"Int J Med Inform"},{"issue":"90001","key":"518_CR7","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1093\/nar\/gkh061","volume":"32","author":"O Bodenreider","year":"2004","unstructured":"Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004; 32(90001):267\u201370. doi: 10.1093\/nar\/gkh061 .","journal-title":"Nucleic Acids Res"},{"key":"518_CR8","volume-title":"AMIA... Annual Symposium proceedings \/ AMIA Symposium. AMIA Symposium","author":"BT McInnes","year":"2007","unstructured":"McInnes BT, Pedersen T, Carlis J. Using UMLS Concept Unique Identifiers (CUIs) for word sense disambiguation in the biomedical domain. In: AMIA... Annual Symposium proceedings \/ AMIA Symposium. AMIA Symposium. Bethesda: National Institute of Health: 2007. p. 533\u20137."},{"issue":"5","key":"518_CR9","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1136\/jamia.2009.001560","volume":"17","author":"GK Savova","year":"2010","unstructured":"Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, Chute CG. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc JAMIA. 2010; 17(5):507\u201313. doi: 10.1136\/jamia.2009.001560 .","journal-title":"J Am Med Inform Assoc JAMIA"},{"issue":"5","key":"518_CR10","doi-asserted-by":"publisher","first-page":"614","DOI":"10.1136\/amiajnl-2011-000093","volume":"18","author":"V Garla","year":"2011","unstructured":"Garla V, Lo Re V, Dorey-Stein Z, Kidwai F, Scotch M, Womack J, Justice A, Brandt C. The Yale cTAKES extensions for document classification: architecture and application. J Am Med Inform Assoc JAMIA. 2011; 18(5):614\u201320. doi: 10.1136\/amiajnl-2011-000093 .","journal-title":"J Am Med Inform Assoc JAMIA"},{"issue":"Suppl 1","key":"518_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1472-6947-13-S1-S1","volume":"13 Suppl 1","author":"B Tang","year":"2013","unstructured":"Tang B, Cao H, Wu Y, Jiang M, Xu H. Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. BMC Med Inform Decis Mak. 2013; 13 Suppl 1(Suppl 1):1. doi: 10.1186\/1472-6947-13-S1-S1 .","journal-title":"BMC Med Inform Decis Mak"},{"key":"518_CR12","first-page":"240403","volume":"2014","author":"B Tang","year":"2014","unstructured":"Tang B, Cao H, Wang X, Chen Q, Xu H. Evaluating word representation features in biomedical named entity recognition tasks. BioMed Res Int. 2014; 2014:240403. doi: 10.1155\/2014\/240403 .","journal-title":"BioMed Res Int"},{"issue":"9","key":"518_CR13","doi-asserted-by":"publisher","first-page":"1725","DOI":"10.1002\/art.1780400928","volume":"40","author":"MC Hochberg","year":"1997","unstructured":"Hochberg MC. Updating the american college of rheumatology revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum. 1997; 40(9):1725.","journal-title":"Arthritis Rheum"},{"issue":"11","key":"518_CR14","doi-asserted-by":"publisher","first-page":"1271","DOI":"10.1002\/art.1780251101","volume":"25","author":"EM Tan","year":"1982","unstructured":"Tan EM, Cohen AS, Fries JF, Masi AT, Mcshane DJ, Rothfield NF, Schaller JG, Talal N, Winchester RJ. The 1982 revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum. 1982; 25(11):1271\u20137.","journal-title":"Arthritis Rheum"},{"issue":"8","key":"518_CR15","doi-asserted-by":"publisher","first-page":"2677","DOI":"10.1002\/art.34473","volume":"64","author":"D Isenberg","year":"2012","unstructured":"Isenberg D, Wallace DJ, Nived O, Ramsey-goldman R, Ph MD, Bae S-c, Ph MDD. Derivation and Validation of Systemic Lupus International Collaborating Clinics Classification Criteria for Systemic Lupus Erythematosus. Arthritis Rheum. 2012; 64(8):2677\u201386. doi: 10.1002\/art.34473. .","journal-title":"Arthritis Rheum"},{"issue":"3","key":"518_CR16","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1109\/TKDE.2005.50","volume":"17","author":"J Huang","year":"2005","unstructured":"Huang J, Ling CX. Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng. 2005; 17(3):299\u2013310. doi: 10.1109\/TKDE.2005.50 .","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"518_CR17","volume-title":"Natural Language Processing with Python","author":"S Bird","year":"2009","unstructured":"Bird S, Klein E, Loper E. Natural Language Processing with Python, 1st edn. Sebastopol: O\u2019Reilly Media, Inc.; 2009."},{"key":"518_CR18","volume-title":"Python reference manual. Technical report","author":"G Rossum","year":"1995","unstructured":"Rossum G. Python reference manual. Technical report. The Netherlands: Amsterdam; 1995."},{"key":"518_CR19","unstructured":"Mikolov T, Corrado G, Chen K, Dean J. Efficient Estimation of Word Representations in Vector Space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013). J. Mach. Learn. Res.2013. p. 1\u201312. arXiv:1301.3781v3."},{"key":"518_CR20","first-page":"3371","volume":"11","author":"P Vincent","year":"2010","unstructured":"Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res. 2010; 11:3371\u2013408.","journal-title":"J Mach Learn Res"},{"key":"518_CR21","volume-title":"Advances in Neural Information Processing Systems","author":"N Srivastava","year":"2012","unstructured":"Srivastava N, Salakhutdinov RR. Multimodal learning with deep boltzmann machines. In: Advances in Neural Information Processing Systems. Cambridge: MIT Press: 2012. p. 2222\u201330."},{"key":"518_CR22","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: Machine learning in Python. J Mach Learn Res. 2011; 12:2825\u201330.","journal-title":"J Mach Learn Res"},{"key":"518_CR23","volume-title":"Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop","author":"F Bastien","year":"2012","unstructured":"Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow IJ, Bergeron A, Bouchard N, Bengio Y. Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop. Cambridge: MIT Press: 2012."},{"key":"518_CR24","doi-asserted-by":"crossref","unstructured":"Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y. Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy).2010. Oral Presentation. http:\/\/www-etud.iro.umontreal.ca\/~wardefar\/publications\/theano_scipy2010.pdf .","DOI":"10.25080\/Majora-92bf1922-003"},{"key":"518_CR25","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/3071.001.0001","volume-title":"Foundations of neural networks, fuzzy systems, and knowledge engineering","author":"NK Kasabov","year":"1996","unstructured":"Kasabov NK. Foundations of neural networks, fuzzy systems, and knowledge engineering. Cambridge: MIT Press; 1996. pp. 14\u201328. http:\/\/boente.eti.br\/fuzzy\/ebook-fuzzy-kazabov.pdf ."},{"issue":"3","key":"518_CR26","first-page":"2913","volume":"22","author":"CY Lee","year":"2011","unstructured":"Lee CY, Chiou CW. An Improved Random Forest Classifier for Text Categorization. J Comput. 2011; 22(3):2913\u201320.","journal-title":"J Comput."},{"key":"518_CR27","volume-title":"European Conference on Machine Learning","author":"DD Lewis","year":"1998","unstructured":"Lewis DD. Naive (bayes) at forty: The independence assumption in information retrieval. In: European Conference on Machine Learning. Florham Park: Springer: 1998. p. 4\u201315."},{"key":"518_CR28","doi-asserted-by":"crossref","unstructured":"Joachims T. Text Categorization with Suport Vector Machines: Learning with Many Relevant Features. In: Proceedings of the 10th European Conference on Machine Learning ECML \u201998: 1998. p. 137\u201342. doi: 10.1007\/BFb0026683 .","DOI":"10.1007\/BFb0026683"},{"key":"518_CR29","unstructured":"Pouransari H, Ghili S. Deep learning for sentiment analysis of movie reviews: Stanford University; 2014. https:\/\/cs224d.stanford.edu\/reports\/PouransariHadi.pdf ."},{"issue":"438","key":"518_CR30","first-page":"548","volume":"92","author":"B Efron","year":"1997","unstructured":"Efron B, Tibshirani R. Improvements on cross-validation: The.632 plus bootstrap method. J Am Stat Assoc. 1997; 92(438):548. doi: 10.1080\/01621459.1997.10474007 .","journal-title":"J Am Stat Assoc"},{"key":"518_CR31","volume-title":"R: A Language and Environment for Statistical Computing","author":"R Core Team","year":"2015","unstructured":"R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2015. R Foundation for Statistical Computing. https:\/\/www.R-project.org\/ ."},{"key":"518_CR32","unstructured":"LeDell E, Petersen M, van der Laan M. cvAUC: Cross-Validated Area Under the ROC Curve Confidence Intervals. 2014. R package version 1.1.0. http:\/\/CRAN.R-project.org\/package=cvAUC . Accessed May 2016."},{"key":"518_CR33","unstructured":"Lewis DD. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. 1998. doi: 10.1007\/BFb0026666 . http:\/\/citeseer.ist.psu.edu\/viewdoc\/summary?doi=10.1.1.11.8264 . Accessed May 2016."},{"issue":"2","key":"518_CR34","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1145\/306686.306688","volume":"17","author":"WW Cohen","year":"1999","unstructured":"Cohen WW, Singer Y. Context-sensitive learning methods for text categorization. ACM Trans Inf Syst (TOIS). 1999; 17(2):141\u201373.","journal-title":"ACM Trans Inf Syst (TOIS)"},{"key":"518_CR35","doi-asserted-by":"crossref","unstructured":"Abu-Nimeh S, Nappa D, Wang X, Nair S. A comparison of machine learning techniques for phishing detection. In: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit on - eCrime \u201907: 2007. p. 60\u201369. doi: 10.1145\/1299015.1299021 .","DOI":"10.1145\/1299015.1299021"},{"issue":"12","key":"518_CR36","first-page":"2913","volume":"7","author":"B Xu","year":"2012","unstructured":"Xu B, Guo X, Ye Y, Cheng J. An Improved Random Forest Classifier for Text Categorization. J Comput. 2012; 7(12):2913\u201320.","journal-title":"J Comput"},{"key":"518_CR37","doi-asserted-by":"crossref","unstructured":"Breiman L. Random forests. Mach Learn. 2001:5\u201332. doi: 10.1023\/A:1010933404324 . http:\/\/dx.doi.org\/10.1023%2FA3A1010933404324 .","DOI":"10.1023\/A:1010933404324"},{"key":"518_CR38","doi-asserted-by":"crossref","unstructured":"Clark J, Koprinska I, Poon J, Sydney U. A neural network based approach to automated e-mail classification. In: Proceedings IEEE\/WIC International Conference on Web Intelligence (WI 2003): 2003. p. 702\u2013705. doi: 10.1109\/WI.2003.1241300 .","DOI":"10.1109\/WI.2003.1241300"},{"issue":"1","key":"518_CR39","first-page":"160","volume":"20","author":"R Collobert","year":"2008","unstructured":"Collobert R, Weston J. A Unified Architecture for Natural Language Processing : Deep Neural Networks with Multitask Learning. Architecture. 2008; 20(1):160\u20137. doi: 10.1145\/1390156.1390177 .","journal-title":"Architecture"},{"key":"518_CR40","doi-asserted-by":"crossref","unstructured":"Cires D, Meier U. Multi-column Deep Neural Networks for Image Classification. Appl Sci (February). 2012;20. http:\/\/arxiv.org\/abs\/1202.2745.","DOI":"10.1109\/CVPR.2012.6248110"},{"key":"518_CR41","unstructured":"Zhang X-y. Deep Neural Networks. 2013; 23(3):540\u201352. arXiv:1402.1869v2."},{"key":"518_CR42","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1186\/1471-2105-12-77","volume":"12","author":"X Robin","year":"2011","unstructured":"Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, M\u00fcller M. Proc: an open-source package for r and s+ to analyze and compare roc curves. BMC Bioinforma. 2011; 12:77.","journal-title":"BMC Bioinforma"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-017-0518-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12911-017-0518-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-017-0518-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,25]],"date-time":"2025-06-25T00:47:05Z","timestamp":1750812425000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-017-0518-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,8,22]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,12]]}},"alternative-id":["518"],"URL":"https:\/\/doi.org\/10.1186\/s12911-017-0518-1","relation":{},"ISSN":["1472-6947"],"issn-type":[{"type":"electronic","value":"1472-6947"}],"subject":[],"published":{"date-parts":[[2017,8,22]]},"assertion":[{"value":"30 January 2017","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 August 2017","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 August 2017","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"This research was approved by the Institutional Review Board for Human Research (IRB) at the Medical University of South Carolina under protocol number Pro00009511. The study was approved with respect to the study of human subjects as adequately protecting the rights and welfare of the individuals involved. Additionally, the Institutional Review Board for Human Research (IRB) recommended approval of the investigator\u2019s request for Waiver of Consent pursuant to 45 CFR 46.116(d) because the research involves no more than minimal risk to the subject, the waiver will not adversely affect the rights and welfare of the subjects, and the research could not be practicably carried out without the waiver. The Institutional Review Board for Human Research (IRB) also recommended approval of the investigator\u2019s request for a HIPAA Waiver of Authorization, as it appears that the criteria of the Privacy Rule have been satisfied. The HIPAA Waiver of Authorization was reviewed under expedited review procedures. No IRB member who has a conflicting interest was involved in the review or approval of this study, except to provide information as requested by the IRB.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Publisher\u2019s Note"}}],"article-number":"126"}}