{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T19:24:34Z","timestamp":1773775474326,"version":"3.50.1"},"reference-count":55,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2020,3,17]],"date-time":"2020-03-17T00:00:00Z","timestamp":1584403200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Scientific Research Instrument Developing Project of the Chinese Academy of Sciences","award":["YJKYYQ20170067"],"award-info":[{"award-number":["YJKYYQ20170067"]}]},{"name":"Science and Technology Service Network Program of Chinese Academy of Sciences","award":["KFJ-STS-SCYD-007"],"award-info":[{"award-number":["KFJ-STS-SCYD-007"]}]},{"name":"Institute-City Cooperation Project of Chinese Academy of Sciences","award":["18YFYSZC00010"],"award-info":[{"award-number":["18YFYSZC00010"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,2,14]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Big data in medical diagnosis can provide abundant value for clinical diagnosis, decision support and many other applications, but obtaining a large number of labeled medical data will take a lot of time and manpower. In this paper, a classification model based on semi-supervised learning algorithm using both labeled and unlabeled data is proposed to process big data in medical diagnosis, which includes structured, semi-structured and unstructured data. For the medical laboratory data, this paper proposes a self-training algorithm based on repeated labeling strategy to solve the problem that mislabeled samples weaken the performance of classifiers. Aiming at medical record data, this paper extracts features with high correlation of classification results based on domain expert knowledge base first, and then chooses the unlabeled medical record data with the highest confidence to expand the training set and optimizes the performance of the classifiers of tri-training algorithm, which uses supervised learning algorithm to train three basic classifiers. The experimental results show that the proposed medical diagnosis data classification model based on semi-supervised learning algorithm has good performance.<\/jats:p>","DOI":"10.1093\/comjnl\/bxaa006","type":"journal-article","created":{"date-parts":[[2020,1,17]],"date-time":"2020-01-17T12:09:30Z","timestamp":1579262970000},"page":"177-191","source":"Crossref","is-referenced-by-count":13,"title":["Classification Model on Big Data in Medical Diagnosis Based on Semi-Supervised Learning"],"prefix":"10.1093","volume":"65","author":[{"given":"Lei","family":"Wang","sequence":"first","affiliation":[{"name":"Suzhou Institute of Biomedical Engineering and Technology, Key Laboratory of Biomedical Testing Technology of CAS, Chinese Academy of Sciences, No.88 Keling Road, SND, Suzhou 215163, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qing","family":"Qian","sequence":"additional","affiliation":[{"name":"Suzhou Institute of Biomedical Engineering and Technology, Key Laboratory of Biomedical Testing Technology of CAS, Chinese Academy of Sciences, No.88 Keling Road, SND, Suzhou 215163, China"},{"name":"Tianjin Guokeyigong Science & Technology Development Co., Ltd., Research & Development Department, Tianjin 300300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qiang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Suzhou Institute of Biomedical Engineering and Technology, Key Laboratory of Biomedical Testing Technology of CAS, Chinese Academy of Sciences, No.88 Keling Road, SND, Suzhou 215163, China"},{"name":"Tianjin Guokeyigong Science & Technology Development Co., Ltd., Research & Development Department, Tianjin 300300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jishuai","family":"Wang","sequence":"additional","affiliation":[{"name":"Suzhou Institute of Biomedical Engineering and Technology, Key Laboratory of Biomedical Testing Technology of CAS, Chinese Academy of Sciences, No.88 Keling Road, SND, Suzhou 215163, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenbo","family":"Cheng","sequence":"additional","affiliation":[{"name":"Suzhou Institute of Biomedical Engineering and Technology, Key Laboratory of Biomedical Testing Technology of CAS, Chinese Academy of Sciences, No.88 Keling Road, SND, Suzhou 215163, China"},{"name":"State Key Lab of Optical Technologies on Nano-Fabrication and Micro-Engineering, Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu 410075, China"},{"name":"Department of Optics and Optical Engineering,University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Yan","sequence":"additional","affiliation":[{"name":"State Key Lab of Optical Technologies on Nano-Fabrication and Micro-Engineering, Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu 410075, China"},{"name":"Department of Optics and Optical Engineering,University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2020,3,17]]},"reference":[{"key":"2022021610435047100_ref1","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1016\/j.eswa.2017.06.027","article-title":"Smart health: Big data enabled health paradigm within smart cities","volume":"87","author":"Pramanik","year":"2017","journal-title":"Expert Syst Appl"},{"key":"2022021610435047100_ref2","doi-asserted-by":"crossref","first-page":"1272","DOI":"10.1016\/j.amjmed.2018.05.038","article-title":"The big health data\u2013intelligent machine paradox","volume":"131","author":"Miller","year":"2018","journal-title":"Am J Med"},{"key":"2022021610435047100_ref3","doi-asserted-by":"crossref","first-page":"1747","DOI":"10.3390\/s19081747","article-title":"Meaningful integration of data from heterogeneous health services and home environment based on ontology","volume":"19","author":"Peng","year":"2019","journal-title":"Sensors"},{"key":"2022021610435047100_ref4","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.cmpb.2016.04.016","article-title":"An effective model for store and retrieve big health data in cloud computing","volume":"132","author":"Zohreh","year":"2016","journal-title":"Comput Meth Prog Bio"},{"key":"2022021610435047100_ref5","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1016\/j.outlook.2017.11.006","article-title":"Systems biology for nursing in the era of big data and precision health","volume":"66","author":"Found","year":"2018","journal-title":"Nursing Outlook"},{"key":"2022021610435047100_ref6","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1016\/j.bdr.2015.02.002","article-title":"Promises and challenges of big data computing in health sciences","volume":"2","author":"Huang","year":"2015","journal-title":"Big Data Res"},{"key":"2022021610435047100_ref7","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1016\/j.procs.2015.04.021","article-title":"A survey of big data analytics in healthcare and government","volume":"50","author":"Archenaa","year":"2015","journal-title":"Procedia Comput Sci"},{"key":"2022021610435047100_ref8","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/j.cmpb.2011.03.018","article-title":"Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms","volume":"104","author":"Ozcift","year":"2011","journal-title":"Comput Meth Prog Bio"},{"key":"2022021610435047100_ref9","doi-asserted-by":"crossref","first-page":"292","DOI":"10.3390\/electronics8030292","article-title":"A state-of-the-art survey on deep learning theory and architectures","volume":"8","author":"Alom","year":"2019","journal-title":"Electronics"},{"key":"2022021610435047100_ref10","doi-asserted-by":"crossref","first-page":"e262","DOI":"10.1016\/S1470-2045(19)30149-4","article-title":"Big data and machine learning algorithms for health-care delivery","volume":"20","author":"Ngiam","year":"2019","journal-title":"Lancet Oncol"},{"key":"2022021610435047100_ref11","doi-asserted-by":"crossref","first-page":"2558","DOI":"10.3390\/s19112558","article-title":"Electrocardiogram classification based on faster regions with convolutional neural network","volume":"19","author":"Ji","year":"2019","journal-title":"Sensors"},{"key":"2022021610435047100_ref12","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.patcog.2018.07.026","article-title":"Hierarchical Bayesian image analysis: From low-level modeling to robust supervised learning","volume":"85","author":"Lagrange","year":"2017","journal-title":"Pattern Recognit"},{"key":"2022021610435047100_ref13","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.coisb.2017.04.012","article-title":"Perspectives on big data applications of health information","volume":"3","author":"Cano","year":"2017","journal-title":"Curr Opin Syst Bio"},{"key":"2022021610435047100_ref14","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1016\/j.future.2017.12.059","article-title":"Smart health monitoring and management system: Toward autonomous wearable sensing for internet of things using big data analytics","volume":"91","author":"Din","year":"2019","journal-title":"Future Gener Comp Sy"},{"key":"2022021610435047100_ref15","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1016\/j.cmpb.2018.10.008","article-title":"Towards an efficient and energy-aware mobile big health data architecture","volume":"166","author":"Navaz","year":"2018","journal-title":"Comput Meth Progr Biomed"},{"key":"2022021610435047100_ref16","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1016\/j.eswa.2019.05.002","article-title":"Benchmarking unsupervised near-duplicate image detection","volume":"135","author":"Morra","year":"2019","journal-title":"Expert Syst Appl"},{"key":"2022021610435047100_ref17","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1016\/j.jsv.2018.08.040","article-title":"Active learning for semi-supervised structural health monitoring","volume":"437","author":"Bull","year":"2018","journal-title":"J Sound Vib"},{"key":"2022021610435047100_ref18","doi-asserted-by":"crossref","first-page":"337","DOI":"10.3390\/rs9040337","article-title":"Supervised and semi-supervised multi-view canonical correlation analysis ensemble for heterogeneous domain adaptation in remote sensing image classification","volume":"9","author":"Samat","year":"2017","journal-title":"Remote Sens"},{"key":"2022021610435047100_ref19","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1016\/j.cmpb.2018.10.004","article-title":"A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data","volume":"166","author":"Xiao","year":"2018","journal-title":"Comput Meth Prog Biomed"},{"key":"2022021610435047100_ref20","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/j.artmed.2017.10.003","article-title":"SSEL-ADE: A semi-supervised ensemble learning framework for extracting adverse drug events from social media","volume":"84","author":"Liu","year":"2018","journal-title":"Artif Intell Med"},{"key":"2022021610435047100_ref21","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.knosys.2019.02.008","article-title":"Semi-supervised aspect-level sentiment classification model based on variational autoencoder","volume":"171","author":"Fu","year":"2019","journal-title":"Knowl Based Syst"},{"key":"2022021610435047100_ref22","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.neucom.2019.01.059","article-title":"Semi-supervised target-oriented sentiment classification","volume":"337","author":"Xu","year":"2019","journal-title":"Neurocomputing"},{"key":"2022021610435047100_ref23","doi-asserted-by":"crossref","first-page":"130117","DOI":"10.1016\/j.jbi.2019.103117","article-title":"Semi-supervised learning to improve generalizability of risk prediction models","volume":"92","author":"Chi","year":"2019","journal-title":"J Biomed Inf"},{"key":"2022021610435047100_ref24","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1016\/j.neunet.2019.03.014","article-title":"Semi-supervised deep learning of brain tissue segmentation","volume":"16","author":"Ito","year":"2019","journal-title":"Neural Netw"},{"key":"2022021610435047100_ref25","doi-asserted-by":"crossref","first-page":"2706","DOI":"10.3390\/s18082706","article-title":"Semi-supervised generative adversarial nets with multiple generators for SAR image recognition","volume":"18","author":"Gao","year":"2018","journal-title":"Sensors"},{"key":"2022021610435047100_ref26","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1016\/j.media.2019.03.009","article-title":"Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis","volume":"54","author":"Cheplygina","year":"2019","journal-title":"Med Image Anal"},{"key":"2022021610435047100_ref27","article-title":"A review on graph-based semi-supervised learning methods for hyperspectral image classification","author":"Sawant","year":"2018","journal-title":"Egyptian J Remote Sens Space Sci"},{"key":"2022021610435047100_ref28","doi-asserted-by":"crossref","first-page":"101393","DOI":"10.1016\/j.scs.2018.12.021","article-title":"Household appliance recognition through a Bayes classification model","volume":"46","author":"Yan","year":"2019","journal-title":"Sustain Cities Soc"},{"key":"2022021610435047100_ref29","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1016\/j.ipm.2018.10.014","article-title":"A novel intelligent classification model for breast cancer diagnosis","volume":"56","author":"Liu","year":"2019","journal-title":"Inform Process Manag"},{"key":"2022021610435047100_ref30","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1016\/j.ejor.2017.12.001","article-title":"A support vector machine-based ensemble algorithm for breast cancer diagnosis","volume":"267","author":"Wang","year":"2018","journal-title":"Eur J Oper Res"},{"key":"2022021610435047100_ref31","doi-asserted-by":"crossref","first-page":"631","DOI":"10.3390\/electronics8060631","article-title":"Parallel implementation on FPGA of support vector machines using stochastic gradient descent","volume":"8","author":"Lopes","year":"2019","journal-title":"Electronics"},{"key":"2022021610435047100_ref32","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1016\/j.neucom.2018.07.016","article-title":"A new image classification model based on brain parallel interaction mechanism","volume":"315","author":"Yu","year":"2018","journal-title":"Neurocomputing"},{"key":"2022021610435047100_ref33","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1016\/j.patcog.2017.02.009","article-title":"Nonlinear dictionary learning with application to image classification","volume":"75","author":"Hu","year":"2018","journal-title":"Pattern Recognit"},{"key":"2022021610435047100_ref34","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.eswa.2018.08.039","article-title":"Dispersion ratio based decision tree model for classification","volume":"116","author":"Roy","year":"2019","journal-title":"Expert Syst Appl"},{"key":"2022021610435047100_ref35","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.patrec.2017.09.036","article-title":"A novel kNN algorithm with data-driven k parameter computation","volume":"109","author":"Zhang","year":"2018","journal-title":"Pattern Recognit Lett"},{"key":"2022021610435047100_ref36","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1016\/j.neucom.2015.08.112","article-title":"Efficient kNN classification algorithm for big data","volume":"195","author":"Deng","year":"2016","journal-title":"Neurocomputing"},{"key":"2022021610435047100_ref37","doi-asserted-by":"crossref","first-page":"1064","DOI":"10.1016\/j.procs.2016.04.224","article-title":"Using machine learning algorithms for breast cancer risk prediction and diagnosis","volume":"83","author":"Asri","year":"2016","journal-title":"Procedia Comput Sci"},{"key":"2022021610435047100_ref38","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.csbj.2016.11.004","article-title":"A hybrid computer-aided-diagnosis system for prediction of breast cancer recurrence (HPBCR) using optimized ensemble learning","volume":"15","author":"Mohebian","year":"2017","journal-title":"Comput Struct Biotec J"},{"key":"2022021610435047100_ref39","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1016\/j.eswa.2018.08.040","article-title":"A dynamic gradient boosting machine using genetic optimizer for practical breast cancer prognosis","volume":"116","author":"Lu","year":"2019","journal-title":"Expert Syst Appl"},{"key":"2022021610435047100_ref40","article-title":"A new nested ensemble technique for automated diagnosis of breast cancer","author":"Abdar","year":"2018","journal-title":"Pattern Recognit Lett"},{"key":"2022021610435047100_ref41","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.eswa.2019.05.028","article-title":"A comparison of random forest variable selection methods for classification prediction modeling","volume":"134","author":"Speiser","year":"2019","journal-title":"Expert Syst Appl"},{"key":"2022021610435047100_ref42","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.3390\/rs11111309","article-title":"Individual tree-crown detection in RGB imagery using semi-supervised deep learning neural networks","volume":"11","author":"Weinstein","year":"2019","journal-title":"Remote Sens"},{"key":"2022021610435047100_ref43","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1016\/j.tele.2017.01.007","article-title":"A knowledge-based system for breast cancer classification using fuzzy logic method","volume":"34","author":"Nilashi","year":"2017","journal-title":"Telemat Inform"},{"key":"2022021610435047100_ref44","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1016\/j.ijforecast.2018.04.003","article-title":"Predicting the failures of prediction markets: A procedure of decision making using classification models","volume":"35","author":"Tai","year":"2019","journal-title":"Int J Forecast"},{"key":"2022021610435047100_ref45","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.artmed.2018.12.007","article-title":"Normal and pathological gait classification LSTM model","volume":"94","author":"Khokhlova","year":"2019","journal-title":"Artif Intell Med"},{"key":"2022021610435047100_ref46","doi-asserted-by":"crossref","first-page":"1042","DOI":"10.3390\/rs9101042","article-title":"Generative adversarial networks-based semi-supervised learning for hyperspectral image classification","volume":"9","author":"He","year":"2017","journal-title":"Remote Sens"},{"key":"2022021610435047100_ref47","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/j.neucom.2019.02.016","article-title":"Semi-supervised and active learning through manifold reciprocal kNN graph for image retrieval","volume":"340","author":"Pedronette","year":"2019","journal-title":"Neurocomputing"},{"key":"2022021610435047100_ref48","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/j.asoc.2016.11.022","article-title":"A sentiment classification model based on multiple classifiers","volume":"50","author":"Catal","year":"2017","journal-title":"Appl Soft Comput"},{"key":"2022021610435047100_ref49","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1016\/j.compeleceng.2019.04.012","article-title":"HIC-net: A deep convolutional neural network model for classification of histopathological breast images","volume":"76","author":"\u00d6zt\u00fcrk","year":"2019","journal-title":"Comput Electr Eng"},{"key":"2022021610435047100_ref50","doi-asserted-by":"crossref","first-page":"103111","DOI":"10.1016\/j.jbi.2019.103111","article-title":"A method for analyzing inpatient care variability through physicians\u2019 orders","volume":"91","author":"Lenert","year":"2019","journal-title":"J Biomed Inform"},{"key":"2022021610435047100_ref51","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1016\/j.patcog.2016.08.011","article-title":"An efficient semi-supervised representatives feature selection algorithm based on information theory","volume":"61","author":"Wang","year":"2017","journal-title":"Pattern Recognit"},{"key":"2022021610435047100_ref52","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/j.patrec.2019.06.003","article-title":"Self-reinforced diffusion for graph-based semi-supervised learning","volume":"125","author":"Li","year":"2019","journal-title":"Pattern Recognit Lett"},{"key":"2022021610435047100_ref53","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.neunet.2019.03.002","article-title":"Joint sparse graph and flexible embedding for graph-based semi-supervised learning","volume":"114","author":"Dornaika","year":"2019","journal-title":"Neural Netw"},{"key":"2022021610435047100_ref54","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1016\/j.patrec.2018.05.004","article-title":"TGLSTM: A time based graph deep learning approach to gait recognition","volume":"126","author":"Battistone","year":"2019","journal-title":"Pattern Recognit Lett"},{"key":"2022021610435047100_ref55","doi-asserted-by":"crossref","first-page":"1529","DOI":"10.1109\/TKDE.2005.186","article-title":"Tri-training: Exploiting unlabeled data using three classifiers","volume":"17","author":"Zhou","year":"2005","journal-title":"IEEE T Knowl Data En"}],"container-title":["The Computer Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/comjnl\/article-pdf\/65\/2\/177\/42537639\/bxaa006.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/comjnl\/article-pdf\/65\/2\/177\/42537639\/bxaa006.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,2,16]],"date-time":"2022-02-16T10:45:32Z","timestamp":1645008332000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/comjnl\/article\/65\/2\/177\/5808795"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,17]]},"references-count":55,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2020,3,17]]},"published-print":{"date-parts":[[2022,2,14]]}},"URL":"https:\/\/doi.org\/10.1093\/comjnl\/bxaa006","relation":{},"ISSN":["0010-4620","1460-2067"],"issn-type":[{"value":"0010-4620","type":"print"},{"value":"1460-2067","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,2]]},"published":{"date-parts":[[2020,3,17]]}}}