{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T04:26:55Z","timestamp":1774672015796,"version":"3.50.1"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972134"],"award-info":[{"award-number":["61972134"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Young Elite Teachers in Henan Province","award":["2020GGJS050"],"award-info":[{"award-number":["2020GGJS050"]}]},{"name":"Doctor Foundation of Henan Polytechnic University","award":["B2018-36"],"award-info":[{"award-number":["B2018-36"]}]},{"name":"Innovative and Scientific Research Team of Henan Polytechnic University","award":["T2021-3"],"award-info":[{"award-number":["T2021-3"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Studies have shown that classifying cancer subtypes can provide valuable information for a range of cancer research, from aetiology and tumour biology to prognosis and personalized treatment. Current methods usually adopt gene expression data to perform cancer subtype classification. However, cancer samples are scarce, and the high-dimensional features of their gene expression data are too sparse to allow most methods to achieve desirable classification results.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>In this paper, we propose a deep learning approach by combining a convolutional neural network (CNN) and bidirectional gated recurrent unit (BiGRU): our approach, DCGN, aims to achieve nonlinear dimensionality reduction and learn features to eliminate irrelevant factors in gene expression data. Specifically, DCGN first uses the synthetic minority oversampling technique algorithm to equalize data. The CNN can handle high-dimensional data without stress and extract important local features, and the BiGRU can analyse deep features and retain their important information; the DCGN captures key features by combining both neural networks to overcome the challenges of small sample sizes and sparse, high-dimensional features. In the experiments, we compared the DCGN to seven other cancer subtype classification methods using breast and bladder cancer gene expression datasets. The experimental results show that the DCGN performs better than the other seven methods and can provide more satisfactory classification results.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-022-04980-9","type":"journal-article","created":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T09:03:48Z","timestamp":1665997428000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["Deep learning approach for cancer subtype classification using high-dimensional gene expression data"],"prefix":"10.1186","volume":"23","author":[{"given":"Jiquan","family":"Shen","sequence":"first","affiliation":[]},{"given":"Jiawei","family":"Shi","sequence":"additional","affiliation":[]},{"given":"Junwei","family":"Luo","sequence":"additional","affiliation":[]},{"given":"Haixia","family":"Zhai","sequence":"additional","affiliation":[]},{"given":"Xiaoyan","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Zhengjiang","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Chaokun","family":"Yan","sequence":"additional","affiliation":[]},{"given":"Huimin","family":"Luo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"issue":"5","key":"4980_CR1","doi-asserted-by":"publisher","first-page":"646","DOI":"10.1016\/j.cell.2011.02.013","volume":"144","author":"D Hanahan","year":"2011","unstructured":"Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646\u201374.","journal-title":"Cell"},{"issue":"9","key":"4980_CR2","first-page":"e69","volume":"45","author":"Y Sun","year":"2017","unstructured":"Sun Y, Yao J, Yang L, Chen R, Nowak NJ, Goodison S. Computational approach for deriving cancer progression roadmaps from static sample data. Nucleic Acids Res. 2017;45(9):e69.","journal-title":"Nucleic Acids Res"},{"issue":"7403","key":"4980_CR3","doi-asserted-by":"publisher","first-page":"346","DOI":"10.1038\/nature10983","volume":"486","author":"C Curtis","year":"2012","unstructured":"Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486(7403):346\u201352.","journal-title":"Nature"},{"issue":"8","key":"4980_CR4","doi-asserted-by":"publisher","first-page":"1160","DOI":"10.1200\/JCO.2008.18.1370","volume":"27","author":"JS Parker","year":"2009","unstructured":"Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27(8):1160\u20137.","journal-title":"J Clin Oncol"},{"issue":"5","key":"4980_CR5","doi-asserted-by":"publisher","first-page":"1476","DOI":"10.1093\/bioinformatics\/btz769","volume":"36","author":"R Chen","year":"2019","unstructured":"Chen R, Yang L, Goodison S, et al. Deep learning approach to identifying cancer subtypes using high-dimensional genomic data. Bioinformatics. 2019;36(5):1476\u201383.","journal-title":"Bioinformatics"},{"issue":"1","key":"4980_CR6","doi-asserted-by":"publisher","first-page":"104","DOI":"10.1186\/s13073-017-0493-2","volume":"9","author":"KP Soh","year":"2017","unstructured":"Soh KP, Szczurek E, Sakoparnig T, et al. Predicting cancer type from tumour DNA signatures. Genome Med. 2017;9(1):104.","journal-title":"Genome Med"},{"issue":"3","key":"4980_CR7","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/BF00994018","volume":"20","author":"C Cortes","year":"1995","unstructured":"Cortes C, Vapnik V. Support-Vector Networks. Mach Learn. 1995;20(3):273\u201397.","journal-title":"Mach Learn"},{"issue":"03","key":"4980_CR8","first-page":"10","volume":"48","author":"MQ Ye","year":"2018","unstructured":"Ye MQ, Gao LY, Wan CHY. Gene expression data classification based on artificial bee colony and SVM. J Shandong Univ (Engineering Edition). 2018;48(03):10\u20136.","journal-title":"J Shandong Univ (Engineering Edition)"},{"issue":"3","key":"4980_CR9","doi-asserted-by":"publisher","first-page":"6915","DOI":"10.4249\/scholarpedia.6915","volume":"5","author":"D Karaboga","year":"2010","unstructured":"Karaboga D. Artificial bee colony algorithm. Scholarpedia. 2010;5(3):6915.","journal-title":"Scholarpedia"},{"issue":"3","key":"4980_CR10","first-page":"7","volume":"10","author":"H Duan","year":"2021","unstructured":"Duan H, Huang JS, Zhang SH. Study of cancer subtype classification model based on gene expression profile. Math Model Appl. 2021;10(3):7.","journal-title":"Math Model Appl"},{"issue":"9","key":"4980_CR11","first-page":"20","volume":"324","author":"G Yang","year":"2018","unstructured":"Yang G, Shang X, Li Z. Identification of cancer subtypes by integrating multiple types of transcriptomics data with deep learning in breast cancer. Neurocomputing. 2018;324(9):20\u201330.","journal-title":"Neurocomputing"},{"key":"4980_CR12","doi-asserted-by":"crossref","unstructured":"Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[C]\/\/ Advances in Neural Information Processing Systems 19, In: Proceedings of the twentieth annual conference on neural information processing systems, Vancouver, British Columbia, Canada, 2006. DBLP, 2007.","DOI":"10.7551\/mitpress\/7503.003.0024"},{"key":"4980_CR13","doi-asserted-by":"publisher","DOI":"10.27389\/d.cnki.gxadu.2019.002388","author":"Z Liang","year":"2019","unstructured":"Liang Z. Classification of gene expression data based on Boosting. Xi\u2019an Univ Electron Sci Technol. 2019. https:\/\/doi.org\/10.27389\/d.cnki.gxadu.2019.002388.","journal-title":"Xi'an Univ Electron Sci Technol"},{"key":"4980_CR14","doi-asserted-by":"publisher","DOI":"10.27307\/d.cnki.gsjtu.2020.000051","author":"Y Xiao","year":"2020","unstructured":"Xiao Y. Research on cancer diagnosis based on deep learning of gene expression data. Shanghai Jiaotong Univ. 2020. https:\/\/doi.org\/10.27307\/d.cnki.gsjtu.2020.000051.","journal-title":"Shanghai Jiaotong Univ"},{"key":"4980_CR15","doi-asserted-by":"publisher","first-page":"1122536","DOI":"10.1155\/2022\/1122536","volume":"2022","author":"S Majumder","year":"2022","unstructured":"Majumder S, et al. Performance analysis of deep learning models for binary classification of cancer gene expression data. J Healthc Eng. 2022;2022:1122536\u20131122536.","journal-title":"J Healthc Eng"},{"key":"4980_CR16","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla NV, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321\u201357.","journal-title":"J Artif Intell Res"},{"key":"4980_CR17","doi-asserted-by":"publisher","first-page":"541","DOI":"10.1162\/neco.1989.1.4.541","volume":"1","author":"Y Lecun","year":"1989","unstructured":"Lecun Y, Boser B, Denker JS, et al. Backpropagation applied to handwritten zip code. Neural Comput. 1989;1:541\u201351.","journal-title":"Neural Comput"},{"key":"4980_CR18","doi-asserted-by":"crossref","unstructured":"Cho K, Merrienboer BV, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Comput Sci. 2014;1406.1078.","DOI":"10.3115\/v1\/D14-1179"},{"key":"4980_CR19","unstructured":"Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. CoRR, 2013, abs\/1311.2901"},{"key":"4980_CR20","unstructured":"Chung J, Gulcehre C, Cho KH, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. Eprint Arxiv, 2014."},{"key":"4980_CR21","doi-asserted-by":"publisher","unstructured":"Yi\u011fit G, Amasyali MF, Simple but effective GRU variants. In: 2021 international conference on INnovations in intelligent SysTems and applications (INISTA), 2021, pp. 1\u20136. https:\/\/doi.org\/10.1109\/INISTA52262.2021.9548535","DOI":"10.1109\/INISTA52262.2021.9548535"},{"key":"4980_CR22","unstructured":"Hendrycks D, Gimpel K. Gaussian error linear units (GELUs). 2016."},{"key":"4980_CR23","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1016\/j.ccr.2014.01.009","volume":"25","author":"W Choi","year":"2014","unstructured":"Choi W, Porten S, Kim S, et al. Identification of distinct basal and luminal subtypes of muscle-invasive bladder cancer with different sensitivities to frontline chemotherapy. Cancer Cell. 2014;25:152\u201365.","journal-title":"Cancer Cell"},{"key":"4980_CR24","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1016\/j.cell.2017.09.007","volume":"171","author":"AG Robertson","year":"2017","unstructured":"Robertson AG, Kim J, Al-Ahmadie H, et al. Comprehensive molecular characterization of muscle-invasive bladder cancer. Cell. 2017;171:540-56.e25.","journal-title":"Cell"},{"key":"4980_CR25","doi-asserted-by":"publisher","first-page":"244ra91","DOI":"10.1126\/scitranslmed.3008970","volume":"6","author":"S Rebouissou","year":"2014","unstructured":"Rebouissou S, Bernard-Pierrot I, de Reyni\u00e8s A, et al. EGFR as a potential therapeutic target for a subset of muscle-invasive bladder cancers presenting a basal-like phenotype. Sci Transl Med. 2014;6:244ra91.","journal-title":"Sci Transl Med"},{"key":"4980_CR26","doi-asserted-by":"publisher","first-page":"3737","DOI":"10.1038\/s41598-018-22126-x","volume":"8","author":"N Marzouka","year":"2018","unstructured":"Marzouka N, Eriksson P, Rovira C, Liedberg F, Sj\u00f6dahl G, H\u00f6glund M. A validation and extended description of the Lund taxonomy for urothelial carcinoma using the TCGA cohort. Sci Rep. 2018;8:3737.","journal-title":"Sci Rep"},{"key":"4980_CR27","unstructured":"Kamoun A, De Reyni\u00e8s A, Allory Y, et al. A consensus molecular classification of muscle-invasive bladder cancer. Social Science Electronic Publishing."},{"key":"4980_CR28","unstructured":"Kingma DP, Ba J. Adam: a method for stochastic optimization. In: International conference on learning representations, 2014. pp. 1\u201313."},{"key":"4980_CR29","doi-asserted-by":"crossref","unstructured":"Zhou ZH, Feng J. Deep forest: towards an alternative to deep neural networks. 2017.","DOI":"10.24963\/ijcai.2017\/497"},{"issue":"5","key":"4980_CR30","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1214\/aos\/1013203451","volume":"29","author":"JH Friedman","year":"2001","unstructured":"Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189\u2013232.","journal-title":"Ann Stat"},{"key":"4980_CR31","unstructured":"Qi M. LightGBM: a highly efficient gradient boosting decision tree[C]\/\/ Neural Information Processing Systems. Curran Associates Inc. 2017."},{"key":"4980_CR32","doi-asserted-by":"publisher","first-page":"250","DOI":"10.1016\/j.ins.2016.01.033","volume":"340","author":"X Deng","year":"2016","unstructured":"Deng X, Liu Q, Deng Y, et al. An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Inf Sci. 2016;340:250\u201361.","journal-title":"Inf Sci"},{"key":"4980_CR33","first-page":"62","volume":"27","author":"TA Wan","year":"2015","unstructured":"Wan TA, Jun HU, et al. Kappa coefficient: a popular measure of rater agreement. Shanghai Arch Psychiatry. 2015;27:62.","journal-title":"Shanghai Arch Psychiatry"},{"key":"4980_CR34","volume-title":"Hamming distance","author":"R Sanchez-Reillo","year":"2009","unstructured":"Sanchez-Reillo R, Tamer S, Lu G, et al. Hamming distance. US: Springer; 2009."},{"issue":"1","key":"4980_CR35","doi-asserted-by":"publisher","first-page":"269","DOI":"10.1214\/12-AOAS578","volume":"7","author":"R Shen","year":"2013","unstructured":"Shen R, Wang S, Mo Q. Sparse integrative clustering of multiple omics data sets. Ann Appl Stat. 2013;7(1):269\u201394.","journal-title":"Ann Appl Stat"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04980-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-022-04980-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-022-04980-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,28]],"date-time":"2023-11-28T23:53:17Z","timestamp":1701215597000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-022-04980-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":35,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["4980"],"URL":"https:\/\/doi.org\/10.1186\/s12859-022-04980-9","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"28 May 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 October 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 October 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"430"}}