{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T03:14:51Z","timestamp":1768446891170,"version":"3.49.0"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T00:00:00Z","timestamp":1568332800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T00:00:00Z","timestamp":1568332800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61403288"],"award-info":[{"award-number":["61403288"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["71871174"],"award-info":[{"award-number":["71871174"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Natural Science Foundation of Hubei Province, China","award":["2019cfb589"],"award-info":[{"award-number":["2019cfb589"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n              <jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Long-chain non-coding RNA (lncRNA) is closely related to many biological activities. Since its sequence structure is similar to that of messenger RNA (mRNA), it is difficult to distinguish between the two based only on sequence biometrics. Therefore, it is particularly important to construct a model that can effectively identify lncRNA and mRNA.<\/jats:p>\n              <\/jats:sec>\n              <jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>First, the difference in the k-mer frequency distribution between lncRNA and mRNA sequences is considered in this paper, and they are transformed into the k-mer frequency matrix. Moreover, k-mers with more species are screened by relative entropy. The classification model of the lncRNA and mRNA sequences is then proposed by inputting the k-mer frequency matrix and training the convolutional neural network. Finally, the optimal k-mer combination of the classification model is determined and compared with other machine learning methods in humans, mice and chickens. The results indicate that the proposed model has the highest classification accuracy. Furthermore, the recognition ability of this model is verified to a single sequence.<\/jats:p>\n              <\/jats:sec>\n              <jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>We established a classification model for lncRNA and mRNA based on k-mers and the convolutional neural network. The classification accuracy of the model with 1-mers, 2-mers and 3-mers was the highest, with an accuracy of 0.9872 in humans, 0.8797 in mice and 0.9963 in chickens, which is better than those of the random forest, logistic regression, decision tree and support vector machine.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-019-3039-3","type":"journal-article","created":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T04:39:46Z","timestamp":1568349586000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":45,"title":["A classification model for lncRNA and mRNA based on k-mers and a convolutional neural network"],"prefix":"10.1186","volume":"20","author":[{"given":"Jianghui","family":"Wen","sequence":"first","affiliation":[]},{"given":"Yeshu","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Yu","family":"Shi","sequence":"additional","affiliation":[]},{"given":"Haoran","family":"Huang","sequence":"additional","affiliation":[]},{"given":"Bing","family":"Deng","sequence":"additional","affiliation":[]},{"given":"Xinping","family":"Xiao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2019,9,13]]},"reference":[{"key":"3039_CR1","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1038\/nature11233","volume":"489","author":"S Djebali","year":"2012","unstructured":"Djebali S, Davis CA, Merkel A, et al. Landscape of transcription in human cells. Nature. 2012;489:101\u20138.","journal-title":"Nature."},{"issue":"8","key":"3039_CR2","first-page":"57","volume":"45","author":"V Wucher","year":"2017","unstructured":"Wucher V, Legeai F, H\u00e9dan B, et al. FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res. 2017;45(8):57\u201368.","journal-title":"Nucleic Acids Res"},{"key":"3039_CR3","first-page":"1","volume":"2016","author":"SY Han","year":"2016","unstructured":"Han SY, Liang YC, Li Y, et al. Long noncoding RNA identification: comparing machine learning based tools for long noncoding transcripts discrimination. Biomed Res Int. 2016;2016:1\u201314.","journal-title":"Biomed Res Int"},{"issue":"3","key":"3039_CR4","first-page":"433","volume":"37","author":"WS Li","year":"2017","unstructured":"Li WS, Xiao XW, Su H, et al. The research progress of LncRNA. J Gannan Med Univ. 2017;37(3):433\u20137.","journal-title":"J Gannan Med Univ"},{"issue":"10","key":"3039_CR5","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1100\/tsw.2010.7","volume":"8","author":"DP Caley","year":"2010","unstructured":"Caley DP, Pink RC, Truillano D. Long non-coding RNAs, chromatin and development. Sci World J. 2010;8(10):90\u2013102.","journal-title":"Sci World J"},{"issue":"5908","key":"3039_CR6","doi-asserted-by":"publisher","first-page":"1717","DOI":"10.1126\/science.1163802","volume":"322","author":"T Nagano","year":"2008","unstructured":"Nagano T, Mitchell JA, Sanz LA, et al. The air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science. 2008;322(5908):1717\u201320.","journal-title":"Science."},{"issue":"7200","key":"3039_CR7","doi-asserted-by":"publisher","first-page":"126","DOI":"10.1038\/nature06992","volume":"454","author":"X Wang","year":"2008","unstructured":"Wang X, Arai S, Song X, et al. Induced ncRNAs allosterically modify RNA-binding proteins in cis to inhibit transcription. Nature. 2008;454(7200):126\u201330.","journal-title":"Nature."},{"issue":"6","key":"3039_CR8","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1016\/j.tcb.2011.04.001","volume":"21","author":"O Wapinski","year":"2011","unstructured":"Wapinski O, Chang HY. Corrigendum: long noncoding RNAs and human disease. Trends Cell Biol. 2011;21(6):354\u201361.","journal-title":"Trends Cell Biol"},{"key":"3039_CR9","doi-asserted-by":"publisher","first-page":"345","DOI":"10.1093\/nar\/gkm391","volume":"35","author":"L Kong","year":"2007","unstructured":"Kong L, Zhang Y, Ye ZQ, et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 2007;35:345\u20139.","journal-title":"Nucleic Acids Res"},{"issue":"17","key":"3039_CR10","doi-asserted-by":"publisher","first-page":"166","DOI":"10.1093\/nar\/gkt646","volume":"41","author":"L Sun","year":"2013","unstructured":"Sun L, Luo H, Bu D, et al. Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res. 2013;41(17):166\u201373.","journal-title":"Nucleic Acids Res"},{"key":"3039_CR11","volume-title":"Multi-feature based long non-coding RNA recognition method","author":"HX Dang","year":"2013","unstructured":"Dang HX. Multi-feature based long non-coding RNA recognition method. Xian: Xidian University; 2013."},{"issue":"4","key":"3039_CR12","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1016\/j.molcel.2007.12.013","volume":"29","author":"PD Mariner","year":"2008","unstructured":"Mariner PD, Walters RD, Espinoza CA, et al. Human Alu RNA is a modular transacting repressor of mRNA transcription during heat shock. Mol Cell. 2008;29(4):499\u2013509.","journal-title":"Mol Cell"},{"issue":"13","key":"3039_CR13","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1093\/bioinformatics\/btr209","volume":"27","author":"MF Lin","year":"2011","unstructured":"Lin MF, Jungreis I, Kellis M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics. 2011;27(13):275\u201382.","journal-title":"Bioinformatics."},{"issue":"11","key":"3039_CR14","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1093\/nar\/gku325","volume":"42","author":"S Lertampaiporn","year":"2014","unstructured":"Lertampaiporn S, Thammarongtham C, Nukoolkit C, et al. Identification of non-coding RNAs with a new composite feature in the hybrid random forest ensemble algorithm. Nucleic Acids Res. 2014;42(11):93\u2013104.","journal-title":"Nucleic Acids Res"},{"key":"3039_CR15","volume-title":"Identification of long non-coding RNA and mRNA based on maximum entropy and k-mer","author":"M Wei","year":"2015","unstructured":"Wei M. Identification of long non-coding RNA and mRNA based on maximum entropy and k-mer. Xian: Xidian University; 2015."},{"issue":"12","key":"3039_CR16","doi-asserted-by":"publisher","first-page":"113","DOI":"10.3390\/genes7120113","volume":"7","author":"A Qaisar","year":"2016","unstructured":"Qaisar A, Syed R, Azizuddin B, et al. A review of computational methods for finding non-coding rna genes. Genes. 2016;7(12):113.","journal-title":"Genes."},{"key":"3039_CR17","doi-asserted-by":"publisher","first-page":"105620","DOI":"10.1016\/j.asoc.2019.105620","volume":"83","author":"H Li","year":"2019","unstructured":"Li H, Wang Y, Xu X, et al. Short-term passenger flow prediction under passenger flow control using a dynamic radial basis function network. Appl Soft Comput. 2019;83:105620.","journal-title":"Appl Soft Comput"},{"issue":"36","key":"3039_CR18","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1016\/j.inffus.2016.11.015","volume":"7","author":"Y Chen","year":"2017","unstructured":"Chen Y, Wang L, Li F, et al. Air quality data clustering using EPLS method. Information Fusion. 2017;7(36):225\u201332.","journal-title":"Information Fusion"},{"issue":"12","key":"3039_CR19","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1093\/bioinformatics\/btw255","volume":"32","author":"H Zeng","year":"2016","unstructured":"Zeng H, Edwards MD, Liu G, et al. Convolutional neural network architectures for predicting DNA-protein binding. Bioinformatics. 2016;32(12):121\u20137.","journal-title":"Bioinformatics."},{"issue":"8","key":"3039_CR20","doi-asserted-by":"publisher","first-page":"831","DOI":"10.1038\/nbt.3300","volume":"33","author":"B Alipanahi","year":"2015","unstructured":"Alipanahi B, Delong A, Weirauch MT, et al. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol. 2015;33(8):831\u20138.","journal-title":"Nat Biotechnol"},{"key":"3039_CR21","doi-asserted-by":"crossref","unstructured":"Zhang Q, Zhu L, Huang DS. High-order convolutional neural network architecture for predicting DNA-protein binding sites. IEEE\/ACM Trans Comput Biol Bioinform. 2019;16(4):1184\u201392.","DOI":"10.1109\/TCBB.2018.2819660"},{"key":"3039_CR22","doi-asserted-by":"publisher","unstructured":"Zhang Q, Zhu L, Bao WZ, et al. Weakly-supervised convolutional neural network architecture for predicting protein-DNA binding. IEEE\/ACM Trans Comput Biol Bioinform. 2018:1\u20131. Online.\u00a0\n                    https:\/\/doi.org\/10.1109\/TCBB.2018.2864203\n                    \n                  .","DOI":"10.1109\/TCBB.2018.2864203"},{"issue":"1","key":"3039_CR23","doi-asserted-by":"publisher","first-page":"3217","DOI":"10.1038\/s41598-017-03554-7","volume":"7","author":"Q Zhang","year":"2017","unstructured":"Zhang Q, Zhu L, Huang DS. WSMD: weakly-supervised motif discovery in transcription factor ChIP-seq data. Sci Rep. 2017;7(1):3217.","journal-title":"Sci Rep"},{"issue":"1","key":"3039_CR24","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1186\/s13059-018-1459-4","volume":"19","author":"GH Chuai","year":"2018","unstructured":"Chuai GH, Ma HH, Yan JF, et al. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 2018;19(1):80.","journal-title":"Genome Biol"},{"issue":"14","key":"3039_CR25","doi-asserted-by":"publisher","first-page":"23775","DOI":"10.18632\/oncotarget.15864","volume":"8","author":"L Gasri-Plotnitsky","year":"2017","unstructured":"Gasri-Plotnitsky L, Ovadia A, Shamalov K, et al. A novel lncRNA, GASL1, inhibits cell proliferation and restricts E2F1 activity. Oncotarget. 2017;8(14):23775\u201386.","journal-title":"Oncotarget."},{"key":"3039_CR26","first-page":"63","volume":"1","author":"KC Chou","year":"2009","unstructured":"Chou KC, Shen HB. Recent advances in developing web-servers for predicting protein attributes. Nat Sci. 2009;1:63\u201392.","journal-title":"Nat Sci"},{"key":"3039_CR27","doi-asserted-by":"publisher","first-page":"218","DOI":"10.2174\/1573406411666141229162834","volume":"11","author":"KC Chou","year":"2015","unstructured":"Chou KC. Impacts of bioinformatics to medicinal chemistry. Med Chem. 2015;11:218\u201334.","journal-title":"Med Chem"},{"key":"3039_CR28","doi-asserted-by":"publisher","first-page":"2337","DOI":"10.2174\/1568026617666170414145508","volume":"17","author":"KC Chou","year":"2017","unstructured":"Chou KC. An unprecedented revolution in medicinal chemistry driven by the progress of biological science. Curr Top Med Chem. 2017;17:2337\u201358.","journal-title":"Curr Top Med Chem"},{"key":"3039_CR29","volume-title":"Biological classification based on k-mer frequency statistics","author":"X Chen","year":"2011","unstructured":"Chen X. Biological classification based on k-mer frequency statistics. Changchun: Jilin University; 2011."},{"key":"3039_CR30","first-page":"18","volume-title":"Statistics learning method","author":"H Li","year":"2012","unstructured":"Li H. Statistics learning method. Beijing: Peking University impress; 2012. p. 18\u20139."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3039-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-019-3039-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3039-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,9,11]],"date-time":"2020-09-11T23:08:14Z","timestamp":1599865694000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-019-3039-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,9,13]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["3039"],"URL":"https:\/\/doi.org\/10.1186\/s12859-019-3039-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,9,13]]},"assertion":[{"value":"30 December 2018","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 August 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 September 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"469"}}