{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T07:43:43Z","timestamp":1768290223726,"version":"3.49.0"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"8","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,4,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: In this article, we show that the classification of human precursor microRNA (pre-miRNAs) hairpins from both genome pseudo hairpins and other non-coding RNAs (ncRNAs) is a common and essential requirement for both comparative and non-comparative computational recognition of human miRNA genes. However, the existing computational methods do not address this issue completely or successfully. Here we present the development of an effective classifier system (named as microPred) for this classification problem by using appropriate machine learning techniques. Our approach includes the introduction of more representative datasets, extraction of new biologically relevant features, feature selection, handling of class imbalance problem in the datasets and extensive classifier performance evaluation via systematic cross-validation methods.<\/jats:p>\n               <jats:p>Results: Our microPred classifier yielded higher and, especially, much more reliable classification results in terms of both sensitivity (90.02%) and specificity (97.28%) than the exiting pre-miRNA classification methods. When validated with 6095 non-human animal pre-miRNAs and 139 virus pre-miRNAs from miRBase, microPred resulted in 92.71% (5651\/6095) and 94.24% (131\/139) recognition rates, respectively.<\/jats:p>\n               <jats:p>Availability: The microPred classifier, the datasets used, and the features extracted are freely available at http:\/\/web.comlab.ox.ac.uk\/people\/ManoharaRukshan.Batuwita\/microPred.htm.<\/jats:p>\n               <jats:p>Contact: \u00a0manb@comlab.ox.ac.uk; vasile.palade@comlab.ox.ac.uk<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp107","type":"journal-article","created":{"date-parts":[[2009,2,22]],"date-time":"2009-02-22T03:48:44Z","timestamp":1235274524000},"page":"989-995","source":"Crossref","is-referenced-by-count":213,"title":["<i>microPred<\/i>: effective classification of pre-miRNAs for human miRNA gene prediction"],"prefix":"10.1093","volume":"25","author":[{"given":"Rukshan","family":"Batuwita","sequence":"first","affiliation":[{"name":"Oxford University Computing Laboratory, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK"}]},{"given":"Vasile","family":"Palade","sequence":"additional","affiliation":[{"name":"Oxford University Computing Laboratory, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK"}]}],"member":"286","published-online":{"date-parts":[[2009,2,20]]},"reference":[{"key":"2023051607021421100_B1","first-page":"39","article-title":"Applying support vector machines to imbalanced datasets","volume-title":"Proc. of 15th ECML.","author":"Akbani","year":"2004"},{"key":"2023051607021421100_B2","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023051607021421100_B3","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/S0092-8674(04)00045-5","article-title":"MicroRNAs: genomics, biogenesis, mechanism, and function","volume":"116","author":"Bartel","year":"2004","journal-title":"Cell"},{"key":"2023051607021421100_B4","doi-asserted-by":"crossref","first-page":"766","DOI":"10.1038\/ng1590","article-title":"Identification of hundreds of conserved and nonconserved human microRNAs","volume":"37","author":"Bentwich","year":"2005","journal-title":"Nat. Genet."},{"key":"2023051607021421100_B5","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.cell.2004.12.031","article-title":"Phylogenetic shadowing and computational identification of human microRNA genes","volume":"120","author":"Berezikov","year":"2005","journal-title":"Cell"},{"key":"2023051607021421100_B6","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1038\/ng1794","article-title":"Approaches to microRNA discovery","volume":"38","author":"Berezikov","year":"2006","journal-title":"Nat. Genet."},{"key":"2023051607021421100_B7","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1023\/A:1009715923555","article-title":"A tutorial on support vector machines for pattern recognition","volume":"2","author":"Burges","year":"1998","journal-title":"Data Min. Knowl. Discov."},{"key":"2023051607021421100_B8","author":"Chang","year":"2001","journal-title":"LIBSVM: a library for support vector machines."},{"key":"2023051607021421100_B9","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1146\/annurev.genom.8.080706.092351","article-title":"Roles of microRMAs in vertebrate physiology and human disease","volume":"8","author":"Chang","year":"2007","journal-title":"Annu. Rev. Genomics Hum. Genet."},{"key":"2023051607021421100_B10","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"Artif. Intell. Res."},{"key":"2023051607021421100_B11","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1261\/rna.7220505","article-title":"Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency","volume":"11","author":"Clote","year":"2005","journal-title":"RNA"},{"key":"2023051607021421100_B12","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1016\/j.cell.2005.06.036","article-title":"miRNAs, cancer, and stem cell division","volume":"122","author":"Croce","year":"2005","journal-title":"Cell"},{"key":"2023051607021421100_B13","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1038\/nbt1394","article-title":"Discovering microRNAs from deep sequencing data using miRDeep","volume":"26","author":"Friedlander","year":"2008","journal-title":"Nat. Biotechnol."},{"key":"2023051607021421100_B14","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1093\/nar\/gki081","article-title":"Rfam: annotating non-coding RNAs in complete genomes","volume":"33","author":"Griffiths-Jones","year":"2005","journal-title":"Nucleic Acids Res."},{"issue":"Database Issue","key":"2023051607021421100_B15","doi-asserted-by":"crossref","first-page":"D140","DOI":"10.1093\/nar\/gkj112","article-title":"miRBase: microRNA sequences, targets and gene nomenclature","volume":"34","author":"Griffiths-Jones","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023051607021421100_B16","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1146\/annurev.genom.8.080706.092419","article-title":"Annotating noncoding RNA genes","volume":"8","author":"Griffiths-Jones","year":"2007","journal-title":"Annu. Rev. Genomics Hum. Genet"},{"key":"2023051607021421100_B17","first-page":"1157","article-title":"An introduction to variable and feature selection","volume":"3","author":"Guyon","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"2023051607021421100_B18","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1093\/bioinformatics\/btl257","article-title":"Hairpins in a haystack: recognizing microRNA precursors in comparative genomics data","volume":"22","author":"Hertel","year":"2006","journal-title":"Bioinformatics"},{"key":"2023051607021421100_B19","doi-asserted-by":"crossref","first-page":"3429","DOI":"10.1093\/nar\/gkg599","article-title":"Vienna RNA secondary structure server","volume":"31","author":"Hofacker","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023051607021421100_B20","first-page":"264","article-title":"z-SVM: an SVM for improved classification of imbalanced data","volume-title":"Proc. of 19th AUS-AI.","author":"Imam","year":"2006"},{"key":"2023051607021421100_B21","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1093\/nar\/gkm368","article-title":"MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features","volume":"35","author":"Jiang","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023051607021421100_B22","doi-asserted-by":"crossref","first-page":"1667","DOI":"10.1162\/089976603321891855","article-title":"Asymptotic behaviours of support vector machines with Gaussian kernel","volume":"15","author":"Keerthi","year":"2003","journal-title":"Neural Comput."},{"key":"2023051607021421100_B23","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1016\/j.tig.2006.01.003","article-title":"Genomics of microRNA","volume":"22","author":"Kim","year":"2006","journal-title":"Trends Genet."},{"key":"2023051607021421100_B24","doi-asserted-by":"crossref","first-page":"2919","DOI":"10.1080\/01431160110107743","article-title":"The role of feature selection in artificial neural network applications","volume":"23","author":"Kovzoglu","year":"2002","journal-title":"Int. J. Remote Sensing"},{"key":"2023051607021421100_B25","doi-asserted-by":"crossref","first-page":"860","DOI":"10.1038\/35057062","article-title":"Initial sequencing and analysis of the human genome","volume":"409","author":"Lander","year":"2001","journal-title":"Nature"},{"key":"2023051607021421100_B26","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1093\/nar\/gkj002","article-title":"snoRNA-LBME-db, a comprehensive database of human H\/ACA and C\/D box snoRNAs","volume":"34","author":"Lestrade","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023051607021421100_B27","doi-asserted-by":"crossref","first-page":"1540","DOI":"10.1126\/science.1080372","article-title":"Vertebrate microRNA genes","volume":"299","author":"Lim","year":"2003","journal-title":"Science"},{"key":"2023051607021421100_B28","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1089\/dna.2006.0551","article-title":"Principles and limitations of computational microRNA gene and target finding","volume":"26","author":"Lindow","year":"2007","journal-title":"DNA Cell Biol."},{"key":"2023051607021421100_B29","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1093\/bioinformatics\/btm026","article-title":"De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures","volume":"23","author":"Loong","year":"2007","journal-title":"Bioinformatics"},{"key":"2023051607021421100_B30","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1093\/nar\/gki591","article-title":"DINAMelt web server for nucleic acid melting prediction","volume":"33","author":"Markham","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023051607021421100_B31","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1002\/bip.360290621","article-title":"The equilibrium partition function and base pair binding probabilities for RNA secondary structures","volume":"29","author":"McCaskill","year":"1990","journal-title":"Biopolymers"},{"key":"2023051607021421100_B32","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.1016\/j.cell.2006.07.031","article-title":"A pattern-based method for the identification of MicroRNA binding sites and their corresponding heteroduplexes","volume":"126","author":"Miranda","year":"2006","journal-title":"Cell"},{"key":"2023051607021421100_B33","first-page":"43","article-title":"Facing imbalance classes through aggregation of classifiers","volume-title":"Proc. of 14th ICIAP. IEEE Comp. Soc.","author":"Molinara","year":"2007"},{"key":"2023051607021421100_B34","first-page":"1","article-title":"Inverted repeates, stem-loops, and cruciforms: significance for initiation of DNA replication","volume":"63","author":"Pearson","year":"1996","journal-title":"J. Cell Bio-Chem"},{"key":"2023051607021421100_B35","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1093\/nar\/29.1.137","article-title":"RefSeq and LocusLink: NCBI gene-centered resources","volume":"29","author":"Pruitt","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023051607021421100_B36","doi-asserted-by":"crossref","first-page":"1193","DOI":"10.1016\/j.cell.2006.10.040","article-title":"Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans","volume":"127","author":"Ruby","year":"2006","journal-title":"Cell"},{"key":"2023051607021421100_B37","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1186\/1471-2105-6-267","article-title":"Identification of clustered microRNAs using an ab initio prediction method","volume":"6","author":"Sewer","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023051607021421100_B38","first-page":"270","article-title":"Support vector machines for predicting microRNA hairpins","volume-title":"Proc. of BIOCOMP.","author":"Szafranski","year":"2006"},{"key":"2023051607021421100_B39","first-page":"55","article-title":"Controlling the sensitivity of support vector machines","volume-title":"In Proc. of IJCAI. IJCAII Organization","author":"Veropoulos","year":"1999"},{"key":"2023051607021421100_B40","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1145\/1007730.1007734","article-title":"Mining with rarity: a unifying framework","volume":"6","author":"Weiss","year":"2004","journal-title":"SIGKDD Expl."},{"key":"2023051607021421100_B41","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1186\/1471-2105-6-310","article-title":"Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine","volume":"6","author":"Xue","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023051607021421100_B42","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1007\/s00018-005-5467-7","article-title":"Evidence that miRNAs are different from other RNAs","volume":"63","author":"Zhang","year":"2005","journal-title":"Cell. Mol. Life Sci."},{"key":"2023051607021421100_B43","doi-asserted-by":"crossref","first-page":"3406","DOI":"10.1093\/nar\/gkg595","article-title":"Mfold web server for nucleic acid folding and hybridization prediction","volume":"31","author":"Zuker","year":"2003","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/8\/989\/50287484\/bioinformatics_25_8_989.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/8\/989\/50287484\/bioinformatics_25_8_989.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T07:05:57Z","timestamp":1684220757000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/8\/989\/324698"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,2,20]]},"references-count":43,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2009,4,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp107","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,4,15]]},"published":{"date-parts":[[2009,2,20]]}}}