{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:05Z","timestamp":1772138045895,"version":"3.50.1"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2021,5,16]],"date-time":"2021-05-16T00:00:00Z","timestamp":1621123200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100003141","name":"Consejo Nacional de Ciencia y Tecnolog\u00eda","doi-asserted-by":"publisher","award":["2016\/277850"],"award-info":[{"award-number":["2016\/277850"]}],"id":[{"id":"10.13039\/501100003141","id-type":"DOI","asserted-by":"publisher"}]},{"name":"FORDECYT-PRONACES Ciecias de Frontera","award":["2019\/101732"],"award-info":[{"award-number":["2019\/101732"]}]},{"name":"C\u00e1tedras CONACyT program"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,10,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Machine learning algorithms excavate important variables from big data. However, deciding on the relevance of identified variables is challenging. The addition of artificial noise, \u2018decoy\u2019 variables, to raw data, \u2018target\u2019 variables, enables calculating a false-positive rate and a biological relevance probability for each variable rank. These scores allow the setting of a cut-off for informative variables, depending on the required sensitivity\/specificity of a scientific question.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We tested the function of the Target\u2013Decoy MineR (TDM) using synthetic data with different degrees of perturbation. Following, we applied the TDM to experimental Omics (metabolomics, transcriptomics and proteomics) results. The TDM graphs indicate the degree of difference between sample groups. Further, the TDM reports the contribution of each variable to correct classification, i.e. its biological relevance.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availabilityand implementation<\/jats:title>\n                    <jats:p>An implementation of the algorithm in R is freely available from https:\/\/bitbucket.org\/cesaremov\/targetdecoy_mining\/. The Target\u2013Decoy MineR is applicable to different types of quantitative data in tabular format.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab369","type":"journal-article","created":{"date-parts":[[2021,5,11]],"date-time":"2021-05-11T15:15:20Z","timestamp":1620746120000},"page":"3595-3603","source":"Crossref","is-referenced-by-count":2,"title":["Target\u2013Decoy MineR for determining the biological relevance of variables in noisy datasets"],"prefix":"10.1093","volume":"37","author":[{"given":"Cesar\u00e9","family":"Ovando-V\u00e1zquez","sequence":"first","affiliation":[{"name":"National Supercomputing Center, Potosinan Institute for Scientific and Technological Research (IPICYT) , San Luis Potos\u00ed ZIP 78216, Mexico"}]},{"given":"Daniel","family":"C\u00e1zarez-Garc\u00eda","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Biotechnology, Center for Research and Advanced Studies (CINVESTAV) Irapuato , Irapuato-Le\u00f3n, Irapuato Gto ZIP 36824, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6732-1958","authenticated-orcid":false,"given":"Robert","family":"Winkler","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Biotechnology, Center for Research and Advanced Studies (CINVESTAV) Irapuato , Irapuato-Le\u00f3n, Irapuato Gto ZIP 36824, Mexico"}]}],"member":"286","published-online":{"date-parts":[[2021,5,16]]},"reference":[{"key":"2023051609033568200_btab369-B1","doi-asserted-by":"crossref","first-page":"2418","DOI":"10.1093\/bioinformatics\/btv146","article-title":"Cardinal: an R package for statistical analysis of mass spectrometry-based imaging experiments","volume":"31","author":"Bemis","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051609033568200_btab369-B2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B (Methodological)"},{"key":"2023051609033568200_btab369-B3","doi-asserted-by":"crossref","first-page":"3063","DOI":"10.1111\/jcmm.14219","article-title":"The latest progress on miR-374 and its functional implications in physiological and pathological processes","volume":"23","author":"Bian","year":"2019","journal-title":"J. Cell. Mol. Med"},{"key":"2023051609033568200_btab369-B4","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/S0031-3203(96)00142-2","article-title":"The use of the area under the ROC curve in the evaluation of machine learning algorithms","volume":"30","author":"Bradley","year":"1997","journal-title":"Pattern Recognit"},{"key":"2023051609033568200_btab369-B5","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn"},{"key":"2023051609033568200_btab369-B6","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023051609033568200_btab369-B7","volume-title":"Classification and Regression Trees","author":"Breiman","year":"1984","edition":"1st edn"},{"key":"2023051609033568200_btab369-B8","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1039\/C7IB00155J","article-title":"Lipidomic profiles of Drosophila melanogaster and cactophilic fly species: models of human metabolic diseases","volume":"9","author":"C\u00e1zarez-Garc\u00eda","year":"2017","journal-title":"Integrat. Biol"},{"key":"2023051609033568200_btab369-B9","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/1471-2105-7-3","article-title":"Gene selection and classification of microarray data using random forest","volume":"7","author":"D\u00edaz-Uriarte","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023051609033568200_btab369-B10","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/nmeth1019","article-title":"Target\u2013decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry","volume":"4","author":"Elias","year":"2007","journal-title":"Nat. Methods"},{"key":"2023051609033568200_btab369-B11","doi-asserted-by":"crossref","first-page":"2225","DOI":"10.1016\/j.patrec.2010.03.014","article-title":"Variable selection using random forests","volume":"31","author":"Genuer","year":"2010","journal-title":"Pattern Recognit. Lett"},{"key":"2023051609033568200_btab369-B12","doi-asserted-by":"crossref","first-page":"2270","DOI":"10.1093\/bioinformatics\/bts447","article-title":"MALDIquant: a versatile R package for the analysis of mass spectrometry data","volume":"28","author":"Gibb","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051609033568200_btab369-B13","volume-title":"Reprinted in Memorie di metodologica statistica","author":"Gini","year":"1912"},{"key":"2023051609033568200_btab369-B14","doi-asserted-by":"crossref","first-page":"1896","DOI":"10.1002\/jcp.24662","article-title":"Big data bioinformatics","volume":"229","author":"Greene","year":"2014","journal-title":"J. Cell. Physiol"},{"key":"2023051609033568200_btab369-B15","first-page":"570733","volume-title":"Front. Oncol","author":"Guo","year":"2020"},{"key":"2023051609033568200_btab369-B16","doi-asserted-by":"crossref","first-page":"1111","DOI":"10.1007\/s13361-011-0139-3","article-title":"Target\u2013decoy approach and false discovery rate: when things may go wrong","volume":"22","author":"Gupta","year":"2011","journal-title":"J. Am. Soc. Mass. Spectrom"},{"key":"2023051609033568200_btab369-B17","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.csda.2012.09.020","article-title":"A new variable selection approach using Random Forests","volume":"60","author":"Hapfelmeier","year":"2013","journal-title":"Comput. Stat. Data Anal"},{"key":"2023051609033568200_btab369-B18","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics","author":"Hastie","year":"2009","edition":"2nd edn"},{"key":"2023051609033568200_btab369-B19","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/s12031-019-01340-w","article-title":"Potential roles of miR-374a-5p in mediating neuroprotective effects and related molecular mechanism","volume":"69","author":"Jiang","year":"2019","journal-title":"J. Mol. Neurosci"},{"key":"2023051609033568200_btab369-B20","author":"Kassambara","year":"2017"},{"key":"2023051609033568200_btab369-B21","doi-asserted-by":"crossref","first-page":"3148","DOI":"10.1021\/acs.jproteome.5b00081","article-title":"Improved false discovery rate estimation procedure for shotgun proteomics","volume":"14","author":"Keich","year":"2015","journal-title":"J. Proteome Res"},{"key":"2023051609033568200_btab369-B22","first-page":"1137","author":"Kohavi","year":"1995"},{"key":"2023051609033568200_btab369-B23","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1186\/1471-2105-12-253","article-title":"Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems","volume":"12","author":"L\u00ea Cao","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051609033568200_btab369-B24","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.3945\/jn.111.138438","article-title":"Dietary protein and sugar differentially affect development and metabolic pools in ecologically diverse Drosophila","volume":"141","author":"Matzkin","year":"2011","journal-title":"J. Nutr"},{"key":"2023051609033568200_btab369-B25","volume-title":"e1071: Misc Functions of the Department of Statistics","author":"Meyer","year":"2020"},{"key":"2023051609033568200_btab369-B26","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1186\/s12864-016-2542-4","article-title":"Multivariate models from RNA-Seq SNVs yield candidate molecular targets for biomarker discovery: SNV-DA","volume":"17","author":"Paul","year":"2016","journal-title":"BMC Genomics"},{"key":"2023051609033568200_btab369-B27","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1186\/1471-2105-11-395","article-title":"MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data","volume":"11","author":"Pluskal","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023051609033568200_btab369-B28","volume-title":"R: A Language and Environment for Statistical Computing","year":"2018"},{"key":"2023051609033568200_btab369-B29","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1186\/1471-2105-12-77","article-title":"pROC: an open-source package for R and S+ to analyze and compare ROC curves","volume":"12","author":"Robin","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051609033568200_btab369-B30","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1002\/jms.3512","article-title":"GridMass: a fast two-dimensional feature detection method for LC\/MS","volume":"50","author":"Trevi\u00f1o","year":"2015","journal-title":"J. Mass Spectrom"},{"key":"2023051609033568200_btab369-B31","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-21706-2","volume-title":"Modern Applied Statistics with S","author":"Venables","year":"2002","edition":"4th edn"},{"key":"2023051609033568200_btab369-B32","doi-asserted-by":"crossref","first-page":"2153","DOI":"10.1111\/pbi.13129","article-title":"Comparative proteomics combined with analyses of transgenic plants reveal ZmREM1.3 mediates maize resistance to southern corn rust","volume":"17","author":"Wang","year":"2019","journal-title":"Plant Biotechnol. J"},{"key":"2023051609033568200_btab369-B33","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4419-9890-3","volume-title":"Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery (Use R!)","author":"Williams","year":"2011","edition":"1st edn"},{"key":"2023051609033568200_btab369-B34","doi-asserted-by":"crossref","first-page":"e14011","DOI":"10.7717\/peerj.1401","article-title":"An evolving computational platform for biological mass spectrometry: workflows, statistics and data mining with MASSyPup64","volume":"3","author":"Winkler","year":"2015","journal-title":"PeerJ"},{"key":"2023051609033568200_btab369-B35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.3389\/fpls.2016.00195","article-title":"Popper and the Omics","volume":"7","author":"Winkler","year":"2016","journal-title":"Front. Plant Sci"},{"key":"2023051609033568200_btab369-B36","doi-asserted-by":"crossref","first-page":"103985","DOI":"10.1016\/j.jprot.2020.103985","article-title":"ProtyQuant: comparing label-free shotgun proteomics datasets using accumulated peptide probabilities","volume":"230","author":"Winkler","year":"2021","journal-title":"J. Proteomics"},{"key":"2023051609033568200_btab369-B37","author":"Wright","year":"2020"},{"key":"2023051609033568200_btab369-B38","doi-asserted-by":"crossref","first-page":"102151","DOI":"10.1016\/j.isci.2021.102151","article-title":"Transcriptomic profiling of SARS-CoV-2 infected human cell lines identifies HSP90 as target for COVID-19 therapy","volume":"24","author":"Wyler","year":"2021","journal-title":"iScience"},{"key":"2023051609033568200_btab369-B39","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.gene.2019.02.066","article-title":"MiR-155-3p acts as a tumor suppressor and reverses paclitaxel resistance via negative regulation of MYD88 in human breast cancer","volume":"700","author":"Zhang","year":"2019","journal-title":"Gene"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab369\/38709474\/btab369.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3595\/50338501\/btab369.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3595\/50338501\/btab369.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,30]],"date-time":"2024-08-30T13:29:41Z","timestamp":1725024581000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/20\/3595\/6276427"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,5,16]]},"references-count":39,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2021,10,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab369","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.11.09.374181","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,10,15]]},"published":{"date-parts":[[2021,5,16]]}}}