{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T23:35:13Z","timestamp":1775777713285,"version":"3.50.1"},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2024,9,10]],"date-time":"2024-09-10T00:00:00Z","timestamp":1725926400000},"content-version":"vor","delay-in-days":9,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010663","name":"European Research Council","doi-asserted-by":"publisher","award":["810296"],"award-info":[{"award-number":["810296"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>A fundamental step in many analyses of high-dimensional data is dimension reduction. Two basic approaches are introduction of new synthetic coordinates and selection of extant features. Advantages of the latter include interpretability, simplicity, transferability, and modularity. A common criterion for unsupervized feature selection is variance or dynamic range. However, in practice, it can occur that high-variance features are noisy, that important features have low variance, or that variances are simply not comparable across features because they are measured in unrelated numeric scales or physical units. Moreover, users may want to include measures of signal-to-noise ratio and non-redundancy into feature selection.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we introduce the RNR algorithm, which selects features based on (i) the reproducibility of their signal across replicates and (ii) their non-redundancy, measured by linear dependence. It takes as input a typically large set of features measured on a collection of objects with two or more replicates per object. It returns an ordered list of features, i1,i2,\u2026,ik, where feature i1 is the one with the highest reproducibility across replicates, i2 that with the highest reproducibility across replicates after projecting out the dimension spanned by i1, and so on. Applications to microscopy-based imaging of cells and proteomics highlight benefits of the approach.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The RNR method is available via Bioconductor (Huber W, Carey VJ, Gentleman R et al. (Orchestrating high-throughput genomic analysis with bioconductor. Nat Methods 2015;12:115\u201321.) in the R package FeatSeekR. Its source code is also available at https:\/\/github.com\/tcapraz\/FeatSeekR under the GPL-3 open source license.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae548","type":"journal-article","created":{"date-parts":[[2024,9,6]],"date-time":"2024-09-06T19:19:48Z","timestamp":1725650388000},"source":"Crossref","is-referenced-by-count":2,"title":["Feature selection by replicate reproducibility and non-redundancy"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2547-067X","authenticated-orcid":false,"given":"T\u00fcmay","family":"Capraz","sequence":"first","affiliation":[{"name":"Genome Biology Unit, EMBL , Heidelberg, 69117, Germany"},{"name":"Faculty of Biosciences, University of Heidelberg , Heidelberg, 69117, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0474-2218","authenticated-orcid":false,"given":"Wolfgang","family":"Huber","sequence":"additional","affiliation":[{"name":"Genome Biology Unit, EMBL , Heidelberg, 69117, Germany"}]}],"member":"286","published-online":{"date-parts":[[2024,9,10]]},"reference":[{"key":"2024091905063189700_btae548-B1","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1038\/nature19949","article-title":"Mass-spectrometric exploration of proteome structure and function","volume":"537","author":"Aebersold","year":"2016","journal-title":"Nature"},{"key":"2024091905063189700_btae548-B2","doi-asserted-by":"crossref","first-page":"6562","DOI":"10.1073\/pnas.102102699","article-title":"Selection bias in gene extraction on the basis of microarray gene-expression data","volume":"99","author":"Ambroise","year":"2002","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2024091905063189700_btae548-B3","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1038\/s41467-017-00249-5","article-title":"Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of swath-mass spectrometry","volume":"8","author":"Collins","year":"2017","journal-title":"Nat Commun"},{"key":"2024091905063189700_btae548-B4","doi-asserted-by":"crossref","first-page":"3048","DOI":"10.1016\/j.patcog.2011.12.008","article-title":"An unsupervised approach to feature discretization and selection","volume":"45","author":"Ferreira","year":"2012","journal-title":"Pattern Recognition"},{"key":"2024091905063189700_btae548-B5","doi-asserted-by":"crossref","first-page":"e05464","DOI":"10.7554\/eLife.05464","article-title":"A map of directional genetic interactions in a metazoan cell","volume":"4","author":"Fischer","year":"2015","journal-title":"Elife"},{"key":"2024091905063189700_btae548-B6","first-page":"1157","article-title":"An introduction to variable and feature selection","volume":"3","author":"Guyon","year":"2003","journal-title":"J Mach Learn Res"},{"key":"2024091905063189700_btae548-B7","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1109\/TSMC.1973.4309314","article-title":"Textural features for image classification","author":"Haralick","year":"1973","journal-title":"IEEE Trans Syst Man Cybern"},{"key":"2024091905063189700_btae548-B8","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1038\/nmeth.3252","article-title":"Orchestrating high-throughput genomic analysis with bioconductor","volume":"12","author":"Huber","year":"2015","journal-title":"Nat Methods"},{"key":"2024091905063189700_btae548-B9","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1038\/nmeth.2436","article-title":"Mapping genetic interactions in human cancer cells with rnai and multiparametric phenotyping","volume":"10","author":"Laufer","year":"2013","journal-title":"Nat Methods"},{"key":"2024091905063189700_btae548-B10","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1007\/978-1-4757-1904-8_8","volume-title":"Principal Component Analysis","author":"Jolliffe","year":"1986"},{"key":"2024091905063189700_btae548-B11","doi-asserted-by":"crossref","first-page":"e2005970","DOI":"10.1371\/journal.pbio.2005970","article-title":"Cellprofiler 3.0: next-generation image processing for biology","volume":"16","author":"McQuin","year":"2018","journal-title":"PLoS Biol"},{"key":"2024091905063189700_btae548-B12","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1177\/0013164484441003","article-title":"The measurement of classification agreement: an adjustment to the rand statistic for chance agreement","volume":"44","author":"Morey","year":"1984","journal-title":"Educ Psychol Meas"},{"key":"2024091905063189700_btae548-B13","doi-asserted-by":"crossref","first-page":"979","DOI":"10.1093\/bioinformatics\/btq046","article-title":"Ebimage\u2014an r package for image processing with applications to cellular phenotypes","volume":"26","author":"Pau","year":"2010","journal-title":"Bioinformatics"},{"key":"2024091905063189700_btae548-B14","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1038\/nbt.2841","article-title":"Openswath enables automated, targeted analysis of data-independent acquisition ms data","volume":"32","author":"R\u00f6st","year":"2014","journal-title":"Nat Biotechnol"},{"key":"2024091905063189700_btae548-B15","doi-asserted-by":"crossref","first-page":"e507","DOI":"10.1093\/bioinformatics\/btl214","article-title":"Novel unsupervised feature filtering of biological data","volume":"22","author":"Varshavsky","year":"2006","journal-title":"Bioinformatics"},{"key":"2024091905063189700_btae548-B16","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/j.knosys.2014.11.008","article-title":"Unsupervised feature selection via maximum projection and minimum redundancy","volume":"75","author":"Wang","year":"2015","journal-title":"Knowl Based Syst"},{"key":"2024091905063189700_btae548-B17","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1198\/jasa.2010.tm09415","article-title":"A framework for feature selection in clustering","volume":"105","author":"Witten","year":"2010","journal-title":"J Am Stat Assoc"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae548\/59074478\/btae548.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/9\/btae548\/59188996\/btae548.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/9\/btae548\/59188996\/btae548.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T01:06:49Z","timestamp":1726708009000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae548\/7754483"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9]]},"references-count":17,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2024,9,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae548","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.07.04.547623","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,9]]},"published":{"date-parts":[[2024,9]]},"article-number":"btae548"}}