{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,4]],"date-time":"2026-05-04T11:28:36Z","timestamp":1777894116704,"version":"3.51.4"},"reference-count":45,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2022,1,24]],"date-time":"2022-01-24T00:00:00Z","timestamp":1642982400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["390740016"],"award-info":[{"award-number":["390740016"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Fundacao para a Ciencia e a Tecnologia","doi-asserted-by":"crossref","award":["CEECINST\/00102\/2018"],"award-info":[{"award-number":["CEECINST\/00102\/2018"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001871","name":"Fundacao para a Ciencia e a Tecnologia","doi-asserted-by":"crossref","award":["PTDC\/CCI-BIO\/4180\/2020"],"award-info":[{"award-number":["PTDC\/CCI-BIO\/4180\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001871","name":"Fundacao para a Ciencia e a Tecnologia","doi-asserted-by":"crossref","award":["UIDB\/00297\/2020"],"award-info":[{"award-number":["UIDB\/00297\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001871","name":"Fundacao para a Ciencia e a Tecnologia","doi-asserted-by":"crossref","award":["UIDB\/04516\/2020"],"award-info":[{"award-number":["UIDB\/04516\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001871","name":"Fundacao para a Ciencia e a Tecnologia","doi-asserted-by":"crossref","award":["UIDB\/50021\/2020"],"award-info":[{"award-number":["UIDB\/50021\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001871","name":"Fundacao para a Ciencia e a Tecnologia","doi-asserted-by":"crossref","award":["UIDB\/50022\/2020"],"award-info":[{"award-number":["UIDB\/50022\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"crossref"}]},{"name":"European Union's Horizon 2020 research and innovation programme","award":["951970"],"award-info":[{"award-number":["951970"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Stat Methods Med Res"],"published-print":{"date-parts":[[2022,5]]},"abstract":"<jats:p> The extraction of novel information from omics data is a challenging task, in particular, since the number of features (e.g. genes) often far exceeds the number of samples. In such a setting, conventional parameter estimation leads to ill-posed optimization problems, and regularization may be required. In addition, outliers can largely impact classification accuracy. <\/jats:p><jats:p> Here we introduce ROSIE, an ensemble classification approach, which combines three sparse and robust classification methods for outlier detection and feature selection and further performs a bootstrap-based validity check. Outliers of ROSIE are determined by the rank product test using outlier rankings of all three methods, and important features are selected as features commonly selected by all methods. <\/jats:p><jats:p> We apply ROSIE to RNA-Seq data from The Cancer Genome Atlas (TCGA) to classify observations into Triple-Negative Breast Cancer (TNBC) and non-TNBC tissue samples. The pre-processed dataset consists of [Formula: see text] genes and more than [Formula: see text] samples. We demonstrate that ROSIE selects important features and outliers in a robust way. Identified outliers are concordant with the distribution of the commonly selected genes by the three methods, and results are in line with other independent studies. Furthermore, we discuss the association of some of the selected genes with the TNBC subtype in other investigations. In summary, ROSIE constitutes a robust and sparse procedure to identify outliers and important genes through binary classification. Our approach is ad hoc applicable to other datasets, fulfilling the overall goal of simultaneously identifying outliers and candidate disease biomarkers to the targeted in therapy research and personalized medicine frameworks. <\/jats:p>","DOI":"10.1177\/09622802211072456","type":"journal-article","created":{"date-parts":[[2022,1,24]],"date-time":"2022-01-24T16:44:12Z","timestamp":1643042652000},"page":"947-958","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":5,"title":["ROSIE: RObust Sparse ensemble for outlIEr detection and gene selection in cancer omics data"],"prefix":"10.1177","volume":"31","author":[{"given":"Antje","family":"Jensch","sequence":"first","affiliation":[{"name":"Institute for Systems Theory and Automatic Control, University of Stuttgart, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4135-1857","authenticated-orcid":false,"given":"Marta B.","family":"Lopes","sequence":"additional","affiliation":[{"name":"Center for Mathematics and Applications (CMA), NOVA School of Science and Technology, Caparica, Portugal"},{"name":"NOVA Laboratory for Computer Science and Informatics (NOVA LINCS), NOVA School of Science and Technology, Caparica, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1954-5487","authenticated-orcid":false,"given":"Susana","family":"Vinga","sequence":"additional","affiliation":[{"name":"INESC-ID, Instituto Superior T\u00e9cnico, Universidade de Lisboa, Portugal"},{"name":"IDMEC, Instituto Superior T\u00e9cnico, Universidade de Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5145-0058","authenticated-orcid":false,"given":"Nicole","family":"Radde","sequence":"additional","affiliation":[{"name":"Institute for Systems Theory and Automatic Control, University of Stuttgart, Germany"}]}],"member":"179","published-online":{"date-parts":[[2022,1,24]]},"reference":[{"key":"bibr1-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1093\/toxsci\/kft094"},{"key":"bibr2-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1038\/tpj.2017.17"},{"key":"bibr3-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1016\/j.biotechadv.2007.11.002"},{"key":"bibr4-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1016\/S2225-4110(16)30053-0"},{"key":"bibr5-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1186\/s12931-017-0631-9"},{"key":"bibr6-09622802211072456","doi-asserted-by":"publisher","DOI":"10.7554\/eLife.33105"},{"key":"bibr7-09622802211072456","doi-asserted-by":"publisher","DOI":"10.3389\/fonc.2020.00423"},{"key":"bibr8-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-019-13983-9"},{"key":"bibr9-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1111\/j.1469-1809.1936.tb02137.x"},{"key":"bibr10-09622802211072456","volume-title":"Discriminant Analysis and Statistical Pattern Recognition","author":"McLachlan G","year":"2004"},{"key":"bibr11-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1093\/aje\/kwt245"},{"key":"bibr12-09622802211072456","doi-asserted-by":"publisher","DOI":"10.2202\/1544-6115.1175"},{"key":"bibr13-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1023\/A:1012487302797"},{"key":"bibr14-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1214\/07-AOS515"},{"key":"bibr15-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-018-2149-7"},{"key":"bibr16-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-020-03653-9"},{"key":"bibr17-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1002\/cem.2775"},{"key":"bibr18-09622802211072456","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.2619056"},{"key":"bibr19-09622802211072456","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v072.i05"},{"key":"bibr20-09622802211072456","unstructured":"Kondo Y. RSKC: Robust Sparse K-Means, 2016. https:\/\/CRAN.R-project.org\/package=RSKC. R package version 2.4.2."},{"key":"bibr21-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1016\/j.chemolab.2017.11.017"},{"key":"bibr22-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1016\/j.chemolab.2017.11.017"},{"key":"bibr23-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1007\/s10549-010-1293-1"},{"key":"bibr24-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1016\/j.febslet.2004.07.055"},{"key":"bibr25-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-014-0367-1"},{"key":"bibr26-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9868.00346"},{"key":"bibr27-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1530509100"},{"key":"bibr28-09622802211072456","unstructured":"Research Network TCGA. The Cancer Genome Atlas. https:\/\/cancergenome.nih.gov\/. Accessed December 2019."},{"key":"bibr29-09622802211072456","doi-asserted-by":"publisher","DOI":"10.5114\/wo.2014.47136"},{"key":"bibr30-09622802211072456","unstructured":"Verssimo A. brca.data: BRCA gene expression and clinical data from TCGA (with import script), 2019. https:\/\/github.com\/averissimo\/brca.data\/releases\/download\/1.0\/brca.data\u02d91.0.tar.gz. R package version 1.0."},{"key":"bibr31-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkz966"},{"key":"bibr32-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1101\/gr.080531.108"},{"key":"bibr33-09622802211072456","unstructured":"from Jed Wing MKC, Weston S, Williams A et al. caret: Classification and Regression Training, 2019. https:\/\/CRAN.R-project.org\/package=caret. R package version 6.0-84."},{"key":"bibr34-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1177\/0962280218794722"},{"key":"bibr35-09622802211072456","first-page":"00","volume":"5","author":"Chen YT","year":"2005","journal-title":"Cancer Immun"},{"key":"bibr36-09622802211072456","doi-asserted-by":"publisher","DOI":"10.2147\/CMAR.S187151"},{"key":"bibr37-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btp616"},{"key":"bibr38-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1007\/s12282-019-00988-x"},{"key":"bibr39-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1158\/2159-8290.CD-14-1092"},{"key":"bibr40-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1006\/bbrc.1998.9440"},{"key":"bibr41-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1038\/sj.bjc.6600740"},{"key":"bibr42-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-020-14533-4"},{"key":"bibr43-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-020-67525-1"},{"key":"bibr44-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1007\/s11222-021-10061-3"},{"key":"bibr45-09622802211072456","doi-asserted-by":"publisher","DOI":"10.1007\/s11222-020-09950-w"}],"container-title":["Statistical Methods in Medical Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/09622802211072456","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/09622802211072456","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/09622802211072456","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T12:56:36Z","timestamp":1740920196000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/09622802211072456"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,24]]},"references-count":45,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,5]]}},"alternative-id":["10.1177\/09622802211072456"],"URL":"https:\/\/doi.org\/10.1177\/09622802211072456","relation":{},"ISSN":["0962-2802","1477-0334"],"issn-type":[{"value":"0962-2802","type":"print"},{"value":"1477-0334","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,24]]}}}