{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:42:44Z","timestamp":1753875764490,"version":"3.41.2"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2022,2,25]],"date-time":"2022-02-25T00:00:00Z","timestamp":1645747200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Although sifting functional genes has been discussed for years, traditional selection methods tend to be ineffective in capturing potential specific genes. First, typical methods focus on finding features (genes) relevant to class while irrelevant to each other. However, the features that can offer rich discriminative information are more likely to be the complementary ones. Next, almost all existing methods assess feature relations in pairs, yielding an inaccurate local estimation and lacking a global exploration. In this paper, we introduce multi-variable Area Under the receiver operating characteristic Curve (AUC) to globally evaluate the complementarity among features by employing Area Above the receiver operating characteristic Curve (AAC). Due to AAC, the class-relevant information newly provided by a candidate feature and that preserved by the selected features can be achieved beyond pairwise computation. Furthermore, we propose an AAC-based feature selection algorithm, named Multi-variable AUC-based Combined Features Complementarity, to screen discriminative complementary feature combinations. Extensive experiments on public datasets demonstrate the effectiveness of the proposed approach. Besides, we provide a gene set about prostate cancer and discuss its potential biological significance from the machine learning aspect and based on the existing biomedical findings of some individual genes.<\/jats:p>","DOI":"10.1093\/bib\/bbac029","type":"journal-article","created":{"date-parts":[[2022,1,26]],"date-time":"2022-01-26T12:10:29Z","timestamp":1643199029000},"source":"Crossref","is-referenced-by-count":2,"title":["Multi-variable AUC for sifting complementary features and its biomedical application"],"prefix":"10.1093","volume":"23","author":[{"given":"Yue","family":"Su","sequence":"first","affiliation":[{"name":"College of Computer Science at Nankai University, China"}]},{"given":"Keyu","family":"Du","sequence":"additional","affiliation":[{"name":"College of Computer Science at Nankai University, China"}]},{"given":"Jun","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Mathematics and Statistics Science at Ludong University, China"}]},{"given":"Jin-mao","family":"Wei","sequence":"additional","affiliation":[{"name":"College of Computer Science at Nankai University, China"}]},{"given":"Jian","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer Science at Nankai University, China"}]}],"member":"286","published-online":{"date-parts":[[2022,2,25]]},"reference":[{"issue":"5","key":"2022031506404708600_ref1","doi-asserted-by":"crossref","first-page":"1538","DOI":"10.1109\/TCBB.2017.2712775","article-title":"Local-nearest-neighbors-based feature weighting for gene selection","volume":"15","author":"An","year":"2018","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2022031506404708600_ref2","article-title":"Benchmark of filter methods for feature selection in high-dimensional gene expression survival data","author":"Andrea","year":"2021","journal-title":"Brief Bioinform"},{"issue":"10","key":"2022031506404708600_ref3","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1038\/ng.2764","article-title":"The cancer genome atlas pan-cancer analysis project","volume":"45","author":"Chang","year":"2013","journal-title":"Nat Genet"},{"key":"2022031506404708600_ref4","first-page":"124","article-title":"Fast: a roc-based feature selection metric for small samples and imdddata classification problems","volume-title":"Proc. 14th ACM SIGKDD","author":"Chen","year":"2008"},{"issue":"4","key":"2022031506404708600_ref5","article-title":"Disentangling pten-cooperating tumor suppressor gene networks in cancer","volume":"4","author":"de la Rosa","year":"2017","journal-title":"Mol Cell Oncol"},{"issue":"2","key":"2022031506404708600_ref6","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1109\/TCYB.2018.2859342","article-title":"Multiple relevant feature ensemble selection based on multilayer co-evolutionary consensus mapreduce","volume":"50","author":"Ding","year":"2020","journal-title":"IEEE Trans Cybern"},{"volume-title":"UCI machine learning repository","year":"2017","author":"Dua","key":"2022031506404708600_ref7"},{"issue":"6","key":"2022031506404708600_ref8","first-page":"1157","article-title":"An introduction to variable and feature selection","volume":"3","author":"Guyon","year":"2003","journal-title":"J Mach Learn Res"},{"issue":"2","key":"2022031506404708600_ref9","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1023\/A:1010920819831","article-title":"A simple generalisation of the area under the roc curve for multiple class classification problems","volume":"45","author":"Hand","year":"2001","journal-title":"Mach Learn"},{"issue":"1","key":"2022031506404708600_ref10","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1148\/radiology.143.1.7063747","article-title":"The meaning and use of the area under a receiver operating characteristic (roc) curve","volume":"143","author":"Hanley","year":"1982","journal-title":"Radiology"},{"issue":"9","key":"2022031506404708600_ref11","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","article-title":"Learning from imbalanced data","volume":"21","author":"He","year":"2009","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"2","key":"2022031506404708600_ref12","first-page":"123","article-title":"Protein phosphatase and trail receptor genes as new candidate tumor genes on chromosome 8p in prostate cancer","volume":"5","author":"Hornstein","year":"2008","journal-title":"Cancer Genomics Proteomics"},{"key":"2022031506404708600_ref13","article-title":"Machine learning based on attribute interactions","author":"Jakulin","year":"2005","journal-title":"PhD dissertation, Faculty Comput Inf Sci, Ljubljana Univ, Ljubljana, Slovenia"},{"issue":"5","key":"2022031506404708600_ref14","doi-asserted-by":"crossref","first-page":"3299","DOI":"10.1109\/TIE.2016.2527623","article-title":"A hybrid feature selection scheme for reducing diagnostic performance deterioration caused by outliers in data-driven diagnostics","volume":"63","author":"Kang","year":"2016","journal-title":"IEEE Trans Ind Electron"},{"issue":"1-2","key":"2022031506404708600_ref15","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/S0004-3702(97)00043-X","article-title":"Wrappers for feature subset selection","volume":"97","author":"Kohavi","year":"1997","journal-title":"Artif Intell"},{"issue":"17","key":"2022031506404708600_ref16","doi-asserted-by":"crossref","first-page":"i421","DOI":"10.1093\/bioinformatics\/btw430","article-title":"Complementary feature selection from alternative splicing events and gene expression for phenotype prediction","volume":"32","author":"Labuzzetta","year":"2016","journal-title":"Bioinformatics"},{"issue":"6","key":"2022031506404708600_ref17","doi-asserted-by":"crossref","DOI":"10.1145\/3136625","article-title":"Feature selection: A data perspective","volume":"50","author":"Li","year":"2016","journal-title":"Acm Computing Surveys"},{"key":"2022031506404708600_ref18","first-page":"62","article-title":"Conditional infomax learning: An integrated framework for feature extraction and fusion","volume-title":"Proc. 9th Eur. Conf. Comput. Vis","author":"Lin","year":"2006"},{"issue":"2","key":"2022031506404708600_ref19","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1109\/TCBB.2014.2361329","article-title":"Gene selection integrated with biological knowledge for plant stress response using neighborhood system and rough set theory","volume":"12","author":"Meng","year":"2015","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"issue":"1","key":"2022031506404708600_ref20","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1016\/j.bbrc.2014.06.007","article-title":"Eph receptor a10 has a potential as a target for a prostate cancer therapy","volume":"450","author":"Nagano","year":"2014","journal-title":"Biochem Biophys Res Commun"},{"issue":"8","key":"2022031506404708600_ref21","doi-asserted-by":"crossref","first-page":"1226","DOI":"10.1109\/TPAMI.2005.159","article-title":"Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy","volume":"27","author":"Peng","year":"2005","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"5","key":"2022031506404708600_ref22","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab097","article-title":"Improving feature selection performance for classification of gene expression data using harris hawks optimizer with variable neighborhood learning","volume":"22","author":"Qu","year":"2021","journal-title":"Briefings Bioinform"},{"issue":"1-2","key":"2022031506404708600_ref23","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1023\/A:1025667309714","article-title":"Theoretical and empirical analysis of relieff and rrelieff","volume":"53","author":"Robnik-\u0160ikonja","year":"2003","journal-title":"Mach Learn"},{"issue":"19","key":"2022031506404708600_ref24","doi-asserted-by":"crossref","first-page":"2507","DOI":"10.1093\/bioinformatics\/btm344","article-title":"A review of feature selection techniques in bioinformatics","volume":"23","author":"Saeys","year":"2007","journal-title":"Bioinformatics"},{"key":"2022031506404708600_ref25","first-page":"178","article-title":"Filter methods for feature selection - a comparative study","volume-title":"Proc. 8th IDEAL","author":"S\u00e1nchez-Maro\u00f1o","year":"2007"},{"issue":"3","key":"2022031506404708600_ref26","first-page":"754","article-title":"A top-r feature selection algorithm for microarray gene expression data","volume":"9","author":"Sharma","year":"2011","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"issue":"1","key":"2022031506404708600_ref27","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1111\/andr.12438","article-title":"Roles of histone h3. 5 in human spermatogenesis and spermatogenic disorders","volume":"6","author":"Shiraishi","year":"2018","journal-title":"Andrology"},{"issue":"3","key":"2022031506404708600_ref28","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1186\/s12859-017-1468-4","article-title":"Avc: Selecting discriminative features on basis of auc by maximizing variable complementarity","volume":"18","author":"Sun","year":"2017","journal-title":"BMC Bioinf"},{"key":"2022031506404708600_ref29","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.patcog.2019.01.047","article-title":"A novel ecoc algorithm for multiclass microarray data classification based on data complexity analysis","volume":"90","author":"Sun","year":"2019","journal-title":"Pattern Recognition"},{"issue":"4","key":"2022031506404708600_ref30","doi-asserted-by":"crossref","first-page":"1378","DOI":"10.1093\/bib\/bbz061","article-title":"A critical assessment of the feature selection methods used for biomarker discovery in current metaproteomics studies","volume":"21","author":"Tang","year":"2020","journal-title":"Briefings Bioinform"},{"issue":"10","key":"2022031506404708600_ref31","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"4","key":"2022031506404708600_ref32","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1109\/TKDE.2017.2650906","article-title":"Feature selection by maximizing independent classification information","volume":"29","author":"Wang","year":"2017","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2022031506404708600_ref33","first-page":"400","article-title":"Feature selection for maximizing the area under the roc curve","author":"Wang","year":"2009","journal-title":"Proc 13th ICDMW"},{"issue":"3","key":"2022031506404708600_ref34","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1109\/TKDE.2009.114","article-title":"Ensemble rough hypercuboid approach for classifying cancers","volume":"22","author":"Wei","year":"2010","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"2022031506404708600_ref35","first-page":"687","article-title":"Data visualization and feature selection: New algorithms for nongaussian data","volume":"12","author":"Yang","year":"2000","journal-title":"Advances Neural Inf Process Syst"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbac029\/42806627\/bbac029.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbac029\/42806627\/bbac029.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,15]],"date-time":"2022-03-15T06:52:55Z","timestamp":1647327175000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac029\/6536295"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,25]]},"references-count":35,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3,10]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac029","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2022,3]]},"published":{"date-parts":[[2022,2,25]]},"article-number":"bbac029"}}