{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T10:50:00Z","timestamp":1771066200047,"version":"3.50.1"},"reference-count":18,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2021,5,19]],"date-time":"2021-05-19T00:00:00Z","timestamp":1621382400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,10,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>When designing prediction models built with many features and relatively small sample sizes, feature selection methods often overfit training data, leading to selection of irrelevant features. One way to potentially mitigate overfitting is to incorporate domain knowledge during feature selection. Here, a feature ranking algorithm called \u2018Family Rank\u2019 is presented in which features are ranked based on a combination of graphical domain knowledge and feature scores computed from empirical data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>A simulated dataset is used to demonstrate a scenario in which family rank outperforms other state-of-the-art graph based ranking algorithms, decreasing the sample size needed to detect true predictors by 2- to 3-fold. An example from oncology is then used to explore a real-world application of family rank.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>An implementation of Family Rank is freely available at https:\/\/cran.r-project.org\/package=FamilyRank.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab387","type":"journal-article","created":{"date-parts":[[2021,5,18]],"date-time":"2021-05-18T19:12:03Z","timestamp":1621365123000},"page":"3626-3631","source":"Crossref","is-referenced-by-count":4,"title":["Family Rank: a graphical domain knowledge informed feature ranking algorithm"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6991-3217","authenticated-orcid":false,"given":"Michelle","family":"Saul","sequence":"first","affiliation":[{"name":"College of Health Solutions, Arizona State University , Tempe, AZ 85287-9020, USA"},{"name":"Caris Life Sciences , Tempe, AZ 85281, USA"}]},{"given":"Valentin","family":"Dinu","sequence":"additional","affiliation":[{"name":"College of Health Solutions, Arizona State University , Tempe, AZ 85287-9020, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,5,19]]},"reference":[{"key":"2023051609045929900_btab387-B1","first-page":"3","article-title":"Classification and regression by randomForest","volume":"2","author":"Andy","year":"2002","journal-title":"R. News"},{"key":"2023051609045929900_btab387-B2","doi-asserted-by":"crossref","first-page":"827","DOI":"10.1038\/nbt.1665","article-title":"The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models","volume":"28","author":"Consortium","year":"2010","journal-title":"Nat. Biotechnol"},{"key":"2023051609045929900_btab387-B3","first-page":"1","article-title":"The igraph software package for complex network research","volume":"1695","author":"Csardi","year":"2006","journal-title":"Int. J. Complex Syst"},{"key":"2023051609045929900_btab387-B4","first-page":"5","article-title":"Misc functions of the Department of Statistics (e1071), TU Wien","volume":"1","author":"Dimitriadou","year":"2008","journal-title":"R Package"},{"key":"2023051609045929900_btab387-B5","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/S0955-0674(03)00009-7","article-title":"Function prediction and protein networks","volume":"15","author":"Huynen","year":"2003","journal-title":"Curr. Opin. Cell Biol"},{"key":"2023051609045929900_btab387-B6","volume-title":"Novartis Foundation Symposium","author":"Kanehisa","year":"2002"},{"key":"2023051609045929900_btab387-B7","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/j.mpmed.2007.10.003","article-title":"Principles of cytotoxic chemotherapy","volume":"36","author":"Lind","year":"2008","journal-title":"Medicine"},{"key":"2023051609045929900_btab387-B8","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1186\/1471-2105-6-233","article-title":"GeneRank: using search engine technology for the analysis of microarray experiments","volume":"6","author":"Morrison","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023051609045929900_btab387-B9","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1093\/bioinformatics\/btv634","article-title":"EMDomics: a robust and powerful method for the identification of genes differentially expressed between heterogeneous classes","volume":"32","author":"Nabavi","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051609045929900_btab387-B10","volume-title":"The pagerank citation ranking: Bringing order to the web","year":"1999"},{"key":"2023051609045929900_btab387-B11","doi-asserted-by":"crossref","first-page":"R5","DOI":"10.1186\/bcr2468","article-title":"Effect of training-sample size and classification difficulty on the accuracy of genomic predictors","volume":"12","author":"Popovici","year":"2010","journal-title":"Breast Cancer Res"},{"key":"2023051609045929900_btab387-B12"},{"key":"2023051609045929900_btab387-B13","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1186\/1471-2105-12-77","article-title":"pROC: an open-source package for R and S+ to analyze and compare ROC curves","volume":"12","author":"Robin","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051609045929900_btab387-B14","author":"Saul","year":"2021","journal-title":"FamilyRank: Algorithm for Ranking Predictors Using Graphical Domain Knowledge. Release 1.0"},{"key":"2023051609045929900_btab387-B15","doi-asserted-by":"crossref","first-page":"D447","DOI":"10.1093\/nar\/gku1003","article-title":"STRING v10: protein\u2013protein interaction networks, integrated over the tree of life","volume":"43","author":"Szklarczyk","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023051609045929900_btab387-B16","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-21706-2","volume-title":"Modern Applied Statistics with S","author":"Venables","year":"2002"},{"key":"2023051609045929900_btab387-B17","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1093\/nar\/gkg034","article-title":"STRING: a database of predicted functional associations between proteins","volume":"31","author":"Von Mering","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023051609045929900_btab387-B18","doi-asserted-by":"crossref","first-page":"D433","DOI":"10.1093\/nar\/gki005","article-title":"STRING: known and predicted protein\u2013protein associations, integrated and transferred across organisms","volume":"33","author":"Von Mering","year":"2005","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab387\/37957646\/btab387.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3626\/50338745\/btab387.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3626\/50338745\/btab387.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T09:07:22Z","timestamp":1684228042000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/20\/3626\/6278293"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,5,19]]},"references-count":18,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2021,10,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab387","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,10,15]]},"published":{"date-parts":[[2021,5,19]]}}}