{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,15]],"date-time":"2024-08-15T05:10:35Z","timestamp":1723698635375},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"S14","license":[{"start":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T00:00:00Z","timestamp":1598918400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T00:00:00Z","timestamp":1601424000000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>The abundance of molecular profiling of breast cancer tissues entailed active research on molecular marker-based early diagnosis of metastasis. Recently there is a surging interest in combining gene expression with gene networks such as protein-protein interaction (PPI) network, gene co-expression (CE) network and pathway information to identify robust and accurate biomarkers for metastasis prediction, reflecting the common belief that cancer is a systems biology disease. However, controversy exists in the literature regarding whether network markers are indeed better features than genes alone for predicting as well as understanding metastasis. We believe much of the existing results may have been biased by the overly complicated prediction algorithms, unfair evaluation, and lack of rigorous statistics. In this study, we propose a simple approach to use network edges as features, based on two types of networks respectively, and compared their prediction power using three classification algorithms and rigorous statistical procedure on one of the largest datasets available. To detect biomarkers that are significant for the prediction and to compare the robustness of different feature types, we propose an unbiased and novel procedure to measure feature importance that eliminates the potential bias from factors such as different sample size, number of features, as well as class distribution.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Experimental results reveal that edge-based feature types consistently outperformed gene-based feature type in random forest and logistic regression models under all performance evaluation metrics, while the prediction accuracy of edge-based support vector machine (SVM) model was poorer, due to the larger number of edge features compared to gene features and the lack of feature selection in SVM model. Experimental results also show that edge features are much more robust than gene features and the top biomarkers from edge feature types are statistically more significantly enriched in the biological processes that are well known to be related to breast cancer metastasis.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Overall, this study validates the utility of edge features as biomarkers but also highlights the importance of carefully designed experimental procedures in order to achieve statistically reliable comparison results.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-020-03692-2","type":"journal-article","created":{"date-parts":[[2020,9,30]],"date-time":"2020-09-30T08:05:22Z","timestamp":1601453122000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Robust edge-based biomarker discovery improves prediction of breast cancer metastasis"],"prefix":"10.1186","volume":"21","author":[{"given":"Nahim","family":"Adnan","sequence":"first","affiliation":[]},{"given":"Chengwei","family":"Lei","sequence":"additional","affiliation":[]},{"given":"Jianhua","family":"Ruan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,30]]},"reference":[{"issue":"8","key":"3692_CR1","doi-asserted-by":"publisher","first-page":"591","DOI":"10.1038\/nrc1670","volume":"5","author":"B Weigelt","year":"2005","unstructured":"Weigelt B, Peterse JL, Van\u2019t Veer LJ. Breast cancer metastasis: markers and models. Nat Rev Cancer. 2005; 5(8):591\u2013602.","journal-title":"Nat Rev Cancer"},{"issue":"1","key":"3692_CR2","doi-asserted-by":"publisher","first-page":"7","DOI":"10.3322\/caac.21442","volume":"68","author":"RL Siegel","year":"2016","unstructured":"Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin. 2016; 68(1):7\u201330.","journal-title":"CA Cancer J Clin"},{"key":"3692_CR3","unstructured":"Breast Cancer - Metastatic: Statistics. Online. https:\/\/www.cancer.net\/cancer-types\/breast-cancer-metastatic\/statistics. Accessed 20 Feb 2019."},{"key":"3692_CR4","doi-asserted-by":"publisher","first-page":"530","DOI":"10.1038\/415530a","volume":"415","author":"LJ Van\u2019t Veer","year":"2002","unstructured":"Van\u2019t Veer LJ, Dai H, Van De Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, Van Der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002; 415:530\u20136.","journal-title":"Nature"},{"issue":"25","key":"3692_CR5","doi-asserted-by":"publisher","first-page":"1999","DOI":"10.1056\/NEJMoa021967","volume":"347","author":"MJ Van De Vijver","year":"2002","unstructured":"Van De Vijver MJ, He YD, Van \u2019t Veer LJ, Dai H, Hart AAM, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, Van Der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002; 347(25):1999\u20132009.","journal-title":"N Engl J Med"},{"issue":"9460","key":"3692_CR6","doi-asserted-by":"publisher","first-page":"671","DOI":"10.1016\/S0140-6736(05)17947-1","volume":"365","author":"Y Wang","year":"2005","unstructured":"Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. The Lancet. 2005; 365(9460):671\u20139.","journal-title":"The Lancet"},{"issue":"2","key":"3692_CR7","first-page":"171","volume":"21","author":"D Givol","year":"2004","unstructured":"Givol D, Domany E, Getz G, Kela I, Ein-Dor L. Outcome signature genes in breast cancer: is there a unique set?Bioinformatics. 2004; 21(2):171\u20138.","journal-title":"Bioinformatics"},{"key":"3692_CR8","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1186\/1471-2164-9-375","volume":"9","author":"MH van Vliet","year":"2008","unstructured":"van Vliet MH, Reyal F, Horlings HM, van de Vijver MJ, Reinders MJ, Wessels LF. Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability. BMC Genomics. 2008; 9:375.","journal-title":"BMC Genomics"},{"issue":"1","key":"3692_CR9","doi-asserted-by":"publisher","first-page":"140","DOI":"10.1038\/msb4100180","volume":"3","author":"H-Y Chuang","year":"2007","unstructured":"Chuang H-Y, Lee E, Liu Y-T, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Mol Syst Biol. 2007; 3(1):140.","journal-title":"Mol Syst Biol"},{"issue":"2","key":"3692_CR10","doi-asserted-by":"publisher","first-page":"212","DOI":"10.1093\/biostatistics\/kxl002","volume":"8","author":"MY Park","year":"2006","unstructured":"Park MY, Hastie T, Tibshirani R. Averaged gene expressions for regression. Biostatistics. 2006; 8(2):212\u201327.","journal-title":"Biostatistics"},{"key":"3692_CR11","doi-asserted-by":"publisher","first-page":"1338","DOI":"10.1038\/ng.2007.2","volume":"39","author":"MA Pujana","year":"2007","unstructured":"Pujana MA, Han J-DJ, Starita LM, Stevens KN, Tewari M, Ahn JS, Rennert G, Moreno V, Kirchhoff T, Gold B, Assmann V, ElShamy WM, Rual J-F, Levine D, Rozek LS, Gelman RS, Gunsalus KC, Greenberg RA, Sobhian B, Bertin N, Venkatesan K, Ayivi-Guedehoussou N, Sol\u00e9 X, Hern\u00e1ndez P, L\u00e1zaro C, Nathanson KL, Weber BL, Cusick ME, Hill DE, Offit K, Livingston DM, Gruber SB, Parvin JD, Vidal M. Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat Genet. 2007; 39:1338\u201349.","journal-title":"Nat Genet"},{"issue":"11","key":"3692_CR12","doi-asserted-by":"publisher","first-page":"1000217","DOI":"10.1371\/journal.pcbi.1000217","volume":"4","author":"E Lee","year":"2008","unstructured":"Lee E, Chuang H-Y, Kim J-W, Ideker T, Lee D. Inferring pathway activity toward precise disease classification. PLoS Comput Biol. 2008; 4(11):1000217.","journal-title":"PLoS Comput Biol"},{"issue":"2","key":"3692_CR13","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1038\/nbt.1522","volume":"27","author":"IW Taylor","year":"2009","unstructured":"Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, Bull S, Pawson T, Morris Q, Wrana JL. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nat Biotechnol. 2009; 27(2):199\u2013204.","journal-title":"Nat Biotechnol"},{"issue":"18","key":"3692_CR14","doi-asserted-by":"publisher","first-page":"625","DOI":"10.1093\/bioinformatics\/btq393","volume":"26","author":"A Sch\u00f6nhuth","year":"2010","unstructured":"Sch\u00f6nhuth A, Davicioni E, Moser F, Ester M, Dao P, Salari R, Colak R. Inferring cancer subnetwork markers using density-constrained biclustering. Bioinformatics. 2010; 26(18):625\u201331.","journal-title":"Bioinformatics"},{"issue":"1","key":"3692_CR15","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1186\/1471-2105-11-277","volume":"11","author":"G Abraham","year":"2010","unstructured":"Abraham G, Kowalczyk A, Loi S, Haviv I, Zobel J. Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context. BMC Bioinformatics. 2010; 11(1):277.","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"3692_CR16","doi-asserted-by":"publisher","first-page":"222","DOI":"10.1515\/jib-2011-188","volume":"8","author":"E van den Akker","year":"2011","unstructured":"van den Akker E, Verbruggen B, Heijmans B, Beekman M, Kok J, Slagboom E, Reinders M. Integrating protein-protein interaction networks with gene-gene co-expression networks improves gene signatures for classifying breast cancer metastasis. J Integr Bioinforma. 2011; 8(2):222\u201338.","journal-title":"J Integr Bioinforma"},{"issue":"5","key":"3692_CR17","doi-asserted-by":"publisher","first-page":"1002511","DOI":"10.1371\/journal.pcbi.1002511","volume":"8","author":"C Winter","year":"2012","unstructured":"Winter C, Kristiansen G, Kersting S, Roy J, Aust D, Kn\u00f6sel T, R\u00fcmmele P, Jahnke B, Hentrich V, R\u00fcckert F, Niedergethmann M, Weichert W, Bahra M, Schlitt HJ, Settmacher U, Friess H, B\u00fcchler M, Saeger H-D, Schroeder M, Pilarsky C, Gr\u00fctzmann R. Google goes cancer: improving outcome prediction for cancer patients by network-based ranking of marker genes. PLoS Comput Biol. 2012; 8(5):1002511.","journal-title":"PLoS Comput Biol"},{"issue":"12","key":"3692_CR18","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1093\/bioinformatics\/btv255","volume":"31","author":"A Allahyar","year":"2015","unstructured":"Allahyar A, De Ridder J. FERAL: network-based classifier with application to breast cancer outcome prediction. Bioinformatics. 2015; 31(12):311\u20139.","journal-title":"Bioinformatics"},{"issue":"16","key":"3692_CR19","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1093\/nar\/gkx642","volume":"45","author":"N Alcaraz","year":"2017","unstructured":"Alcaraz N, Vandin F, Baumbach J, Ditzel HJ, List M, Batra R. De novo pathway-based biomarker identification. Nucleic Acids Res. 2017; 45(16):151.","journal-title":"Nucleic Acids Res"},{"key":"3692_CR20","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1016\/j.jtbi.2014.05.041","volume":"362","author":"W Zhang","year":"2014","unstructured":"Zhang W, Zeng T, Chen L. EdgeMarker: identifying differentially correlated molecule pairs as edge-biomarkers. J Theor Biol. 2014; 362:35\u201343.","journal-title":"J Theor Biol"},{"issue":"2","key":"3692_CR21","doi-asserted-by":"publisher","first-page":"241","DOI":"10.1136\/amiajnl-2011-000658","volume":"19","author":"X Liu","year":"2012","unstructured":"Liu X, Liu Z-P, Zhao X-M, Chen L. Identifying disease genes and module biomarkers by differential interactions. J Am Med Inform Assoc. 2012; 19(2):241\u20138.","journal-title":"J Am Med Inform Assoc"},{"issue":"17","key":"3692_CR22","doi-asserted-by":"publisher","first-page":"2399","DOI":"10.1093\/bioinformatics\/btu199","volume":"30","author":"R Ben-Hamo","year":"2014","unstructured":"Ben-Hamo R, Gidoni M, Efroni S. PhenoNet: identification of key networks associated with disease phenotype. Bioinformatics. 2014; 30(17):2399\u2013405.","journal-title":"Bioinformatics"},{"issue":"4","key":"3692_CR23","doi-asserted-by":"publisher","first-page":"563","DOI":"10.1093\/bioinformatics\/btu672","volume":"31","author":"S Ma","year":"2015","unstructured":"Ma S, Jiang T, Jiang R. Differential regulation enrichment analysis via the integration of transcriptional regulatory network and gene expression data. Bioinformatics. 2015; 31(4):563\u201371.","journal-title":"Bioinformatics"},{"issue":"9","key":"3692_CR24","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1093\/nar\/gku182","volume":"42","author":"Y Li","year":"2014","unstructured":"Li Y, Liang C, Wong K-C, Jin K, Zhang Z. Inferring probabilistic miRNA\u2013mRNA interaction signatures in cancers: a role-switch approach. Nucleic Acids Res. 2014; 42(9):76.","journal-title":"Nucleic Acids Res"},{"key":"3692_CR25","doi-asserted-by":"publisher","first-page":"289","DOI":"10.3389\/fgene.2013.00289","volume":"4","author":"C Staiger","year":"2013","unstructured":"Staiger C, Cadot S, Gy\u00f6rffy B, Wessels L, Klau G. Current composite-feature classification methods do not outperform simple single-genes classifiers in breast cancer prognosis. Front Genet. 2013; 4:289.","journal-title":"Front Genet"},{"issue":"4","key":"3692_CR26","doi-asserted-by":"publisher","first-page":"34796","DOI":"10.1371\/journal.pone.0034796","volume":"7","author":"C Staiger","year":"2012","unstructured":"Staiger C, Cadot S, Kooter R, Dittrich M, M\u00fcller T, Klau GW, Wessels LFA. A critical evaluation of network and pathway-based classifiers for outcome prediction in breast cancer. PLoS ONE. 2012; 7(4):34796.","journal-title":"PLoS ONE"},{"issue":"10","key":"3692_CR27","doi-asserted-by":"publisher","first-page":"2257","DOI":"10.1093\/annonc\/mdq758","volume":"22","author":"X Zhang","year":"2011","unstructured":"Zhang X, Yan Z, Zhang J, Gong L, Li W, Cui J, Liu Y, Gao Z, Li J, Shen L, Lu Y. Combination of hsa-miR-375 and hsa-miR-142-5p as a predictor for recurrence risk in gastric cancer patients following surgical resection. Ann Oncol. 2011; 22(10):2257\u201366.","journal-title":"Ann Oncol"},{"issue":"1","key":"3692_CR28","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1186\/1471-2105-7-3","volume":"7","author":"R D\u00edaz-Uriarte","year":"2006","unstructured":"D\u00edaz-Uriarte R, De Andres SA. Gene selection and classification of microarray data using random forest. BMC Bioinformatics. 2006; 7(1):3.","journal-title":"BMC Bioinformatics"},{"issue":"5","key":"3692_CR29","first-page":"1","volume":"13","author":"N Adnan","year":"2020","unstructured":"Adnan N, Liu Z, Huang TH, Ruan J. Comparative evaluation of network features for the prediction of breast cancer metastasis. BMC Med Genet. 2020; 13(5):1\u201310.","journal-title":"BMC Med Genet"},{"issue":"D1","key":"3692_CR30","doi-asserted-by":"publisher","first-page":"D529","DOI":"10.1093\/nar\/gky1079","volume":"47","author":"R McAdam","year":"2019","unstructured":"Oughtred R, Stark C, Breitkreutz BJ, Rust J, Boucher L, Chang C, Kolas N, O\u2019Donnell L, Leung G, McAdam R, Zhang F, Dolma S, Willems A, Coulombe-Huntington J, Chatr-Aryamontri A, Dolinski K, Tyers M. The BioGRID interaction database: 2019 update. Nucleic Acids Res. 2019; 47(D1):D529\u2013D541.","journal-title":"Nucleic Acids Res"},{"key":"3692_CR31","doi-asserted-by":"crossref","unstructured":"Pearson\u2019s Correlation Coefficient In: Kirch W, editor. Encyclopedia of Public Health. Dordrecht: Springer: 2008. p. 1090\u20131.","DOI":"10.1007\/978-1-4020-5614-7_2569"},{"key":"3692_CR32","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1103\/RevModPhys.74.47","volume":"74","author":"R Albert","year":"2002","unstructured":"Albert R, Barab\u00e1si A-L. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74:47\u201397.","journal-title":"Rev Mod Phys"},{"key":"3692_CR33","unstructured":"Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, VanderPlas J, Joly A, Holt B, Varoquaux G. API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning: 2013. p. 108\u201322."},{"volume-title":"Area under the ROC Curve","year":"2013","key":"3692_CR34","unstructured":"Melo F. In: Dubitzky W, Wolkenhauer O, Cho K-H, Yokota H, (eds).Area under the ROC Curve. New York: Springer; 2013, pp. 38\u20139."},{"issue":"3","key":"3692_CR35","doi-asserted-by":"publisher","first-page":"276","DOI":"10.11613\/BM.2012.031","volume":"22","author":"ML McHugh","year":"2012","unstructured":"McHugh ML. Interrater reliability: the kappa statistic. Biochemia Medica. 2012; 22(3):276\u201382.","journal-title":"Biochemia Medica"},{"key":"3692_CR36","doi-asserted-by":"crossref","unstructured":"Chinchor N. MUC-4 Evaluation Metrics. In: Proc. of the Fourth Message Understanding Conference: 1992. p. 22\u201329.","DOI":"10.3115\/1072064.1072067"},{"key":"3692_CR37","doi-asserted-by":"crossref","unstructured":"Pepe MS. The statistical evaluation of medical tests for classification and prediction: Oxford University Press; 2003.","DOI":"10.1093\/oso\/9780198509844.001.0001"},{"key":"3692_CR38","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4614-6849-3","volume-title":"Applied Predictive Modeling","author":"M Kuhn","year":"2013","unstructured":"Kuhn M, Johnson K, Vol. 26. Applied Predictive Modeling. New York: Springer; 2013."},{"issue":"1","key":"3692_CR39","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/nar\/gkn923","volume":"37","author":"DW Huang","year":"2008","unstructured":"Huang DW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2008; 37(1):1\u201313.","journal-title":"Nucleic Acids Res"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03692-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03692-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03692-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,15]],"date-time":"2024-08-15T04:55:34Z","timestamp":1723697734000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03692-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9]]},"references-count":39,"journal-issue":{"issue":"S14","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["3692"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03692-2","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2020,9]]},"assertion":[{"value":"30 September 2020","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"359"}}