{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,22]],"date-time":"2026-03-22T04:03:53Z","timestamp":1774152233844,"version":"3.50.1"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,7,10]],"date-time":"2020-07-10T00:00:00Z","timestamp":1594339200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,10]],"date-time":"2020-07-10T00:00:00Z","timestamp":1594339200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000005","name":"U.S. Department of Defense","doi-asserted-by":"publisher","award":["W81XWH-19-1-0495"],"award-info":[{"award-number":["W81XWH-19-1-0495"]}],"id":[{"id":"10.13039\/100000005","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000092","name":"U.S. National Library of Medicine","doi-asserted-by":"publisher","award":["R01LM011663"],"award-info":[{"award-number":["R01LM011663"]}],"id":[{"id":"10.13039\/100000092","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Even though we have established a few risk factors for <jats:italic>metastatic breast cancer<\/jats:italic> (<jats:italic>MBC<\/jats:italic>) through epidemiologic studies, these risk factors have not proven to be effective in predicting an individual\u2019s risk of developing metastasis. Therefore, identifying critical risk factors for MBC continues to be a major research imperative, and one which can lead to advances in breast cancer clinical care. The objective of this research is to leverage Bayesian Networks (BN) and information theory to identify key risk factors for breast cancer metastasis from data.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Methods<\/jats:title>\n                <jats:p>We develop the <jats:italic>Markov Blanket and Interactive risk factor Learner<\/jats:italic> (<jats:italic>MBIL<\/jats:italic>) algorithm, which learns single and interactive risk factors having a direct influence on a patient\u2019s outcome. We evaluate the effectiveness of MBIL using simulated datasets, and compare MBIL with the BN learning algorithms <jats:italic>Fast Greedy Search<\/jats:italic> (<jats:italic>FGS<\/jats:italic>), <jats:italic>PC algorithm<\/jats:italic> (<jats:italic>PC<\/jats:italic>), and <jats:italic>CPC algorithm<\/jats:italic> (<jats:italic>CPC<\/jats:italic>). We apply MBIL to learn risk factors for 5\u2009year breast cancer metastasis using a clinical dataset we curated. We evaluate the learned risk factors by consulting with breast cancer experts and literature. We further evaluate the effectiveness of MBIL at learning risk factors for breast cancer metastasis by comparing it to the BN learning algorithms <jats:italic>Necessary Path Conditio<\/jats:italic>n (<jats:italic>NPC<\/jats:italic>) and <jats:italic>Greedy Equivalent Search<\/jats:italic> (<jats:italic>GES<\/jats:italic>).<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>The averages of the Jaccard index for the simulated datasets containing 2000 records were 0.705, 0.272, 0.228, and 0.147 for MBIL, FGS, PC, and CPC respectively. MBIL, NPC, and GES all learned that <jats:italic>grade<\/jats:italic> and <jats:italic>lymph_nodes_positive<\/jats:italic> are direct risk factors for 5\u2009year metastasis. Only MBIL and NPC found that <jats:italic>surgical_margins<\/jats:italic> is a direct risk factor. Only NPC found that <jats:italic>invasive<\/jats:italic> is a direct risk factor. MBIL learned that <jats:italic>HER2<\/jats:italic> and <jats:italic>ER<\/jats:italic> interact to directly affect 5\u2009year metastasis. Neither GES nor NPC learned that <jats:italic>HER2<\/jats:italic> and <jats:italic>ER<\/jats:italic> are direct risk factors.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Discussion<\/jats:title>\n                <jats:p>The results involving simulated datasets indicated that MBIL can learn direct risk factors substantially better than standard Bayesian network learning algorithms. An application of MBIL to a real breast cancer dataset identified both single and interactive risk factors that directly influence breast cancer metastasis, which can be investigated further.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-020-03638-8","type":"journal-article","created":{"date-parts":[[2020,7,10]],"date-time":"2020-07-10T08:04:00Z","timestamp":1594368240000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Leveraging Bayesian networks and information theory to learn risk factors for breast cancer metastasis"],"prefix":"10.1186","volume":"21","author":[{"given":"Xia","family":"Jiang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alan","family":"Wells","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Adam","family":"Brufsky","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Darshan","family":"Shetty","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kahmil","family":"Shajihan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Richard E.","family":"Neapolitan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,7,10]]},"reference":[{"key":"3638_CR1","unstructured":"CDC (Centers for Disease Control and Prevention), Leading Causes of Death in Females, United States. https:\/\/www.cdc.gov\/women\/lcod\/index.htm. Accessed Jan 2018."},{"key":"3638_CR2","volume-title":"Cancer Facts and Figures","author":"American Cancer Society","year":"2018","unstructured":"American Cancer Society. Cancer Facts and Figures. Atlanta: American Cancer Society, Inc; 2018. https:\/\/www.cancer.org\/research\/cancer-facts-statistics\/all-cancer-facts-figures\/cancer-facts-figures-2018.html."},{"key":"3638_CR3","unstructured":"U.S. breast cancer statistic, breast cancer.org. https:\/\/www.breastcancer.org\/symptoms\/understand_bc\/statistics. Accessed Jan 2018."},{"key":"3638_CR4","unstructured":"The Breast Cancer Landscape. https:\/\/cdmrp.army.mil\/bcrp\/pdfs\/Breast%20Cancer%20Landscape.pdf. Accessed Jan 2018."},{"issue":"4","key":"3638_CR5","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1016\/j.cell.2006.11.001","volume":"127","author":"GP Gupta","year":"2006","unstructured":"Gupta GP, Massague J. Cancer metastasis: building a framework. Cell. 2006;127(4):679\u201395.","journal-title":"Cell"},{"key":"3638_CR6","unstructured":"Statistics for Metastatic Breast Cancer Metastatic. Breast Cancer Network. http:\/\/mbcn.org\/education\/category\/statistics\/. Accessed Jan 2018."},{"issue":"22","key":"3638_CR7","doi-asserted-by":"publisher","first-page":"1681","DOI":"10.1093\/jnci\/87.22.1681","volume":"87","author":"RG Ziegler","year":"1995","unstructured":"Ziegler RG, Benichou J, Byrne C, et al. Proportion of breast cancer cases in the United States explained by well-established risk factors. J Natl Cancer Inst. 1995;87(22):1681\u20135.","journal-title":"J Natl Cancer Inst"},{"key":"3638_CR8","doi-asserted-by":"publisher","first-page":"346","DOI":"10.1038\/nature10983","volume":"486","author":"C Curtis","year":"2012","unstructured":"Curtis C, Shah SP, Chin SF, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroup. Nature. 2012;486:346\u201352.","journal-title":"Nature"},{"key":"3638_CR9","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1186\/s12859-016-1084-8","volume":"17","author":"Z Zeng","year":"2016","unstructured":"Zeng Z, Jiang X, Neapolitan RE. Discovering causal interactions using Bayesian network scoring and Information gain. BMC Bioinformatics. 2016;17:21.","journal-title":"BMC Bioinformatics"},{"issue":"5631","key":"3638_CR10","doi-asserted-by":"publisher","first-page":"336","DOI":"10.1126\/science.1085242","volume":"301","author":"JC Carrington","year":"2003","unstructured":"Carrington JC, Ambros V. Role of microRNAs in plant and animal development. Science. 2003;301(5631):336\u20138.","journal-title":"Science."},{"key":"3638_CR11","doi-asserted-by":"publisher","unstructured":"Lee S, Jiang X. Modeling miRNA-mRNA interactions that cause phenotypic abnormality in breast cancer patients. PLoS One. 2017;12(8). https:\/\/doi.org\/10.1371\/journal.pone.0182666.","DOI":"10.1371\/journal.pone.0182666"},{"issue":"11","key":"3638_CR12","doi-asserted-by":"publisher","first-page":"2348","DOI":"10.1261\/rna.1034808","volume":"14","author":"L-X Yan","year":"2008","unstructured":"Yan L-X, Huang X-F, Shao Q, Huang M-Y, Deng L, Wu Q-L, et al. MicroRNA miR-21 overexpression in human breast cancer is associated with advanced clinical stage, lymph node metastasis and patient poor prognosis. Rna. 2008;14(11):2348\u201360.","journal-title":"Rna"},{"issue":"3","key":"3638_CR13","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1038\/cr.2008.24","volume":"18","author":"S Zhu","year":"2008","unstructured":"Zhu S, Wu H, Wu F, Nie D, Sheng S, Mo Y-Y. MicroRNA-21 targets tumor suppressor genes in invasion and metastasis. Cell Res. 2008;18(3):350\u20139.","journal-title":"Cell Res"},{"key":"3638_CR14","volume-title":"Learning Bayesian Networks","author":"RE Neapolitan","year":"2004","unstructured":"Neapolitan RE. Learning Bayesian Networks. Upper Saddle River: Prentice Hall; 2004."},{"key":"3638_CR15","volume-title":"Probabilistic Reasoning in Intelligent Systems","author":"J Pearl","year":"1988","unstructured":"Pearl J. Probabilistic Reasoning in Intelligent Systems. Burlington: Morgan Kaufmann; 1988."},{"key":"3638_CR16","volume-title":"Probabilistic reasoning in expert systems","author":"RE Neapolitan","year":"1989","unstructured":"Neapolitan RE. Probabilistic reasoning in expert systems. NY: Wiley; 1989."},{"key":"3638_CR17","volume-title":"Bayesian networks and influence diagrams","author":"UB Kjaerulff","year":"2010","unstructured":"Kjaerulff UB, Madsen AL. Bayesian networks and influence diagrams. NY: Springer; 2010."},{"key":"3638_CR18","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-2748-9","volume-title":"Causation, Prediction, and Search","author":"P Spirtes","year":"1993","unstructured":"Spirtes P, Glymour C, Scheines R. Causation, Prediction, and Search. 2nd ed. New York: Springer-Verlag; 1993. Boston, MA; MIT Press; 2000. (https:\/\/bd2kccd.github.io\/docs\/tetrad\/).","edition":"2"},{"key":"3638_CR19","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1023\/A:1020249912095","volume":"50","author":"N Friedman","year":"2003","unstructured":"Friedman N, Koller K. Being Bayesian about network structure: a Bayesian approach to structure discovery in Bayesian networks. Mach Learn. 2003;50:95\u2013125.","journal-title":"Mach Learn"},{"issue":"3","key":"3638_CR20","first-page":"197","volume":"20","author":"D Heckerman","year":"1995","unstructured":"Heckerman D, Geiger D, Chickering D. Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn. 1995;20(3):197\u2013243.","journal-title":"Mach Learn"},{"key":"3638_CR21","first-page":"309","volume":"9","author":"GF Cooper","year":"1992","unstructured":"Cooper GF, Herskovits E. A Bayesian method for the induction of probabilistic networks from data. Mach Learn. 1992;9:309\u201347.","journal-title":"Mach Learn"},{"key":"3638_CR22","volume-title":"Learning from data: lecture notes in statistics","author":"M Chickering","year":"1996","unstructured":"Chickering M. Learning Bayesian networks is NP-complete. In: Fisher D, Lenz H, editors. Learning from data: lecture notes in statistics. New York: Springer Verlag; 1996."},{"key":"3638_CR23","doi-asserted-by":"publisher","unstructured":"Jiang X, Jao J, Neapolitan RE. Learning predictive interactions using Information Gain and Bayesian network scoring. PLOS ONE. 2015. https:\/\/doi.org\/10.1371\/journal.pone.0143247.","DOI":"10.1371\/journal.pone.0143247"},{"issue":"3","key":"3638_CR24","doi-asserted-by":"publisher","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","volume":"27","author":"CE Shannon","year":"1948","unstructured":"Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(3):379\u2013423.","journal-title":"Bell Syst Tech J"},{"key":"3638_CR25","doi-asserted-by":"publisher","first-page":"338","DOI":"10.1016\/S0019-9958(65)90241-X","volume":"8","author":"LA Zadeh","year":"1965","unstructured":"Zadeh LA. Fuzzy sets. Inf Control. 1965;8:338\u201353.","journal-title":"Inf Control"},{"key":"3638_CR26","doi-asserted-by":"crossref","unstructured":"Fabian CJ. The what, why and how of aromatase inhibitors: hormonal agents for treatment and prevention of breast cancer. Int J Clin Pract. 2007;61(12):2051\u201363. https:\/\/onlinelibrary.wiley.com\/doi\/full\/10.1111\/j.1742-1241.2007.01587.x.","DOI":"10.1111\/j.1742-1241.2007.01587.x"},{"key":"3638_CR27","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1186\/1756-0381-5-16","volume":"5","author":"RJ Urbanowicz","year":"2012","unstructured":"Urbanowicz RJ, Kiralis J, Sinnott-Armstrong NA, et al. GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData Min. 2012;5:16.","journal-title":"BioData Min"},{"issue":"2","key":"3638_CR28","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1210\/er.2006-0045","volume":"29","author":"G Arpino","year":"2008","unstructured":"Arpino G, Wiechmann L, Osborne CK, Schiff R. Crosstalk between the estrogen receptor and the HER tyrosine kinase receptor family: molecular mechanism and clinical implications for endocrine therapy resistance. Endocr Rev. 2008;29(2):217\u201333.","journal-title":"Endocr Rev"},{"issue":"1\u20132","key":"3638_CR29","doi-asserted-by":"publisher","first-page":"4","DOI":"10.3121\/cmr.2008.825","volume":"7","author":"AA Onitilo","year":"2009","unstructured":"Onitilo AA, Engel JM, Greenlee RT, Mukesh BN. Breast cancer subtypes based on ER\/PR and Her2 expression: comparison of clinicopathologic features and survival. Clin Med Res. 2009;7(1\u20132):4\u201313.","journal-title":"Clin Med Res"},{"issue":"2","key":"3638_CR30","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1007\/s10549-016-4059-6","volume":"161","author":"X Li","year":"2017","unstructured":"Li X, Yang J, Peng L, Sahin AA, et al. Triple-negative breast cancer has worse overall survival and cause-specific survival than non-triple-negative breast cancer. Breast Cancer Res Treat. 2017;161(2):297\u201387.","journal-title":"Breast Cancer Res Treat"},{"issue":"3","key":"3638_CR31","doi-asserted-by":"publisher","first-page":"743","DOI":"10.1007\/s10549-017-4383-5","volume":"165","author":"CA Parise","year":"2017","unstructured":"Parise CA, Caggiano V. Risk of mortality of node-negative, ER\/PR\/HER2 breast cancer subtypes in T1, T2, and T3 tumors. Breast Cancer Res Treat. 2017;165(3):743\u201350.","journal-title":"Breast Cancer Res Treat"},{"issue":"2","key":"3638_CR32","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1007\/s10549-017-4620-y","volume":"168","author":"AF Pichilingue-Febres","year":"2018","unstructured":"Pichilingue-Febres AF, Arias-Linares MA, Araujo-Castillo RV. Comments on \"risk of mortality of node-negative, ER\/PR\/HER2 breast cancer subtypes in T1, T2, and T3 tumors\" by Parise CA and Caggiano V, breast Cancer res treat, 2017. Breast Cancer Res Treat. 2018;168(2):577\u20138.","journal-title":"Breast Cancer Res Treat"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03638-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03638-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03638-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,9]],"date-time":"2021-07-09T23:16:11Z","timestamp":1625872571000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03638-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,10]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3638"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03638-8","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,10]]},"assertion":[{"value":"6 June 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 July 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The dataset is a result of the PROTOCOL TITLE: A New Generation Clinical Decision Support System, which was approved by Northwestern University IRB #: STU00200923-MOD0006.The need for patient consent was waived by the ethics committee because the data consists only of de-identified data mined from EHR databases.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"298"}}