{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T21:39:17Z","timestamp":1773524357943,"version":"3.50.1"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T00:00:00Z","timestamp":1742342400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T00:00:00Z","timestamp":1742342400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>The Mapper algorithm is a data mining topological tool that can help us to obtain higher level understanding of disease by visualising the structure of patient data as a similarity graph. It has been successfully applied for exploratory analysis of cancer data in the past, delivering several significant subgroup discoveries. Using the Mapper algorithm in practice requires setting up multiple parameters. The graph then needs to be manually analysed according to a research question at hand. It has been highlighted in the literature that Mapper\u2019s parameters have significant impact on the output graph shape and there is no established way to select their optimal values. Hence while using the Mapper algorithm, different parameter values and consequently different output graphs need to be studied. This prevents routine application of the Mapper algorithm in real world settings.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Methods<\/jats:title>\n            <jats:p>We propose a new algorithm for subgroup discovery within the Mapper graph. We refer to the task as hotspot detection as it is designed to identify homogenous and geometrically compact subsets of patients, which are distinct with respect to their clinical or molecular profiles (e.g. survival). Furthermore, we propose to include the existence of a hotspot as a criterion while searching the parameter space, addressing one of the key limitations of the Mapper algorithm (i.e. parameter selection).<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>Two experiments were performed to demonstrate the efficacy of the algorithm, including an artificial hotspot in the Two Circles dataset and a real world case study of subgroup discovery in oestrogen receptor-positive breast cancer. Our hotspot detection algorithm successfully identified graphs containing homogenous communities of nodes within the Two Circles dataset. When applied to gene expression data of ER+ breast cancer patients, appropriate parameters were identified to generate a Mapper graph revealing a hotspot of ER+ patients with poor prognosis and characteristic patterns of gene expression. This was subsequently confirmed in an independent breast cancer dataset.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Our proposed method can be effectively applied for subgroup discovery with pathology data. It allows\u00a0us to find optimal parameters of the Mapper algorithm, bridging the gap between its potential and the translational research.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/s12911-025-02852-9","type":"journal-article","created":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T02:06:41Z","timestamp":1742350001000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["A novel method for subgroup discovery in precision medicine based on topological data analysis"],"prefix":"10.1186","volume":"25","author":[{"given":"Ciara F.","family":"Loughrey","sequence":"first","affiliation":[]},{"given":"Sarah","family":"Maguire","sequence":"additional","affiliation":[]},{"given":"Pawe\u0142","family":"D\u0142otko","sequence":"additional","affiliation":[]},{"given":"Lu","family":"Bai","sequence":"additional","affiliation":[]},{"given":"Nick","family":"Orr","sequence":"additional","affiliation":[]},{"given":"Anna","family":"Jurek-Loughrey","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,3,19]]},"reference":[{"key":"2852_CR1","unstructured":"Singh G, M\u00e9moli F, Carlsson GE, et\u00a0al. Topological methods for the analysis of high dimensional data sets and 3d object recognition. PBG@ Eurographics. 2007;2."},{"key":"2852_CR2","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/j.coisb.2016.12.012","volume":"1","author":"G Carlsson","year":"2017","unstructured":"Carlsson G. The shape of biomedical data. Curr Opin Syst Biol. 2017;1:109\u201313.","journal-title":"Curr Opin Syst Biol."},{"issue":"7","key":"2852_CR3","doi-asserted-by":"publisher","first-page":"816","DOI":"10.1002\/dvdy.175","volume":"249","author":"EJ Am\u00e9zquita","year":"2020","unstructured":"Am\u00e9zquita EJ, Quigley MY, Ophelders T, Munch E, Chitwood DH. The shape of things to come: topological data analysis and biology, from molecules to organisms. Dev Dyn. 2020;249(7):816\u201333.","journal-title":"Dev Dyn."},{"issue":"19","key":"2852_CR4","doi-asserted-by":"publisher","first-page":"3091","DOI":"10.1093\/bioinformatics\/btab553","volume":"37","author":"CF Loughrey","year":"2021","unstructured":"Loughrey CF, Fitzpatrick P, Orr N, Jurek-Loughrey A. The topology of data: opportunities for cancer research. Bioinformatics. 2021;37(19):3091\u20138.","journal-title":"Bioinformatics."},{"issue":"1","key":"2852_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-018-03664-4","volume":"9","author":"M Saggar","year":"2018","unstructured":"Saggar M, Sporns O, Gonzalez-Castillo J, Bandettini PA, Carlsson G, Glover G, et al. Towards a new approach to reveal dynamical organization of the brain using topological data analysis. Nat Commun. 2018;9(1):1\u201314.","journal-title":"Nat Commun."},{"key":"2852_CR6","doi-asserted-by":"crossref","unstructured":"Li L, Cheng WY, Glicksberg BS, Gottesman O, Tamler R, Chen R, et\u00a0al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci Transl Med. 2015;7(311):311ra174.","DOI":"10.1126\/scitranslmed.aaa9364"},{"issue":"46","key":"2852_CR7","doi-asserted-by":"publisher","first-page":"18566","DOI":"10.1073\/pnas.1313480110","volume":"110","author":"JM Chan","year":"2013","unstructured":"Chan JM, Carlsson G, Rabadan R. Topology of viral evolution. PNAS. 2013;110(46):18566\u201371.","journal-title":"PNAS."},{"key":"2852_CR8","doi-asserted-by":"publisher","unstructured":"Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, Alagappan M, et\u00a0al. Extracting insights from the shape of complex data using topology. Sci Rep. 2013;3. https:\/\/doi.org\/10.1038\/srep01236.","DOI":"10.1038\/srep01236"},{"key":"2852_CR9","doi-asserted-by":"publisher","unstructured":"Mathews JC, Nadeem S, Levine AJ, Pouryahya M, Deasy JO, Tannenbaum A. Robust and interpretable PAM50 reclassification exhibits survival advantage for myoepithelial and immune phenotypes. NPJ Breast Cancer. 2019;5(1). https:\/\/doi.org\/10.1038\/s41523-019-0124-8.","DOI":"10.1038\/s41523-019-0124-8"},{"key":"2852_CR10","doi-asserted-by":"publisher","unstructured":"Nicolau M, Levine AJ, Carlsson G. Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. PNAS. 2011;108(17):7265\u201370. https:\/\/doi.org\/10.1073\/pnas.1102826108.","DOI":"10.1073\/pnas.1102826108"},{"issue":"8","key":"2852_CR11","doi-asserted-by":"publisher","first-page":"957","DOI":"10.1093\/bioinformatics\/btm033","volume":"23","author":"M Nicolau","year":"2007","unstructured":"Nicolau M, Tibshirani R, B\u00f8rresen-Dale AL, Jeffrey SS. Disease-specific genomic analysis: identifying the signature of pathologic biology. Bioinformatics. 2007;23(8):957\u201365.","journal-title":"Bioinformatics."},{"issue":"1","key":"2852_CR12","first-page":"478","volume":"19","author":"M Carriere","year":"2018","unstructured":"Carriere M, Michel B, Oudot S. Statistical analysis and parameter selection for mapper. JMLR. 2018;19(1):478\u2013516.","journal-title":"JMLR."},{"key":"2852_CR13","first-page":"1","volume":"21","author":"F Belch\u0131","year":"2020","unstructured":"Belch\u0131 F, Brodzki J, Burfitt M, Niranjan M. A numerical measure of the instability of Mapper-type algorithms. JMLR. 2020;21:1\u201345.","journal-title":"JMLR."},{"key":"2852_CR14","doi-asserted-by":"crossref","unstructured":"Kang SJ, Lim Y. Ensemble mapper. Stat. 2021;10(1).","DOI":"10.1002\/sta4.405"},{"issue":"7","key":"2852_CR15","doi-asserted-by":"publisher","first-page":"2567","DOI":"10.1007\/s10489-018-01397-x","volume":"49","author":"M Mojarad","year":"2019","unstructured":"Mojarad M, et al. A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters. Appl Intell. 2019;49(7):2567\u201381.","journal-title":"Appl Intell."},{"key":"2852_CR16","doi-asserted-by":"publisher","first-page":"105107","DOI":"10.1016\/j.knosys.2019.105107","volume":"189","author":"QT Bui","year":"2020","unstructured":"Bui QT, et al. F-Mapper: A Fuzzy Mapper clustering algorithm. Knowl-Based Syst. 2020;189:105107.","journal-title":"Knowl-Based Syst."},{"issue":"2","key":"2852_CR17","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1016\/0098-3004(84)90020-7","volume":"10","author":"JC Bezdek","year":"1984","unstructured":"Bezdek JC, Ehrlich R, Full W. FCM: The fuzzy c-means clustering algorithm. Comput Geosci. 1984;10(2):191\u2013203.","journal-title":"Comput Geosci."},{"key":"2852_CR18","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","volume":"20","author":"PJ Rousseeuw","year":"1987","unstructured":"Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53\u201365.","journal-title":"J Comput Appl Math."},{"key":"2852_CR19","doi-asserted-by":"crossref","unstructured":"Fitzpatrick P, Jurek-Loughrey A, D\u0142otko P, Del\u00a0Rincon JM. Ensemble learning for mapper parameter optimization. In: 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE; 2023. pp. 129\u2013134.","DOI":"10.1109\/ICTAI59109.2023.00026"},{"issue":"1","key":"2852_CR20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-021-04360-9","volume":"22","author":"E Carr","year":"2021","unstructured":"Carr E, Carri\u00e8re M, Michel B, Chazal F, Iniesta R. Identifying homogeneous subgroups of patients and important features: a topological machine learning approach. BMC Bioinformatics. 2021;22(1):1\u20137.","journal-title":"BMC Bioinformatics."},{"key":"2852_CR21","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in Python. J Mach Learn Res. 2011;12:2825\u201330.","journal-title":"J Mach Learn Res."},{"issue":"7403","key":"2852_CR22","doi-asserted-by":"publisher","first-page":"346","DOI":"10.1038\/nature10983","volume":"486","author":"C Curtis","year":"2012","unstructured":"Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486(7403):346\u201352.","journal-title":"Nature."},{"issue":"7418","key":"2852_CR23","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1038\/nature11412","volume":"490","author":"D Koboldt","year":"2012","unstructured":"Koboldt D, Fulton R, McLellan M, Schmidt H, Kalicki-Veizer J, McMichael J, et al. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490(7418):61\u201370.","journal-title":"Nature."},{"issue":"6","key":"2852_CR24","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1038\/ng.2653","volume":"45","author":"J Lonsdale","year":"2013","unstructured":"Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, et al. The genotype-tissue expression (GTEx) project. Nat Genet. 2013;45(6):580\u20135.","journal-title":"Nat Genet."},{"issue":"11","key":"2852_CR25","doi-asserted-by":"publisher","first-page":"205","DOI":"10.21105\/joss.00205","volume":"2","author":"L McInnes","year":"2017","unstructured":"McInnes L, Healy J, Astels S. hdbscan: Hierarchical density based clustering. JOSS. 2017;2(11):205.","journal-title":"JOSS."},{"issue":"4","key":"2852_CR26","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1002\/wics.101","volume":"2","author":"H Abdi","year":"2010","unstructured":"Abdi H, Williams LJ. Principal component analysis. Wiley Interdiscip Rev Comput Stat. 2010;2(4):433\u201359.","journal-title":"Wiley Interdiscip Rev Comput Stat."},{"key":"2852_CR27","doi-asserted-by":"crossref","unstructured":"McInnes L, Healy J, Melville J. Umap: uniform manifold approximation and projection for dimension reduction. 2018. arXiv preprint arXiv:1802.03426.","DOI":"10.21105\/joss.00861"},{"key":"2852_CR28","unstructured":"Van\u00a0der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9(11)."},{"issue":"5552","key":"2852_CR29","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1126\/science.295.5552.7a","volume":"295","author":"M Balasubramanian","year":"2002","unstructured":"Balasubramanian M, Schwartz EL. The isomap algorithm and topological stability. Science. 2002;295(5552):7.","journal-title":"Science."},{"issue":"8","key":"2852_CR30","doi-asserted-by":"publisher","first-page":"1160","DOI":"10.1200\/JCO.2008.18.1370","volume":"27","author":"JS Parker","year":"2009","unstructured":"Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. JCO. 2009;27(8):1160.","journal-title":"JCO."},{"issue":"5","key":"2852_CR31","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/bcr2635","volume":"12","author":"A Prat","year":"2010","unstructured":"Prat A, Parker JS, Karginova O, Fan C, Livasy C, Herschkowitz JI, et al. Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res. 2010;12(5):1\u201318.","journal-title":"Breast Cancer Res."},{"issue":"3","key":"2852_CR32","doi-asserted-by":"publisher","first-page":"412","DOI":"10.5306\/wjco.v5.i3.412","volume":"5","author":"O Yersal","year":"2014","unstructured":"Yersal O, Barutca S. Biological subtypes of breast cancer: prognostic and therapeutic implications. WJCO. 2014;5(3):412.","journal-title":"WJCO."},{"key":"2852_CR33","unstructured":"Iniesta R, Carr E, Carriere M, Yerolemou N, Michel B, Chazal F. Topological Data Analysis and its usefulness for precision medicine studies. SORT-Stat Oper Res Trans. 2022;46(1):115\u201336."},{"issue":"3","key":"2852_CR34","doi-asserted-by":"publisher","first-page":"198","DOI":"10.1038\/nrm1857","volume":"7","author":"AR Joyce","year":"2006","unstructured":"Joyce AR, Palsson B\u00d8. The model organism as a system: integrating\u2019omics\u2019 data sets. Nat Rev Mol Cell Biol. 2006;7(3):198\u2013210.","journal-title":"Nat Rev Mol Cell Biol."}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-025-02852-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-025-02852-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-025-02852-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T02:06:47Z","timestamp":1742350007000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-025-02852-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,19]]},"references-count":34,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["2852"],"URL":"https:\/\/doi.org\/10.1186\/s12911-025-02852-9","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,19]]},"assertion":[{"value":"6 August 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 January 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 March 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"N\/a.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"N\/a.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"139"}}