{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T02:56:51Z","timestamp":1774061811315,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2022,8,2]],"date-time":"2022-08-02T00:00:00Z","timestamp":1659398400000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Mart\u00ed i Franqu\u00e8s Fellowship Program","award":["2020PMF-PIPF-45"],"award-info":[{"award-number":["2020PMF-PIPF-45"]}]},{"name":"PFR Program","award":["2019PFR-URV-B2-41"],"award-info":[{"award-number":["2019PFR-URV-B2-41"]}]},{"DOI":"10.13039\/501100003329","name":"Ministerio de Econom\u00eda y Competitividad","doi-asserted-by":"publisher","award":["PGC2018-094754-BC21"],"award-info":[{"award-number":["PGC2018-094754-BC21"]}],"id":[{"id":"10.13039\/501100003329","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003329","name":"Ministerio de Econom\u00eda y Competitividad","doi-asserted-by":"publisher","award":["RED2018-102518-T"],"award-info":[{"award-number":["RED2018-102518-T"]}],"id":[{"id":"10.13039\/501100003329","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002809","name":"Generalitat de Catalunya","doi-asserted-by":"publisher","award":["2020PANDE00098"],"award-info":[{"award-number":["2020PANDE00098"]}],"id":[{"id":"10.13039\/501100002809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,9,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Agglomerative hierarchical clustering has become a common tool for the analysis and visualization of data, thus being present in a large amount of scientific research and predating all areas of bioinformatics and computational biology. In this work, we focus on a critical problem, the nonuniqueness of the clustering when there are tied distances, for which several solutions exist but are not implemented in most hierarchical clustering packages. We analyze the magnitude of this problem in one particular setting: the clustering of microsatellite markers using the Unweighted Pair-Group Method with Arithmetic Mean. To do so, we have calculated the fraction of publications at the Scopus database in which more than one hierarchical clustering is possible, showing that about 46% of the articles are affected. Additionally, to show the problem from a practical point of view, we selected two opposite examples of articles that have multiple solutions: one with two possible dendrograms, and the other with more than 2.5 million different possible hierarchical clusterings.<\/jats:p>","DOI":"10.1093\/bib\/bbac312","type":"journal-article","created":{"date-parts":[[2022,8,2]],"date-time":"2022-08-02T02:20:55Z","timestamp":1659406855000},"source":"Crossref","is-referenced-by-count":16,"title":["Nonunique UPGMA clusterings of microsatellite markers"],"prefix":"10.1093","volume":"23","author":[{"given":"Nat\u00e0lia","family":"Segura-Alabart","sequence":"first","affiliation":[{"name":"Departament d\u2019Enginyeria Inform\u00e1tica i Matem\u00e1tiques, Universitat Rovira i Virgili , Av. Pa\u00efsos Catalans 26, 43007, Tarragona , Spain"}]},{"given":"Francesc","family":"Serratosa","sequence":"additional","affiliation":[{"name":"Departament d\u2019Enginyeria Inform\u00e1tica i Matem\u00e1tiques, Universitat Rovira i Virgili , Av. Pa\u00efsos Catalans 26, 43007, Tarragona , Spain"}]},{"given":"Sergio","family":"G\u00f3mez","sequence":"additional","affiliation":[{"name":"Departament d\u2019Enginyeria Inform\u00e1tica i Matem\u00e1tiques, Universitat Rovira i Virgili , Av. Pa\u00efsos Catalans 26, 43007, Tarragona , Spain"}]},{"given":"Alberto","family":"Fern\u00e1ndez","sequence":"additional","affiliation":[{"name":"Departament d\u2019Enginyeria Qu\u00e1mica, Universitat Rovira i Virgili , Av. Pa\u00efsos Catalans 26, 43007, Tarragona , Spain"}]}],"member":"286","published-online":{"date-parts":[[2022,8,1]]},"reference":[{"key":"2022092013220420300_ref1","doi-asserted-by":"crossref","first-page":"1347","DOI":"10.1007\/s11033-016-4070-3","article-title":"Molecular markers: a potential resource for ginger genetic diversity studies","volume":"43","author":"Ismail","year":"2016","journal-title":"Mol Biol Rep"},{"issue":"22","key":"2022092013220420300_ref2","doi-asserted-by":"crossref","first-page":"6531","DOI":"10.1093\/nar\/18.22.6531","article-title":"DNA polymorphisms amplified by arbitrary primers are useful as genetic markers","volume":"18","author":"Williams","year":"1990","journal-title":"Nucleic Acids Res"},{"key":"2022092013220420300_ref3","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1007\/BF00564200","article-title":"The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for germplasm analysis","volume":"2","author":"Powell","year":"1996","journal-title":"Mol Breed"},{"key":"2022092013220420300_ref4","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1007\/s001220050961","article-title":"Development, characterization and mapping of microsatellite markers in Eucalyptus grandis and E. urophylla","volume":"97","author":"Brondani","year":"1998","journal-title":"Theor Appl Genet"},{"issue":"6","key":"2022092013220420300_ref5","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/nrg1348","article-title":"Microsatellites: simple sequences with complex evolution","volume":"5","author":"Ellegren","year":"2004","journal-title":"Nat Rev Genet"},{"issue":"16","key":"2022092013220420300_ref6","doi-asserted-by":"crossref","first-page":"6463","DOI":"10.1093\/nar\/17.16.6463","article-title":"Hypervariability of simple sequences as a general source for polymorphic DNA markers","volume":"17","author":"Tautz","year":"1989","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"2022092013220420300_ref7","doi-asserted-by":"crossref","first-page":"183","DOI":"10.5194\/aab-60-183-2017","article-title":"Using microsatellite markers to analyze genetic diversity in 14 sheep types in Iran","volume":"60","author":"Ebrahimi","year":"2017","journal-title":"Arch Anim Breed"},{"issue":"5","key":"2022092013220420300_ref8","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.22438\/jeb\/41\/5(SI)\/MS_28","article-title":"Genetic diversity of banana prawns Fenneropenaeus merguiensis in Malaysian waters using microsatellite markers","volume":"41","author":"Aziz","year":"2020","journal-title":"J Environ Biol"},{"issue":"2","key":"2022092013220420300_ref9","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1007\/s001220100684","article-title":"Molecular characterization and similarity relationships among apricot (Prunus armeniaca L.) genotypes using simple sequence repeats","volume":"104","author":"Hormaza","year":"2002","journal-title":"Theor Appl Genet"},{"issue":"10","key":"2022092013220420300_ref10","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1111\/jph.12848","article-title":"Population structure and linkage disequilibrium in a large collection of Fusarium oxysporum strains analysed through iPBS markers","volume":"167","author":"Ates","year":"2019","journal-title":"J Phytopathol"},{"key":"2022092013220420300_ref11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2017\/6943952","article-title":"Isolation and characterization of thermophilic bacteria from Jordanian hot springs: Bacillus licheniformis and Thermomonas hydrothermalis isolates as potential producers of thermostable enzymes","volume":"2017","author":"Mohammad","year":"2017","journal-title":"Int J Microbiol"},{"key":"2022092013220420300_ref12","first-page":"47","volume-title":"2010 International Conference on Educational and Information Technology (ICEIT 2010)","author":"Han","year":"2010"},{"issue":"2","key":"2022092013220420300_ref13","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1093\/oxfordjournals.molbev.a025590","article-title":"Multiple UPGMA and neighbor-joining trees and the performance of some computer packages","volume":"13","author":"Backeljau","year":"1996","journal-title":"Mol Biol Evol"},{"key":"2022092013220420300_ref14","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1038\/212218a0","article-title":"A generalized sorting strategy for computer classifications","volume":"212","author":"Lance","year":"1966","journal-title":"Nature"},{"issue":"3","key":"2022092013220420300_ref15","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1093\/genetics\/89.3.583","article-title":"Estimation of average heterozygosity and genetic distance from a small number of individuals","volume":"89","author":"Nei","year":"1978","journal-title":"Genetics"},{"issue":"4","key":"2022092013220420300_ref16","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1186\/s13321-016-0114-x","article-title":"How frequently do clusters occur in hierarchical clustering analysis? A graph theoretical approach to studying ties in proximity","volume":"8","author":"Leal","year":"2016","journal-title":"J Chem"},{"key":"2022092013220420300_ref17","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1007\/978-3-642-69024-2_30","article-title":"The occurrence of multiple UPGMA phenograms","volume":"1","author":"Hart","year":"1983","journal-title":"Numer Taxon"},{"key":"2022092013220420300_ref18","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1021\/ci000069q","article-title":"Ties in proximity and clustering compounds","volume":"41","author":"MacCuish","year":"2001","journal-title":"J Chem Inf Comput Sci"},{"key":"2022092013220420300_ref19","doi-asserted-by":"crossref","first-page":"153","DOI":"10.2307\/3237253","article-title":"On the sensitivity of ordination and classification methods to variation in the input order of data","volume":"8","author":"Podani","year":"1997","journal-title":"J Veg Sci"},{"issue":"4","key":"2022092013220420300_ref20","article-title":"Genetic diversity of grasspea and its relative species revealed by SSR markers","volume":"10","author":"Wang","year":"2015","journal-title":"PLoS One"},{"key":"2022092013220420300_ref21","volume-title":"The R stats package","author":"R Core Team","year":"2021"},{"key":"2022092013220420300_ref22","volume-title":"cluster: cluster analysis basics and extensions","author":"Maechler","year":"2021"},{"issue":"Oct","key":"2022092013220420300_ref23","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2022092013220420300_ref24","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","article-title":"SciPy 1.0: fundamental algorithms for scientific computing in python","volume":"17","author":"Virtanen","year":"2020","journal-title":"Nat Methods"},{"key":"2022092013220420300_ref25","volume-title":"version 7.10.0 (R2010a)","author":"MATLAB","year":"2010"},{"issue":"3","key":"2022092013220420300_ref26","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1093\/bioinformatics\/bti021","article-title":"Iterative cluster analysis of protein interaction data","volume":"21","author":"Arnau","year":"2005","journal-title":"Bioinformatics"},{"key":"2022092013220420300_ref27","article-title":"Orders and overlapping clusters by pyramids","author":"Diday"},{"key":"2022092013220420300_ref28","first-page":"35","volume-title":"Partitioning Data Sets. DIMACS Series in Discrete Mathematics and Theoretical Computer Science","author":"Bertrand","year":"1995"},{"key":"2022092013220420300_ref29","first-page":"486","volume-title":"Proceedings from the 13th European Symposium on Quantitative Structure-Activity Relationships","author":"Nicolaou","year":"2000"},{"key":"2022092013220420300_ref30","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s00357-008-9004-x","article-title":"Solving non-uniqueness in agglomerative hierarchical clustering using multidendrograms","volume":"25","author":"Fern\u00e1ndez","year":"2008","journal-title":"J Classif"},{"issue":"11","key":"2022092013220420300_ref31","doi-asserted-by":"crossref","first-page":"1403","DOI":"10.1093\/bioinformatics\/btn129","article-title":"adegenet: a R package for the multivariate analysis of genetic markers","volume":"24","author":"Jombart","year":"2008","journal-title":"Bioinformatics"},{"issue":"21","key":"2022092013220420300_ref32","doi-asserted-by":"crossref","first-page":"3070","DOI":"10.1093\/bioinformatics\/btr521","article-title":"adegenet 1.3-1: new tools for the analysis of genome-wide SNP data","volume":"27","author":"Jombart","year":"2011","journal-title":"Bioinformatics"},{"key":"2022092013220420300_ref33","volume-title":"R: a language and environment for statistical computing","author":"R Core Team","year":"2021"},{"key":"2022092013220420300_ref34","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1007\/s00357-019-09339-z","article-title":"Versatile linkage: a family of space-conserving strategies for agglomerative hierarchical clustering","volume":"37","author":"Fern\u00e1ndez","year":"2020","journal-title":"J Classif"},{"key":"2022092013220420300_ref35","article-title":"Radatools 5.2: communities detection in complex networks and other tools","author":"G\u00f3mez","year":"2021"},{"issue":"12","key":"2022092013220420300_ref36","doi-asserted-by":"crossref","first-page":"5464","DOI":"10.3390\/e15125464","article-title":"Structural patterns in complex systems using multidendrograms","volume":"15","author":"G\u00f3mez","year":"2013","journal-title":"Entropy"},{"issue":"2","key":"2022092013220420300_ref37","doi-asserted-by":"crossref","first-page":"76","DOI":"10.5213\/inj.1632742.371","article-title":"Trends in next-generation sequencing and a new era for whole genome sequencing","volume":"20","author":"Park","year":"2016","journal-title":"Int Neurourol J"},{"issue":"11","key":"2022092013220420300_ref38","doi-asserted-by":"crossref","first-page":"1523","DOI":"10.1093\/bioinformatics\/18.11.1523","article-title":"Euclidian space and grouping of biological objects","volume":"18","author":"Grishin","year":"2002","journal-title":"Bioinformatics"},{"issue":"2","key":"2022092013220420300_ref39","doi-asserted-by":"crossref","first-page":"105","DOI":"10.5614\/j.math.fund.sci.2017.49.2.1","article-title":"Genetic diversity of Indigofera tinctoria L. in Java and Madura islands as natural batik dye based on intersimple sequence repeat markers","volume":"49","author":"Hariri","year":"2017","journal-title":"J Math Fund Sci"},{"issue":"19","key":"2022092013220420300_ref40","doi-asserted-by":"crossref","first-page":"8284","DOI":"10.3390\/su12198284","article-title":"Assessing genetic diversity and population structure of Kalmia latifolia L. in the eastern United States: an essential step towards breeding for adaptability to southeastern environmental conditions","volume":"12","author":"Li","year":"2020","journal-title":"Sustainability"},{"issue":"4","key":"2022092013220420300_ref41","doi-asserted-by":"crossref","first-page":"174","DOI":"10.3923\/biotech.2014.174.180","article-title":"DNA fingerprinting and genetic diversity analysis of chilli germplasm using microsatellite markers","volume":"13","author":"Hossain","year":"2014","journal-title":"Biotechnology"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/5\/bbac312\/45937168\/bbac312.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/5\/bbac312\/45937168\/bbac312.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,20]],"date-time":"2022-09-20T17:52:30Z","timestamp":1663696350000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac312\/6652780"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,1]]},"references-count":41,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,9,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac312","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,9]]},"published":{"date-parts":[[2022,8,1]]},"article-number":"bbac312"}}