{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,26]],"date-time":"2025-12-26T16:50:52Z","timestamp":1766767852340,"version":"3.38.0"},"reference-count":55,"publisher":"IGI Global","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,10,1]]},"abstract":"<p>The accuracy of \u201cstopping rules\u201d for determining the number of clusters in a data set is examined as a function of the underlying clustering algorithm being used. Using a Monte Carlo study, various stopping rules, used in conjunction with six clustering algorithms, are compared to determine which rule\/algorithm combinations best recover the true number of clusters. The rules and algorithms are tested using disparately sized, artificially generated data sets that contained multiple numbers and levels of clusters, variables, noise, outliers, and elongated and unequally sized clusters. The results indicate that stopping rule accuracy depends on the underlying clustering algorithm being used. The cubic clustering criterion (CCC), when used in conjunction with mixture models or Ward\u2019s method, recovers the true number of clusters more accurately than other rules and algorithms. However, the CCC was more likely than other stopping rules to report more clusters than are actually present. Implications are discussed.<\/p>","DOI":"10.4018\/jsds.2011100101","type":"journal-article","created":{"date-parts":[[2011,11,16]],"date-time":"2011-11-16T12:53:45Z","timestamp":1321448025000},"page":"1-13","source":"Crossref","is-referenced-by-count":4,"title":["Determination of the Number of Clusters in a Data Set"],"prefix":"10.4018","volume":"2","author":[{"given":"Derrick S.","family":"Boone","sequence":"first","affiliation":[{"name":"Wake Forest University - Schools of Business, USA"}]}],"member":"2432","reference":[{"key":"jsds.2011100101-0","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.1974.1100705"},{"issue":"1-2","key":"jsds.2011100101-1","first-page":"102","article-title":"How do I choose the optimal number of clusters in cluster analysis.","volume":"10","author":"P.Arabie","year":"2001","journal-title":"Journal of Consumer Psychology"},{"key":"jsds.2011100101-2","doi-asserted-by":"crossref","DOI":"10.1142\/1930","author":"P.Arabie","year":"1996","journal-title":"Clustering and classification"},{"key":"jsds.2011100101-3","doi-asserted-by":"publisher","DOI":"10.1007\/BF02294390"},{"key":"jsds.2011100101-4","doi-asserted-by":"publisher","DOI":"10.1016\/0377-2217(96)00046-X"},{"key":"jsds.2011100101-5","doi-asserted-by":"publisher","DOI":"10.1023\/A:1020321132568"},{"key":"jsds.2011100101-6","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8116(02)00080-0"},{"key":"jsds.2011100101-7","first-page":"69","article-title":"Mixture model cluster analysis using model selection criteria and a new informational measure of complexity","author":"H.Bozdogan","year":"1994","journal-title":"Multivariate statistical modelling"},{"key":"jsds.2011100101-8","doi-asserted-by":"publisher","DOI":"10.1080\/03610927408827101"},{"key":"jsds.2011100101-9","doi-asserted-by":"publisher","DOI":"10.2307\/3152003"},{"key":"jsds.2011100101-10","doi-asserted-by":"publisher","DOI":"10.1007\/BF01246098"},{"key":"jsds.2011100101-11","doi-asserted-by":"publisher","DOI":"10.2307\/3151899"},{"key":"jsds.2011100101-12","doi-asserted-by":"crossref","unstructured":"Cormack, R. M. (1971). A review of classification. Journal of the Royal Statistical Society, 134(A), 321-67.","DOI":"10.2307\/2344237"},{"key":"jsds.2011100101-13","doi-asserted-by":"publisher","DOI":"10.1016\/0169-2070(94)90004-3"},{"key":"jsds.2011100101-14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM-algorithm.","volume":"39","author":"A. P.Dempster","year":"1977","journal-title":"Journal of the Royal Statistical Society. Series B. Methodological"},{"journal-title":"The bibliography of marketing research methods","year":"1990","author":"J. R.Dickinson","key":"jsds.2011100101-15"},{"key":"jsds.2011100101-16","doi-asserted-by":"publisher","DOI":"10.1007\/s11002-009-9083-4"},{"journal-title":"Pattern classification and scene analysis","year":"1973","author":"R. O.Duda","key":"jsds.2011100101-17"},{"key":"jsds.2011100101-18","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/41.8.578"},{"key":"jsds.2011100101-19","doi-asserted-by":"publisher","DOI":"10.4324\/9780203451519"},{"key":"jsds.2011100101-20","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1176344071"},{"key":"jsds.2011100101-21","doi-asserted-by":"publisher","DOI":"10.1007\/BF01908064"},{"key":"jsds.2011100101-22","first-page":"558","article-title":"A simplified Monte Carol significance test procedure.","volume":"30","author":"A. C. A.Hope","year":"1968","journal-title":"Journal of the Royal Statistical Society. Series B. Methodological"},{"key":"jsds.2011100101-23","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.79.8.2554"},{"key":"jsds.2011100101-24","doi-asserted-by":"publisher","DOI":"10.1016\/0167-8116(86)90015-7"},{"key":"jsds.2011100101-25","doi-asserted-by":"publisher","DOI":"10.2307\/2286004"},{"key":"jsds.2011100101-26","doi-asserted-by":"publisher","DOI":"10.1007\/BF00195859"},{"key":"jsds.2011100101-27","doi-asserted-by":"publisher","DOI":"10.1016\/S0969-6989(98)00006-X"},{"key":"jsds.2011100101-28","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2008.88"},{"key":"jsds.2011100101-29","doi-asserted-by":"publisher","DOI":"10.1007\/s12543-009-0001-5"},{"key":"jsds.2011100101-30","unstructured":"MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297."},{"key":"jsds.2011100101-31","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2002.1114856"},{"key":"jsds.2011100101-32","first-page":"456","article-title":"CLUSTISZ: A program to test for the quality of clustering of a set of objects.","volume":"12","author":"J. O.McClain","year":"1975","journal-title":"JMR, Journal of Marketing Research"},{"journal-title":"Mixture models: Inference and applications to clustering","year":"1988","author":"G. J.McLachlan","key":"jsds.2011100101-33"},{"key":"jsds.2011100101-34","unstructured":"McLachlan, G. J., & Peel, D. (1998). Misfit: An algorithm for automatic fitting and testing of normal mixtures. In Proceedings of the 14th International Conference on Pattern Recognition (Vol. 1, pp. 553-557)."},{"key":"jsds.2011100101-35","doi-asserted-by":"publisher","DOI":"10.1007\/BF02293907"},{"key":"jsds.2011100101-36","doi-asserted-by":"publisher","DOI":"10.1007\/BF02294153"},{"key":"jsds.2011100101-37","doi-asserted-by":"publisher","DOI":"10.1007\/BF02294245"},{"key":"jsds.2011100101-38","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1983.4767342"},{"key":"jsds.2011100101-39","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1002\/0471264385.wei0207","article-title":"Clustering and classification methods","volume":"Vol. 2","author":"G. W.Milligan","year":"2003","journal-title":"Comprehensive handbook of psychology"},{"key":"jsds.2011100101-40","doi-asserted-by":"publisher","DOI":"10.2307\/3151680"},{"key":"jsds.2011100101-41","doi-asserted-by":"publisher","DOI":"10.1007\/s10044-007-0099-1"},{"key":"jsds.2011100101-42","article-title":"The cubic clustering criterion","author":"W. S.Sarle","year":"1983","journal-title":"SAS technical report A-108"},{"key":"jsds.2011100101-43","article-title":"The MODECLUS procedure","author":"W. S.Sarle","year":"1993","journal-title":"SAS technical report P-256"},{"key":"jsds.2011100101-44","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1176344136"},{"key":"jsds.2011100101-45","doi-asserted-by":"publisher","DOI":"10.1037\/0033-295X.86.2.87"},{"journal-title":"Density estimation","year":"1986","author":"B. W.Silverman","key":"jsds.2011100101-46"},{"key":"jsds.2011100101-47","doi-asserted-by":"publisher","DOI":"10.2307\/3152014"},{"key":"jsds.2011100101-48","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4615-4651-1","author":"M.Wedel","year":"2000","journal-title":"Market segmentation: Conceptual and methodological foundations"},{"key":"jsds.2011100101-49","doi-asserted-by":"publisher","DOI":"10.1016\/0167-8116(89)90052-9"},{"key":"jsds.2011100101-50","doi-asserted-by":"publisher","DOI":"10.2307\/3172779"},{"key":"jsds.2011100101-51","doi-asserted-by":"publisher","DOI":"10.1207\/s15327906mbr0503_6"},{"key":"jsds.2011100101-52","doi-asserted-by":"publisher","DOI":"10.1207\/s15327906mbr1301_3"},{"key":"jsds.2011100101-53","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1111\/j.2517-6161.1983.tb01262.x","article-title":"A kth nearest neighbor clustering procedure.","volume":"45","author":"M. A.Wong","year":"1983","journal-title":"Journal of the Royal Statistical Society. Series B. Methodological"},{"key":"jsds.2011100101-54","first-page":"1","article-title":"Rediscovering market segmentation.","author":"D.Yankelovich","year":"2006","journal-title":"Harvard Business Review"}],"container-title":["International Journal of Strategic Decision Sciences"],"original-title":[],"language":"ng","link":[{"URL":"https:\/\/www.igi-global.com\/viewtitle.aspx?TitleId=60528","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T03:32:26Z","timestamp":1741923146000},"score":1,"resource":{"primary":{"URL":"https:\/\/services.igi-global.com\/resolvedoi\/resolve.aspx?doi=10.4018\/jsds.2011100101"}},"subtitle":["A Stopping Rule \u00d7 Clustering Algorithm Comparison"],"short-title":[],"issued":{"date-parts":[[2011,10,1]]},"references-count":55,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2011,10]]}},"URL":"https:\/\/doi.org\/10.4018\/jsds.2011100101","relation":{},"ISSN":["1947-8569","1947-8577"],"issn-type":[{"type":"print","value":"1947-8569"},{"type":"electronic","value":"1947-8577"}],"subject":[],"published":{"date-parts":[[2011,10,1]]}}}