{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T05:01:04Z","timestamp":1770526864938,"version":"3.49.0"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,3,8]],"date-time":"2022-03-08T00:00:00Z","timestamp":1646697600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,8]],"date-time":"2022-03-08T00:00:00Z","timestamp":1646697600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Adv Data Anal Classif"],"published-print":{"date-parts":[[2024,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Many stochastic models in economics and finance are described by distributions with a lognormal body. Testing for a possible Pareto tail and estimating the parameters of the Pareto distribution in these models is an important topic. Although the problem has been extensively studied in the literature, most applications are characterized by some weaknesses. We propose a method that exploits all the available information by taking into account the data generating process of the whole population. After estimating a lognormal\u2013Pareto mixture with a known threshold via the EM algorithm, we exploit this result to develop an unsupervised tail estimation approach based on the maximization of the profile likelihood function. Monte Carlo experiments and two empirical applications to the size of US metropolitan areas and of firms in an Italian district confirm that the proposed method works well and outperforms two commonly used techniques. Simulation results are available in an online supplementary appendix.\n<\/jats:p>","DOI":"10.1007\/s11634-022-00497-4","type":"journal-article","created":{"date-parts":[[2022,3,8]],"date-time":"2022-03-08T13:02:33Z","timestamp":1646744553000},"page":"251-269","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["On discriminating between lognormal and Pareto tail: an unsupervised mixture-based approach"],"prefix":"10.1007","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9579-3650","authenticated-orcid":false,"given":"Marco","family":"Bee","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,8]]},"reference":[{"key":"497_CR1","doi-asserted-by":"publisher","first-page":"e0257762","DOI":"10.1371\/journal.pone.0257762","volume":"16","author":"M Abdul Majid","year":"2021","unstructured":"Abdul Majid M, Ibrahim K (2021) On Bayesian approach to composite Pareto models. PLoS ONE 16:e0257762","journal-title":"PLoS ONE"},{"key":"497_CR2","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1016\/j.insmatheco.2014.08.008","volume":"61","author":"S Abu Bakar","year":"2015","unstructured":"Abu Bakar S, Hamzah N, Maghsoudi M, Nadarajah S (2015) Modeling loss data using composite models. Insur Math Econom 61:146\u2013154","journal-title":"Insur Math Econom"},{"issue":"5536","key":"497_CR3","doi-asserted-by":"publisher","first-page":"1818","DOI":"10.1126\/science.1062081","volume":"293","author":"RL Axtell","year":"2001","unstructured":"Axtell RL (2001) Zipf distribution of U.S. firm sizes. Science 293(5536):1818\u20131820","journal-title":"Science"},{"key":"497_CR4","doi-asserted-by":"publisher","first-page":"265","DOI":"10.1016\/j.physa.2017.04.012","volume":"481","author":"M Bee","year":"2017","unstructured":"Bee M, Riccaboni M, Schiavo S (2017) Where Gibrat meets Zipf: scale and scope of French firms. Physica A 481:265\u2013275","journal-title":"Physica A"},{"key":"497_CR5","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1007\/s10888-019-09431-9","volume":"18","author":"M Benzidia","year":"2020","unstructured":"Benzidia M, Lubrano M (2020) A Bayesian look at American academic wages: from wage dispersion to wage compression. J Econ Inequal 18:213\u2013238","journal-title":"J Econ Inequal"},{"issue":"Supplement 1","key":"497_CR6","doi-asserted-by":"publisher","first-page":"S17","DOI":"10.1016\/j.cities.2011.11.007","volume":"29","author":"BJ Berry","year":"2012","unstructured":"Berry BJ, Okulicz-Kozaryn A (2012) The city size distribution debate: resolution for US urban regions and megalopolitan areas. Cities 29(Supplement 1):S17\u2013S23","journal-title":"Cities"},{"issue":"3","key":"497_CR7","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1016\/S0167-9473(02)00163-9","volume":"41","author":"C Biernacki","year":"2003","unstructured":"Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3):561\u2013575","journal-title":"Comput Stat Data Anal"},{"key":"497_CR8","doi-asserted-by":"publisher","first-page":"661","DOI":"10.1137\/070710111","volume":"51","author":"A Clauset","year":"2009","unstructured":"Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51:661\u2013673","journal-title":"SIAM Rev"},{"key":"497_CR9","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-12381-9","volume-title":"The mathematics of urban morphology","author":"L D\u2019Acci","year":"2019","unstructured":"D\u2019Acci L (2019) The mathematics of urban morphology. Birkh\u00e4user, Boston"},{"key":"497_CR10","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1080\/01621459.1999.10474147","volume":"94","author":"J Del Castillo","year":"1999","unstructured":"Del Castillo J, Puig P (1999) The best test of exponentiality against singly truncated normal alternatives. J Am Stat Assoc 94:529\u2013532","journal-title":"J Am Stat Assoc"},{"issue":"1","key":"497_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","volume":"39","author":"AP Dempster","year":"1977","unstructured":"Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39(1):1\u201338","journal-title":"J R Stat Soc B"},{"issue":"1","key":"497_CR12","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1016\/j.jinteco.2011.05.003","volume":"85","author":"J Di Giovanni","year":"2011","unstructured":"Di Giovanni J, Levchenko AA, Ranci\u00e8re R (2011) Power laws in firm size and openness to trade: measurement and implications. J Int Econ 85(1):42\u201352","journal-title":"J Int Econ"},{"issue":"5","key":"497_CR13","doi-asserted-by":"publisher","first-page":"1429","DOI":"10.1257\/0002828043052303","volume":"94","author":"J Eeckhout","year":"2004","unstructured":"Eeckhout J (2004) Gibrat\u2019s law for (all) cities. Am Econ Rev 94(5):1429\u201351","journal-title":"Am Econ Rev"},{"issue":"4","key":"497_CR14","doi-asserted-by":"publisher","first-page":"1676","DOI":"10.1257\/aer.99.4.1676","volume":"99","author":"J Eeckhout","year":"2009","unstructured":"Eeckhout J (2009) Gibrat\u2019s law for (all) cities: reply. Am Econ Rev 99(4):1676\u201383","journal-title":"Am Econ Rev"},{"issue":"5","key":"497_CR15","doi-asserted-by":"publisher","first-page":"736","DOI":"10.1111\/jors.12205","volume":"55","author":"G Fazio","year":"2015","unstructured":"Fazio G, Modica M (2015) Pareto or log-normal? best fit and truncation in the distribution of all cities. J Reg Sci 55(5):736\u2013756","journal-title":"J Reg Sci"},{"key":"497_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2765-4","volume-title":"A first course in multivariate statistics","author":"B Flury","year":"1997","unstructured":"Flury B (1997) A first course in multivariate statistics. Springer, Berlin"},{"issue":"5","key":"497_CR17","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1023\/A:1024072610684","volume":"3","author":"A Frigessi","year":"2002","unstructured":"Frigessi A, Haug O, Rue H (2002) A dynamic mixture model for unsupervised tail estimation without threshold selection. Extremes 3(5):219\u2013235","journal-title":"Extremes"},{"key":"497_CR18","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1146\/annurev.economics.050708.142940","volume":"1","author":"X Gabaix","year":"2009","unstructured":"Gabaix X (2009) Power laws in economics and finance. Annu Rev Econ 1:255\u201393","journal-title":"Annu Rev Econ"},{"issue":"1","key":"497_CR19","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1198\/jbes.2009.06157","volume":"29","author":"X Gabaix","year":"2011","unstructured":"Gabaix X, Ibragimov R (2011) Rank-1\/2: a simple way to improve the OLS estimation of tail exponents. J Bus Econ Stat 29(1):24\u201339","journal-title":"J Bus Econ Stat"},{"issue":"2","key":"497_CR20","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1111\/insr.12058","volume":"83","author":"M Gomes","year":"2015","unstructured":"Gomes M, Guillou A (2015) Extreme value theory and statistics of univariate extremes: a review. Int Stat Rev 83(2):263\u2013292","journal-title":"Int Stat Rev"},{"issue":"1","key":"497_CR21","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1111\/pirs.12037","volume":"94","author":"R Gonz\u00e1lez-Val","year":"2015","unstructured":"Gonz\u00e1lez-Val R, Ramos A, Sanz-Gracia F, Vera-Cabello M (2015) Size distributions for all cities: Which one is best? Pap Reg Sci 94(1):177\u2013196","journal-title":"Pap Reg Sci"},{"key":"497_CR22","doi-asserted-by":"publisher","first-page":"1892","DOI":"10.1214\/13-AOS1137","volume":"41","author":"P Hall","year":"2013","unstructured":"Hall P, Horowitz J (2013) A simple bootstrap method for constructing nonparametric confidence bands for functions. Ann Stat 41:1892\u20131921","journal-title":"Ann Stat"},{"issue":"563","key":"497_CR23","doi-asserted-by":"publisher","first-page":"903","DOI":"10.1111\/j.1468-0297.2012.02518.x","volume":"122","author":"W-T Hsu","year":"2012","unstructured":"Hsu W-T (2012) Central place theory and city size distribution. Econ J 122(563):903\u2013932","journal-title":"Econ J"},{"issue":"1","key":"497_CR24","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1016\/j.jue.2012.06.005","volume":"73","author":"Y Ioannides","year":"2013","unstructured":"Ioannides Y, Skouras S (2013) US city size distribution: robustly Pareto, but only in the tail. J Urban Econ 73(1):18\u201329","journal-title":"J Urban Econ"},{"key":"497_CR25","doi-asserted-by":"publisher","DOI":"10.1002\/0471457175","volume-title":"Statistical size distributions in economics and actuarial sciences","author":"C Kleiber","year":"2003","unstructured":"Kleiber C, Kotz S (2003) Statistical size distributions in economics and actuarial sciences. Wiley, New York"},{"key":"497_CR26","volume-title":"Loss models: from data to decisions","author":"SA Klugman","year":"2004","unstructured":"Klugman SA, Panjer HH, Willmot GE (2004) Loss models: from data to decisions, 2nd edn. Wiley, New York","edition":"2"},{"key":"497_CR27","unstructured":"Kondo I, Lewis L, Stella A (2021) Heavy tailed, but not Zipf: firm and establishment size in the U.S. U.S. Census working paper number CES-21-15"},{"issue":"4","key":"497_CR28","doi-asserted-by":"publisher","first-page":"1672","DOI":"10.1257\/aer.99.4.1672","volume":"99","author":"M Levy","year":"2009","unstructured":"Levy M (2009) Gibrat\u2019s law for (all) cities: comment. Am Econ Rev 99(4):1672\u201375","journal-title":"Am Econ Rev"},{"key":"497_CR29","doi-asserted-by":"crossref","unstructured":"Malevergne Y, Pisarenko V, Sornette D (2009) Gibrat\u2019s law for cities: uniformly most powerful unbiased test of the Pareto against the lognormal. Swiss Finance Institute Research Paper Series, pp 09\u201340","DOI":"10.2139\/ssrn.1479481"},{"key":"497_CR30","doi-asserted-by":"publisher","DOI":"10.1002\/9780470191613","volume-title":"The EM algorithm and extensions","author":"G McLachlan","year":"2008","unstructured":"McLachlan G, Krishnan T (2008) The EM algorithm and extensions, 2nd edn. Wiley, New York","edition":"2"},{"issue":"1","key":"497_CR31","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1016\/S0165-1765(01)00524-9","volume":"74","author":"W Reed","year":"2001","unstructured":"Reed W (2001) The Pareto, Zipf and other power laws. Econ Lett 74(1):15\u201319","journal-title":"Econ Lett"},{"issue":"5","key":"497_CR32","doi-asserted-by":"publisher","first-page":"2205","DOI":"10.1257\/aer.101.5.2205","volume":"101","author":"H Rozenfeld","year":"2011","unstructured":"Rozenfeld H, Rybski D, Gabaix X, Makse H (2011) The area and population of cities: new insights from a different perspective on cities. Am Econ Rev 101(5):2205\u201325","journal-title":"Am Econ Rev"},{"key":"497_CR33","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1080\/03461230601110447","volume":"1","author":"DPM Scollnik","year":"2007","unstructured":"Scollnik DPM (2007) On composite lognormal\u2013Pareto models. Scand Actuar J 1:20\u201333","journal-title":"Scand Actuar J"},{"issue":"398","key":"497_CR34","doi-asserted-by":"publisher","first-page":"605","DOI":"10.1080\/01621459.1987.10478472","volume":"82","author":"SG Self","year":"1987","unstructured":"Self SG, Liang K-Y (1987) Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J Am Stat Assoc 82(398):605\u2013610","journal-title":"J Am Stat Assoc"},{"key":"497_CR35","doi-asserted-by":"publisher","first-page":"659","DOI":"10.1007\/s00181-014-0883-x","volume":"49","author":"A Tang","year":"2015","unstructured":"Tang A (2015) Does Gibrat\u2019s law hold for Swedish energy firms? Empir Econ 49:659-674","journal-title":"Empir Econ"},{"key":"497_CR36","volume-title":"Statistical analysis of finite mixture distributions","author":"D Titterington","year":"1985","unstructured":"Titterington D, Smith A, Makov U (1985) Statistical analysis of finite mixture distributions. Wiley, New York"},{"issue":"1","key":"497_CR37","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1214\/aos\/1176346060","volume":"11","author":"CFJ Wu","year":"1983","unstructured":"Wu CFJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11(1):95\u2013103","journal-title":"Ann Stat"}],"container-title":["Advances in Data Analysis and Classification"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00497-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11634-022-00497-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11634-022-00497-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,19]],"date-time":"2024-06-19T08:18:33Z","timestamp":1718785113000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11634-022-00497-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,8]]},"references-count":37,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,6]]}},"alternative-id":["497"],"URL":"https:\/\/doi.org\/10.1007\/s11634-022-00497-4","relation":{},"ISSN":["1862-5347","1862-5355"],"issn-type":[{"value":"1862-5347","type":"print"},{"value":"1862-5355","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,8]]},"assertion":[{"value":"13 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 February 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 July 2022","order":5,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":6,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Missing Open Access funding information has been added in the Funding Note.","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}}]}}