{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,24]],"date-time":"2026-01-24T15:55:00Z","timestamp":1769270100411,"version":"3.49.0"},"reference-count":18,"publisher":"Walter de Gruyter GmbH","issue":"4","license":[{"start":{"date-parts":[[2024,8,8]],"date-time":"2024-08-08T00:00:00Z","timestamp":1723075200000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["P20GM103451"],"award-info":[{"award-number":["P20GM103451"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Correlation coefficients\nand linear regression values computed from group averages can differ from correlation coefficients and linear regression values computed using individual scores. This observation known as the ecological fallacy often assumes that all the individual scores are available from a population. In many situations, one must use a sample from the larger population. In such cases, the computed correlation coefficient and linear regression values will depend on the sample that is chosen and the underlying sampling distribution.\nThe sampling distribution of correlation coefficients and linear regression values for group averages will be identical to the sampling distribution for individuals for normally distributed variables for random samples drawn from infinitely large continuous distributions.\nHowever, data that is acquired in practice is often acquired when sampling without replacement from a finite population. Our objective is to demonstrate through Monte Carlo simulations that the\nsampling distributions for\ncorrelation and linear regression will also be similar for individuals and group averages when sampling without replacement from normally distributed variables. These simulations suggest that when a random sample from a population is selected, the correlation coefficients and linear regression values computed from individual scores will not be more accurate in estimating the entire population values compared to samples when group averages are used as long as the sample size is the same.<\/jats:p>","DOI":"10.1515\/mcma-2024-2013","type":"journal-article","created":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T15:34:48Z","timestamp":1723044888000},"page":"331-363","source":"Crossref","is-referenced-by-count":2,"title":["Investigating the ecological fallacy through sampling distributions constructed from finite populations"],"prefix":"10.1515","volume":"30","author":[{"given":"David J.","family":"Torres","sequence":"first","affiliation":[{"name":"Department of Mathematics and Physical Science , Northern New Mexico College , Espa\u00f1ola , NM 87532 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Damain","family":"Rouson","sequence":"additional","affiliation":[{"name":"Computer Languages and Systems Software Group , Lawrence Berkeley National Laboratory , Berkeley , California , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"374","published-online":{"date-parts":[[2024,8,8]]},"reference":[{"key":"2024112721095546084_j_mcma-2024-2013_ref_001","doi-asserted-by":"crossref","unstructured":"N.  Cleave, P. J.  Brown and C. D.  Payne,\nEvaluation of methods for ecological inference,\nJ. Roy. Statist. Soc. Ser. A 158 (1995), 55\u201372.","DOI":"10.2307\/2983403"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_002","doi-asserted-by":"crossref","unstructured":"R. A.  Fisher,\nFrequency distribution of the values of the correlation coefficient in samples from an indefinitely large population,\nBiometrika 10 (1915), 507\u2013521.","DOI":"10.1093\/biomet\/10.4.507"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_003","unstructured":"R. A.  Fisher,\nOn the probable error of a coefficient of correlation deduced from a small sample,\nMetron 1 (1921), 3\u201332."},{"key":"2024112721095546084_j_mcma-2024-2013_ref_004","doi-asserted-by":"crossref","unstructured":"R. A.  Fisher,\nThe general sampling distribution of the multiple correlation coefficient,\nProc. Roy. Soc. Lond. Ser. A 121 (1928), 654\u2013673.","DOI":"10.1098\/rspa.1928.0224"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_005","doi-asserted-by":"crossref","unstructured":"H.  Gatignon,\nStatistical Analysis of Management Data, 2nd ed.,\nSpringer, New York, 2010.","DOI":"10.1007\/978-1-4419-1270-1"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_006","doi-asserted-by":"crossref","unstructured":"A. T.  Geronimus and J.  Bound,\nUse of census-based aggregate variables to proxy for socioeconomic group: Evidence from national samples,\nAm. J. Epidemiol. 148 (1988), 475\u2013486.","DOI":"10.1093\/oxfordjournals.aje.a009673"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_007","doi-asserted-by":"crossref","unstructured":"L.  Goodman,\nEcological regressions and behavior of individuals,\nAmer. Sociological Rev. 18 (1953), 663\u2013664.","DOI":"10.2307\/2088121"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_008","unstructured":"L.  Irwin and A. J.  Lichtman,\nAcross the great divide: Inferring individual level behavior from aggregate data,\nPolitical Methodology 3 (1976), 411\u2013439."},{"key":"2024112721095546084_j_mcma-2024-2013_ref_009","doi-asserted-by":"crossref","unstructured":"G.  King,\nA Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data,\nPrinceton University, New Jersey, 1997.","DOI":"10.3886\/ICPSR01132.v1"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_010","doi-asserted-by":"crossref","unstructured":"A. J.  Lichtman,\nCorrelation, regression, and the ecological fallacy: A critique,\nJ. Interdiscip. Hist. 4 (1974), 417\u2013433.","DOI":"10.2307\/202485"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_011","unstructured":"S.  Mahadevan,\nMonte Carlo simulation,\nReliability-Based Mechanical Design,\nMarcel Dekker, New York (1997), 123\u2013146."},{"key":"2024112721095546084_j_mcma-2024-2013_ref_012","unstructured":"R. J.  Muirhead,\nAspects of Multivariate Statistical Theory,\nJohn Wiley & Sons, New Jersey, 2005."},{"key":"2024112721095546084_j_mcma-2024-2013_ref_013","doi-asserted-by":"crossref","unstructured":"S.  Piantadosi, D. P.  Byar and S. B.  Green,\nThe ecological fallacy,\nAm. J. Epidemiol. 127 (1988), 893\u2013904.","DOI":"10.1093\/oxfordjournals.aje.a114892"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_014","doi-asserted-by":"crossref","unstructured":"W. S.  Robinson,\nEcological correlations and the behavior of individuals,\nAmer. Sociological Rev. 15 (1950), 351\u2013357.","DOI":"10.2307\/2087176"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_015","unstructured":"V.  Romanovskij,\nOn the distribution of the regression coefficient in samples from normal population,\nBull. Acad. Sci. URSS 20 (1926), no. 6, 643\u2013648."},{"key":"2024112721095546084_j_mcma-2024-2013_ref_016","doi-asserted-by":"crossref","unstructured":"Y. T.  Shih, C.  Bradley and K. R.  Yabroff,\nEcological and individualistic fallacies in health disparities research,\nJ. National Cancer Inst. 115 (2023), 488\u2013491.","DOI":"10.1093\/jnci\/djad047"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_017","doi-asserted-by":"crossref","unstructured":"D. J.  Torres,\nDescribing the Pearson R distribution of aggregate data,\nMonte Carlo Methods Appl. 1 (2020), 17\u201332.","DOI":"10.1515\/mcma-2020-2054"},{"key":"2024112721095546084_j_mcma-2024-2013_ref_018","doi-asserted-by":"crossref","unstructured":"S. M.  Woodward, D.  Mork, X.  Wu, Z.  Hou, D.  Braun and F.  Dominici,\nCombining aggregate and individual-level data to estimate individual-level associations between air pollution and COVID-19 mortality in the United States,\nPLOS Global Public Health 3 (2023), Article ID e0002178.","DOI":"10.1371\/journal.pgph.0002178"}],"container-title":["Monte Carlo Methods and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/mcma-2024-2013\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/mcma-2024-2013\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,27]],"date-time":"2024-11-27T21:11:07Z","timestamp":1732741867000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/mcma-2024-2013\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,8]]},"references-count":18,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,11,14]]},"published-print":{"date-parts":[[2024,12,1]]}},"alternative-id":["10.1515\/mcma-2024-2013"],"URL":"https:\/\/doi.org\/10.1515\/mcma-2024-2013","relation":{},"ISSN":["0929-9629","1569-3961"],"issn-type":[{"value":"0929-9629","type":"print"},{"value":"1569-3961","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,8]]}}}