{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,5]],"date-time":"2026-05-05T04:56:19Z","timestamp":1777956979401,"version":"3.51.4"},"reference-count":84,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2025,4,13]],"date-time":"2025-04-13T00:00:00Z","timestamp":1744502400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,4,13]],"date-time":"2025-04-13T00:00:00Z","timestamp":1744502400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"name":"WASP-HS","award":["NetX"],"award-info":[{"award-number":["NetX"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Big Data &amp; Society"],"published-print":{"date-parts":[[2025,6]]},"abstract":"<jats:p>Synthetic data is increasingly used as a substitute for real data due to ethical, legal, and logistical reasons. However, the rise of synthetic data also raises critical questions about its entanglement with the politics of classification and the reproduction of social norms and categories. This paper aims to problematize the use of synthetic data by examining how its production is intertwined with the maintenance of certain worldviews and classifications. We argue that synthetic data, like real data, is embedded with societal biases and power structures, leading to the reproduction of existing social inequalities. Through empirical examples, we demonstrate how synthetic data tends to highlight majority elements as the \u201cnormal\u201d and minimize minority elements, and that the slight changes to the data structures that create synthetic data will also inevitably result in what we term \u201cintersectional hallucinations.\u201d These hallucinations are inherent to synthetic data and cannot be entirely eliminated without compromising the purpose of creating synthetic datasets. We contend that decisions about synthetic data involve determining which intersections are essential and which can be disregarded, a practice which will imbue these decisions with norms and values. Our study underscores the need for critical engagement with the mathematical and statistical choices in synthetic data production and advocates for careful consideration of the ontological and political implications of these choices during curatorial style production of synthetic structured data.<\/jats:p>","DOI":"10.1177\/20539517251318289","type":"journal-article","created":{"date-parts":[[2025,4,13]],"date-time":"2025-04-13T11:31:26Z","timestamp":1744543886000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":9,"title":["The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations"],"prefix":"10.1177","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7206-2046","authenticated-orcid":false,"given":"Francis","family":"Lee","sequence":"first","affiliation":[{"name":"Division of Science, Technology, and Society, Chalmers Technical University, Goteborg, Sweden"},{"name":"Link\u00f6ping University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0176-5852","authenticated-orcid":false,"given":"Saghi","family":"Hajisharif","sequence":"additional","affiliation":[{"name":"Department of Science and Technology, Link\u00f6ping University, Linkoping, Sweden"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5041-5018","authenticated-orcid":false,"given":"Ericka","family":"Johnson","sequence":"additional","affiliation":[{"name":"Link\u00f6ping University"}]}],"member":"179","published-online":{"date-parts":[[2025,4,13]]},"reference":[{"key":"e_1_3_3_2_1","unstructured":"Angwin J Larson J Mattu S et al. (2016) Machine Bias: There\u2019s software used across the country to predict future criminals. And it\u2019s biased against blacks. Available at: https:\/\/www.propublica.org\/article\/machine-bias-risk-assessments-in-criminal-sentencing."},{"key":"e_1_3_3_3_1","doi-asserted-by":"publisher","DOI":"10.1177\/0162243912449749"},{"key":"e_1_3_3_4_1","first-page":"442","volume-title":"Sample Sample amplification: Increasing dataset size even when learning is impossible","author":"Axelrod B","year":"2020","unstructured":"Axelrod B, Garg S, Sharan V, et al. (2020et al. (2020) Sample Sample amplification: Increasing dataset size even when learning is impossible. In: International Conference on Machine Learning, pp.442\u2013451: PMLR."},{"key":"e_1_3_3_5_1","doi-asserted-by":"publisher","DOI":"10.2307\/j.ctv12101zq"},{"key":"e_1_3_3_6_1","doi-asserted-by":"publisher","DOI":"10.23987\/sts.66156"},{"key":"e_1_3_3_7_1","volume-title":"Race After Technology","author":"Benjamin R","year":"2019","unstructured":"Benjamin R (2019) Race After Technology. London: Polity."},{"key":"e_1_3_3_8_1","doi-asserted-by":"publisher","DOI":"10.3390\/e23091165"},{"key":"e_1_3_3_9_1","volume-title":"Democracy\u2019s Data Infrastructure: The Technopolitics of the US Census.","author":"Bouk D","year":"2021","unstructured":"Bouk D, Boyd D (2021) Democracy\u2019s Data Infrastructure: The Technopolitics of the US Census. New York: Knight First Amendment Institute."},{"key":"e_1_3_3_10_1","doi-asserted-by":"publisher","DOI":"10.1177\/030631200030005001"},{"key":"e_1_3_3_11_1","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/6352.001.0001"},{"key":"e_1_3_3_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11199-008-9400-z"},{"key":"e_1_3_3_13_1","first-page":"1","article-title":"Gender shades: Intersectional accuracy disparities in commercial gender classification","volume":"81","author":"Buolamwini J","year":"2018","unstructured":"Buolamwini J, Gebru T (2018) Gender shades: Intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research 81: 1\u201315.","journal-title":"Proceedings of Machine Learning Research"},{"key":"e_1_3_3_14_1","unstructured":"Burkardt J (2023) The truncated normal distribution. Available at: https:\/\/people.sc.fsu.edu\/~jburkardt\/presentations\/truncated_normal.pdf."},{"key":"e_1_3_3_15_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-954X.1998.tb03468.x"},{"key":"e_1_3_3_16_1","doi-asserted-by":"publisher","DOI":"10.1177\/0170840605056393"},{"key":"e_1_3_3_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-024-10759-6"},{"key":"e_1_3_3_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3442188.3445879"},{"key":"e_1_3_3_19_1","doi-asserted-by":"publisher","DOI":"10.1086\/669608"},{"key":"e_1_3_3_20_1","doi-asserted-by":"crossref","unstructured":"Ciston S (2019) Imagining intersectional AI. xCoAx. 2019.","DOI":"10.5399\/uo\/ada.2019.15.5"},{"key":"e_1_3_3_21_1","volume-title":"Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment","author":"Collins PH","unstructured":"Collins PH (2000 [1990]) Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment, 2nd edn. New York: Routledge.","edition":"2"},{"key":"e_1_3_3_22_1","first-page":"385","volume-title":"The Oxford Handbook of Feminist Theory","author":"Cooper B","year":"2016","unstructured":"Cooper B (2016) Intersectionality. In: Disch L, Hawkesworth M (eds) The Oxford Handbook of Feminist Theory, Vol. 1. Oxford: Oxford University Press, 385\u2013406."},{"key":"e_1_3_3_23_1","volume-title":"The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence","author":"Crawford K","year":"2021","unstructured":"Crawford K (2021) The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. New Haven: Yale University Press."},{"key":"e_1_3_3_24_1","first-page":"139","article-title":"Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics","volume":"1989","author":"Crenshaw K","year":"1989","unstructured":"Crenshaw K (1989) Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. University of Chicago Legal Forum 1989: 139\u2013167.","journal-title":"University of Chicago Legal Forum"},{"key":"e_1_3_3_25_1","unstructured":"DataCebo Blog (2021) Meet the synthetic data vault. Available at: https:\/\/datacebo.com\/blog\/intro-to-sdv\/ (accessed 14 March 2023)."},{"key":"e_1_3_3_26_1","doi-asserted-by":"crossref","unstructured":"Dehdarirad T et al. (2024) Enhancing tabular GAN fairness: The Impact of Intersectional Feature Selection. ICMLA. 2024.","DOI":"10.1109\/ICMLA61862.2024.00176"},{"key":"e_1_3_3_27_1","volume-title":"The Politics of Large Numbers: A History of Statistical Reasoning","author":"Desrosi\u00e8res A","year":"1998","unstructured":"Desrosi\u00e8res A (1998) The Politics of Large Numbers: A History of Statistical Reasoning. Cambridge: Harvard University Press."},{"key":"e_1_3_3_28_1","volume-title":"Data Feminism","author":"D\u2019Ignazio C","year":"2019","unstructured":"D\u2019Ignazio C, Klein L (2019) Data Feminism. Cambridge: MIT Press."},{"key":"e_1_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.7208\/chicago\/9780226213118.001.0001"},{"key":"e_1_3_3_30_1","volume-title":"The Order of Things: An Archaeology of the Human Sciences.","author":"Foucault M","year":"2007","unstructured":"Foucault M (2007) The Order of Things: An Archaeology of the Human Sciences. London: Routledge."},{"key":"e_1_3_3_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.aos.2013.11.002"},{"key":"e_1_3_3_32_1","volume-title":"Robin Hood and Matthew effects: Differential privacy has disparate impact on synthetic data","volume":"162","author":"Ganev G","year":"2022","unstructured":"Ganev G, Oprisanu B, De Cristofaro E (2022) Robin Hood and Matthew effects: Differential privacy has disparate impact on synthetic data. In: Proceedings of the 39th International Conference on Machine Learning, vol. 162, 2022. Baltimore, MD: PMLR."},{"key":"e_1_3_3_33_1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511720482"},{"key":"e_1_3_3_34_1","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/9302.001.0001"},{"key":"e_1_3_3_35_1","unstructured":"Grace-Martin K (2008) Outliers: To drop or not to drop. In: The Analysis Factor. Available at: https:\/\/www.theanalysisfactor.com\/outliers-to-drop-or-not-to-drop\/ (accessed 17 March 2023)."},{"key":"e_1_3_3_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-54204-6_10"},{"key":"e_1_3_3_37_1","doi-asserted-by":"publisher","DOI":"10.1177\/20539517211003118"},{"key":"e_1_3_3_38_1","first-page":"4049","volume-title":"Data amplification: Instance-optimal property estimation","author":"Hao Y","year":"2020","unstructured":"Hao Y, Orlitsky A (2020, November) Data amplification: Instance-optimal property estimation. In: International Conference on Machine Learning, pp. 4049\u20134059: PMLR."},{"key":"e_1_3_3_39_1","doi-asserted-by":"publisher","DOI":"10.1177\/0306312706054047"},{"issue":"1","key":"e_1_3_3_40_1","article-title":"This ground truth is muddy anyway. Ground truth assemblages for medical AI development","volume":"2025","author":"H\u00f6gberg C","year":"2025","unstructured":"H\u00f6gberg C (2025) This ground truth is muddy anyway. Ground truth assemblages for medical AI development. Sociologisk Forskning 2025(1). Forthcoming.","journal-title":"Sociologisk Forskning"},{"key":"e_1_3_3_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-53946-6"},{"key":"e_1_3_3_42_1","doi-asserted-by":"publisher","DOI":"10.1177\/20539517221145372"},{"key":"e_1_3_3_43_1","first-page":"1","article-title":"The intersectional hallucinations of synthetic data","author":"Johnson E","year":"2024","unstructured":"Johnson E, Hajisharif S (2024) The intersectional hallucinations of synthetic data. AI & Society. 1\u20133. Epub ahead of print. DOI: https:\/\/doi.org\/10.1007\/s00146-024-02017-8.","journal-title":"AI & Society"},{"key":"e_1_3_3_44_1","volume-title":"Data feminism for AI\u201824","author":"Klein L","year":"2024","unstructured":"Klein L, D\u2019Ignazio C (2024) Data feminism for AI. In: FAccT \u201824. June 03-03, 2024, Brazil: Rio De Janeiro."},{"key":"e_1_3_3_45_1","doi-asserted-by":"publisher","DOI":"10.23987\/sts.75323"},{"key":"e_1_3_3_46_1","first-page":"97","volume-title":"Sensing In\/Security: Sensors as Transnational Security Infrastructures","author":"Lee F","unstructured":"Lee F (2021b) Sensing Salmonella: Modes of sensing and the politics of sensing infrastructures. In: Witjes N, P\u00f6chhacker N, Bowker GC (eds) Sensing In\/Security: Sensors as Transnational Security Infrastructures. London: Mattering Press, 97\u2013131."},{"key":"e_1_3_3_47_1","first-page":"417","article-title":"Ontological overflows and the politics of absence: Zika, disease surveillance, and mosquitos","author":"Lee F","year":"2023","unstructured":"Lee F (2023) Ontological overflows and the politics of absence: Zika, disease surveillance, and mosquitos. Science as Culture 33(1): 417\u2013442.","journal-title":"Science as Culture"},{"key":"e_1_3_3_48_1","doi-asserted-by":"publisher","DOI":"10.1177\/2053951719863819"},{"key":"e_1_3_3_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.2023.3273896"},{"key":"e_1_3_3_50_1","volume-title":"All Data Are Local","author":"Loukissas Y","year":"2022","unstructured":"Loukissas Y (2022) All Data Are Local. Cambridge: MIT Press."},{"key":"e_1_3_3_51_1","first-page":"207","volume-title":"Framing Intersectionality: Debates on a Multi-Faceted Concept in Gender Studies","author":"Lykke N","year":"2011","unstructured":"Lykke N (2011) Intersectional analysis: Black box or useful critical feminist thinking technology. In: Lutz S (ed) Framing Intersectionality: Debates on a Multi-Faceted Concept in Gender Studies. Farnham: Ashgate, 207\u2013219."},{"key":"e_1_3_3_52_1","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/10302.001.0001"},{"key":"e_1_3_3_53_1","doi-asserted-by":"publisher","DOI":"10.1177\/030631277600600310"},{"key":"e_1_3_3_54_1","volume-title":"An Engine, Not a Camera: How Financial Models Shape Markets","author":"MacKenzie D","year":"2008","unstructured":"MacKenzie D (2008) An Engine, Not a Camera: How Financial Models Shape Markets, 1st edn. Cambridge: MIT Press.","edition":"1"},{"key":"e_1_3_3_55_1","doi-asserted-by":"publisher","DOI":"10.1177\/0306312713517158"},{"key":"e_1_3_3_56_1","doi-asserted-by":"publisher","DOI":"10.1177\/0306312713517157"},{"key":"e_1_3_3_57_1","doi-asserted-by":"publisher","DOI":"10.1515\/9781400835287-004"},{"key":"e_1_3_3_58_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0234962"},{"key":"e_1_3_3_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3457607"},{"key":"e_1_3_3_60_1","unstructured":"Miceli M Posada J Yang T (2021) Studying up machine learning data. ArXiv:2109.08131."},{"key":"e_1_3_3_61_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-954X.1999.tb03483.x"},{"key":"e_1_3_3_62_1","doi-asserted-by":"publisher","DOI":"10.1215\/9780822384151"},{"key":"e_1_3_3_63_1","doi-asserted-by":"publisher","DOI":"10.1177\/07352751221076863"},{"key":"e_1_3_3_64_1","doi-asserted-by":"publisher","DOI":"10.1177\/20539517241249390"},{"key":"e_1_3_3_65_1","doi-asserted-by":"publisher","DOI":"10.1186\/s12939-019-1098-8"},{"key":"e_1_3_3_66_1","unstructured":"Parsa RA Kim JJ Katzoff M (2009) Application of the truncated distributions and copulas in masking data. In: Joint Statistical Meetings pp. 2770\u20132780."},{"key":"e_1_3_3_67_1","first-page":"399","volume-title":"The synthetic data vaultOctober 2016","author":"Patki N","year":"2016","unstructured":"Patki N, Wedge R, Veeramachaneni K (2016) The synthetic data vault. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, Canada, October 2016, pp. 399\u2013410. IEEE. Available at: http:\/\/ieeexplore.ieee.org\/document\/7796926\/ (accessed 14 March 2023)."},{"key":"e_1_3_3_68_1","doi-asserted-by":"publisher","DOI":"10.1007\/s43681-024-00419-4"},{"key":"e_1_3_3_69_1","doi-asserted-by":"publisher","DOI":"10.3390\/make4020022"},{"key":"e_1_3_3_70_1","doi-asserted-by":"publisher","DOI":"10.1089\/trgh.2020.0054"},{"key":"e_1_3_3_71_1","unstructured":"Reuters (2018) Amazon scraps secret AI recruiting tool that showed bias against women. Available at: https:\/\/www.reuters.com\/article\/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G (accessed 2 April 2020)."},{"issue":"6","key":"e_1_3_3_72_1","first-page":"1082","article-title":"Intersectionality-informed quantitative research: A primer","volume":"103","author":"Rouhani S","year":"2014","unstructured":"Rouhani S (2014) Intersectionality-informed quantitative research: A primer. American Journal of Public Health 103 (6): 1082.","journal-title":"American Journal of Public Health"},{"issue":"2016","key":"e_1_3_3_73_1","first-page":"4972","article-title":"When the algorithm itself is a racist","volume":"10","author":"Sandvig C","year":"2016","unstructured":"Sandvig C, Hamilton K, Karahalios K, et al. (2016) When the algorithm itself is a racist. International Journal of Communication 10 (2016): 4972\u20134990.","journal-title":"International Journal of Communication"},{"key":"e_1_3_3_74_1","article-title":"Synthetic data could be better than real data","author":"Savage N","year":"2023","unstructured":"Savage N (2023) Synthetic data could be better than real data. Nature Machine Intelligence. Epub ahead of print. DOI: 10.1038\/d41586-023-01445-8.","journal-title":"Nature Machine Intelligence"},{"key":"e_1_3_3_75_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41586-024-07566-y"},{"key":"e_1_3_3_76_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-954X.1990.tb03347.x"},{"key":"e_1_3_3_77_1","volume-title":"Human-Machine Reconfigurations","author":"Suchman L","year":"2007","unstructured":"Suchman L (2007) Human-Machine Reconfigurations. Cambridge: Cambridge University Press."},{"key":"e_1_3_3_78_1","doi-asserted-by":"publisher","DOI":"10.1177\/20539517231206794"},{"key":"e_1_3_3_79_1","doi-asserted-by":"crossref","unstructured":"Varley T Kaminski P (2021) Intersectional synergies: Untangling irreducible effects of intersecting identities via information decomposition. Arxiv 1\u201310. DOI: 10.48550\/arXiv.2106.10338.","DOI":"10.3390\/e24101387"},{"key":"e_1_3_3_80_1","doi-asserted-by":"crossref","unstructured":"Verma S Ruben J (2018) Fairness definitions explained. In: 2018 ACM\/IEEE International Workshop on Software Fairness. FairWare\u201918 Gothenburg Sweden May 29 2018. https:\/\/doi.org\/10.1145\/3194770.3194776.","DOI":"10.1145\/3194770.3194776"},{"key":"e_1_3_3_81_1","volume-title":"Towards intersectionality in machine learningFAccT \u201822","author":"Wang A","year":"2022","unstructured":"Wang A, et al. (2022) Towards intersectionality in machine learning. In: FAccT \u201822, June 21\u201324, 2022."},{"key":"e_1_3_3_82_1","volume-title":"Race, Rhetoric and Media Studies","author":"Washington M","year":"2017","unstructured":"Washington M (2017) Race, Rhetoric and Media Studies. Jackson, MS: University Press of Mississippi."},{"key":"e_1_3_3_83_1","doi-asserted-by":"publisher","DOI":"10.1038\/d41586-024-02355-z"},{"key":"e_1_3_3_84_1","first-page":"7335","volume-title":"Modeling tabular data using conditional GAN","author":"Xu L","year":"2019","unstructured":"Xu L, Skoularidou M, Cuesta-Infante A, et al. (2019) Modeling tabular data using conditional GAN. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp.7335\u20137345. Red Hook, NY: Curran Associates Inc."},{"key":"e_1_3_3_85_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41746-023-00888-7"}],"container-title":["Big Data &amp; Society"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/20539517251318289","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/20539517251318289","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/20539517251318289","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T13:01:10Z","timestamp":1777381270000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/20539517251318289"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,13]]},"references-count":84,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,6]]}},"alternative-id":["10.1177\/20539517251318289"],"URL":"https:\/\/doi.org\/10.1177\/20539517251318289","relation":{},"ISSN":["2053-9517","2053-9517"],"issn-type":[{"value":"2053-9517","type":"print"},{"value":"2053-9517","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,13]]},"article-number":"20539517251318289"}}