{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:43:15Z","timestamp":1777455795321,"version":"3.51.4"},"reference-count":54,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010661","name":"Horizon 2020 Framework Programme","doi-asserted-by":"publisher","award":["ERC-2019-AdG-883107-ALGOSOC"],"award-info":[{"award-number":["ERC-2019-AdG-883107-ALGOSOC"]}],"id":[{"id":"10.13039\/100010661","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Big Data &amp; Society"],"published-print":{"date-parts":[[2023,1]]},"abstract":"<jats:p>Machine-learning algorithms have become deeply embedded in contemporary society. As such, ample attention has been paid to the contents, biases, and underlying assumptions of the training datasets that many algorithmic models are trained on. Yet, what happens when algorithms are trained on data that are not real, but instead data that are \u2018synthetic\u2019, not referring to real persons, objects, or events? Increasingly, synthetic data are being incorporated into the training of machine-learning algorithms for use in various societal domains. There is currently little understanding, however, of the role played by and the ethicopolitical implications of synthetic training data for machine-learning algorithms. In this article, I explore the politics of synthetic data through two central aspects: first, synthetic data promise to emerge as a rich source of exposure to variability for the algorithm. Second, the paper explores how synthetic data promise to place algorithms beyond the realm of risk. I propose that an analysis of these two areas will help us better understand the ways in which machine-learning algorithms are envisioned in the light of synthetic data, but also how synthetic training data actively reconfigure the conditions of possibility for machine learning in contemporary society.<\/jats:p>","DOI":"10.1177\/20539517221145372","type":"journal-article","created":{"date-parts":[[2023,1,17]],"date-time":"2023-01-17T08:08:12Z","timestamp":1673942892000},"update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":68,"title":["Machine learning and the politics  of synthetic data"],"prefix":"10.1177","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6656-8892","authenticated-orcid":false,"given":"Benjamin N","family":"Jacobsen","sequence":"first","affiliation":[{"name":"Department of Geography, Durham University, Durham, UK"}]}],"member":"179","published-online":{"date-parts":[[2023,1,17]]},"reference":[{"key":"bibr1-20539517221145372","doi-asserted-by":"publisher","DOI":"10.4135\/9781446219539"},{"key":"bibr2-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1215\/9780822395324"},{"key":"bibr3-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1057\/9781137290755"},{"key":"bibr4-20539517221145372","unstructured":"Amaro R (2020) Threshold Value. E-Flux Architecture. Available at: https:\/\/www.e-flux.com\/architecture\/education\/322664\/threshold-value\/."},{"key":"bibr5-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1080\/13621020802586628"},{"key":"bibr6-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1177\/0263276411417430"},{"key":"bibr7-20539517221145372","volume-title":"Cloud Ethics: Machine Learning and the Attributes of Ourselves and Others","author":"Amoore L","year":"2020"},{"key":"bibr8-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1080\/17439884.2020.1686014"},{"key":"bibr9-20539517221145372","unstructured":"Angwin J, Larson J, Mattu S et al. (2016) Machine Bias. ProPublica. Available at: https:\/\/www.propublica.org\/article\/machine-bias-risk-assessments-in-criminal-sentencing."},{"key":"bibr10-20539517221145372","doi-asserted-by":"crossref","unstructured":"Bansal M, Krizhevsky A, Ogale A (2019) Chaffeurnet: Learning to drive by imitating the best and synthesizing the worst. In Robotics: Science and Systems, Freiburg im Breisgau, June 22\u201326, 2019, pp. 1\u201320.","DOI":"10.15607\/RSS.2019.XV.031"},{"key":"bibr11-20539517221145372","volume-title":"Risk Society: Towards a New Modernity","author":"Beck","year":"1992"},{"key":"bibr12-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1177\/2053951716646135"},{"key":"bibr13-20539517221145372","doi-asserted-by":"publisher","DOI":"10.4135\/9781526463210"},{"key":"bibr14-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1086\/521564"},{"key":"bibr15-20539517221145372","volume-title":"On Justification: Economies of Worth","author":"Boltanksi L","year":"2006"},{"key":"bibr16-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1080\/1369118X.2012.678878"},{"key":"bibr17-20539517221145372","first-page":"1","volume":"20","author":"Bruder J","year":"2021","journal-title":"Culture Machine"},{"key":"bibr18-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1093\/oso\/9780190493028.001.0001"},{"key":"bibr19-20539517221145372","first-page":"1","volume":"81","author":"Buolamwini J","year":"2018","journal-title":"Proceedings of Machine Learning Research"},{"key":"bibr20-20539517221145372","volume":"5","author":"Chen RJ","year":"2021","journal-title":"Nature"},{"key":"bibr21-20539517221145372","volume-title":"Deep Learning","author":"Courville A","year":"2016"},{"key":"bibr22-20539517221145372","volume-title":"Techniques of the Observer: On Vision and Modernity in the Nineteenth Century","author":"Crary J","year":"1990"},{"key":"bibr23-20539517221145372","doi-asserted-by":"publisher","DOI":"10.2307\/j.ctv1ghv45t"},{"key":"bibr24-20539517221145372","unstructured":"Crawford K, Paglen T (2019) Excavating AI: The politics of images in machine learning training sets. September 19. Available at: https:\/\/excavating.ai\/."},{"key":"bibr25-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1177\/20539517211035955"},{"key":"bibr26-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1080\/1369118X.2020.1754877"},{"key":"bibr27-20539517221145372","first-page":"1","author":"Diakopoulos N","year":"2020","journal-title":"New Media & Society"},{"key":"bibr28-20539517221145372","first-page":"1","author":"Dourish P","year":"2016","journal-title":"Big Data & Society"},{"key":"bibr29-20539517221145372","first-page":"197","volume-title":"The Foucault Effect: Studies in Governmentality","author":"Ewald F","year":"1991"},{"key":"bibr30-20539517221145372","volume-title":"Security, Territory, Population: Lectures at the College de France 1977\u20131978","author":"Foucault M","year":"2007"},{"key":"bibr31-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2018.09.013"},{"key":"bibr32-20539517221145372","unstructured":"Gebru T, Morgenstern J, Vecchione B, et al. (2020) Datasheets for datasets. ArXiv: 1\u201318."},{"key":"bibr33-20539517221145372","unstructured":"Gentric S (2021) Deep learning, a key technology behind IDEMIA\u2019s algorithms. Idemia. Available at: https:\/\/www.idemia.com\/news\/deep-learning-key-technology-behind-idemias-algorithms-2021-07-26."},{"key":"bibr34-20539517221145372","unstructured":"Goodfellow I, Pouget-Abadie J, Mirza M, et al. (2014) Generative adversarial nets. Proceedings of the International Conference on Neural Information Processing Systems (NIPS), pp. 1\u20139."},{"key":"bibr35-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-56286-1_8"},{"key":"bibr36-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1086\/717313"},{"key":"bibr37-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1086\/717320"},{"key":"bibr38-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1162\/GREY_a_00221"},{"key":"bibr39-20539517221145372","unstructured":"Heaven WD (2020) Our weird behavior during the pandemic is messing with AI models. MIT Technology Review. Available at: https:\/\/www.technologyreview.com\/2020\/05\/11\/1001563\/covid-pandemic-broken-ai-machine-learning-amazon-retail-fraud-humans-in-the-loop\/."},{"key":"bibr40-20539517221145372","unstructured":"Heaven WD (2021) Synthetic Data for AI. MIT Technology Review. Available at: https:\/\/www.technologyreview.com\/2022\/02\/23\/1044965\/ai-synthetic-data-2\/."},{"key":"bibr41-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1177\/1461444820958725"},{"key":"bibr42-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1016\/j.patter.2021.100241"},{"key":"bibr43-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1145\/3442188.3445918"},{"key":"bibr44-20539517221145372","volume-title":"The Ethics of Invention: Technology and the Human Future","author":"Jasanoff S","year":"2016"},{"key":"bibr45-20539517221145372","first-page":"1","author":"Kitchin R","year":"2014","journal-title":"Big Data & Society"},{"key":"bibr46-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58621-8_18"},{"key":"bibr47-20539517221145372","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"bibr48-20539517221145372","volume-title":"Hard Miles without Hard Miles","author":"Newman P","year":"2021"},{"key":"bibr49-20539517221145372","unstructured":"Nikolenko SI (2019) Synthetic data for deep learning. ArXiv: 1\u2013156."},{"key":"bibr50-20539517221145372","first-page":"1","volume":"20","author":"Phan T","year":"2021","journal-title":"Culture Machine"},{"key":"bibr51-20539517221145372","doi-asserted-by":"crossref","unstructured":"Tremblay J, Prakash A, Acuna D, et al. (2018) Training deep networks with synthetic data: Bridging the reality gap by domain randomization.\n                      IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)\n                      , pp. 1082\u20131090.","DOI":"10.1109\/CVPRW.2018.00143"},{"key":"bibr52-20539517221145372","unstructured":"White A (2021) By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated. Gartner. Available at: https:\/\/blogs.gartner.com\/andrew_white\/2021\/07\/24\/by-2024-60-of-the-data-used-for-the-development-of-ai-and-analytics-projects-will-be-synthetically-generated\/?_ga=2.103596488.916571214.1647348850-633920548.1645012714."},{"key":"bibr53-20539517221145372","first-page":"1","volume":"20","author":"Zeilinger M","year":"2021","journal-title":"Culture Machine"},{"key":"bibr54-20539517221145372","volume-title":"The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power","author":"Zuboff S","year":"2019"}],"container-title":["Big Data &amp; Society"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/20539517221145372","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/20539517221145372","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/20539517221145372","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T13:00:15Z","timestamp":1777381215000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/20539517221145372"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1]]},"references-count":54,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1]]}},"alternative-id":["10.1177\/20539517221145372"],"URL":"https:\/\/doi.org\/10.1177\/20539517221145372","relation":{},"ISSN":["2053-9517","2053-9517"],"issn-type":[{"value":"2053-9517","type":"print"},{"value":"2053-9517","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1]]},"article-number":"20539517221145372"}}