{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T07:13:48Z","timestamp":1776755628891,"version":"3.51.2"},"publisher-location":"New York, NY, USA","reference-count":111,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,8,8]],"date-time":"2023-08-08T00:00:00Z","timestamp":1691452800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nd\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,8]]},"DOI":"10.1145\/3600211.3604658","type":"proceedings-article","created":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T18:41:37Z","timestamp":1693334497000},"page":"855-868","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":11,"title":["Reclaiming the Digital Commons: A Public Data Trust for Training Data"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7547-3951","authenticated-orcid":false,"given":"Alan","family":"Chan","sequence":"first","affiliation":[{"name":"DIRO, Mila, Universit\u00e9 de Montr\u00e9al, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5390-1257","authenticated-orcid":false,"given":"Herbie","family":"Bradley","sequence":"additional","affiliation":[{"name":"University of Cambridge, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8991-0881","authenticated-orcid":false,"given":"Nitarshan","family":"Rajkumar","sequence":"additional","affiliation":[{"name":"University of Cambridge, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2023,8,29]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2021. Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL LAYING DOWN HARMONISED RULES ON ARTIFICIAL INTELLIGENCE (ARTIFICIAL INTELLIGENCE ACT) AND AMENDING CERTAIN UNION LEGISLATIVE ACTS. https:\/\/eur-lex.europa.eu\/legal-content\/EN\/TXT\/?uri=CELEX:52021PC0206"},{"key":"e_1_3_2_1_3_1","unstructured":"2022. Establishing a pro-innovation approach to regulating AI. Technical Report. Office for Artificial Intelligence. https:\/\/www.gov.uk\/government\/publications\/establishing-a-pro-innovation-approach-to-regulating-ai\/establishing-a-pro-innovation-approach-to-regulating-ai-policy-statement"},{"key":"e_1_3_2_1_4_1","volume-title":"Gmail: global active users worldwide","year":"2018","unstructured":"2022. Gmail: global active users worldwide 2018. https:\/\/www.statista.com\/statistics\/432390\/active-gmail-users\/"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","unstructured":"2023. A blueprint for building national compute capacity for artificial intelligence. OECD Digital Economy Papers 350. https:\/\/doi.org\/10.1787\/876367e3-en","DOI":"10.1787\/876367e3-en"},{"key":"e_1_3_2_1_6_1","unstructured":"2023. The Private Copying Tariff - The Canadian Private Copying Collective. https:\/\/www.cpcc.ca\/en\/the-cpcc\/private-copying-tariff"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3461702.3462624"},{"key":"e_1_3_2_1_9_1","volume-title":"OpenAI has hired an army of contractors to make basic coding obsolete | Semafor. (Jan","author":"Albergotti Reed","year":"2023","unstructured":"Reed Albergotti and Louise Matsakis. 2023. OpenAI has hired an army of contractors to make basic coding obsolete | Semafor. (Jan. 2023). https:\/\/www.semafor.com\/article\/01\/27\/2023\/openai-has-hired-an-army-of-contractors-to-make-basic-coding-obsolete"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","unstructured":"Yuntao Bai Andy Jones Kamal Ndousse Amanda Askell Anna Chen Nova DasSarma Dawn Drain Stanislav Fort Deep Ganguli Tom Henighan Nicholas Joseph Saurav Kadavath Jackson Kernion Tom Conerly Sheer El-Showk Nelson Elhage Zac Hatfield-Dodds Danny Hernandez Tristan Hume Scott Johnston Shauna Kravec Liane Lovitt Neel Nanda Catherine Olsson Dario Amodei Tom Brown Jack Clark Sam McCandlish Chris Olah Ben Mann and Jared Kaplan. 2022. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. https:\/\/doi.org\/10.48550\/arXiv.2204.05862 arXiv:2204.05862 [cs].","DOI":"10.48550\/arXiv.2204.05862"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2212.08073"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","unstructured":"Sid Black Stella Biderman Eric Hallahan Quentin Anthony Leo Gao Laurence Golding Horace He Connor Leahy Kyle McDonell Jason Phang Michael Pieler USVSN\u00a0Sai Prashanth Shivanshu Purohit Laria Reynolds Jonathan Tow Ben Wang and Samuel Weinbach. 2022. GPT-NeoX-20B: An Open-Source Autoregressive Language Model. https:\/\/doi.org\/10.48550\/arXiv.2204.06745 arXiv:2204.06745 [cs].","DOI":"10.48550\/arXiv.2204.06745"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","unstructured":"Emma Bluemke Tantum Collins Ben Garfinkel and Andrew Trask. 2023. Exploring the Relevance of Data Privacy-Enhancing Technologies for AI Governance Use Cases. https:\/\/doi.org\/10.48550\/arXiv.2303.08956 arXiv:2303.08956 [cs].","DOI":"10.48550\/arXiv.2303.08956"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.3389\/fdata.2021.729663"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","unstructured":"Rishi Bommasani Drew\u00a0A. Hudson Ehsan Adeli Russ Altman Simran Arora Sydney von Arx Michael\u00a0S. Bernstein Jeannette Bohg Antoine Bosselut Emma Brunskill Erik Brynjolfsson Shyamal Buch Dallas Card Rodrigo Castellon Niladri Chatterji Annie Chen Kathleen Creel Jared\u00a0Quincy Davis Dora Demszky Chris Donahue Moussa Doumbouya Esin Durmus Stefano Ermon John Etchemendy Kawin Ethayarajh Li Fei-Fei Chelsea Finn Trevor Gale Lauren Gillespie Karan Goel Noah Goodman Shelby Grossman Neel Guha Tatsunori Hashimoto Peter Henderson John Hewitt Daniel\u00a0E. Ho Jenny Hong Kyle Hsu Jing Huang Thomas Icard Saahil Jain Dan Jurafsky Pratyusha Kalluri Siddharth Karamcheti Geoff Keeling Fereshte Khani Omar Khattab Pang\u00a0Wei Koh Mark Krass Ranjay Krishna Rohith Kuditipudi Ananya Kumar Faisal Ladhak Mina Lee Tony Lee Jure Leskovec Isabelle Levent Xiang\u00a0Lisa Li Xuechen Li Tengyu Ma Ali Malik Christopher\u00a0D. Manning Suvir Mirchandani Eric Mitchell Zanele Munyikwa Suraj Nair Avanika Narayan Deepak Narayanan Ben Newman Allen Nie Juan\u00a0Carlos Niebles Hamed Nilforoshan Julian Nyarko Giray Ogut Laurel Orr Isabel Papadimitriou Joon\u00a0Sung Park Chris Piech Eva Portelance Christopher Potts Aditi Raghunathan Rob Reich Hongyu Ren Frieda Rong Yusuf Roohani Camilo Ruiz Jack Ryan Christopher R\u00e9 Dorsa Sadigh Shiori Sagawa Keshav Santhanam Andy Shih Krishnan Srinivasan Alex Tamkin Rohan Taori Armin\u00a0W. Thomas Florian Tram\u00e8r Rose\u00a0E. Wang William Wang Bohan Wu Jiajun Wu Yuhuai Wu Sang\u00a0Michael Xie Michihiro Yasunaga Jiaxuan You Matei Zaharia Michael Zhang Tianyi Zhang Xikun Zhang Yuhui Zhang Lucia Zheng Kaitlyn Zhou and Percy Liang. 2022. On the Opportunities and Risks of Foundation Models. https:\/\/doi.org\/10.48550\/arXiv.2108.07258 arXiv:2108.07258 [cs].","DOI":"10.48550\/arXiv.2108.07258"},{"key":"e_1_3_2_1_16_1","volume-title":"Getty Images lawsuit says Stability AI misused photos to train AI","author":"Brittain Blake","year":"2023","unstructured":"Blake Brittain. 2023. Getty Images lawsuit says Stability AI misused photos to train AI. Reuters (Feb. 2023). https:\/\/www.reuters.com\/legal\/getty-images-lawsuit-says-stability-ai-misused-photos-train-ai-2023-02-06\/"},{"key":"e_1_3_2_1_17_1","volume-title":"Advances in Neural Information Processing Systems, Vol.\u00a033. Curran Associates","author":"Brown Tom","year":"1877","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared\u00a0D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol.\u00a033. Curran Associates, Inc., 1877\u20131901. https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html"},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings of the 1st Conference on Fairness, Accountability and Transparency(Proceedings of Machine Learning Research, Vol.\u00a081)","author":"Buolamwini Joy","year":"2018","unstructured":"Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency(Proceedings of Machine Learning Research, Vol.\u00a081), Sorelle\u00a0A. Friedler and Christo Wilson (Eds.). PMLR, 77\u201391. https:\/\/proceedings.mlr.press\/v81\/buolamwini18a.html"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","unstructured":"Nicholas Carlini Matthew Jagielski Christopher\u00a0A. Choquette-Choo Daniel Paleka Will Pearce Hyrum Anderson Andreas Terzis Kurt Thomas and Florian Tram\u00e8r. 2023. Poisoning Web-Scale Training Datasets is Practical. https:\/\/doi.org\/10.48550\/arXiv.2302.10149 arXiv:2302.10149 [cs].","DOI":"10.48550\/arXiv.2302.10149"},{"key":"e_1_3_2_1_20_1","volume-title":"Poisoning and Backdooring Contrastive Learning. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=iC4UHbQ01Mp","author":"Carlini Nicholas","year":"2022","unstructured":"Nicholas Carlini and Andreas Terzis. 2022. Poisoning and Backdooring Contrastive Learning. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=iC4UHbQ01Mp"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","unstructured":"Alan Chan Rebecca Salganik Alva Markelius Chris Pang Nitarshan Rajkumar Dmitrii Krasheninnikov Lauro Langosco Zhonghao He Yawen Duan Micah Carroll Michelle Lin Alex Mayhew Katherine Collins Maryam Molamohammadi John Burden Wanru Zhao Shalaleh Rismani Konstantinos Voudouris Umang Bhatt Adrian Weller David Krueger and Tegan Maharaj. 2023. Harms from Increasingly Agentic Algorithmic Systems. https:\/\/doi.org\/10.48550\/arXiv.2302.10329 arXiv:2302.10329 [cs].","DOI":"10.48550\/arXiv.2302.10329"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","unstructured":"Mark Chen Jerry Tworek Heewoo Jun Qiming Yuan Henrique Ponde de\u00a0Oliveira Pinto Jared Kaplan Harri Edwards Yuri Burda Nicholas Joseph Greg Brockman Alex Ray Raul Puri Gretchen Krueger Michael Petrov Heidy Khlaaf Girish Sastry Pamela Mishkin Brooke Chan Scott Gray Nick Ryder Mikhail Pavlov Alethea Power Lukasz Kaiser Mohammad Bavarian Clemens Winter Philippe Tillet Felipe\u00a0Petroski Such Dave Cummings Matthias Plappert Fotios Chantzis Elizabeth Barnes Ariel Herbert-Voss William\u00a0Hebgen Guss Alex Nichol Alex Paino Nikolas Tezak Jie Tang Igor Babuschkin Suchir Balaji Shantanu Jain William Saunders Christopher Hesse Andrew\u00a0N. Carr Jan Leike Josh Achiam Vedant Misra Evan Morikawa Alec Radford Matthew Knight Miles Brundage Mira Murati Katie Mayer Peter Welinder Bob McGrew Dario Amodei Sam McCandlish Ilya Sutskever and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. https:\/\/doi.org\/10.48550\/arXiv.2107.03374 arXiv:2107.03374 [cs].","DOI":"10.48550\/arXiv.2107.03374"},{"key":"e_1_3_2_1_23_1","volume-title":"Advances in Neural Information Processing Systems, Vol.\u00a030. Curran Associates","author":"Christiano F","year":"2017","unstructured":"Paul\u00a0F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep Reinforcement Learning from Human Preferences. In Advances in Neural Information Processing Systems, Vol.\u00a030. Curran Associates, Inc.https:\/\/papers.nips.cc\/paper\/2017\/hash\/d5e2c0adad503c91f91df240d0cd4e49-Abstract.html"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3585385"},{"key":"e_1_3_2_1_25_1","first-page":"1443","article-title":"AI governance: a research agenda. Governance of AI Program, Future of Humanity Institute, University of Oxford: Oxford","volume":"1442","author":"Dafoe Allan","year":"2018","unstructured":"Allan Dafoe. 2018. AI governance: a research agenda. Governance of AI Program, Future of Humanity Institute, University of Oxford: Oxford, UK 1442 (2018), 1443.","journal-title":"UK"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","unstructured":"Mostafa Dehghani Josip Djolonga Basil Mustafa Piotr Padlewski Jonathan Heek Justin Gilmer Andreas Steiner Mathilde Caron Robert Geirhos Ibrahim Alabdulmohsin Rodolphe Jenatton Lucas Beyer Michael Tschannen Anurag Arnab Xiao Wang Carlos Riquelme Matthias Minderer Joan Puigcerver Utku Evci Manoj Kumar Sjoerd van Steenkiste Gamaleldin\u00a0F. Elsayed Aravindh Mahendran Fisher Yu Avital Oliver Fantine Huot Jasmijn Bastings Mark\u00a0Patrick Collier Alexey Gritsenko Vighnesh Birodkar Cristina Vasconcelos Yi Tay Thomas Mensink Alexander Kolesnikov Filip Paveti\u0107 Dustin Tran Thomas Kipf Mario Lu\u010di\u0107 Xiaohua Zhai Daniel Keysers Jeremiah Harmsen and Neil Houlsby. 2023. Scaling Vision Transformers to 22 Billion Parameters. https:\/\/doi.org\/10.48550\/arXiv.2302.05442 arXiv:2302.05442 [cs].","DOI":"10.48550\/arXiv.2302.05442"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1093\/idpl"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"crossref","unstructured":"Sylvie Delacroix Joelle Pineau and Jessica Montgomery. 2020. Democratising the Digital Revolution: The Role of Data Governance. https:\/\/papers.ssrn.com\/abstract=3720208","DOI":"10.1007\/978-3-030-69128-8_3"},{"key":"e_1_3_2_1_29_1","volume-title":"The impact of open source software and hardware on technological independence, competitiveness and innovation in the EU economy: final study report","author":"Content and Technology (European\u00a0Commission","unstructured":"Content and Technology (European\u00a0Commission) Directorate-General\u00a0for Communications\u00a0Networks, Knut Blind, Sivan P\u00e4tsch, Sachiko Muto, Mirko B\u00f6hm, Torben Schubert, Paula Grzegorzewska, and Andrew Katz. 2021. The impact of open source software and hardware on technological independence, competitiveness and innovation in the EU economy: final study report. Publications Office of the European Union, LU. https:\/\/data.europa.eu\/doi\/10.2759\/430161"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1257\/jep.28.3.217"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.14763\/2020.4.1530"},{"key":"e_1_3_2_1_32_1","unstructured":"Kawin Ethayarajh Heidi Zhang Yizhong Wang and Dan Jurafsky. 2023. Stanford Human Preferences Dataset. https:\/\/huggingface.co\/datasets\/stanfordnlp\/SHP"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","unstructured":"Congyu Fang Hengrui Jia Anvith Thudi Mohammad Yaghini Christopher\u00a0A. Choquette-Choo Natalie Dullerud Varun Chandrasekaran and Nicolas Papernot. 2022. On the Fundamental Limits of Formally (Dis)Proving Robustness in Proof-of-Learning. https:\/\/doi.org\/10.48550\/arXiv.2208.03567 arXiv:2208.03567 [cs stat].","DOI":"10.48550\/arXiv.2208.03567"},{"key":"e_1_3_2_1_34_1","unstructured":"Yakov Feygin Hanlin Li Chirag Lala Brent Hecht Nicholas Vincent Luisa Scarcella and Matthew Prewitt. 2021. A data dividend that works: steps toward building an equitable data economy. (2021)."},{"key":"e_1_3_2_1_35_1","volume-title":"Tech firms","author":"Fischer Sara","year":"2022","unstructured":"Sara Fischer. 2022. Tech firms\u2019 big trust gap: Hardware\u2019s up, social media\u2019s down. Axios (May 2022). https:\/\/www.axios.com\/2022\/05\/25\/tech-firms-big-trust-gap-harris-reputation-survey"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3531146.3533229"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2101.00027"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458723"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_40_1","unstructured":"Jonas Geiping Liam\u00a0H. Fowl Gowthami Somepalli Micah Goldblum Michael Moeller and Tom Goldstein. 2022. What Doesn\u2019t Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning. https:\/\/openreview.net\/forum?id=VMuenFh7IpP"},{"key":"e_1_3_2_1_41_1","unstructured":"Rishab\u00a0Aiyer Ghosh. 2007. Economic impact of open source software on innovation and the competitiveness of the Information and Communication Technologies (ICT) sector in the EU. (2007). https:\/\/ictlogy.net\/bibliography\/reports\/projects.php?idp=895&lang=en Publisher: UNU-MERIT."},{"key":"e_1_3_2_1_42_1","volume-title":"Artificial Intelligence. Our World in Data","author":"Giattino Charlie","year":"2022","unstructured":"Charlie Giattino, Edouard Mathieu, Julia Broden, and Max Roser. 2022. Artificial Intelligence. Our World in Data (2022)."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","unstructured":"Josh\u00a0A. Goldstein Girish Sastry Micah Musser Renee DiResta Matthew Gentzel and Katerina Sedova. 2023. Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations. https:\/\/doi.org\/10.48550\/arXiv.2301.04246 arXiv:2301.04246 [cs].","DOI":"10.48550\/arXiv.2301.04246"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","unstructured":"Chenxi Gu Chengsong Huang Xiaoqing Zheng Kai-Wei Chang and Cho-Jui Hsieh. 2023. Watermarking Pre-trained Language Models with Backdooring. https:\/\/doi.org\/10.48550\/arXiv.2210.07543 arXiv:2210.07543 [cs].","DOI":"10.48550\/arXiv.2210.07543"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3563357.3564075"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","unstructured":"Camille Harris Matan Halevy Ayanna Howard Amy Bruckman and Diyi Yang. 2022. Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification. In 2022 ACM Conference on Fairness Accountability and Transparency. ACM. https:\/\/doi.org\/10.1145\/3531146.3533144","DOI":"10.1145\/3531146.3533144"},{"key":"e_1_3_2_1_47_1","unstructured":"Daniel Ho Jennifer King Russell Wald and Christopher Wan. 2021. Building a National AI Research Resource: A Blueprint for the National Research Cloud. https:\/\/hai.stanford.edu\/sites\/default\/files\/2022-01\/HAI_NRCR_v17.pdf"},{"key":"e_1_3_2_1_48_1","unstructured":"Jordan Hoffmann Sebastian Borgeaud Arthur Mensch Elena Buchatskaya Trevor Cai Eliza Rutherford Diego de\u00a0las Casas Lisa\u00a0Anne Hendricks Johannes Welbl Aidan Clark Tom Hennigan Eric Noland Katherine Millican George van\u00a0den Driessche Bogdan Damoc Aurelia Guy Simon Osindero Karen Simonyan Erich Elsen Oriol Vinyals Jack\u00a0William Rae and Laurent Sifre. 2022. An empirical analysis of compute-optimal large language model training. In Advances in Neural Information Processing Systems Alice\u00a0H. Oh Alekh Agarwal Danielle Belgrave and Kyunghyun Cho (Eds.). https:\/\/openreview.net\/forum?id=iBBcRUlOAPR"},{"key":"e_1_3_2_1_49_1","unstructured":"Saffron Huang and Divya Siddarth. 2023. Generative AI and the Digital Commons. https:\/\/cip.org\/research\/generative-ai-digital-commons"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2010.13561"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","unstructured":"Tim Hwang. 2018. Computational Power and the Social Impact of Artificial Intelligence. https:\/\/doi.org\/10.48550\/arXiv.1803.08971","DOI":"10.48550\/arXiv.1803.08971"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2470654.2470742"},{"key":"e_1_3_2_1_53_1","unstructured":"janus and jdp. 2023. Anomalous tokens reveal the original identities of Instruct models. https:\/\/generative.ink\/posts\/anomalous-tokens-reveal-the-original-identities-of-instruct-models\/"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3531146.3534637"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3571730"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP40001.2021.00106"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-019-0088-2"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","unstructured":"Jared Kaplan Sam McCandlish Tom Henighan Tom\u00a0B. Brown Benjamin Chess Rewon Child Scott Gray Alec Radford Jeffrey Wu and Dario Amodei. 2020. Scaling Laws for Neural Language Models. https:\/\/doi.org\/10.48550\/arXiv.2001.08361 arXiv:2001.08361 [cs stat].","DOI":"10.48550\/arXiv.2001.08361"},{"key":"e_1_3_2_1_59_1","volume-title":"TikTok and Instagram with their data, poll finds. Washington Post (Dec.","author":"Kelly Heather","year":"2021","unstructured":"Heather Kelly and Emily Guskin. 2021. Americans widely distrust Facebook, TikTok and Instagram with their data, poll finds. Washington Post (Dec. 2021). https:\/\/www.washingtonpost.com\/technology\/2021\/12\/22\/tech-trust-survey\/"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","unstructured":"John Kirchenbauer Jonas Geiping Yuxin Wen Jonathan Katz Ian Miers and Tom Goldstein. 2023. A Watermark for Large Language Models. https:\/\/doi.org\/10.48550\/arXiv.2301.10226 arXiv:2301.10226 [cs].","DOI":"10.48550\/arXiv.2301.10226"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","unstructured":"Anton Korinek and Megan Juelfs. 2022. Preparing for the (Non-Existent?) Future of Work. https:\/\/doi.org\/10.2139\/ssrn.4147243","DOI":"10.2139\/ssrn.4147243"},{"key":"e_1_3_2_1_62_1","unstructured":"Yiming Li Yang Bai Yong Jiang Yong Yang Shu-Tao Xia and Bo Li. 2022. Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection. In Advances in Neural Information Processing Systems Alice\u00a0H. Oh Alekh Agarwal Danielle Belgrave and Kyunghyun Cho (Eds.). https:\/\/openreview.net\/forum?id=kcQiIrvA_nz"},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","unstructured":"Yiming Li Ziqi Zhang Jiawang Bai Baoyuan Wu Yong Jiang and Shu-Tao Xia. 2020. Open-sourced Dataset Protection via Backdoor Watermarking. https:\/\/doi.org\/10.48550\/arXiv.2010.05821 arXiv:2010.05821 [cs].","DOI":"10.48550\/arXiv.2010.05821"},{"key":"e_1_3_2_1_64_1","volume-title":"TruthfulQA: Measuring How Models Mimic Human Falsehoods. arXiv:2109.07958 [cs] (Sept","author":"Lin Stephanie","year":"2021","unstructured":"Stephanie Lin, Jacob Hilton, and Owain Evans. 2021. TruthfulQA: Measuring How Models Mimic Human Falsehoods. arXiv:2109.07958 [cs] (Sept. 2021). http:\/\/arxiv.org\/abs\/2109.07958 arXiv:2109.07958."},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","unstructured":"Chris Lu Timon Willi Alistair Letcher and Jakob Foerster. 2022. Adversarial Cheap Talk. https:\/\/doi.org\/10.48550\/arXiv.2211.11030 arXiv:2211.11030 [cs].","DOI":"10.48550\/arXiv.2211.11030"},{"key":"e_1_3_2_1_66_1","volume-title":"Meta settles Cambridge Analytica scandal case for $725m. BBC News (Dec","author":"McCallum Shiona","year":"2022","unstructured":"Shiona McCallum. 2022. Meta settles Cambridge Analytica scandal case for $725m. BBC News (Dec. 2022). https:\/\/www.bbc.com\/news\/technology-64075067"},{"key":"e_1_3_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1002\/asi.23172"},{"key":"e_1_3_2_1_68_1","doi-asserted-by":"publisher","unstructured":"Margaret Mitchell Alexandra\u00a0Sasha Luccioni Nathan Lambert Marissa Gerchick Angelina McMillan-Major Ezinwanne Ozoani Nazneen Rajani Tristan Thrush Yacine Jernite and Douwe Kiela. 2023. Measuring Data. https:\/\/doi.org\/10.48550\/arXiv.2212.05129 arXiv:2212.05129 [cs].","DOI":"10.48550\/arXiv.2212.05129"},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"publisher","unstructured":"Reiichiro Nakano Jacob Hilton Suchir Balaji Jeff Wu Long Ouyang Christina Kim Christopher Hesse Shantanu Jain Vineet Kosaraju William Saunders Xu Jiang Karl Cobbe Tyna Eloundou Gretchen Krueger Kevin Button Matthew Knight Benjamin Chess and John Schulman. 2022. WebGPT: Browser-assisted question-answering with human feedback. https:\/\/doi.org\/10.48550\/arXiv.2112.09332 arXiv:2112.09332 [cs].","DOI":"10.48550\/arXiv.2112.09332"},{"key":"e_1_3_2_1_70_1","doi-asserted-by":"publisher","unstructured":"Richard Ngo Lawrence Chan and S\u00f6ren Mindermann. 2023. The alignment problem from a deep learning perspective. https:\/\/doi.org\/10.48550\/arXiv.2209.00626 arXiv:2209.00626 [cs].","DOI":"10.48550\/arXiv.2209.00626"},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2209.02299"},{"key":"e_1_3_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1080\/1369118X.2018.1486870"},{"key":"e_1_3_2_1_73_1","volume-title":"ICO fines facial recognition database company Clearview AI Inc more than \u00a37.5m and orders UK data to be deleted. (May","author":"Information\u00a0Commission","year":"2022","unstructured":"Information\u00a0Commissioner\u2019s Office. 2022. ICO fines facial recognition database company Clearview AI Inc more than \u00a37.5m and orders UK data to be deleted. (May 2022). https:\/\/ico.org.uk\/about-the-ico\/media-centre\/news-and-blogs\/2022\/05\/ico-fines-facial-recognition-database-company-clearview-ai-inc\/ Publisher: ICO."},{"key":"e_1_3_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3375627.3375842"},{"key":"e_1_3_2_1_76_1","volume-title":"The rise of social media. Our World in Data","author":"Ortiz-Ospina Esteban","year":"2019","unstructured":"Esteban Ortiz-Ospina. 2019. The rise of social media. Our World in Data (2019)."},{"key":"e_1_3_2_1_77_1","unstructured":"Dylan Patel and Afzal Ahmad. 2023. The Inference Cost Of Search Disruption \u2013 Large Language Model Cost Analysis. https:\/\/www.semianalysis.com\/p\/the-inference-cost-of-search-disruption"},{"key":"e_1_3_2_1_78_1","volume-title":"How Frances Haugen\u2019s Team Forced a Facebook Reckoning. Time (Oct","author":"Perrigo Billy","year":"2021","unstructured":"Billy Perrigo. 2021. How Frances Haugen\u2019s Team Forced a Facebook Reckoning. Time (Oct. 2021). https:\/\/time.com\/6104899\/facebook-reckoning-frances-haugen\/"},{"key":"e_1_3_2_1_79_1","volume-title":"Exclusive: The $2 Per Hour Workers Who Made ChatGPT Safer. Time (Jan.","author":"Perrigo Billy","year":"2023","unstructured":"Billy Perrigo. 2023. Exclusive: The $2 Per Hour Workers Who Made ChatGPT Safer. Time (Jan. 2023). https:\/\/time.com\/6247678\/openai-chatgpt-kenya-workers\/"},{"key":"e_1_3_2_1_80_1","doi-asserted-by":"publisher","unstructured":"Jason Phang Herbie Bradley Leo Gao Louis Castricato and Stella Biderman. 2022. EleutherAI: Going Beyond \"Open Science\" to \"Science in the Open\". https:\/\/doi.org\/10.48550\/arXiv.2210.06413 arXiv:2210.06413 [cs].","DOI":"10.48550\/arXiv.2210.06413"},{"key":"e_1_3_2_1_81_1","unstructured":"The\u00a0Consilience Project. 2021. Democracy and the Epistemic Commons. https:\/\/consilienceproject.org\/democracy-and-the-epistemic-commons\/"},{"key":"e_1_3_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1145\/3351095.3372873"},{"key":"e_1_3_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514094.3534181"},{"key":"e_1_3_2_1_84_1","first-page":"2640","volume-title":"Proceedings of the 37th International Conference on Machine Learning. PMLR, 7974\u20137984","author":"Rakhsha Amin","year":"2020","unstructured":"Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, and Adish Singla. 2020. Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning. In Proceedings of the 37th International Conference on Machine Learning. PMLR, 7974\u20137984. https:\/\/proceedings.mlr.press\/v119\/rakhsha20a.html ISSN: 2640-3498."},{"key":"e_1_3_2_1_85_1","volume-title":"World Population Growth. Our World in Data","author":"Roser Max","year":"2013","unstructured":"Max Roser, Hannah Ritchie, Esteban Ortiz-Ospina, and Lucas Rod\u00e9s-Guirao. 2013. World Population Growth. Our World in Data (2013)."},{"key":"e_1_3_2_1_86_1","doi-asserted-by":"crossref","unstructured":"Teresa Scassa. 2020. Designing Data Governance for Data Sharing: Lessons from Sidewalk Toronto. https:\/\/papers.ssrn.com\/abstract=3722204","DOI":"10.71265\/d7yvsg86"},{"key":"e_1_3_2_1_87_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2302.04761"},{"key":"e_1_3_2_1_88_1","doi-asserted-by":"publisher","unstructured":"Christoph Schuhmann Romain Beaumont Richard Vencu Cade Gordon Ross Wightman Mehdi Cherti Theo Coombes Aarush Katta Clayton Mullis Mitchell Wortsman Patrick Schramowski Srivatsa Kundurthy Katherine Crowson Ludwig Schmidt Robert Kaczmarczyk and Jenia Jitsev. 2022. LAION-5B: An open large-scale dataset for training next generation image-text models. https:\/\/doi.org\/10.48550\/arXiv.2210.08402 arXiv:2210.08402 [cs].","DOI":"10.48550\/arXiv.2210.08402"},{"key":"e_1_3_2_1_89_1","doi-asserted-by":"publisher","unstructured":"Jaime Sevilla Lennart Heim Anson Ho Tamay Besiroglu Marius Hobbhahn and Pablo Villalobos. 2022. Compute Trends Across Three Eras of Machine Learning. https:\/\/doi.org\/10.48550\/arXiv.2202.05924","DOI":"10.48550\/arXiv.2202.05924"},{"key":"e_1_3_2_1_90_1","doi-asserted-by":"publisher","unstructured":"Toby Shevlane. 2022. Structured access: an emerging paradigm for safe AI deployment. https:\/\/doi.org\/10.48550\/arXiv.2201.05159 arXiv:2201.05159 [cs].","DOI":"10.48550\/arXiv.2201.05159"},{"key":"e_1_3_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1145\/3375627.3375815"},{"key":"e_1_3_2_1_92_1","doi-asserted-by":"publisher","unstructured":"Toby Shevlane Sebastian Farquhar Ben Garfinkel Mary Phuong Jess Whittlestone Jade Leung Daniel Kokotajlo Nahema Marchal Markus Anderljung Noam Kolt Lewis Ho Divya Siddarth Shahar Avin Will Hawkins Been Kim Iason Gabriel Vijay Bolina Jack Clark Yoshua Bengio Paul Christiano and Allan Dafoe. 2023. Model evaluation for extreme risks. https:\/\/doi.org\/10.48550\/arXiv.2305.15324 arXiv:2305.15324 [cs].","DOI":"10.48550\/arXiv.2305.15324"},{"key":"e_1_3_2_1_93_1","doi-asserted-by":"publisher","unstructured":"Irene Solaiman. 2023. The Gradient of Generative AI Release: Methods and Considerations. https:\/\/doi.org\/10.48550\/arXiv.2302.04844 arXiv:2302.04844 [cs].","DOI":"10.48550\/arXiv.2302.04844"},{"key":"e_1_3_2_1_94_1","volume-title":"Advances in Neural Information Processing Systems, Vol.\u00a033. Curran Associates","author":"Stiennon Nisan","year":"2020","unstructured":"Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul\u00a0F Christiano. 2020. Learning to summarize with human feedback. In Advances in Neural Information Processing Systems, Vol.\u00a033. Curran Associates, Inc., 3008\u20133021. https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/1f89885d556929e98d3ef9b86448f951-Abstract.html"},{"key":"e_1_3_2_1_95_1","volume-title":"Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking. arXiv preprint arXiv:2303.11470","author":"Tang Ruixiang","year":"2023","unstructured":"Ruixiang Tang, Qizhang Feng, Ninghao Liu, Fan Yang, and Xia Hu. 2023. Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking. arXiv preprint arXiv:2303.11470 (2023)."},{"key":"e_1_3_2_1_96_1","volume-title":"Stanford Alpaca: An Instruction-following LLaMA model. https:\/\/github.com\/tatsu-lab\/stanford_alpaca Publication Title: GitHub repository.","author":"Taori Rohan","year":"2023","unstructured":"Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori\u00a0B. Hashimoto. 2023. Stanford Alpaca: An Instruction-following LLaMA model. https:\/\/github.com\/tatsu-lab\/stanford_alpaca Publication Title: GitHub repository."},{"key":"e_1_3_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.1145\/3551636"},{"key":"e_1_3_2_1_98_1","volume-title":"TikTok has been accused of \u2018aggressive","author":"Touma Rafqa","year":"2022","unstructured":"Rafqa Touma. 2022. TikTok has been accused of \u2018aggressive\u2019 data harvesting. Is your information at risk?The Guardian (July 2022). https:\/\/www.theguardian.com\/technology\/2022\/jul\/19\/tiktok-has-been-accused-of-aggressive-data-harvesting-is-your-information-at-risk"},{"key":"e_1_3_2_1_99_1","first-page":"83","article-title":"An FDA for Algorithms","volume":"69","author":"Tutt Andrew","year":"2017","unstructured":"Andrew Tutt. 2017. An FDA for Algorithms. Administrative Law Review 69, 1 (2017), 83\u2013124. https:\/\/heinonline.org\/HOL\/P?h=hein.journals\/admin69&i=95","journal-title":"Administrative Law Review"},{"key":"e_1_3_2_1_100_1","first-page":"573","article-title":"A Relational Theory of Data Governance Feature","volume":"131","author":"Viljoen Salome","year":"2021","unstructured":"Salome Viljoen. 2021. A Relational Theory of Data Governance Feature. Yale Law Journal 131, 2 (2021), 573\u2013654. https:\/\/heinonline.org\/HOL\/P?h=hein.journals\/ylr131&i=595","journal-title":"Yale Law Journal"},{"key":"e_1_3_2_1_101_1","doi-asserted-by":"publisher","unstructured":"Pablo Villalobos Jaime Sevilla Lennart Heim Tamay Besiroglu Marius Hobbhahn and Anson Ho. 2022. Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning. https:\/\/doi.org\/10.48550\/arXiv.2211.04325 arXiv:2211.04325 [cs].","DOI":"10.48550\/arXiv.2211.04325"},{"key":"e_1_3_2_1_102_1","volume-title":"Microsoft\u2019s Bing is an emotionally manipulative liar, and people love it. The Verge (Feb","author":"Vincent James","year":"2023","unstructured":"James Vincent. 2023. Microsoft\u2019s Bing is an emotionally manipulative liar, and people love it. The Verge (Feb. 2023). https:\/\/www.theverge.com\/2023\/2\/15\/23599072\/microsoft-ai-bing-personality-conversations-spy-employees-webcams"},{"key":"e_1_3_2_1_103_1","doi-asserted-by":"publisher","unstructured":"Leandro von Werra Lewis Tunstall Abhishek Thakur Alexandra\u00a0Sasha Luccioni Tristan Thrush Aleksandra Piktus Felix Marty Nazneen Rajani Victor Mustar Helen Ngo Omar Sanseviero Mario \u0160a\u0161ko Albert Villanova Quentin Lhoest Julien Chaumond Margaret Mitchell Alexander\u00a0M. Rush Thomas Wolf and Douwe Kiela. 2022. Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements. https:\/\/doi.org\/10.48550\/arXiv.2210.01970 arXiv:2210.01970 [cs].","DOI":"10.48550\/arXiv.2210.01970"},{"key":"e_1_3_2_1_104_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_105_1","doi-asserted-by":"publisher","unstructured":"Jason Wei Yi Tay Rishi Bommasani Colin Raffel Barret Zoph Sebastian Borgeaud Dani Yogatama Maarten Bosma Denny Zhou Donald Metzler Ed\u00a0H. Chi Tatsunori Hashimoto Oriol Vinyals Percy Liang Jeff Dean and William Fedus. 2022. Emergent Abilities of Large Language Models. https:\/\/doi.org\/10.48550\/arXiv.2206.07682 arXiv:2206.07682 [cs].","DOI":"10.48550\/arXiv.2206.07682"},{"key":"e_1_3_2_1_106_1","volume-title":"Facebook Knows Instagram Is Toxic for Teen Girls","author":"Wells Georgia","year":"2021","unstructured":"Georgia Wells, Jeff Horwitz, and Deepa Seetharaman. 2021. Facebook Knows Instagram Is Toxic for Teen Girls, Company Documents Show. Wall Street Journal (Sept. 2021). https:\/\/www.wsj.com\/articles\/facebook-knows-instagram-is-toxic-for-teen-girls-company-documents-show-11631620739"},{"key":"e_1_3_2_1_107_1","unstructured":"Nicole Wetsman. 2021. Facebook\u2019s whistleblower report confirms what researchers have known for years. https:\/\/www.theverge.com\/2021\/10\/6\/22712927\/facebook-instagram-teen-mental-health-research"},{"key":"e_1_3_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2108.12427"},{"key":"e_1_3_2_1_109_1","doi-asserted-by":"publisher","DOI":"10.1145\/3514094.3534136"},{"key":"e_1_3_2_1_110_1","doi-asserted-by":"publisher","unstructured":"Yuhuai Wu Felix Li and Percy Liang. 2022. Insights into Pre-training via Simpler Synthetic Tasks. https:\/\/doi.org\/10.48550\/arXiv.2206.10139 arXiv:2206.10139 [cs].","DOI":"10.48550\/arXiv.2206.10139"},{"key":"e_1_3_2_1_111_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP46214.2022.9833596"},{"key":"e_1_3_2_1_112_1","unstructured":"Remco Zwetsloot and Allan Dafoe. 2019. Thinking About Risks From AI: Accidents Misuse and Structure. https:\/\/www.lawfareblog.com\/thinking-about-risks-ai-accidents-misuse-and-structure"},{"key":"e_1_3_2_1_113_1","doi-asserted-by":"publisher","DOI":"10.14763\/2021.3.1572"},{"key":"e_1_3_2_1_114_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00146-022-01480-5"}],"event":{"name":"AIES '23: AAAI\/ACM Conference on AI, Ethics, and Society","location":"Montr\u00e9al QC Canada","acronym":"AIES '23","sponsor":["SIGAI ACM Special Interest Group on Artificial Intelligence"]},"container-title":["Proceedings of the 2023 AAAI\/ACM Conference on AI, Ethics, and Society"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3600211.3604658","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3600211.3604658","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:39Z","timestamp":1750178259000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3600211.3604658"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,8]]},"references-count":111,"alternative-id":["10.1145\/3600211.3604658","10.1145\/3600211"],"URL":"https:\/\/doi.org\/10.1145\/3600211.3604658","relation":{},"subject":[],"published":{"date-parts":[[2023,8,8]]},"assertion":[{"value":"2023-08-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}