{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,12]],"date-time":"2025-12-12T13:36:05Z","timestamp":1765546565023,"version":"3.41.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2020,5,13]],"date-time":"2020-05-13T00:00:00Z","timestamp":1589328000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100005156","name":"Alexander von Humboldt-Stiftung","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100005156","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Direction G\u00e9n\u00e9rale de l\u00f0Armement"},{"DOI":"10.13039\/100014718","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CCF-1750539"],"award-info":[{"award-number":["CCF-1750539"]}],"id":[{"id":"10.13039\/100014718","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005304","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-15-IDEX-02"],"award-info":[{"award-number":["ANR-15-IDEX-02"]}],"id":[{"id":"10.13039\/501100005304","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Econ. Comput."],"published-print":{"date-parts":[[2020,5,31]]},"abstract":"<jats:p>Linear regression is a fundamental building block of statistical data analysis. It amounts to estimating the parameters of a linear model that maps input features to corresponding outputs. In the classical setting where the precision of each data point is fixed, the famous Aitken\/Gauss-Markov theorem in statistics states that generalized least squares (GLS) is a so-called \u201cBest Linear Unbiased Estimator\u201d (BLUE). In modern data science, however, one often faces<jats:italic>strategic data sources<\/jats:italic>; namely, individuals who incur a cost for providing high-precision data. For instance, this is the case for personal data, whose revelation may affect an individual\u2019s privacy\u2014which can be modeled as a cost\u2014or in applications such as recommender systems, where producing an accurate estimate entails effort.<\/jats:p><jats:p>In this article, we study a setting in which features are public but individuals choose the precision of the outputs they reveal to an analyst. We assume that the analyst performs linear regression on this dataset, and individuals benefit from the outcome of this estimation. We model this scenario as a game where individuals minimize a cost composed of two components: (a) an (agent-specific) disclosure cost for providing high-precision data; and (b) a (global) estimation cost representing the inaccuracy in the linear model estimate. In this game, the linear model estimate is a public good that benefits all individuals. We establish that this game has a unique non-trivial Nash equilibrium. We study the efficiency of this equilibrium and we prove tight bounds on the price of stability for a large class of disclosure and estimation costs. Finally, we study the estimator accuracy achieved at equilibrium. We show that, in general, Aitken\u2019s theorem does not hold under strategic data sources, though it does hold if individuals have identical disclosure costs (up to a multiplicative factor). When individuals have non-identical costs, we derive a bound on the improvement of the equilibrium estimation cost that can be achieved by deviating from GLS, under mild assumptions on the disclosure cost functions.<\/jats:p>","DOI":"10.1145\/3391436","type":"journal-article","created":{"date-parts":[[2020,5,19]],"date-time":"2020-05-19T10:16:59Z","timestamp":1589883419000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Linear Regression from Strategic Data Sources"],"prefix":"10.1145","volume":"8","author":[{"given":"Nicolas","family":"Gast","sequence":"first","affiliation":[{"name":"Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stratis","family":"Ioannidis","sequence":"additional","affiliation":[{"name":"Northeastern University, Boston, MA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patrick","family":"Loiseau","sequence":"additional","affiliation":[{"name":"Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG France and MPI-SWS, Saarbr\u00fccken, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Benjamin","family":"Roussillon","sequence":"additional","affiliation":[{"name":"Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, Grenoble, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,5,13]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2764468.2764519"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335438"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0370164600014346"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/519168.788219"},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"A. C. Atkinson A. N. Donev and R. D. Tobias. 2007. Optimum Experimental Designs with SAS. Oxford University Press New York. A. C. Atkinson A. N. Donev and R. D. Tobias. 2007. Optimum Experimental Designs with SAS. Oxford University Press New York.","DOI":"10.1093\/oso\/9780199296590.001.0001"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3328526.3329560"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"S. Boyd and L. Vandenberghe. 2004. Convex Optimization. Cambridge University Press. S. Boyd and L. Vandenberghe. 2004. Convex Optimization. Cambridge University Press.","DOI":"10.1017\/CBO9780511804441"},{"volume-title":"Proceedings of the 28th Conference on Learning Theory (COLT\u201915)","author":"Cai Y.","key":"e_1_2_1_8_1","unstructured":"Y. Cai , C. Daskalakis , and C. H. Papadimitriou . 2015. Optimum statistical estimation with strategic data sources . In Proceedings of the 28th Conference on Learning Theory (COLT\u201915) . 40.1\u201340.40. Y. Cai, C. Daskalakis, and C. H. Papadimitriou. 2015. Optimum statistical estimation with strategic data sources. In Proceedings of the 28th Conference on Learning Theory (COLT\u201915). 40.1\u201340.40."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 33rd International Conference on Machine Learning (ICML\u201916)","author":"Caragiannis Ioannis","year":"2016","unstructured":"Ioannis Caragiannis , Ariel D. Procaccia , and Nisarg Shah . 2016 . Truthful univariate estimators . In Proceedings of the 33rd International Conference on Machine Learning (ICML\u201916) . Ioannis Caragiannis, Ariel D. Procaccia, and Nisarg Shah. 2016. Truthful univariate estimators. In Proceedings of the 33rd International Conference on Machine Learning (ICML\u201916)."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219166.3219195"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219166.3219175"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSF.2015.14"},{"key":"e_1_2_1_13_1","unstructured":"Michela Chessa and Patrick Loiseau. 2017. On non-monetary incentives for the provision of public goods. Retrieved from https:\/\/ideas.repec.org\/p\/gre\/wpaper\/2017-24.html. Michela Chessa and Patrick Loiseau. 2017. On non-monetary incentives for the provision of public goods. Retrieved from https:\/\/ideas.repec.org\/p\/gre\/wpaper\/2017-24.html."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.pmcj.2012.07.011"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 28th Conference on Learning Theory (COLT\u201915)","volume":"40","author":"Cummings Rachel","year":"2015","unstructured":"Rachel Cummings , Stratis Ioannidis , and Katrina Ligett . 2015 . Truthful linear regression . In Proceedings of the 28th Conference on Learning Theory (COLT\u201915) , Vol. 40 . 1--36. Rachel Cummings, Stratis Ioannidis, and Katrina Ligett. 2015. Truthful linear regression. In Proceedings of the 28th Conference on Learning Theory (COLT\u201915), Vol. 40. 1--36."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629665"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2488388.2488417"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2010.03.003"},{"volume-title":"Privacy-preserving Data Mining","author":"Domingo-Ferrer Josep","key":"e_1_2_1_19_1","unstructured":"Josep Domingo-Ferrer . 2008. A survey of inference control methods for privacy-preserving data mining . In Privacy-preserving Data Mining . Springer , 53--80. Josep Domingo-Ferrer. 2008. A survey of inference control methods for privacy-preserving data mining. In Privacy-preserving Data Mining. Springer, 53--80."},{"volume-title":"Proceedings of the 54th IEEE Symposium on Foundations of Computer Science (FOCS\u201913)","author":"Duchi J. C.","key":"e_1_2_1_20_1","unstructured":"J. C. Duchi , M. I. Jordan , and M. J. Wainwright . 2013. Local privacy and statistical minimax rates . In Proceedings of the 54th IEEE Symposium on Foundations of Computer Science (FOCS\u201913) . 429--438. J. C. Duchi, M. I. Jordan, and M. J. Wainwright. 2013. Local privacy and statistical minimax rates. In Proceedings of the 54th IEEE Symposium on Foundations of Computer Science (FOCS\u201913). 429--438."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.2000.10474260"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/11787006_1"},{"volume-title":"Proceedings of the 29th Conference on Artificial Intelligence (AAAI\u201915)","author":"Frongillo Rafael M.","key":"e_1_2_1_23_1","unstructured":"Rafael M. Frongillo , Yiling Chen , and Ian A. Kash . 2015. Elicitation for aggregation . In Proceedings of the 29th Conference on Artificial Intelligence (AAAI\u201915) . Rafael M. Frongillo, Yiling Chen, and Ian A. Kash. 2015. Elicitation for aggregation. In Proceedings of the 29th Conference on Artificial Intelligence (AAAI\u201915)."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993574.1993605"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2840728.2840730"},{"volume-title":"The Elements of Statistical Learning: Data Mining, Inference and Prediction (2 ed.)","author":"Hastie Trevor","key":"e_1_2_1_26_1","unstructured":"Trevor Hastie , Robert Tibshirani , and Jerome Friedman . 2009. The Elements of Statistical Learning: Data Mining, Inference and Prediction (2 ed.) . Springer . Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference and Prediction (2 ed.). Springer."},{"volume-title":"Proceedings of the 11th Latin American Theoretical INformatics Symposium (LATIN\u201914)","author":"Horel Thibaut","key":"e_1_2_1_27_1","unstructured":"Thibaut Horel , Stratis Ioannidis , and S. Muthukrishnan . 2014. Budget feasible mechanisms for experimental design . In Proceedings of the 11th Latin American Theoretical INformatics Symposium (LATIN\u201914) . 719--730. Thibaut Horel, Stratis Ioannidis, and S. Muthukrishnan. 2014. Budget feasible mechanisms for experimental design. In Proceedings of the 11th Latin American Theoretical INformatics Symposium (LATIN\u201914). 719--730."},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 19th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'20)","author":"Hossain Safwan","year":"2020","unstructured":"Safwan Hossain and Nisarg Shah . 2020 . Pure nash equilibria in linear regression . In Proceedings of the 19th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'20) Safwan Hossain and Nisarg Shah. 2020. Pure nash equilibria in linear regression. In Proceedings of the 19th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'20)"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-45046-4_23"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.1040.0091"},{"key":"e_1_2_1_31_1","first-page":"1","article-title":"Extremal mechanisms for local differential privacy","volume":"17","author":"Kairouz Peter","year":"2016","unstructured":"Peter Kairouz , Sewoong Oh , and Pramod Viswanath . 2016 . Extremal mechanisms for local differential privacy . J. Mach. Learn. Res. 17 , 17 (2016), 1 -- 51 . Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2016. Extremal mechanisms for local differential privacy. J. Mach. Learn. Res. 17, 17 (2016), 1--51.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 25th Conference on Learning Theory (COLT\u201912)","author":"Kifer Daniel","year":"2012","unstructured":"Daniel Kifer , Adam Smith , and Abhradeep Thakurta . 2012 . Private convex empirical risk minimization and high-dimensional regression . In Proceedings of the 25th Conference on Learning Theory (COLT\u201912) . 25.1\u201325.40. Daniel Kifer, Adam Smith, and Abhradeep Thakurta. 2012. Private convex empirical risk minimization and high-dimensional regression. In Proceedings of the 25th Conference on Learning Theory (COLT\u201912). 25.1\u201325.40."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-35311-6_28"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS\u201916)","author":"Liu Yang","year":"2016","unstructured":"Yang Liu and Yiling Chen . 2016 . A bandit framework for strategic regression . In Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS\u201916) . 1821--1829. Yang Liu and Yiling Chen. 2016. A bandit framework for strategic regression. In Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS\u201916). 1821--1829."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2847220.2847238"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2012.03.008"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1006\/game.1996.0044"},{"key":"e_1_2_1_38_1","first-page":"4","article-title":"Financing public goods by means of lotteries","volume":"67","author":"Morgan John","year":"2000","unstructured":"John Morgan . 2000 . Financing public goods by means of lotteries . Rev. Econ. Stud. 67 , 4 (Oct. 2000), 761--84. John Morgan. 2000. Financing public goods by means of lotteries. Rev. Econ. Stud. 67, 4 (Oct. 2000), 761--84.","journal-title":"Rev. Econ. Stud."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2090236.2090254"},{"volume-title":"Proceedings of the Brazilian Symposium on Databases (SBBD\u201903)","author":"Stanley R.","key":"e_1_2_1_40_1","unstructured":"Stanley R. M. Oliveira and Osmar R. Zaiane. 2003. Privacy preserving clustering by data transformation . In Proceedings of the Brazilian Symposium on Databases (SBBD\u201903) . 304--318. Stanley R. M. Oliveira and Osmar R. Zaiane. 2003. Privacy preserving clustering by data transformation. In Proceedings of the Brazilian Symposium on Databases (SBBD\u201903). 304--318."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0165-4896(03)00085-4"},{"key":"e_1_2_1_42_1","unstructured":"F. Pukelsheim. 2006. Opt. Des. Exper. Vol. 50. Society for Industrial Mathematics. F. Pukelsheim. 2006. Opt. Des. Exper. Vol. 50. Society for Industrial Mathematics."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/506147.506153"},{"volume-title":"Population Games and Evolutionary Dynamics","author":"Sandholm William H.","key":"e_1_2_1_44_1","unstructured":"William H. Sandholm . 2010. Population Games and Evolutionary Dynamics . The MIT Press . William H. Sandholm. 2010. Population Games and Evolutionary Dynamics. The MIT Press."},{"volume-title":"Online Social Networks and Network Economics. Lecture notes","author":"Sch\u00e4fer Guido","key":"e_1_2_1_45_1","unstructured":"Guido Sch\u00e4fer . 2011. Online Social Networks and Network Economics. Lecture notes , Sapienza University of Rome . Guido Sch\u00e4fer. 2011. Online Social Networks and Network Economics. Lecture notes, Sapienza University of Rome."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1994.383392"},{"volume-title":"Privacy Preserving Data Mining","author":"Vaidya Jaideep","key":"e_1_2_1_47_1","unstructured":"Jaideep Vaidya , Christopher W. Clifton , and Yu Michael Zhu . 2006. Privacy Preserving Data Mining . Springer . Jaideep Vaidya, Christopher W. Clifton, and Yu Michael Zhu. 2006. Privacy Preserving Data Mining. Springer."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.2019.2922190"}],"container-title":["ACM Transactions on Economics and Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3391436","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3391436","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:41:41Z","timestamp":1750200101000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3391436"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,13]]},"references-count":48,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,5,31]]}},"alternative-id":["10.1145\/3391436"],"URL":"https:\/\/doi.org\/10.1145\/3391436","relation":{},"ISSN":["2167-8375","2167-8383"],"issn-type":[{"type":"print","value":"2167-8375"},{"type":"electronic","value":"2167-8383"}],"subject":[],"published":{"date-parts":[[2020,5,13]]},"assertion":[{"value":"2019-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-05-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}