{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T20:21:49Z","timestamp":1767990109323,"version":"3.49.0"},"publisher-location":"New York, NY, USA","reference-count":63,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,4,20]],"date-time":"2020-04-20T00:00:00Z","timestamp":1587340800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,4,20]]},"DOI":"10.1145\/3366423.3380104","type":"proceedings-article","created":{"date-parts":[[2020,5,4]],"date-time":"2020-05-04T08:11:44Z","timestamp":1588579904000},"page":"167-178","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":26,"title":["The Representativeness of Automated Web Crawls as a Surrogate for Human Browsing"],"prefix":"10.1145","author":[{"given":"David","family":"Zeber","sequence":"first","affiliation":[{"name":"Mozilla"}]},{"given":"Sarah","family":"Bird","sequence":"additional","affiliation":[{"name":"Mozilla"}]},{"given":"Camila","family":"Oliveira","sequence":"additional","affiliation":[{"name":"Mozilla"}]},{"given":"Walter","family":"Rudametkin","sequence":"additional","affiliation":[{"name":"Univ. Lille \/ Inria"}]},{"given":"Ilana","family":"Segall","sequence":"additional","affiliation":[{"name":"Mozilla"}]},{"given":"Fredrik","family":"Wolls\u00e9n","sequence":"additional","affiliation":[{"name":"Mozilla"}]},{"given":"Martin","family":"Lopatka","sequence":"additional","affiliation":[{"name":"Mozilla"}]}],"member":"320","published-online":{"date-parts":[[2020,4,20]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"a\u00a0data company","year":"2019"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2660267.2660347"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508859.2516674"},{"key":"e_1_3_2_1_4_1","volume-title":"an Amazon.com\u00a0company","author":"Alexa","year":"2019"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336937.3336940"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1062745.1062768"},{"key":"e_1_3_2_1_7_1","unstructured":"Benoit Bernard. 2018. Web Scraping and Crawling Are Perfectly Legal Right?https:\/\/benbernardblog.com\/web-scraping-and-crawling-are-perfectly-legal-right\/  Benoit Bernard. 2018. Web Scraping and Crawling Are Perfectly Legal Right?https:\/\/benbernardblog.com\/web-scraping-and-crawling-are-perfectly-legal-right\/"},{"key":"e_1_3_2_1_8_1","volume-title":"Privacy Technologies and Policy, Maurizio Naldi, Giuseppe\u00a0F. Italiano","author":"Boniface Coline"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1389-1286(00)00083-9"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/WEBMED.2004.1348139"},{"key":"e_1_3_2_1_11_1","unstructured":"Software\u00a0Freedom Conservancy. 2019. SeleniumHQ Browser Automation.  Software\u00a0Freedom Conservancy. 2019. SeleniumHQ Browser Automation."},{"key":"e_1_3_2_1_12_1","unstructured":"World Wide\u00a0Web Consortium. 2019. W3C Webdriver Standard. https:\/\/w3c.github.io\/webdriver\/  World Wide\u00a0Web Consortium. 2019. W3C Webdriver Standard. https:\/\/w3c.github.io\/webdriver\/"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243860"},{"key":"e_1_3_2_1_14_1","unstructured":"Disconnect. 2019. Disconnect Tracking Protection List. https:\/\/github.com\/disconnectme\/disconnect-tracking-protection  Disconnect. 2019. Disconnect Tracking Protection List. https:\/\/github.com\/disconnectme\/disconnect-tracking-protection"},{"key":"e_1_3_2_1_15_1","volume-title":"Web robot detection techniques: overview and limitations. Data Mining and Knowledge Discovery 22, 1 (01","author":"Doran Derek","year":"2011"},{"key":"e_1_3_2_1_16_1","volume-title":"Privacy Enhancing Technologies (2010-07-21) (Lecture Notes in Computer Science)","author":"Eckersley Peter"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2976749.2978313"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/316194.316229"},{"key":"e_1_3_2_1_19_1","volume-title":"SpeedReader: Reader Mode Made Fast and Private. In The World Wide Web Conference","author":"Ghasemisharif Mohammad","year":"2019"},{"key":"e_1_3_2_1_20_1","unstructured":"Google. 2019. Puppeteer. https:\/\/pptr.dev\/  Google. 2019. Puppeteer. https:\/\/pptr.dev\/"},{"key":"e_1_3_2_1_21_1","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC","author":"Grave Edouard","year":"2018"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1515\/popets-2015-0018"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Felix Hern\u00e1ndez-Campos Kevin Jeffay and F.D. Smith. 2003. Tracking the evolution of Web traffic: 1995-2003. In in: Proceedings of the 11th IEEE\/ACM International Symposium on Modeling Analysis and Simulation of Computer Telecommunication Systems (MASCOTS. IEEE\/ACM International Orland Florida USA 16\u201325. https:\/\/doi.org\/10.1109\/MASCOT.2003.1240638  Felix Hern\u00e1ndez-Campos Kevin Jeffay and F.D. Smith. 2003. Tracking the evolution of Web traffic: 1995-2003. In in: Proceedings of the 11th IEEE\/ACM International Symposium on Modeling Analysis and Simulation of Computer Telecommunication Systems (MASCOTS. IEEE\/ACM International Orland Florida USA 16\u201325. https:\/\/doi.org\/10.1109\/MASCOT.2003.1240638","DOI":"10.1109\/MASCOT.2003.1240638"},{"key":"e_1_3_2_1_24_1","volume-title":"Passive and Active Measurement (2016-03-31)","author":"Kalavri Vasiliki"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1002\/widm.1218"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2016.57"},{"key":"e_1_3_2_1_27_1","volume-title":"Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. arxiv:1806.01156\u00a0[cs.CR]","author":"Pochat V. Le","year":"2018"},{"key":"e_1_3_2_1_28_1","unstructured":"Arunesh Mathur Gunes Acar Michael Friedman Elena Lucherini Jonathan\u00a0R. Mayer Marshini Chetty and Arvind Narayanan. 2019. Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites. CoRR abs\/1907.07032(2019) 1\u201332. arxiv:1907.07032http:\/\/arxiv.org\/abs\/1907.07032  Arunesh Mathur Gunes Acar Michael Friedman Elena Lucherini Jonathan\u00a0R. Mayer Marshini Chetty and Arvind Narayanan. 2019. Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites. CoRR abs\/1907.07032(2019) 1\u201332. arxiv:1907.07032http:\/\/arxiv.org\/abs\/1907.07032"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2016.23353"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/EuroSP.2017.26"},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the 2012 World Wide Web conference, WebScience Track","author":"Metaxas Panagiotis","year":"2012"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1561\/106.00000003"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1298306.1298311"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3319535.3354198"},{"key":"e_1_3_2_1_35_1","unstructured":"Mozilla. 2019. Firefox Now Available with Enhanced Tracking Protection by Default Plus Updates to Facebook Container Firefox Monitor and Lockwise. Accessed: 14-Oct-2019.  Mozilla. 2019. Firefox Now Available with Enhanced Tracking Protection by Default Plus Updates to Facebook Container Firefox Monitor and Lockwise. Accessed: 14-Oct-2019."},{"key":"e_1_3_2_1_36_1","unstructured":"Mozilla. 2019. JESTr Pioneer Shield Study.  Mozilla. 2019. JESTr Pioneer Shield Study."},{"key":"e_1_3_2_1_37_1","unstructured":"Mozilla. 2019. Mozilla Privacy Policy. https:\/\/www.mozilla.org\/en-US\/privacy\/  Mozilla. 2019. Mozilla Privacy Policy. https:\/\/www.mozilla.org\/en-US\/privacy\/"},{"key":"e_1_3_2_1_38_1","unstructured":"Mozilla. 2019. Security\/Anti tracking policy. Accessed: 29-July-2019.  Mozilla. 2019. Security\/Anti tracking policy. Accessed: 29-July-2019."},{"key":"e_1_3_2_1_39_1","unstructured":"Mozilla. 2019. Study Companion Repository.  Mozilla. 2019. Study Companion Repository."},{"key":"e_1_3_2_1_40_1","volume-title":"Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting. In 2013 IEEE Symposium on Security and Privacy. IEEE Computer Society","author":"Nikiforakis N.","year":"2013"},{"key":"e_1_3_2_1_41_1","unstructured":"Inria &\u00a0University of Lille. 2019. AmIUnique.  Inria &\u00a0University of Lille. 2019. AmIUnique."},{"key":"e_1_3_2_1_42_1","volume-title":"Why Johnny Can\u2019t Browse in Peace: On the Uniqueness of Web Browsing History Patterns. In 5th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs","author":"Olejnik Lukasz","year":"2012"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3134005"},{"key":"e_1_3_2_1_44_1","volume-title":"Kraaler: A User-Perspective Web Crawler. In 2019 Network Traffic Measurement and Analysis Conference (TMA). ACM SIIGCOMM","author":"Panum K.","year":"2019"},{"key":"e_1_3_2_1_45_1","volume-title":"Cookie Synchronization: Everything You Always Wanted to Know But Were Afraid to Ask. arXiv e-prints v2(2018), 1\u201311. arxiv:1805.10505\u00a0[cs] http:\/\/arxiv.org\/abs\/1805.10505","author":"Papadopoulos Panagiotis","year":"2018"},{"key":"e_1_3_2_1_46_1","article-title":"MyAdChoices: Bringing Transparency and Control to Online Advertising","volume":"11","author":"Parra-Arnau Javier","year":"2017","journal-title":"ACM Transactions on the Web (TWEB)"},{"key":"e_1_3_2_1_47_1","volume-title":"Evaluating the Long-term Effects of Parameters on the Characteristics of the Tranco Top Sites Ranking. In 12th USENIX Workshop on Cyber Security Experimentation and Test (CSET 19)","author":"Pochat Victor\u00a0Le","year":"2019"},{"key":"e_1_3_2_1_48_1","unstructured":"Quora. 2018. Is scraping and crawling to collect data illegal?https:\/\/www.quora.com\/Is-scraping-and-crawling-to-collect-data-illegal  Quora. 2018. Is scraping and crawling to collect data illegal?https:\/\/www.quora.com\/Is-scraping-and-crawling-to-collect-data-illegal"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"crossref","unstructured":"A. Saverimoutou B. Mathieu and S. Vaton. 2019. Web View: Measuring Monitoring Representative Information on Websites. In 2019 22nd Conference on Innovation in Clouds Internet and Networks and Workshops (ICIN) Vol.\u00a04. ACM SIGCOMM Computer Communication Review Paris France 133\u2013138. https:\/\/doi.org\/10.1109\/ICIN.2019.8685876  A. Saverimoutou B. Mathieu and S. Vaton. 2019. Web View: Measuring Monitoring Representative Information on Websites. In 2019 22nd Conference on Innovation in Clouds Internet and Networks and Workshops (ICIN) Vol.\u00a04. ACM SIGCOMM Computer Communication Review Paris France 133\u2013138. https:\/\/doi.org\/10.1109\/ICIN.2019.8685876","DOI":"10.1109\/ICIN.2019.8685876"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1561\/106.00000014"},{"key":"e_1_3_2_1_51_1","unstructured":"Scrapinghub. 2019. Scrapy. https:\/\/scrapy.org  Scrapinghub. 2019. Scrapy. https:\/\/scrapy.org"},{"key":"e_1_3_2_1_52_1","unstructured":"Majestic SEO. 2019. The Majestic Million: The million domains we find with the most referring subnets. Accessed: 29-July-2019.  Majestic SEO. 2019. The Majestic Million: The million domains we find with the most referring subnets. Accessed: 29-July-2019."},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1515\/popets-2018-0035"},{"key":"e_1_3_2_1_54_1","volume-title":"Jellyfish: A conceptual model for the AS internet topology. Journal of Communications and Networks - JCN 8 (01","author":"Siganos Georgos","year":"2005"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"crossref","unstructured":"Doli\u00e8re\u00a0Francis Some Nataliia Bielova and Tamara Rezk. 2017. On the Content Security Policy Violations Due to the Same-Origin Policy. In Proceedings of the 26th International Conference on World Wide Web (Perth Australia) (WWW \u201917). International World Wide Web Conferences Steering Committee Republic and Canton of Geneva Switzerland 877\u2013886. https:\/\/doi.org\/10.1145\/3038912.3052634  Doli\u00e8re\u00a0Francis Some Nataliia Bielova and Tamara Rezk. 2017. On the Content Security Policy Violations Due to the Same-Origin Policy. In Proceedings of the 26th International Conference on World Wide Web (Perth Australia) (WWW \u201917). International World Wide Web Conferences Steering Committee Republic and Canton of Geneva Switzerland 877\u2013886. https:\/\/doi.org\/10.1145\/3038912.3052634","DOI":"10.1145\/3038912.3052634"},{"key":"e_1_3_2_1_56_1","unstructured":"Nikolai Tschacher. 2019. Scraping 1 million keywords on the Google Search Engine. Accessed: 13-October-2019.  Nikolai Tschacher. 2019. Scraping 1 million keywords on the Google Search Engine. Accessed: 13-October-2019."},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3321705.3329855"},{"key":"e_1_3_2_1_58_1","unstructured":"MDN web\u00a0docs contributors. 2019. webNavigation. Accessed: 29-July-2019.  MDN web\u00a0docs contributors. 2019. webNavigation. Accessed: 29-July-2019."},{"key":"e_1_3_2_1_59_1","unstructured":"Princeton\u00a0University WebTAP\u00a0research group. 2019. Studies using OpenWPM. https:\/\/webtap.princeton.edu\/software\/  Princeton\u00a0University WebTAP\u00a0research group. 2019. Studies using OpenWPM. https:\/\/webtap.princeton.edu\/software\/"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/2187980.2188068"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"crossref","unstructured":"Zhonghao Yu Sam Macbeth Konark Modi and Josep\u00a0M. Pujol. 2016. Tracking the Trackers. In Proceedings of the 25th International Conference on World Wide Web (Montr\u00e9al Qu\u00e9bec Canada) (WWW \u201916). International World Wide Web Conferences Steering Committee Republic and Canton of Geneva CHE 121\u2013132. https:\/\/doi.org\/10.1145\/2872427.2883028  Zhonghao Yu Sam Macbeth Konark Modi and Josep\u00a0M. Pujol. 2016. Tracking the Trackers. In Proceedings of the 25th International Conference on World Wide Web (Montr\u00e9al Qu\u00e9bec Canada) (WWW \u201916). International World Wide Web Conferences Steering Committee Republic and Canton of Geneva CHE 121\u2013132. https:\/\/doi.org\/10.1145\/2872427.2883028","DOI":"10.1145\/2872427.2883028"},{"key":"e_1_3_2_1_62_1","volume-title":"26th USENIX Security Symposium (USENIX Security 17)","author":"Zimmeck Sebastian","year":"2017"},{"key":"e_1_3_2_1_63_1","unstructured":"Aram Zucker-Scharff. 2019. Understanding the Unplanned Internet - How Ad Tech is Broken By Design 101. https:\/\/youtu.be\/QIyxmSfKGbw?t=7907  Aram Zucker-Scharff. 2019. Understanding the Unplanned Internet - How Ad Tech is Broken By Design 101. https:\/\/youtu.be\/QIyxmSfKGbw?t=7907"}],"event":{"name":"WWW '20: The Web Conference 2020","location":"Taipei Taiwan","acronym":"WWW '20","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"]},"container-title":["Proceedings of The Web Conference 2020"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3366423.3380104","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3366423.3380104","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:32:59Z","timestamp":1750199579000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3366423.3380104"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,20]]},"references-count":63,"alternative-id":["10.1145\/3366423.3380104","10.1145\/3366423"],"URL":"https:\/\/doi.org\/10.1145\/3366423.3380104","relation":{},"subject":[],"published":{"date-parts":[[2020,4,20]]},"assertion":[{"value":"2020-04-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}