{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T09:39:41Z","timestamp":1774604381770,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":83,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,4,20]],"date-time":"2020-04-20T00:00:00Z","timestamp":1587340800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,4,20]]},"DOI":"10.1145\/3366423.3380113","type":"proceedings-article","created":{"date-parts":[[2020,5,4]],"date-time":"2020-05-04T08:11:44Z","timestamp":1588579904000},"page":"271-280","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":20,"title":["Apophanies or Epiphanies? How Crawlers Impact Our Understanding of the Web"],"prefix":"10.1145","author":[{"given":"Syed Suleman","family":"Ahmad","sequence":"first","affiliation":[{"name":"University of Wisconsin-Madison"}]},{"given":"Muhammad Daniyal","family":"Dar","sequence":"additional","affiliation":[{"name":"University of Iowa"}]},{"given":"Muhammad Fareed","family":"Zaffar","sequence":"additional","affiliation":[{"name":"LUMS"}]},{"given":"Narseo","family":"Vallina-Rodriguez","sequence":"additional","affiliation":[{"name":"IMDEA Networks\/ICSI"}]},{"given":"Rishab","family":"Nithyanand","sequence":"additional","affiliation":[{"name":"University of Iowa"}]}],"member":"320","published-online":{"date-parts":[[2020,4,20]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2008. Blockpages Archived. https:\/\/advox.globalvoices.org\/past-projects\/blockpages\/  2008. Blockpages Archived. https:\/\/advox.globalvoices.org\/past-projects\/blockpages\/"},{"key":"e_1_3_2_1_2_1","unstructured":"2015. Collection of censorship blockpages as collected by various sources. https:\/\/github.com\/citizenlab\/blockpages  2015. Collection of censorship blockpages as collected by various sources. https:\/\/github.com\/citizenlab\/blockpages"},{"key":"e_1_3_2_1_3_1","unstructured":"2015. ghost.py - Ghost.py 0.2.2 documentation. https:\/\/ghost-py.readthedocs.io\/en\/latest\/  2015. ghost.py - Ghost.py 0.2.2 documentation. https:\/\/ghost-py.readthedocs.io\/en\/latest\/"},{"key":"e_1_3_2_1_4_1","unstructured":"2015. grub.org - Distributed Internet Crawler download | SourceForge.net. https:\/\/sourceforge.net\/projects\/grub\/  2015. grub.org - Distributed Internet Crawler download | SourceForge.net. https:\/\/sourceforge.net\/projects\/grub\/"},{"key":"e_1_3_2_1_5_1","unstructured":"2016. Fast and powerful scraping and web crawling framework. https:\/\/scrapy.org\/  2016. Fast and powerful scraping and web crawling framework. https:\/\/scrapy.org\/"},{"key":"e_1_3_2_1_6_1","unstructured":"2016. PhantomJS v2.1.1. https:\/\/phantomjs.org\/release-2.1.html  2016. PhantomJS v2.1.1. https:\/\/phantomjs.org\/release-2.1.html"},{"key":"e_1_3_2_1_7_1","unstructured":"2017. Wget v1.17.1. https:\/\/packages.ubuntu.com\/xenial\/wget  2017. Wget v1.17.1. https:\/\/packages.ubuntu.com\/xenial\/wget"},{"key":"e_1_3_2_1_8_1","unstructured":"2018. OWASP ZAP Spider. https:\/\/www.owasp.org\/index.php\/OWASP_Zed_Attack_Proxy_Project  2018. OWASP ZAP Spider. https:\/\/www.owasp.org\/index.php\/OWASP_Zed_Attack_Proxy_Project"},{"key":"e_1_3_2_1_9_1","unstructured":"2018. Raccoon: reconnaissance and vulnerability scanning. https:\/\/www.prodefence.org\/raccoon-reconnaissance-and-vulnerability-scanning\/  2018. Raccoon: reconnaissance and vulnerability scanning. https:\/\/www.prodefence.org\/raccoon-reconnaissance-and-vulnerability-scanning\/"},{"key":"e_1_3_2_1_10_1","unstructured":"2018. Selenium - Web Browser Automation. http:\/\/www.seleniumhq.org\/  2018. Selenium - Web Browser Automation. http:\/\/www.seleniumhq.org\/"},{"key":"e_1_3_2_1_11_1","unstructured":"2018. Selenium v3.141.0. https:\/\/rubygems.org\/gems\/selenium-webdriver\/versions\/3.141.0  2018. Selenium v3.141.0. https:\/\/rubygems.org\/gems\/selenium-webdriver\/versions\/3.141.0"},{"key":"e_1_3_2_1_12_1","unstructured":"2018. Tor-Browser-Crawler. https:\/\/github.com\/webfp\/tor-browser-crawler  2018. Tor-Browser-Crawler. https:\/\/github.com\/webfp\/tor-browser-crawler"},{"key":"e_1_3_2_1_13_1","unstructured":"2019. ACM CCS. https:\/\/www.sigsac.org\/ccs.html  2019. ACM CCS. https:\/\/www.sigsac.org\/ccs.html"},{"key":"e_1_3_2_1_14_1","unstructured":"2019. Alexa Top Sites. https:\/\/docs.aws.amazon.com\/AlexaTopSites\/latest\/ApiReference_TopSitesAction.html  2019. Alexa Top Sites. https:\/\/docs.aws.amazon.com\/AlexaTopSites\/latest\/ApiReference_TopSitesAction.html"},{"key":"e_1_3_2_1_15_1","unstructured":"2019. Block Bots with Bot Mitigation and Detection Tools | Distil Networks. https:\/\/www.distilnetworks.com\/block-bot-detection\/  2019. Block Bots with Bot Mitigation and Detection Tools | Distil Networks. https:\/\/www.distilnetworks.com\/block-bot-detection\/"},{"key":"e_1_3_2_1_16_1","unstructured":"2019. Cisco Talos Intelligence Group - Comprehensive Threat Intelligence. https:\/\/www.talosintelligence.com\/  2019. Cisco Talos Intelligence Group - Comprehensive Threat Intelligence. https:\/\/www.talosintelligence.com\/"},{"key":"e_1_3_2_1_17_1","unstructured":"2019. Cisco Umbrella 1 Million - OpenDNS Umbrella Blog. https:\/\/umbrella.cisco.com\/blog\/2016\/12\/14\/cisco-umbrella-1-million\/  2019. Cisco Umbrella 1 Million - OpenDNS Umbrella Blog. https:\/\/umbrella.cisco.com\/blog\/2016\/12\/14\/cisco-umbrella-1-million\/"},{"key":"e_1_3_2_1_18_1","unstructured":"2019. Curl - Command Line Tool. https:\/\/curl.haxx.se\/  2019. Curl - Command Line Tool. https:\/\/curl.haxx.se\/"},{"key":"e_1_3_2_1_19_1","unstructured":"2019. cURL v7.64.0. https:\/\/curl.haxx.se\/changes.html#7_64_0  2019. cURL v7.64.0. https:\/\/curl.haxx.se\/changes.html#7_64_0"},{"key":"e_1_3_2_1_20_1","unstructured":"2019. EasyList - Overview. https:\/\/easylist.to\/  2019. EasyList - Overview. https:\/\/easylist.to\/"},{"key":"e_1_3_2_1_21_1","unstructured":"2019. GNU Wget 1.20 Manual. https:\/\/www.gnu.org\/software\/wget\/manual\/wget.html#Robot-Exclusion  2019. GNU Wget 1.20 Manual. https:\/\/www.gnu.org\/software\/wget\/manual\/wget.html#Robot-Exclusion"},{"key":"e_1_3_2_1_22_1","volume-title":"IEEE Symposium on Security and Privacy 2019","year":"2019","unstructured":"2019. IEEE Symposium on Security and Privacy 2019 . https:\/\/www.ieee-security.org\/TC\/SP 2019 \/ 2019. IEEE Symposium on Security and Privacy 2019. https:\/\/www.ieee-security.org\/TC\/SP2019\/"},{"key":"e_1_3_2_1_23_1","volume-title":"IMC Conference | acm sigcomm. https:\/\/www.sigcomm.org\/events\/imc-conference","unstructured":"2019. IMC Conference | acm sigcomm. https:\/\/www.sigcomm.org\/events\/imc-conference 2019. IMC Conference | acm sigcomm. https:\/\/www.sigcomm.org\/events\/imc-conference"},{"key":"e_1_3_2_1_24_1","volume-title":"International Conference on Web and Social Media Papers (ICWSM). http:\/\/www.aaai.org\/Library\/ICWSM\/icwsm-library.php","unstructured":"2019. International Conference on Web and Social Media Papers (ICWSM). http:\/\/www.aaai.org\/Library\/ICWSM\/icwsm-library.php 2019. International Conference on Web and Social Media Papers (ICWSM). http:\/\/www.aaai.org\/Library\/ICWSM\/icwsm-library.php"},{"key":"e_1_3_2_1_25_1","volume-title":"International World Wide Web Conference Committee (WWW). https:\/\/www.iw3c2.org","unstructured":"2019. International World Wide Web Conference Committee (WWW). https:\/\/www.iw3c2.org 2019. International World Wide Web Conference Committee (WWW). https:\/\/www.iw3c2.org"},{"key":"e_1_3_2_1_26_1","unstructured":"2019. internetarchive\/heritrix3: Heritrix is the Internet Archive\u2019s open-source extensible web-scale archival-quality web crawler project.https:\/\/github.com\/internetarchive\/heritrix3  2019. internetarchive\/heritrix3: Heritrix is the Internet Archive\u2019s open-source extensible web-scale archival-quality web crawler project.https:\/\/github.com\/internetarchive\/heritrix3"},{"key":"e_1_3_2_1_27_1","volume-title":"NDSS Symposium - The Network and Distributed System Security Symposium (NDSS). https:\/\/www.ndss-symposium.org\/","unstructured":"2019. NDSS Symposium - The Network and Distributed System Security Symposium (NDSS). https:\/\/www.ndss-symposium.org\/ 2019. NDSS Symposium - The Network and Distributed System Security Symposium (NDSS). https:\/\/www.ndss-symposium.org\/"},{"key":"e_1_3_2_1_28_1","unstructured":"2019. OpemWPM v0.8.0. https:\/\/github.com\/mozilla\/OpenWPM\/blob\/b3ead7e38892095950806e8bcbb2e1129c27ca96\/VERSION  2019. OpemWPM v0.8.0. https:\/\/github.com\/mozilla\/OpenWPM\/blob\/b3ead7e38892095950806e8bcbb2e1129c27ca96\/VERSION"},{"key":"e_1_3_2_1_29_1","unstructured":"2019. OWASP ZAP v2.7.0. https:\/\/github.com\/zaproxy\/zap-core-help\/wiki\/HelpReleases2_7_0  2019. OWASP ZAP v2.7.0. https:\/\/github.com\/zaproxy\/zap-core-help\/wiki\/HelpReleases2_7_0"},{"key":"e_1_3_2_1_30_1","unstructured":"2019. OWASP Zed Attack Proxy Project - OWASP. https:\/\/www.owasp.org\/index.php\/OWASP_Zed_Attack_Proxy_Project  2019. OWASP Zed Attack Proxy Project - OWASP. https:\/\/www.owasp.org\/index.php\/OWASP_Zed_Attack_Proxy_Project"},{"key":"e_1_3_2_1_31_1","unstructured":"2019. PoPETs\/PETS. https:\/\/petsymposium.org\/  2019. PoPETs\/PETS. https:\/\/petsymposium.org\/"},{"key":"e_1_3_2_1_32_1","unstructured":"2019. Puppeteer v1.12.1. https:\/\/github.com\/GoogleChrome\/puppeteer\/releases  2019. Puppeteer v1.12.1. https:\/\/github.com\/GoogleChrome\/puppeteer\/releases"},{"key":"e_1_3_2_1_33_1","unstructured":"2019. RedTeam Pentesting on Twitter. https:\/\/twitter.com\/RedTeamPT\/status\/1110843396657238016  2019. RedTeam Pentesting on Twitter. https:\/\/twitter.com\/RedTeamPT\/status\/1110843396657238016"},{"key":"e_1_3_2_1_34_1","unstructured":"2019. USENIX Security Symposia | USENIX. https:\/\/www.usenix.org\/conferences\/byname\/108  2019. USENIX Security Symposia | USENIX. https:\/\/www.usenix.org\/conferences\/byname\/108"},{"key":"e_1_3_2_1_35_1","unstructured":"2019. Wget - Command Line Tool. https:\/\/www.gnu.org\/software\/wget\/  2019. Wget - Command Line Tool. https:\/\/www.gnu.org\/software\/wget\/"},{"key":"e_1_3_2_1_36_1","unstructured":"2019. zmap\/zgrab2: Go Application Layer Scanner. https:\/\/github.com\/zmap\/zgrab2  2019. zmap\/zgrab2: Go Application Layer Scanner. https:\/\/github.com\/zmap\/zgrab2"},{"key":"e_1_3_2_1_37_1","unstructured":"2020. Javascript Usage Statistics. https:\/\/trends.builtwith.com\/docinfo\/Javascript  2020. Javascript Usage Statistics. https:\/\/trends.builtwith.com\/docinfo\/Javascript"},{"key":"#cr-split#-e_1_3_2_1_38_1.1","doi-asserted-by":"crossref","unstructured":"Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P.\u00a0Gummadi Patrick Loiseau and Alan Mislove. 2018. Investigating Ad Transparency Mechanisms in Social Media: A Case Study of Facebook's Explanations. https:\/\/doi.org\/10.14722\/ndss.2018.23204 10.14722\/ndss.2018.23204","DOI":"10.14722\/ndss.2018.23191"},{"key":"#cr-split#-e_1_3_2_1_38_1.2","doi-asserted-by":"crossref","unstructured":"Athanasios Andreou Giridhari Venkatadri Oana Goga Krishna P.\u00a0Gummadi Patrick Loiseau and Alan Mislove. 2018. Investigating Ad Transparency Mechanisms in Social Media: A Case Study of Facebook's Explanations. https:\/\/doi.org\/10.14722\/ndss.2018.23204","DOI":"10.14722\/ndss.2018.23191"},{"key":"e_1_3_2_1_39_1","unstructured":"Anu. 2015. A Survey on Web Forum Crawling Techniques.  Anu. 2015. A Survey on Web Forum Crawling Techniques."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/PCI.2011.53"},{"key":"e_1_3_2_1_41_1","unstructured":"Luca Becchetti Luca Carlos Castillo Carlos Debora Donato Debora Fazzone and Adriano. 2006. A comparison of sampling techniques for Web characterization.  Luca Becchetti Luca Carlos Castillo Carlos Debora Donato Debora Fazzone and Adriano. 2006. A comparison of sampling techniques for Web characterization."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2660267.2660362"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2815675.2815720"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2812803"},{"key":"e_1_3_2_1_45_1","unstructured":"John Cook Rishab Nithyanand and Zubair Shafiq. 2019. Inferring Tracker-Advertiser Relationships in the Online Advertising Ecosystem using Header Bidding. CoRR abs\/1907.07275(2019). arxiv:1907.07275http:\/\/arxiv.org\/abs\/1907.07275  John Cook Rishab Nithyanand and Zubair Shafiq. 2019. Inferring Tracker-Advertiser Relationships in the Online Advertising Ecosystem using Header Bidding. CoRR abs\/1907.07275(2019). arxiv:1907.07275http:\/\/arxiv.org\/abs\/1907.07275"},{"key":"e_1_3_2_1_46_1","volume-title":"Selendroid: Selenium for Android","author":"Dary Dominik","year":"2019","unstructured":"Dominik Dary . 2019 . Selendroid: Selenium for Android . http:\/\/selendroid.io\/ Dominik Dary. 2019. Selendroid: Selenium for Android. http:\/\/selendroid.io\/"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243734.3243860"},{"key":"e_1_3_2_1_48_1","volume-title":"Making simulation results reproducible-Survey, guidelines, and examples based on Gradle and Docker. PeerJ Computer Science 5 (12","author":"Elmenreich Wilfried","year":"2019","unstructured":"Wilfried Elmenreich , Philipp Moll , Sebastian Theuermann , and Mathias Lux . 2019. Making simulation results reproducible-Survey, guidelines, and examples based on Gradle and Docker. PeerJ Computer Science 5 (12 2019 ), e240. https:\/\/doi.org\/10.7717\/peerj-cs.240 10.7717\/peerj-cs.240 Wilfried Elmenreich, Philipp Moll, Sebastian Theuermann, and Mathias Lux. 2019. Making simulation results reproducible-Survey, guidelines, and examples based on Gradle and Docker. PeerJ Computer Science 5 (12 2019), e240. https:\/\/doi.org\/10.7717\/peerj-cs.240"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2976749.2978313"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"crossref","unstructured":"Steven Englehardt and Arvind Narayanan. 2018. OpenWPM - Web Privacy Measurement Framework. https:\/\/github.com\/mozilla\/OpenWPM  Steven Englehardt and Arvind Narayanan. 2018. OpenWPM - Web Privacy Measurement Framework. https:\/\/github.com\/mozilla\/OpenWPM","DOI":"10.1515\/popets-2018-0006"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/1162678.1162679"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3211852.3211864"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.14722\/ndss.2019.23511"},{"key":"e_1_3_2_1_54_1","volume-title":"Web scraping technologies in an API world. Briefings in Bioinformatics 15, 5 (04","author":"Glez-Pe\u00f1a Daniel","year":"2013","unstructured":"Daniel Glez-Pe\u00f1a , An\u00e1lia Louren\u00e7o , Hugo L\u00f3pez-Fern\u00e1ndez , Miguel Reboiro-Jato , and Florentino Fdez-Riverola . 2013. Web scraping technologies in an API world. Briefings in Bioinformatics 15, 5 (04 2013 ), 788\u2013797. https:\/\/doi.org\/10.1093\/bib\/bbt026 arXiv:https:\/\/academic.oup.com\/bib\/article-pdf\/15\/5\/788\/17488715\/bbt026.pdf 10.1093\/bib Daniel Glez-Pe\u00f1a, An\u00e1lia Louren\u00e7o, Hugo L\u00f3pez-Fern\u00e1ndez, Miguel Reboiro-Jato, and Florentino Fdez-Riverola. 2013. Web scraping technologies in an API world. Briefings in Bioinformatics 15, 5 (04 2013), 788\u2013797. https:\/\/doi.org\/10.1093\/bib\/bbt026 arXiv:https:\/\/academic.oup.com\/bib\/article-pdf\/15\/5\/788\/17488715\/bbt026.pdf"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"crossref","unstructured":"Roberto Gonzalez Claudio Soriente and Nikolaos Laoutaris. 2016. User Profiling in the Time of HTTPS. In IMC \u201916.  Roberto Gonzalez Claudio Soriente and Nikolaos Laoutaris. 2016. User Profiling in the Time of HTTPS. In IMC \u201916.","DOI":"10.1145\/2987443.2987451"},{"key":"e_1_3_2_1_56_1","unstructured":"Google. 2019. Chrome Puppeteer. https:\/\/github.com\/GoogleChrome\/puppeteer  Google. 2019. Chrome Puppeteer. https:\/\/github.com\/GoogleChrome\/puppeteer"},{"key":"e_1_3_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/1655008.1655013"},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/1655008.1655013"},{"key":"e_1_3_2_1_59_1","unstructured":"Ariya Hidayat. 2018. PhantomJS - Headless Web Browser. http:\/\/phantomjs.org  Ariya Hidayat. 2018. PhantomJS - Headless Web Browser. http:\/\/phantomjs.org"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/2663716.2663722"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1002\/widm.1218"},{"key":"e_1_3_2_1_62_1","volume-title":"25th USENIX Security Symposium (USENIX Security 16)","author":"Lerner Ada","year":"2016","unstructured":"Ada Lerner , Anna\u00a0Kornfeld Simpson , Tadayoshi Kohno , and Franziska Roesner . 2016 . Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016 . In 25th USENIX Security Symposium (USENIX Security 16) . USENIX Association, Austin, TX. https:\/\/www.usenix.org\/conference\/usenixsecurity16\/technical-sessions\/presentation\/lerner Ada Lerner, Anna\u00a0Kornfeld Simpson, Tadayoshi Kohno, and Franziska Roesner. 2016. Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin, TX. https:\/\/www.usenix.org\/conference\/usenixsecurity16\/technical-sessions\/presentation\/lerner"},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1145\/2420950.2420953"},{"key":"e_1_3_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/1180405.1180437"},{"key":"e_1_3_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3278532.3278552"},{"key":"e_1_3_2_1_66_1","doi-asserted-by":"crossref","unstructured":"Allison McDonald Matthew Bernhard Luke Valenta Benjamin VanderSloot Will Scott Nick Sullivan J.\u00a0Alex Halderman and Roya Ensafi. 2018. 403 Forbidden: A Global View of CDN Geoblocking. In IMC \u201918.  Allison McDonald Matthew Bernhard Luke Valenta Benjamin VanderSloot Will Scott Nick Sullivan J.\u00a0Alex Halderman and Roya Ensafi. 2018. 403 Forbidden: A Global View of CDN Geoblocking. In IMC \u201918.","DOI":"10.1145\/3278532.3278552"},{"key":"e_1_3_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCC.2011.5983970"},{"key":"e_1_3_2_1_68_1","unstructured":"Arian\u00a0Akhavan Niaki Shinyoung Cho Zachary Weinberg Nguyen\u00a0Phong Hoang Abbas Razaghpanah Nicolas Christin and Phillipa Gill. 2019. ICLab: A Global Longitudinal Internet Censorship Measurement Platform. CoRR abs\/1907.04245(2019). arxiv:1907.04245http:\/\/arxiv.org\/abs\/1907.04245  Arian\u00a0Akhavan Niaki Shinyoung Cho Zachary Weinberg Nguyen\u00a0Phong Hoang Abbas Razaghpanah Nicolas Christin and Phillipa Gill. 2019. ICLab: A Global Longitudinal Internet Censorship Measurement Platform. CoRR abs\/1907.04245(2019). arxiv:1907.04245http:\/\/arxiv.org\/abs\/1907.04245"},{"key":"e_1_3_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3131365.3131397"},{"key":"e_1_3_2_1_70_1","unstructured":"Watir Project. 2019. Watir Project. http:\/\/watir.com\/  Watir Project. 2019. Watir Project. http:\/\/watir.com\/"},{"key":"e_1_3_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.5555\/829514.830535"},{"key":"e_1_3_2_1_72_1","unstructured":"Abbas Razaghpanah Anke Li Arturo Filasto Rishab Nithyanand Vasilis Ververis Will Scott and Phillipa Gill. 2016. Exploring the design space of longitudinal censorship measurement platforms. arXiv preprint arXiv:1606.01979(2016).  Abbas Razaghpanah Anke Li Arturo Filasto Rishab Nithyanand Vasilis Ververis Will Scott and Phillipa Gill. 2016. Exploring the design space of longitudinal censorship measurement platforms. arXiv preprint arXiv:1606.01979(2016)."},{"key":"e_1_3_2_1_73_1","volume-title":"Journal of Telecommunications","author":"Ricardo Andre","unstructured":"Andre Ricardo and Carlos Serrao . 2013. Comparison of existing open source tools for web crawling and indexing of free music . In Journal of Telecommunications Vol. 18 . Andre Ricardo and Carlos Serrao. 2013. Comparison of existing open source tools for web crawling and indexing of free music. In Journal of Telecommunications Vol. 18."},{"key":"e_1_3_2_1_74_1","first-page":"318","article-title":"Method for performing user profiling from encrypted network traffic flows","volume":"15","author":"Sanchez Roberto\u00a0Gonzalez","year":"2017","unstructured":"Roberto\u00a0Gonzalez Sanchez , Claudio Soriente , and Nikolaos Laoutaris . 2017 . Method for performing user profiling from encrypted network traffic flows . US Patent App. 15\/486 , 318 . Roberto\u00a0Gonzalez Sanchez, Claudio Soriente, and Nikolaos Laoutaris. 2017. Method for performing user profiling from encrypted network traffic flows. US Patent App. 15\/486,318.","journal-title":"US Patent App."},{"key":"e_1_3_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1145\/3278532.3278574"},{"key":"e_1_3_2_1_76_1","volume-title":"Characterizing the Nature and Dynamics of Tor Exit Blocking. In 26th USENIX Security Symposium (USENIX Security 17)","author":"Singh Rachee","year":"2017","unstructured":"Rachee Singh , Rishab Nithyanand , Sadia Afroz , Paul Pearce , Michael\u00a0Carl Tschantz , Phillipa Gill , and Vern Paxson . 2017 . Characterizing the Nature and Dynamics of Tor Exit Blocking. In 26th USENIX Security Symposium (USENIX Security 17) . USENIX Association, Vancouver, BC, 325\u2013341. https:\/\/www.usenix.org\/conference\/usenixsecurity17\/technical-sessions\/presentation\/singh Rachee Singh, Rishab Nithyanand, Sadia Afroz, Paul Pearce, Michael\u00a0Carl Tschantz, Phillipa Gill, and Vern Paxson. 2017. Characterizing the Nature and Dynamics of Tor Exit Blocking. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 325\u2013341. https:\/\/www.usenix.org\/conference\/usenixsecurity17\/technical-sessions\/presentation\/singh"},{"key":"e_1_3_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/2591971.2592003"},{"key":"e_1_3_2_1_78_1","unstructured":"Tao Wang. 2017. File directory: Website fingerprinting attacks. https:\/\/www.cse.ust.hk\/~taow\/wf\/attacks\/  Tao Wang. 2017. File directory: Website fingerprinting attacks. https:\/\/www.cse.ust.hk\/~taow\/wf\/attacks\/"},{"key":"e_1_3_2_1_79_1","volume-title":"23rd {USENIX} Security Symposium ({USENIX} Security 14). 143\u2013157.","author":"Wang Tao","unstructured":"Tao Wang , Xiang Cai , Rishab Nithyanand , Rob Johnson , and Ian Goldberg . 2014. Effective attacks and provable defenses for website fingerprinting . In 23rd {USENIX} Security Symposium ({USENIX} Security 14). 143\u2013157. Tao Wang, Xiang Cai, Rishab Nithyanand, Rob Johnson, and Ian Goldberg. 2014. Effective attacks and provable defenses for website fingerprinting. In 23rd {USENIX} Security Symposium ({USENIX} Security 14). 143\u2013157."},{"key":"e_1_3_2_1_80_1","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1515\/popets-2017-0004","article-title":"Topics of Controversy: An Empirical Analysis of Web Censorship Lists","volume":"2017","author":"Weinberg Zachary","year":"2017","unstructured":"Zachary Weinberg , Mahmood Sharif , Janos Szurdi , and Nicolas Christin . 2017 . Topics of Controversy: An Empirical Analysis of Web Censorship Lists . PoPETs 2017 , 1 (2017), 42 \u2013 61 . Zachary Weinberg, Mahmood Sharif, Janos Szurdi, and Nicolas Christin. 2017. Topics of Controversy: An Empirical Analysis of Web Censorship Lists. PoPETs 2017, 1 (2017), 42\u201361.","journal-title":"PoPETs"},{"key":"e_1_3_2_1_81_1","unstructured":"Monika Yadav and Neha Goyal. 2015. Comparison of Open Source Crawlers-A Review.  Monika Yadav and Neha Goyal. 2015. Comparison of Open Source Crawlers-A Review."},{"key":"e_1_3_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1145\/2512938.2512945"}],"event":{"name":"WWW '20: The Web Conference 2020","location":"Taipei Taiwan","acronym":"WWW '20","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"]},"container-title":["Proceedings of The Web Conference 2020"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3366423.3380113","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3366423.3380113","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:32:59Z","timestamp":1750199579000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3366423.3380113"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,20]]},"references-count":83,"alternative-id":["10.1145\/3366423.3380113","10.1145\/3366423"],"URL":"https:\/\/doi.org\/10.1145\/3366423.3380113","relation":{},"subject":[],"published":{"date-parts":[[2020,4,20]]},"assertion":[{"value":"2020-04-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}