{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T08:47:35Z","timestamp":1767084455113,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":57,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,12,2]],"date-time":"2021-12-02T00:00:00Z","timestamp":1638403200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"the National Natural Science Foundation of China","award":["62072269"],"award-info":[{"award-number":["62072269"]}]},{"name":"the National Key Research and Development Program of China","award":["2020YFE0200500"],"award-info":[{"award-number":["2020YFE0200500"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,12,2]]},"DOI":"10.1145\/3485983.3494857","type":"proceedings-article","created":{"date-parts":[[2021,12,3]],"date-time":"2021-12-03T22:36:56Z","timestamp":1638571016000},"page":"426-439","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["Discovering obscure looking glass sites on the web to facilitate internet measurement research"],"prefix":"10.1145","author":[{"given":"Shuying","family":"Zhuang","sequence":"first","affiliation":[{"name":"Tsinghua University"}]},{"given":"Jessie Hui","family":"Wang","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Jilong","family":"Wang","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Zujiang","family":"Pan","sequence":"additional","affiliation":[{"name":"Tencent Technology"}]},{"given":"Tianhao","family":"Wu","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Fenghua","family":"Li","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Zhiyong","family":"Zhang","sequence":"additional","affiliation":[{"name":"CETCSC"}]}],"member":"320","published-online":{"date-parts":[[2021,12,3]]},"reference":[{"key":"e_1_3_2_2_1_1","unstructured":"[n.d.]. Beautiful Soup. Retrieved August 2020 from https:\/\/pypi.org\/project\/beautifulsoup4\/  [n.d.]. Beautiful Soup. Retrieved August 2020 from https:\/\/pypi.org\/project\/beautifulsoup4\/"},{"key":"e_1_3_2_2_2_1","unstructured":"[n.d.]. BGP4.as. Retrieved April 2020 from http:\/\/www.bgp4.as\/looking-glasses  [n.d.]. BGP4.as. Retrieved April 2020 from http:\/\/www.bgp4.as\/looking-glasses"},{"key":"e_1_3_2_2_3_1","unstructured":"[n.d.]. BGPlookingglass.com. Retrieved April 2020 from http:\/\/www.bgplookingglass.com  [n.d.]. BGPlookingglass.com. Retrieved April 2020 from http:\/\/www.bgplookingglass.com"},{"volume-title":"Retrieved","year":"2020","key":"e_1_3_2_2_4_1","unstructured":"[n.d.]. CAIDA AS Rank . Retrieved October , 2020 from http:\/\/as-rank.caida.org\/ [n.d.]. CAIDA AS Rank. Retrieved October, 2020 from http:\/\/as-rank.caida.org\/"},{"volume-title":"Retrieved","year":"2020","key":"e_1_3_2_2_5_1","unstructured":"[n.d.]. The CAIDA UCSD AS to Organization Mapping Dataset . Retrieved April , 2020 from https:\/\/www.caida.org\/data\/as_organizations.xml [n.d.]. The CAIDA UCSD AS to Organization Mapping Dataset. Retrieved April, 2020 from https:\/\/www.caida.org\/data\/as_organizations.xml"},{"volume-title":"Retrieved","year":"2020","key":"e_1_3_2_2_6_1","unstructured":"[n.d.]. Cougar Looking Glass . Retrieved September , 2020 from https:\/\/github.com\/Cougar\/lg [n.d.]. Cougar Looking Glass. Retrieved September, 2020 from https:\/\/github.com\/Cougar\/lg"},{"volume-title":"Retrieved","year":"2020","key":"e_1_3_2_2_7_1","unstructured":"[n.d.]. HSDN Looking Glass . Retrieved September , 2020 from https:\/\/github.com\/hsdn\/lg [n.d.]. HSDN Looking Glass. Retrieved September, 2020 from https:\/\/github.com\/hsdn\/lg"},{"key":"e_1_3_2_2_8_1","unstructured":"[n.d.]. PeeringDB. Retrieved April 2020 from http:\/\/www.peeringdb.com  [n.d.]. PeeringDB. Retrieved April 2020 from http:\/\/www.peeringdb.com"},{"key":"e_1_3_2_2_9_1","unstructured":"[n.d.]. Requests. Retrieved June 2020 from https:\/\/pypi.org\/project\/requests\/  [n.d.]. Requests. Retrieved June 2020 from https:\/\/pypi.org\/project\/requests\/"},{"volume-title":"Retrieved","year":"2020","key":"e_1_3_2_2_10_1","unstructured":"[n.d.]. Routeviews Prefix to AS mappings Dataset for IPv4 and IPv6 . Retrieved September , 2020 from https:\/\/www.caida.org\/data\/routing\/routeviews-prefix2as.xml [n.d.]. Routeviews Prefix to AS mappings Dataset for IPv4 and IPv6. Retrieved September, 2020 from https:\/\/www.caida.org\/data\/routing\/routeviews-prefix2as.xml"},{"volume-title":"Retrieved","year":"2020","key":"e_1_3_2_2_11_1","unstructured":"[n.d.]. Telephone Looking Glass . Retrieved September , 2020 from https:\/\/github.com\/telephone\/LookingGlass [n.d.]. Telephone Looking Glass. Retrieved September, 2020 from https:\/\/github.com\/telephone\/LookingGlass"},{"volume-title":"Retrieved","year":"2019","key":"e_1_3_2_2_12_1","unstructured":"[n.d.]. The CAIDA UCSD IPv4 Routed \/24 Topology Dataset . Retrieved December , 2019 from https:\/\/www.caida.org\/data\/active\/ipv4_routed_24_topology_dataset.xml [n.d.]. The CAIDA UCSD IPv4 Routed \/24 Topology Dataset. Retrieved December, 2019 from https:\/\/www.caida.org\/data\/active\/ipv4_routed_24_topology_dataset.xml"},{"key":"e_1_3_2_2_13_1","unstructured":"[n.d.]. Tldextract. Retrieved June 2020 from https:\/\/pypi.org\/project\/tldextract\/  [n.d.]. Tldextract. Retrieved June 2020 from https:\/\/pypi.org\/project\/tldextract\/"},{"key":"e_1_3_2_2_14_1","unstructured":"[n.d.]. Traceroute.org. Retrieved April 2020 from http:\/\/www.traceroute.org  [n.d.]. Traceroute.org. Retrieved April 2020 from http:\/\/www.traceroute.org"},{"key":"e_1_3_2_2_15_1","unstructured":"[n.d.]. Wiki. Retrieved September 2020 from https:\/\/en.wikipedia.org\/wiki\/Tier_1_network  [n.d.]. Wiki. Retrieved September 2020 from https:\/\/en.wikipedia.org\/wiki\/Tier_1_network"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/SocialCom.2013.43"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/170035.170072"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13388-017-0029-8"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICODSE.2016.7936110"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1644893.1644934"},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2805789.2805796"},{"volume-title":"Inventive Communication and Computational Technologies","author":"Baweja Vanshita R","key":"e_1_3_2_2_22_1","unstructured":"Vanshita R Baweja , Rajesh Bhatia , and Manish Kumar . 2020. Support Vector Machine-Based Focused Crawler . In Inventive Communication and Computational Technologies . Springer , 673--686. Vanshita R Baweja, Rajesh Bhatia, and Manish Kumar. 2020. Support Vector Machine-Based Focused Crawler. In Inventive Communication and Computational Technologies. Springer, 673--686."},{"key":"e_1_3_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45747-X_7"},{"key":"e_1_3_2_2_24_1","volume-title":"8th USENIX Workshop on Offensive Technologies (WOOT 14)","author":"Bruno Luca","year":"2014","unstructured":"Luca Bruno , Mariano Graziano , Davide Balzarotti , and Aur\u00e9lien Francillon . 2014 . Through the looking-glass, and what eve found there . In 8th USENIX Workshop on Offensive Technologies (WOOT 14) . Luca Bruno, Mariano Graziano, Davide Balzarotti, and Aur\u00e9lien Francillon. 2014. Through the looking-glass, and what eve found there. In 8th USENIX Workshop on Offensive Technologies (WOOT 14)."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/345508.345597"},{"key":"e_1_3_2_2_26_1","volume-title":"Gray Tunneling Based on Joint Link for Focused Crawling. In 3rd International Conference on Mechatronics, Robotics and Automation. Atlantis Press, 859--862","author":"Dong Wei","year":"2015","unstructured":"Wei Dong , Hong Ni , Haojiang Deng , and Liheng Tuo . 2015 . Gray Tunneling Based on Joint Link for Focused Crawling. In 3rd International Conference on Mechatronics, Robotics and Automation. Atlantis Press, 859--862 . Wei Dong, Hong Ni, Haojiang Deng, and Liheng Tuo. 2015. Gray Tunneling Based on Joint Link for Focused Crawling. In 3rd International Conference on Mechatronics, Robotics and Automation. Atlantis Press, 859--862."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1401890.1401920"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00799-016-0207-1"},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-30505-9_14"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2398776.2398803"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2014.2323128"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.comcom.2017.08.015"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-91662-0_20"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2012.33"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3419394.3423627"},{"key":"e_1_3_2_2_36_1","volume-title":"Scalable Anti-TrustRank with Qualified Site-level Seeds for Link-based Web Spam Detection. In Companion Proceedings of the Web Conference","author":"Whang Joyce Jiyoung","year":"2020","unstructured":"Joyce Jiyoung Whang , Yeonsung Jung , Seonggoo Kang , Dongho Yoo , and Inderjit S. Dhillon . 2020 . Scalable Anti-TrustRank with Qualified Site-level Seeds for Link-based Web Spam Detection. In Companion Proceedings of the Web Conference 2020 . 593--602. Joyce Jiyoung Whang, Yeonsung Jung, Seonggoo Kang, Dongho Yoo, and Inderjit S. Dhillon. 2020. Scalable Anti-TrustRank with Qualified Site-level Seeds for Link-based Web Spam Detection. In Companion Proceedings of the Web Conference 2020. 593--602."},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2504730.2504758"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/3091622.3091662"},{"key":"e_1_3_2_2_39_1","volume-title":"An effective approach to enhancing a focused crawler using Google. The Journal of Supercomputing","author":"Lee Jae-Gil","year":"2019","unstructured":"Jae-Gil Lee , Donghwan Bae , Sansung Kim , Jungeun Kim , and Mun Yong Yi. 2019. An effective approach to enhancing a focused crawler using Google. The Journal of Supercomputing ( 2019 ), 1--18. Jae-Gil Lee, Donghwan Bae, Sansung Kim, Jungeun Kim, and Mun Yong Yi. 2019. An effective approach to enhancing a focused crawler using Google. The Journal of Supercomputing (2019), 1--18."},{"volume-title":"Special interest tracks and posters of the 14th international conference on World Wide Web (WWW). 1190--1191.","author":"Li Jun","key":"e_1_3_2_2_40_1","unstructured":"Jun Li , Kazutaka Furuse , and Kazunori Yamaguchi . 2005. Focused crawling by exploiting anchor text using decision tree . In Special interest tracks and posters of the 14th international conference on World Wide Web (WWW). 1190--1191. Jun Li, Kazutaka Furuse, and Kazunori Yamaguchi. 2005. Focused crawling by exploiting anchor text using decision tree. In Special interest tracks and posters of the 14th international conference on World Wide Web (WWW). 1190--1191."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2504730.2504735"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3278532.3278538"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3419394.3423653"},{"key":"e_1_3_2_2_44_1","volume-title":"Workshop on Formal Methods Integration. Springer, 227--256","author":"McDowell Luke K","year":"2014","unstructured":"Luke K McDowell , Aaron Fleming , and Zane Markel . 2014 . Evaluating and extending latent methods for link-based classification . In Workshop on Formal Methods Integration. Springer, 227--256 . Luke K McDowell, Aaron Fleming, and Zane Markel. 2014. Evaluating and extending latent methods for link-based classification. In Workshop on Formal Methods Integration. Springer, 227--256."},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2013.06.010"},{"key":"e_1_3_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2019.2940369"},{"key":"e_1_3_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3278532.3278556"},{"key":"e_1_3_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2009.2020798"},{"key":"e_1_3_2_2_49_1","article-title":"Study of different focused web crawler to search domain specific information","volume":"2016","author":"Pawar Nisha N","year":"2016","unstructured":"Nisha N Pawar and K Rajeswari . 2016 . Study of different focused web crawler to search domain specific information . International Journal of Computer Applications , 2016 , 136, 11 (2016). Nisha N Pawar and K Rajeswari. 2016. Study of different focused web crawler to search domain specific information. International Journal of Computer Applications, 2016, 136, 11 (2016).","journal-title":"International Journal of Computer Applications"},{"key":"e_1_3_2_2_50_1","volume-title":"Proceedings of the first instructional conference on machine learning","volume":"242","author":"Juan","unstructured":"Juan Ramos et al. 2003. Using TF-IDF to determine word relevance in document queries . In Proceedings of the first instructional conference on machine learning , Vol. 242 . New Jersey, USA, 133--142. Juan Ramos et al. 2003. Using TF-IDF to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. New Jersey, USA, 133--142."},{"key":"e_1_3_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFCOM.2009.5061988"},{"key":"e_1_3_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-50472-8_17"},{"key":"e_1_3_2_2_53_1","volume-title":"Everything Is in the Name-A URL Based Approach for Phishing Detection. In International Symposium on Cyber Security Cryptography and Machine Learning. Springer, 231--248","author":"Tupsamudre Harshal","year":"2019","unstructured":"Harshal Tupsamudre , Ajeet Kumar Singh , and Sachin Lodha . 2019 . Everything Is in the Name-A URL Based Approach for Phishing Detection. In International Symposium on Cyber Security Cryptography and Machine Learning. Springer, 231--248 . Harshal Tupsamudre, Ajeet Kumar Singh, and Sachin Lodha. 2019. Everything Is in the Name-A URL Based Approach for Phishing Detection. In International Symposium on Cyber Security Cryptography and Machine Learning. Springer, 231--248."},{"key":"e_1_3_2_2_54_1","volume-title":"TEDM-PU: A Tax Evasion Detection Method Based on Positive and Unlabeled Learning. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1681--1686","author":"Wu Yingchao","year":"2019","unstructured":"Yingchao Wu , Qinghua Zheng , Yuda Gao , Bo Dong , Rongzhe Wei , Fa Zhang , and Huan He . 2019 . TEDM-PU: A Tax Evasion Detection Method Based on Positive and Unlabeled Learning. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1681--1686 . Yingchao Wu, Qinghua Zheng, Yuda Gao, Bo Dong, Rongzhe Wei, Fa Zhang, and Huan He. 2019. TEDM-PU: A Tax Evasion Detection Method Based on Positive and Unlabeled Learning. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 1681--1686."},{"key":"e_1_3_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3326285.3329049"},{"key":"e_1_3_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0097079"},{"key":"e_1_3_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3138825"}],"event":{"name":"CoNEXT '21: The 17th International Conference on emerging Networking EXperiments and Technologies","sponsor":["SIGCOMM ACM Special Interest Group on Data Communication"],"location":"Virtual Event Germany","acronym":"CoNEXT '21"},"container-title":["Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3485983.3494857","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3485983.3494857","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:12:31Z","timestamp":1750191151000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3485983.3494857"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,2]]},"references-count":57,"alternative-id":["10.1145\/3485983.3494857","10.1145\/3485983"],"URL":"https:\/\/doi.org\/10.1145\/3485983.3494857","relation":{},"subject":[],"published":{"date-parts":[[2021,12,2]]},"assertion":[{"value":"2021-12-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}