{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T06:11:20Z","timestamp":1775283080783,"version":"3.50.1"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2005,8,1]],"date-time":"2005-08-01T00:00:00Z","timestamp":1122854400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Internet Technol."],"published-print":{"date-parts":[[2005,8]]},"abstract":"<jats:p>This article presents a characterization of the community Web of the people of Portugal. We defined criteria for delimiting this Web based on our past experience of crawling pages related to Portugal and collected over 3.2 million documents from 46,000 sites satisfying those criteria. Our characterization was derived from this crawl. We describe the rules that we established for defining the boundaries of this community Web and the methodology used to gather statistics. Statistics cover the number and domain distribution of sites; the number, type and size distribution of text documents; and the linkage structure of this Web. We also show how crawling constraints and abnormal situations on the Web can influence the statistics.<\/jats:p>","DOI":"10.1145\/1084772.1084775","type":"journal-article","created":{"date-parts":[[2005,11,7]],"date-time":"2005-11-07T16:00:45Z","timestamp":1131379245000},"page":"508-531","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":29,"title":["Characterizing a national community web"],"prefix":"10.1145","volume":"5","author":[{"given":"Daniel","family":"Gomes","sequence":"first","affiliation":[{"name":"University of Lisbon, Lisboa, Portugal"}]},{"given":"M\u00e1rio J.","family":"Silva","sequence":"additional","affiliation":[{"name":"University of Lisbon, Lisboa, Portugal"}]}],"member":"320","published-online":{"date-parts":[[2005,8]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the Euroweb Conference. B. Matthews, B. Hopgood, and M. Wilson, Eds","author":"Aires R.","unstructured":"Aires , R. and Santos , D . 2002. Measuring the Web in Portuguese . In Proceedings of the Euroweb Conference. B. Matthews, B. Hopgood, and M. Wilson, Eds . Oxford, UK, 198--199.]] Aires, R. and Santos, D. 2002. Measuring the Web in Portuguese. In Proceedings of the Euroweb Conference. B. Matthews, B. Hopgood, and M. Wilson, Eds. Oxford, UK, 198--199.]]"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of 3rd ECDL Workshop on Web Archives","author":"Albertsen K.","year":"2003","unstructured":"Albertsen , K. 2003 . The paradigma Web harvesting environment . In Proceedings of 3rd ECDL Workshop on Web Archives . Trondheim, Norway.]] Albertsen, K. 2003. The paradigma Web harvesting environment. In Proceedings of 3rd ECDL Workshop on Web Archives. Trondheim, Norway.]]"},{"key":"e_1_2_1_3_1","volume-title":"RFC","author":"Barr D.","year":"1996","unstructured":"Barr , D. 1996 . RFC 1912. IETF.]] Barr, D. 1996. RFC 1912. IETF.]]"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 8th International Conference on the World Wide Web. Elsevier, 1579--1590","author":"Bharat K.","unstructured":"Bharat , K. and Broder , A . 1999. Mirror, mirror on the Web: A study of host pairs with replicated content . In Proceedings of the 8th International Conference on the World Wide Web. Elsevier, 1579--1590 .]] Bharat, K. and Broder, A. 1999. Mirror, mirror on the Web: A study of host pairs with replicated content. In Proceedings of the 8th International Conference on the World Wide Web. Elsevier, 1579--1590.]]"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the IEEE International Conference on Data Mining. IEEE Computer Society, 51--58","author":"Bharat K.","unstructured":"Bharat , K. , Chang , B.-W. , Henzinger , M. R. , and Ruhl , M . 2001. Who links to whom: Mining linkage between Web sites . In Proceedings of the IEEE International Conference on Data Mining. IEEE Computer Society, 51--58 .]] Bharat, K., Chang, B.-W., Henzinger, M. R., and Ruhl, M. 2001. Who links to whom: Mining linkage between Web sites. In Proceedings of the IEEE International Conference on Data Mining. IEEE Computer Society, 51--58.]]"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 11th International World Wide Web Conference","author":"Boldi P.","unstructured":"Boldi , P. , Codenotti , B. , Santini , M. , and Vigna , S . 2002. Structural properties of the African Web . In Proceedings of the 11th International World Wide Web Conference . Honolulu, Hawaii.]] Boldi, P., Codenotti, B., Santini, M., and Vigna, S. 2002. Structural properties of the African Web. In Proceedings of the 11th International World Wide Web Conference. Honolulu, Hawaii.]]"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-7552(98)00110-X"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 9th International World Wide Web Conference on Computer Networks. North-Holland Publishing Co., 309--320","author":"Broder A.","unstructured":"Broder , A. , Kumar , R. , Maghoul , F. , Raghavan , P. , Rajagopalan , S. , Stata , R. , Tomkins , A. , and Wiener , J . 2000. Graph structure in the Web . In Proceedings of the 9th International World Wide Web Conference on Computer Networks. North-Holland Publishing Co., 309--320 .]] Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. 2000. Graph structure in the Web. In Proceedings of the 9th International World Wide Web Conference on Computer Networks. North-Holland Publishing Co., 309--320.]]"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 6th International Conference on the World Wide Web. Elsevier, 1157--1166","author":"Broder A. Z.","unstructured":"Broder , A. Z. , Glassman , S. C. , Manasse , M. S. , and Zweig , G . 1997. Syntactic clustering of the Web . In Proceedings of the 6th International Conference on the World Wide Web. Elsevier, 1157--1166 .]] Broder, A. Z., Glassman, S. C., Manasse, M. S., and Zweig, G. 1997. Syntactic clustering of the Web. In Proceedings of the 6th International Conference on the World Wide Web. Elsevier, 1157--1166.]]"},{"key":"e_1_2_1_10_1","volume-title":"the 3rd Annual Symposium on Document Analysis and Information Retrieval. 161--175","author":"Cavnar W.","unstructured":"Cavnar , W. and Trenkle , J . 1994. N-gram-based text categorization . In the 3rd Annual Symposium on Document Analysis and Information Retrieval. 161--175 .]] Cavnar, W. and Trenkle, J. 1994. N-gram-based text categorization. In the 3rd Annual Symposium on Document Analysis and Information Retrieval. 161--175.]]"},{"key":"e_1_2_1_11_1","unstructured":"Center H. S. D. 2003. Geo targeting IP address to country city region ISP latitude longitude database for Internet developers---ip2location. Available at http:\/\/www.ip2location.com\/.]]  Center H. S. D. 2003. Geo targeting IP address to country city region ISP latitude longitude database for Internet developers---ip2location. Available at http:\/\/www.ip2location.com\/.]]"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 26th International Conference on Very Large Data Bases. (Sept.) 10--14","author":"Cho J.","unstructured":"Cho , J. and Garcia-Molina , H . 2000. The evolution of the Web and implications for an incremental crawler . In Proceedings of the 26th International Conference on Very Large Data Bases. (Sept.) 10--14 , 200--209.]] Cho, J. and Garcia-Molina, H. 2000. The evolution of the Web and implications for an incremental crawler. In Proceedings of the 26th International Conference on Very Large Data Bases. (Sept.) 10--14, 200--209.]]"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-7552(98)00108-1"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Davis C. Vixie P. Goodwin T. and Dickinson I. 1996. A means for expressing location information in the domain name system. RFC 1876. IETF.]]   Davis C. Vixie P. Goodwin T. and Dickinson I. 1996. A means for expressing location information in the domain name system. RFC 1876. IETF.]]","DOI":"10.17487\/rfc1876"},{"key":"e_1_2_1_15_1","unstructured":"Day M. 2003. Collecting and preserving the World Wide Web. Available at http:\/\/www.jisc.ac. uk\/uploaded_documents\/archiving_feasibility.pdf.]]  Day M. 2003. Collecting and preserving the World Wide Web. Available at http:\/\/www.jisc.ac. uk\/uploaded_documents\/archiving_feasibility.pdf.]]"},{"key":"e_1_2_1_16_1","volume-title":"the USENIX Symposium on Internet Technologies and Systems.]]","author":"Douglis F.","unstructured":"Douglis , F. , Feldmann , A. , Krishnamurthy , B. , and Mogul , J. C . 1997. Rate of change and other metrics: A live study of the World Wide Web . In the USENIX Symposium on Internet Technologies and Systems.]] Douglis, F., Feldmann, A., Krishnamurthy, B., and Mogul, J. C. 1997. Rate of change and other metrics: A live study of the World Wide Web. In the USENIX Symposium on Internet Technologies and Systems.]]"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 12th International World Wide Web Conference","author":"Fetterly D.","unstructured":"Fetterly , D. , Manasse , M. , Najork , M. , and Wiener , J. L . 2003. A large-scale study of the evolution of Web pages . In Proceedings of the 12th International World Wide Web Conference . Budapest, Hungary.]] 10.1145\/775152.775246 Fetterly, D., Manasse, M., Najork, M., and Wiener, J. L. 2003. A large-scale study of the evolution of Web pages. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary.]] 10.1145\/775152.775246"},{"key":"e_1_2_1_18_1","volume-title":"the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Flake G.","unstructured":"Flake , G. , Lawrence , S. , and Giles , C. L . 2000. Efficient identification of Web communities . In the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Boston, MA. 150--160.]] 10.1145\/347090.347121 Flake, G., Lawrence, S., and Giles, C. L. 2000. Efficient identification of Web communities. In the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, MA. 150--160.]] 10.1145\/347090.347121"},{"key":"e_1_2_1_19_1","unstructured":"Funredes. 2001. The place of latin languages on the Internet. Available at http:\/\/funredes.org\/lc]]  Funredes. 2001. The place of latin languages on the Internet. Available at http:\/\/funredes.org\/lc]]"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 9th ACM Conference on Hypertext and Hypermedia","author":"Gibson D.","unstructured":"Gibson , D. , Kleinberg , J. M. , and Raghavan , P . 1998. Inferring Web communities from link topology . In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia . Pittsburgh, PA. 225--234.]] 10.1145\/276627.276652 Gibson, D., Kleinberg, J. M., and Raghavan, P. 1998. Inferring Web communities from link topology. In Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. Pittsburgh, PA. 225--234.]] 10.1145\/276627.276652"},{"key":"e_1_2_1_21_1","unstructured":"Gomes D. 2003. V\u00fava negra. Available at www.tumba.pt\/english\/crawler.html.]]  Gomes D. 2003. V\u00fava negra. Available at www.tumba.pt\/english\/crawler.html.]]"},{"key":"e_1_2_1_22_1","unstructured":"Google. 2003. Google Web search features. Available at www.google.com\/help\/features.html#link.]]  Google. 2003. Google Web search features. Available at www.google.com\/help\/features.html#link.]]"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of RIAO'2000---Content-Based Multimedia Information Access","author":"Grefenstette G.","unstructured":"Grefenstette , G. and Nioche , J . 2000. Estimation of english and non-english language use on the WWW . In Proceedings of RIAO'2000---Content-Based Multimedia Information Access . Paris, France. 237--246.]] Grefenstette, G. and Nioche, J. 2000. Estimation of english and non-english language use on the WWW. In Proceedings of RIAO'2000---Content-Based Multimedia Information Access. Paris, France. 237--246.]]"},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","unstructured":"Harrenstien K. Stahl M. K. and Feinler E. J. 1985. NICNAME\/WHOIS. RFC 954. IETF.]]   Harrenstien K. Stahl M. K. and Feinler E. J. 1985. NICNAME\/WHOIS. RFC 954. IETF.]]","DOI":"10.17487\/rfc0954"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1080\/15427951.2004.10129079","article-title":"Algorithmic challenges in Web search engines","volume":"1","author":"Henzinger M.","year":"2003","unstructured":"Henzinger , M. 2003 . Algorithmic challenges in Web search engines . J. Internet Math. 1 , 1, 115 -- 126 .]] Henzinger, M. 2003. Algorithmic challenges in Web search engines. J. Internet Math. 1, 1, 115--126.]]","journal-title":"J. Internet Math."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1019213109274"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 11th International World Wide Web Conference","author":"Kelly T.","unstructured":"Kelly , T. and Mogul , J . 2002. Aliasing on the World Wide Web: Prevalence and performance implications . In Proceedings of the 11th International World Wide Web Conference . Honolulu, Hawaii.]] 10.1145\/511446.511484 Kelly, T. and Mogul, J. 2002. Aliasing on the World Wide Web: Prevalence and performance implications. In Proceedings of the 11th International World Wide Web Conference. Honolulu, Hawaii.]] 10.1145\/511446.511484"},{"key":"e_1_2_1_28_1","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1038\/21987","article-title":"Accessibility of information on the Web","volume":"400","author":"Lawrence S.","year":"1999","unstructured":"Lawrence , S. and Giles , C. L. 1999 . Accessibility of information on the Web . Nature 400 , 107 -- 109 .]] Lawrence, S. and Giles, C. L. 1999. Accessibility of information on the Web. Nature 400, 107--109.]]","journal-title":"Nature"},{"key":"e_1_2_1_29_1","unstructured":"Leung S.-T. A. Perl S. E. Stata R. and Wiener J. L. 2001. Towards Web-scale Web archeology. Tech. rep. 174 (Sept.) Compaq Research Center Paolo Alto CA.]]  Leung S.-T. A. Perl S. E. Stata R. and Wiener J. L. 2001. Towards Web-scale Web archeology. Tech. rep. 174 (Sept.) Compaq Research Center Paolo Alto CA.]]"},{"key":"e_1_2_1_30_1","volume-title":"Maxmind: How to locate your Internet visitors geotargeting IP address to country state city ISP organization latitude longitude.","year":"2003","unstructured":"LLC, M. 2003 . Maxmind: How to locate your Internet visitors geotargeting IP address to country state city ISP organization latitude longitude. Available at http:\/\/www.maxmind.com\/.]] LLC, M. 2003. Maxmind: How to locate your Internet visitors geotargeting IP address to country state city ISP organization latitude longitude. Available at http:\/\/www.maxmind.com\/.]]"},{"key":"e_1_2_1_31_1","unstructured":"Marktest. 2003. Netpanel. Available at netpanel.marktest.pt\/.]]  Marktest. 2003. Netpanel. Available at netpanel.marktest.pt\/.]]"},{"key":"e_1_2_1_32_1","volume-title":"A trace-based analysis of duplicate suppression in HTTP. Tech. rep. 99\/2, (Nov","author":"Mogul J.","unstructured":"Mogul , J. 1999a. A trace-based analysis of duplicate suppression in HTTP. Tech. rep. 99\/2, (Nov . ) Compaq Computer Corporation , Western Research Laboratory.]] Mogul, J. 1999a. A trace-based analysis of duplicate suppression in HTTP. Tech. rep. 99\/2, (Nov.) Compaq Computer Corporation, Western Research Laboratory.]]"},{"key":"e_1_2_1_33_1","volume-title":"Errors in timestamp-based HTTP header values. Tech. rep. 99\/3, (Dec","author":"Mogul J.","unstructured":"Mogul , J. 1999b. Errors in timestamp-based HTTP header values. Tech. rep. 99\/3, (Dec . ) Compaq Computer Corporation , Western Research Laboratory.]] Mogul, J. 1999b. Errors in timestamp-based HTTP header values. Tech. rep. 99\/3, (Dec.) Compaq Computer Corporation, Western Research Laboratory.]]"},{"key":"e_1_2_1_34_1","doi-asserted-by":"crossref","unstructured":"Najork M. and Heydon A. 2001. On high-performance Web crawling. SR Tech A68 Compaq Research Center Palo Alto CA.]]  Najork M. and Heydon A. 2001. On high-performance Web crawling. SR Tech A68 Compaq Research Center Palo Alto CA.]]","DOI":"10.1007\/978-1-4615-0005-6_2"},{"key":"e_1_2_1_35_1","volume-title":"Netcraft","author":"Netcraft Ltd","year":"2004","unstructured":"Netcraft Ltd . 2004 . Netcraft : April 2003 archives. Available at http:\/\/news.netcraft.com\/archives\/2003\/04\/index.html.]] Netcraft Ltd. 2004. Netcraft: April 2003 archives. Available at http:\/\/news.netcraft.com\/archives\/2003\/04\/index.html.]]"},{"key":"e_1_2_1_36_1","unstructured":"Nicolau M. J. Macedo J. and Costa A. 1997. Caracteriza\u00e7\u00e3o da informa\u00e7\u00e3o WWW na RCCN. Tech. Rep. Universidade do Minho Portugal.]]  Nicolau M. J. Macedo J. and Costa A. 1997. Caracteriza\u00e7\u00e3o da informa\u00e7\u00e3o WWW na RCCN. Tech. Rep. Universidade do Minho Portugal.]]"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries. Springer-Verlag, 200--212","author":"Noronha N.","unstructured":"Noronha , N. , Campos , J. P. , Gomes , D. , Silva , M. J. , and Borbinha , J . 2001. A deposit for digital collections . In Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries. Springer-Verlag, 200--212 .]] Noronha, N., Campos, J. P., Gomes, D., Silva, M. J., and Borbinha, J. 2001. A deposit for digital collections. In Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries. Springer-Verlag, 200--212.]]"},{"key":"e_1_2_1_38_1","unstructured":"OCLC. 2003. Web characterization. Available at http:\/\/wcp.oclc.org\/.]]  OCLC. 2003. Web characterization. Available at http:\/\/wcp.oclc.org\/.]]"},{"key":"e_1_2_1_39_1","unstructured":"O'Neill E. T. 1999. Web sites: Concepts issues and definitions. Available at http:\/\/wcp.oclc.org\/pubs\/rn1-websites.html.]]  O'Neill E. T. 1999. Web sites: Concepts issues and definitions. Available at http:\/\/wcp.oclc.org\/pubs\/rn1-websites.html.]]"},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1045\/april2003-lavoie","article-title":"Trends in the evolution of the public Web","volume":"9","author":"O'Neill E. T.","year":"2003","unstructured":"O'Neill , E. T. , Lavoie , B. F. , and Bennett , R. 2003 . Trends in the evolution of the public Web . D-Lib Magazine 9 , 4 (April).]] O'Neill, E. T., Lavoie, B. F., and Bennett, R. 2003. Trends in the evolution of the public Web. D-Lib Magazine 9, 4 (April).]]","journal-title":"D-Lib Magazine"},{"key":"e_1_2_1_41_1","unstructured":"Overture Services I. 2003. Alltheweb.com: Frequently asked questions---URL investigator. Available at www.alltheweb.com\/help\/faqs\/url_investigator.]]  Overture Services I. 2003. Alltheweb.com: Frequently asked questions---URL investigator. Available at www.alltheweb.com\/help\/faqs\/url_investigator.]]"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 13th USENIX Conference on System Administration. 69--78","author":"Periakaruppan R.","unstructured":"Periakaruppan , R. and Nemeth , E . 1999. GTrace: A graphical traceroute tool . In Proceedings of the 13th USENIX Conference on System Administration. 69--78 .]] Periakaruppan, R. and Nemeth, E. 1999. GTrace: A graphical traceroute tool. In Proceedings of the 13th USENIX Conference on System Administration. 69--78.]]"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-7552(98)00066-X"},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Postel J. 1994. Domain name system structure and delegation. RFC 1591. IETF.]]   Postel J. 1994. Domain name system structure and delegation. RFC 1591. IETF.]]","DOI":"10.17487\/rfc1591"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the Asia Pacific Advance Network. 225--230","author":"Punpiti S. S.","year":"2000","unstructured":"Punpiti , S. S. 2000 . Measuring and analysis of the Thai World Wide Web . In Proceedings of the Asia Pacific Advance Network. 225--230 .]] Punpiti, S. S. 2000. Measuring and analysis of the Thai World Wide Web. In Proceedings of the Asia Pacific Advance Network. 225--230.]]"},{"key":"e_1_2_1_46_1","doi-asserted-by":"crossref","unstructured":"Rivest R. 1992. The MD5 message-digest algorithm. RFC 1321. IETF.]]   Rivest R. 1992. The MD5 message-digest algorithm. RFC 1321. IETF.]]","DOI":"10.17487\/rfc1321"},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1007\/10704656_13","article-title":"Finding near-replicas of documents on the Web. In the International Workshop on the World Wide Web and Web Databases","volume":"1590","author":"Shivakumar","year":"1998","unstructured":"Shivakumar and Garcia-Molina. 1998 . Finding near-replicas of documents on the Web. In the International Workshop on the World Wide Web and Web Databases . Lecture Notes in Computer Science , vol. 1590 , 204 -- 212 .]] Shivakumar and Garcia-Molina. 1998. Finding near-replicas of documents on the Web. In the International Workshop on the World Wide Web and Web Databases. Lecture Notes in Computer Science, vol. 1590, 204--212.]]","journal-title":"Lecture Notes in Computer Science"},{"key":"e_1_2_1_48_1","volume-title":"Netcensus: Medi\u00e7\u00e3o da evolu\u00e7\u00e3o dos conte\u00fados na web. Tech. rep. Departamento de Inform\u00e1tica, Universidade do Minho, Portugal.]]","author":"Silva L. O.","year":"2002","unstructured":"Silva , L. O. , Macedo , J. , Costa , A. , Belo , O. , and Santos , A . 2002 a. Netcensus: Medi\u00e7\u00e3o da evolu\u00e7\u00e3o dos conte\u00fados na web. Tech. rep. Departamento de Inform\u00e1tica, Universidade do Minho, Portugal.]] Silva, L. O., Macedo, J., Costa, A., Belo, O., and Santos, A. 2002a. Netcensus: Medi\u00e7\u00e3o da evolu\u00e7\u00e3o dos conte\u00fados na web. Tech. rep. Departamento de Inform\u00e1tica, Universidade do Minho, Portugal.]]"},{"key":"e_1_2_1_49_1","unstructured":"Silva L. O. Macedo J. Costa A. Belo O. and Santos A. 2002b. Obten\u00e7\u00e3o de estat\u00edsticas do www em Portugal. Tech. rep. Universidade do Minho Portugal.]]  Silva L. O. Macedo J. Costa A. Belo O. and Santos A. 2002b. Obten\u00e7\u00e3o de estat\u00edsticas do www em Portugal. Tech. rep. Universidade do Minho Portugal.]]"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of IADIS International Conference WWW\/Internet","author":"Silva M. J.","year":"2003","unstructured":"Silva , M. J. 2003 . The case for a portuguese Web search engine . In Proceedings of IADIS International Conference WWW\/Internet . Algarve, Portugal.]] Silva, M. J. 2003. The case for a portuguese Web search engine. In Proceedings of IADIS International Conference WWW\/Internet. Algarve, Portugal.]]"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/602421.602422"},{"key":"e_1_2_1_52_1","unstructured":"W3C. 1999. HTML 4.01 specification. Available at http:\/\/www.w3.org\/TR\/html401\/.]]  W3C. 1999. HTML 4.01 specification. Available at http:\/\/www.w3.org\/TR\/html401\/.]]"},{"key":"e_1_2_1_53_1","unstructured":"W3C. 1999. Web characterization terminology and definitions sheet. Available at http:\/\/www.w3.org\/1999\/05\/WCA-terms\/.]]  W3C. 1999. Web characterization terminology and definitions sheet. Available at http:\/\/www.w3.org\/1999\/05\/WCA-terms\/.]]"},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the Preservation Conference","author":"Webb C.","year":"2000","unstructured":"Webb , C. 2000 . Towards a preserved national collection of selected Australian digital publications . In Proceedings of the Preservation Conference . York, UK.]] Webb, C. 2000. Towards a preserved national collection of selected Australian digital publications. In Proceedings of the Preservation Conference. York, UK.]]"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1389-1286(99)00037-7"},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of the 3rd ECDL Workshop on Web Archives","author":"Zabicka P.","year":"2003","unstructured":"Zabicka , P. 2003 . Archiving the Czech Web: Issues and challenges . In Proceedings of the 3rd ECDL Workshop on Web Archives . Trondheim, Norway.]] Zabicka, P. 2003. Archiving the Czech Web: Issues and challenges. In Proceedings of the 3rd ECDL Workshop on Web Archives. Trondheim, Norway.]]"},{"key":"e_1_2_1_57_1","doi-asserted-by":"crossref","unstructured":"Zook M. 2000. Internet metrics: Using host and domain counts to map the Internet. Telecomm. Policy 24 6\/7 613--620.]]  Zook M. 2000. Internet metrics: Using host and domain counts to map the Internet. Telecomm. Policy 24 6\/7 613--620.]]","DOI":"10.1016\/S0308-5961(00)00039-2"}],"container-title":["ACM Transactions on Internet Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1084772.1084775","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1084772.1084775","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T16:08:24Z","timestamp":1750262904000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1084772.1084775"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,8]]},"references-count":57,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2005,8]]}},"alternative-id":["10.1145\/1084772.1084775"],"URL":"https:\/\/doi.org\/10.1145\/1084772.1084775","relation":{},"ISSN":["1533-5399","1557-6051"],"issn-type":[{"value":"1533-5399","type":"print"},{"value":"1557-6051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,8]]},"assertion":[{"value":"2005-08-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}