{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:31:00Z","timestamp":1763469060561},"publisher-location":"Berlin, Heidelberg","reference-count":31,"publisher":"Springer Berlin Heidelberg","isbn-type":[{"type":"print","value":"9783642405006"},{"type":"electronic","value":"9783642405013"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013]]},"DOI":"10.1007\/978-3-642-40501-3_33","type":"book-chapter","created":{"date-parts":[[2013,8,30]],"date-time":"2013-08-30T04:37:43Z","timestamp":1377837463000},"page":"321-332","source":"Crossref","is-referenced-by-count":0,"title":["Restoring Semantically Incomplete Document Collections Using Lexical Signatures"],"prefix":"10.1007","author":[{"given":"Luis","family":"Meneses","sequence":"first","affiliation":[]},{"given":"Himanshu","family":"Barthwal","sequence":"additional","affiliation":[]},{"given":"Sanjeev","family":"Singh","sequence":"additional","affiliation":[]},{"given":"Richard","family":"Furuta","sequence":"additional","affiliation":[]},{"given":"Frank","family":"Shipman","sequence":"additional","affiliation":[]}],"member":"297","reference":[{"key":"33_CR1","doi-asserted-by":"crossref","unstructured":"Bogen, P.L., Pogue, D., Poursardar, F., Li, Y., Furuta, R., Shipman, F.: WPv4: a re-imagined Walden\u2019s paths to support diverse user communities. In: Proc. of the 11th Annual International ACM\/IEEE Joint Conference on Digital Libraries, Ottawa, Ontario, Canada, pp. 419\u2013420 (2011)","DOI":"10.1145\/1998076.1998164"},{"key":"33_CR2","first-page":"224","volume":"25","author":"L. Cassel","year":"2010","unstructured":"Cassel, L., Fox, E., Shipman, F., Brusilovsky, P., Fax, W., Garcia, D., Hislop, G., Furuta, R., Delcambre, L., Potluri, S.: Ensemble: enriching communities and collections to support education in computing: poster session. Journal of Computing Sciences in Colleges\u00a025, 224\u2013226 (2010)","journal-title":"Journal of Computing Sciences in Colleges"},{"key":"33_CR3","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1145\/1592761.1592794","volume":"52","author":"F. McCown","year":"2009","unstructured":"McCown, F., Marshall, C.C., Nelson, M.L.: Why web sites are lost (and how they\u2019re sometimes found). Communications of the ACM\u00a052, 141\u2013145 (2009)","journal-title":"Communications of the ACM"},{"key":"33_CR4","doi-asserted-by":"crossref","unstructured":"Klein, M., Ware, J., Nelson, M.L.: Rediscovering missing web pages using link neighborhood lexical signatures. In: Proc. of the 11th Annual International ACM\/IEEE Joint Conference on Digital libraries, Ottawa, Ontario, Canada (2011)","DOI":"10.1145\/1998076.1998101"},{"key":"33_CR5","doi-asserted-by":"crossref","unstructured":"Klein, M., Nelson, M.L.: Evaluating methods to rediscover missing web pages from the web infrastructure. In: Proc. Of The 10th Annual Joint Conference on Digital Libraries, Gold Coast, Queensland, Australia (2010)","DOI":"10.1145\/1816123.1816133"},{"key":"33_CR6","doi-asserted-by":"crossref","unstructured":"Bar-Yossef, Z., Broder, A.Z., Kumar, R., Tomkins, A.: Sic transit gloria telae: towards an understanding of the web\u2019s decay. In: Proc. of the 13th International Conference on World Wide Web, New York, NY, USA (2004)","DOI":"10.1145\/988672.988716"},{"key":"33_CR7","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1007\/978-3-642-33290-6_14","volume-title":"Theory and Practice of Digital Libraries","author":"H.M. SalahEldeen","year":"2012","unstructured":"SalahEldeen, H.M., Nelson, M.L.: Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost? In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol.\u00a07489, pp. 125\u2013137. Springer, Heidelberg (2012)"},{"key":"33_CR8","doi-asserted-by":"crossref","unstructured":"Francisco-Revilla, L., Shipman, F., Furuta, R., Karadkar, U., Arora, A.: Managing change on the web. In: Proc. of the 1st ACM\/IEEE-CS Joint Conference on Digital Libraries, Roanoke, Virginia, United States (2001)","DOI":"10.1145\/379437.379973"},{"key":"33_CR9","doi-asserted-by":"crossref","unstructured":"Francisco-Revilla, L., Shipman, F., Furuta, R., Karadkar, U., Arora, A.: Perception of content, structure, and presentation changes in Web-based hypertext. In: Proc. of the 12th ACM Conference on Hypertext and Hypermedia, Arhus, Denmark (2001)","DOI":"10.1145\/504216.504266"},{"key":"33_CR10","doi-asserted-by":"crossref","unstructured":"Logasa Bogen, P., Francisco-Revilla, L., Furuta, R., Hubbard, T., Karadkar, U.P., Shipman, F.: Longitudinal study of changes in blogs. In: Proc. of the 7th ACM\/IEEE-CS Joint Conference on Digital Libraries, Vancouver, BC, Canada (2007)","DOI":"10.1145\/1255175.1255201"},{"key":"33_CR11","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1007\/978-3-642-33290-6_22","volume-title":"Theory and Practice of Digital Libraries","author":"L. Meneses","year":"2012","unstructured":"Meneses, L., Furuta, R., Shipman, F.: Identifying \u201cSoft 404\u201d Error Pages: Analyzing the Lexical Signatures of Documents in Distributed Collections. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol.\u00a07489, pp. 197\u2013208. Springer, Heidelberg (2012)"},{"key":"33_CR12","doi-asserted-by":"crossref","unstructured":"Dalal, Z., Dash, S., Dave, P., Francisco-Revilla, L., Furuta, R., Karadkar, U., Shipman, F.: Managing distributed collections: evaluating web page changes, movement, and replacement. In: Proc. of the 4th ACM\/IEEE-CS Joint Conference on Digital Libraries, Tuscon, AZ, USA, pp. 160\u2013168 (2004)","DOI":"10.1145\/996350.996387"},{"key":"33_CR13","doi-asserted-by":"crossref","unstructured":"Baeza-Yates, R., Pereira, I., Ziviani, N.: Genealogical trees on the web: a search engine user perspective. In: Proc. of the 17th International Conference on World Wide Web, Beijing, China (2008)","DOI":"10.1145\/1367497.1367548"},{"key":"33_CR14","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1145\/367701.367702","volume":"32","author":"H. Ashman","year":"2000","unstructured":"Ashman, H.: Electronic document addressing: dealing with change. ACM Computing Surveys\u00a032, 201\u2013212 (2000)","journal-title":"ACM Computing Surveys"},{"key":"33_CR15","doi-asserted-by":"crossref","unstructured":"Ashman, H., Davis, H., Whitehead, J., Caughey, S.: Missing the 404: link integrity on the World Wide Web. In: Proc. of the Seventh International Conference on World Wide Web, Brisbane, Australia (1998)","DOI":"10.1016\/S0169-7552(98)00131-7"},{"key":"33_CR16","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1145\/345966.346026","volume":"31","author":"H.C. Davis","year":"1999","unstructured":"Davis, H.C.: Hypertext link integrity. ACM Computing Surveys\u00a031, 28 (1999)","journal-title":"ACM Computing Surveys"},{"key":"33_CR17","doi-asserted-by":"crossref","unstructured":"Davis, H.C.: Referential integrity of links in open hypermedia systems. In: Proc. of the Ninth ACM Conference on Hypertext and Hypermedia, Pittsburgh, Pennsylvania, United States (1998)","DOI":"10.1145\/276627.276650"},{"key":"33_CR18","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1038\/scientificamerican0397-82","volume":"276","author":"B. Kahle","year":"1997","unstructured":"Kahle, B.: Preserving the Internet. Scientific American\u00a0276, 82\u201383 (1997)","journal-title":"Scientific American"},{"key":"33_CR19","doi-asserted-by":"publisher","first-page":"162","DOI":"10.1002\/asi.10018","volume":"53","author":"W. Koehler","year":"2002","unstructured":"Koehler, W.: Web page change and persistence\u2014a four-year longitudinal study. Journal of the American Society for Information Science and Technology\u00a053, 162\u2013171 (2002)","journal-title":"Journal of the American Society for Information Science and Technology"},{"key":"33_CR20","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1145\/602421.602422","volume":"46","author":"D. Spinellis","year":"2003","unstructured":"Spinellis, D.: The decay and failures of web references. Communications of the ACM\u00a046, 71\u201377 (2003)","journal-title":"Communications of the ACM"},{"key":"33_CR21","unstructured":"Phelps, T.A., Wilensky, R.: Robust Hyperlinks Cost Just Five Words Each. University of California at Berkeley (2000)"},{"key":"33_CR22","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1145\/1028099.1028101","volume":"22","author":"S.-T. Park","year":"2004","unstructured":"Park, S.-T., Pennock, D.M., Giles, C.L., Krovetz, R.: Analysis of lexical signatures for improving information persistence on the World Wide Web. Transactions on Information Systems\u00a022, 540\u2013572 (2004)","journal-title":"Transactions on Information Systems"},{"key":"33_CR23","doi-asserted-by":"crossref","unstructured":"Klein, M., Shipman, J., Nelson, M.L.: Is this a good title? In: Proc. of the 21st ACM Conference on Hypertext and Hypermedia, Toronto, Ontario, Canada (2010)","DOI":"10.1145\/1810617.1810621"},{"key":"33_CR24","doi-asserted-by":"crossref","unstructured":"McCown, F., Smith, J.A., Nelson, M.L.: Lazy preservation: reconstructing websites by crawling the crawlers. In: Proc. of the 8th Annual ACM International Workshop on Web Information and Data Management, Arlington, Virginia, USA, pp. 67\u201374 (2006)","DOI":"10.1145\/1183550.1183564"},{"key":"33_CR25","first-page":"1157","volume":"29","author":"A.Z. Broder","year":"1997","unstructured":"Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the Web. Computer Networks\u00a029, 1157\u20131166 (1997)","journal-title":"Computer Networks"},{"key":"33_CR26","doi-asserted-by":"crossref","unstructured":"Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proc. of the Thiry-fourth Annual ACM Symposium on Theory of Computing, Montreal, Quebec, Canada (2002)","DOI":"10.1145\/509961.509965"},{"key":"33_CR27","unstructured":"Manber, U.: Finding similar files in a large file system. In: Proc. of the USENIX Winter 1994 Technical Conference, San Francisco, California (1994)"},{"key":"33_CR28","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"204","DOI":"10.1007\/10704656_13","volume-title":"The World Wide Web and Databases","author":"N. Shivakumar","year":"1999","unstructured":"Shivakumar, N., Garcia-Molina, H.: Finding Near-Replicas of Documents and Servers on the Web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol.\u00a01590, pp. 204\u2013212. Springer, Heidelberg (1999)"},{"key":"33_CR29","doi-asserted-by":"crossref","unstructured":"Brin, S., Davis, J., Garcia-Molina, H.: Copy detection mechanisms for digital documents. In: Proc. of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, USA, pp. 398\u2013409 (1995)","DOI":"10.1145\/568271.223855"},{"key":"33_CR30","doi-asserted-by":"crossref","unstructured":"Forman, G., Eshghi, K., Chiocchetti, S.: Finding similar files in large document repositories. In: Proc. of the eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, Chicago, Illinois, USA (2005)","DOI":"10.1145\/1081870.1081916"},{"key":"33_CR31","doi-asserted-by":"crossref","unstructured":"McCown, F., Nelson, M.L.: Search engines and their public interfaces: which apis are the most synchronized? In: Proc. of the 16th International Conference on World Wide Web, Banff, Alberta, Canada (2007)","DOI":"10.1145\/1242572.1242763"}],"container-title":["Lecture Notes in Computer Science","Research and Advanced Technology for Digital Libraries"],"original-title":[],"link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-642-40501-3_33","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,5,17]],"date-time":"2019-05-17T00:42:35Z","timestamp":1558053755000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/978-3-642-40501-3_33"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013]]},"ISBN":["9783642405006","9783642405013"],"references-count":31,"URL":"https:\/\/doi.org\/10.1007\/978-3-642-40501-3_33","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2013]]}}}