{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T08:38:11Z","timestamp":1774600691683,"version":"3.50.1"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2013,11,1]],"date-time":"2013-11-01T00:00:00Z","timestamp":1383264000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Pervasive Computing"},{"DOI":"10.13039\/100002418","name":"Intel Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100002418","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000145","name":"Division of Information and Intelligent Systems","doi-asserted-by":"publisher","award":["IIS-1250793"],"award-info":[{"award-number":["IIS-1250793"]}],"id":[{"id":"10.13039\/100000145","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006785","name":"Google","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006785","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2013,11]]},"abstract":"<jats:p>\n            We introduce an approach for analyzing Wikipedia and other text, together with online photos, to produce\n            <jats:italic>annotated<\/jats:italic>\n            3D models of famous tourist sites. The approach is completely automated, and leverages online text and photo co-occurrences via Google Image Search. It enables a number of new interactions, which we demonstrate in a new 3D visualization tool. Text can be selected to move the camera to the corresponding objects, 3D bounding boxes provide anchors back to the text describing them, and the overall narrative of the text provides a temporal guide for automatically flying through the scene to visualize the world as you read about it. We show compelling results on several major tourist sites.\n          <\/jats:p>","DOI":"10.1145\/2508363.2508425","type":"journal-article","created":{"date-parts":[[2013,11,6]],"date-time":"2013-11-06T14:09:19Z","timestamp":1383746959000},"page":"1-10","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":35,"title":["3D Wikipedia"],"prefix":"10.1145","volume":"32","author":[{"given":"Bryan C.","family":"Russell","sequence":"first","affiliation":[{"name":"Intel Labs"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ricardo","family":"Martin-Brualla","sequence":"additional","affiliation":[{"name":"University of Washington"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel J.","family":"Butler","sequence":"additional","affiliation":[{"name":"University of Washington"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Steven M.","family":"Seitz","sequence":"additional","affiliation":[{"name":"University of Washington"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luke","family":"Zettlemoyer","sequence":"additional","affiliation":[{"name":"University of Washington"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2013,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2001269.2001293"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/944919.944965"},{"key":"e_1_2_1_3_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3562--3569","author":"Berg A. C.","unstructured":"Berg , A. C. , Berg , T. L., III, H. D. , Dodge , J. , Goyal , A. , Han , X. , Mensch , A. , Mitchell , M. , Sood , A. , Stratos , K. , and Yamaguchi , K . 2012. Understanding and predicting importance in images . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3562--3569 . Berg, A. C., Berg, T. L., III, H. D., Dodge, J., Goyal, A., Han, X., Mensch, A., Mitchell, M., Sood, A., Stratos, K., and Yamaguchi, K. 2012. Understanding and predicting importance in images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3562--3569."},{"key":"e_1_2_1_4_1","unstructured":"Berlitz International I. 2003. Berlitz Rome Pocket Guide. Berlitz Pocket Guides Series. Berlitz International Incorporated.  Berlitz International I. 2003. Berlitz Rome Pocket Guide . Berlitz Pocket Guides Series. Berlitz International Incorporated."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the third Text REtrieval Conference (TREC-3), 69--80","author":"Buckley C.","year":"1995","unstructured":"Buckley , C. 1995 . Automatic query expansion using SMART: TREC 3 . In Proceedings of the third Text REtrieval Conference (TREC-3), 69--80 . Buckley, C. 1995. Automatic query expansion using SMART: TREC 3. In Proceedings of the third Text REtrieval Conference (TREC-3), 69--80."},{"key":"e_1_2_1_6_1","volume-title":"IEEE 11th International Conference on Computer Vision (ICCV), 1--8.","author":"Chum O.","unstructured":"Chum , O. , Philbin , J. , Sivic , J. , Isard , M. , and Zisserman , A . 2007. Total recall: Automatic query expansion with a generative feature model for object retrieval . In IEEE 11th International Conference on Computer Vision (ICCV), 1--8. Chum, O., Philbin, J., Sivic, J., Isard, M., and Zisserman, A. 2007. Total recall: Automatic query expansion with a generative feature model for object retrieval. In IEEE 11th International Conference on Computer Vision (ICCV), 1--8."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2021049"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526812"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-009-0275-4"},{"key":"e_1_2_1_10_1","volume-title":"European Conference on Computer Vision (ECCV), 15--29","author":"Farhadi A.","unstructured":"Farhadi , A. , Hejrati , M. , Sadeghi , M. A. , Young , P. , Rashtchian , C. , Hockenmaier , J. , and Forsyth , D . 2010. Every picture tells a story: Generating sentences from images . In European Conference on Computer Vision (ECCV), 15--29 . Farhadi, A., Hejrati, M., Sadeghi, M. A., Young, P., Rashtchian, C., Hockenmaier, J., and Forsyth, D. 2010. Every picture tells a story: Generating sentences from images. In European Conference on Computer Vision (ECCV), 15--29."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2009.161"},{"key":"e_1_2_1_12_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1434--1441","author":"Furukawa Y.","unstructured":"Furukawa , Y. , Curless , B. , Seitz , S. M. , and Szeliski , R . 2010. Towards internet-scale multi-view stereo . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1434--1441 . Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2010. Towards internet-scale multi-view stereo. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1434--1441."},{"key":"e_1_2_1_13_1","unstructured":"Garwood D. and Hole A. 2012. Lonely Planet Rome. Travel Guide. Lonely Planet Publications.  Garwood D. and Hole A. 2012. Lonely Planet Rome . Travel Guide. Lonely Planet Publications."},{"key":"e_1_2_1_14_1","volume-title":"IEEE 11th International Conference on Computer Vision (ICCV), 1--8.","author":"Goesele M.","unstructured":"Goesele , M. , Snavely , N. , Curless , B. , Hoppe , H. , and Seitz , S. M . 2007. Multi-view stereo for community photo collections . In IEEE 11th International Conference on Computer Vision (ICCV), 1--8. Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In IEEE 11th International Conference on Computer Vision (ICCV), 1--8."},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Hartley R. I. and Zisserman A. 2004. Multiple View Geometry in Computer Vision second ed. Cambridge University Press ISBN: 0521540518.   Hartley R. I. and Zisserman A. 2004. Multiple View Geometry in Computer Vision second ed. Cambridge University Press ISBN: 0521540518.","DOI":"10.1017\/CBO9780511811685"},{"key":"e_1_2_1_16_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8.","author":"Hays J.","unstructured":"Hays , J. , and Efros , A. A . 2008. IM2GPS: estimating geographic information from a single image . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8. Hays, J., and Efros, A. A. 2008. IM2GPS: estimating geographic information from a single image. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 4th Eurographics Symposium on Geometry Processing (SGP), 61--70","author":"Kazhdan M.","unstructured":"Kazhdan , M. , Bolitho , M. , and Hoppe , H . 2006. Poisson surface reconstruction . In Proceedings of the 4th Eurographics Symposium on Geometry Processing (SGP), 61--70 . Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Proceedings of the 4th Eurographics Symposium on Geometry Processing (SGP), 61--70."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.3115\/1075096.1075150"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-011-0489-0"},{"key":"e_1_2_1_20_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8.","author":"Laptev I.","unstructured":"Laptev , I. , Marszalek , M. , Schmid , C. , and Rozenfeld , B . 2008. Learning realistic human actions from movies . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8. Laptev, I., Marszalek, M., Schmid, C., and Rozenfeld, B. 2008. Learning realistic human actions from movies. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 747--756","author":"Mitchell M.","year":"2012","unstructured":"Mitchell , M. , Dodge , J. , Goyal , A. , Yamaguchi , K. , Sratos , K. , Han , X. , Mensch , A. , Berg , A. C. , Berg , T. L. , and Daum\u00e9 III, H . 2012 . Midge: Generating image descriptions from computer vision detections . In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 747--756 . Mitchell, M., Dodge, J., Goyal, A., Yamaguchi, K., Sratos, K., Han, X., Mensch, A., Berg, A. C., Berg, T. L., and Daum\u00e9 III, H. 2012. Midge: Generating image descriptions from computer vision detections. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 747--756."},{"key":"e_1_2_1_23_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8.","author":"Philbin J.","unstructured":"Philbin , J. , Chum , O. , Isard , M. , Sivic , J. , and Zisserman , A . 2008. Lost in quantization: Improving particular object retrieval in large scale image databases . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8. Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2008. Lost in quantization: Improving particular object retrieval in large scale image databases. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1--8."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-011-0445-z"},{"key":"e_1_2_1_25_1","volume-title":"IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2759--2766","author":"Ren X.","unstructured":"Ren , X. , Bo , L. , and Fox , D . 2012. RGB-(D) Scene labeling: Features and algorithms . In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2759--2766 . Ren, X., Bo, L., and Fox, D. 2012. RGB-(D) Scene labeling: Features and algorithms. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2759--2766."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-007-0090-8"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-4571(199006)41:4<288::AID-ASI8>3.0.CO;2-H"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-88688-4_40"},{"key":"e_1_2_1_30_1","volume-title":"IEEE 9th International Conference on Computer Vision (ICCV), 1470--1477","author":"Sivic J.","unstructured":"Sivic , J. , and Zisserman , A . 2003. Video Google: A text retrieval approach to object matching in videos . In IEEE 9th International Conference on Computer Vision (ICCV), 1470--1477 . Sivic, J., and Zisserman, A. 2003. Video Google: A text retrieval approach to object matching in videos. In IEEE 9th International Conference on Computer Vision (ICCV), 1470--1477."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1141911.1141964"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-007-0107-3"},{"key":"e_1_2_1_33_1","unstructured":"Stop words list. http:\/\/norm.al\/2009\/04\/14\/list-of-english-stop-words\/.  Stop words list. http:\/\/norm.al\/2009\/04\/14\/list-of-english-stop-words\/."},{"key":"e_1_2_1_34_1","unstructured":"Wikipedia. http:\/\/www.wikipedia.org.  Wikipedia. http:\/\/www.wikipedia.org."},{"key":"e_1_2_1_35_1","unstructured":"Wu C. SiftGPU: A GPU implementation of scale invaraint feature transform (SIFT). http:\/\/cs.unc.edu\/~ccwu\/siftgpu.  Wu C. SiftGPU: A GPU implementation of scale invaraint feature transform (SIFT). http:\/\/cs.unc.edu\/~ccwu\/siftgpu."},{"key":"e_1_2_1_36_1","unstructured":"Wu C. VisualSFM: A visual structure from motion system. http:\/\/homes.cs.washington.edu\/~ccwu\/vsfm\/.  Wu C. VisualSFM: A visual structure from motion system. http:\/\/homes.cs.washington.edu\/~ccwu\/vsfm\/."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995552"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2508363.2508425","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2508363.2508425","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T07:28:39Z","timestamp":1750231719000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2508363.2508425"}},"subtitle":["using online text to automatically label and navigate reconstructed geometry"],"short-title":[],"issued":{"date-parts":[[2013,11]]},"references-count":37,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2013,11]]}},"alternative-id":["10.1145\/2508363.2508425"],"URL":"https:\/\/doi.org\/10.1145\/2508363.2508425","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,11]]},"assertion":[{"value":"2013-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}