{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T09:35:07Z","timestamp":1763458507731,"version":"3.45.0"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2017,3,31]],"date-time":"2017-03-31T00:00:00Z","timestamp":1490918400000},"content-version":"vor","delay-in-days":365,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000006","name":"Office of Naval Research","doi-asserted-by":"publisher","award":["N00014-12-1-0486"],"award-info":[{"award-number":["N00014-12-1-0486"]}],"id":[{"id":"10.13039\/100000006","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["NSF 14-19297"],"award-info":[{"award-number":["NSF 14-19297"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2016,7,14]]},"abstract":"<jats:p>The process of generating schematic maps of salient objects from a set of pictures of an indoor environment is challenging. It has been an active area of research as it is crucial to a wide range of context- and location-aware services, as well as for general scene understanding. Although many automated systems have been developed to solve the problem, most of them either require predefining labels or expensive equipment, such as RGBD sensors or lasers, to scan the environment. In this article, we introduce a prototype system to show how human computations can be utilized to generate schematic maps from a set of pictures, without making strong assumptions or demanding extra devices. The system requires humans (crowd workers from Amazon Mechanical Turks) to do simple spatial mapping tasks in various conditions, and their data are aggregated by filtering and clustering techniques that allow salient cues to be identified in the pictures and their spatial relations to be inferred and projected on a two-dimensional map. In particular, we tested and demonstrated the effectiveness of two methods that improved the quality of the generated schematic map: (1) We encouraged humans to adopt an allocentric representations of salient objects by guiding them to perform mental rotations of these objects and (2) we sensitized human perception by guided arrows superimposed on the imagery to improve the accuracy of depth and width estimation. We demonstrated the feasibility of our system by evaluating the results of schematic maps generated from indoor pictures taken from an office building. By calculating Riemannian shape distances between the generated maps to the ground truth, we found that the generated schematic maps captured the spatial relations well. Our results showed that the combination of human computations and machine clustering could lead to more-accurate schematized maps from imagery. We also discuss how our approach may have important insights on methods that leverage human computations in other areas.<\/jats:p>","DOI":"10.1145\/2873065","type":"journal-article","created":{"date-parts":[[2016,3,31]],"date-time":"2016-03-31T07:55:39Z","timestamp":1459410939000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Leveraging Human Computations to Improve Schematization of Spatial Relations from Imagery"],"prefix":"10.1145","volume":"7","author":[{"given":"Huaming","family":"Rao","sequence":"first","affiliation":[{"name":"Nanjing University of Science and Technology, University of Illinois at Urbana-Champaign, Jiangsu, China P. R"}]},{"given":"Shih-Wen","family":"Huang","sequence":"additional","affiliation":[{"name":"University of Washington, WA, US"}]},{"given":"Wai-Tat","family":"Fu","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign, IL, US"}]}],"member":"320","published-online":{"date-parts":[[2016,3,31]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2424321.2424335"},{"key":"e_1_2_1_2_1","unstructured":"Anthony J. Aretz. 1990. Cognitive Requirements for Aircraft Navigation. Technical Report. DTIC Document."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/2590208.2590211"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/11744023_32"},{"key":"e_1_2_1_5_1","volume-title":"IFC BIM-based methodology for semi-automated building energy performance simulation. Lawrence Berkeley National Laboratory","author":"Bazjanac Vladimir","year":"2008","unstructured":"Vladimir Bazjanac. 2008. IFC BIM-based methodology for semi-automated building energy performance simulation. Lawrence Berkeley National Laboratory (2008)."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1866029.1866078"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1866029.1866080"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461381.2461388"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370216.2370288"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03204234"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/2898607.2898793"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.81"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2632048.2632051"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2010.5509682"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/3001460.3001507"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 22nd Australasian Conference on Information Systems (ACIS\u201911)","author":"Geiger David","year":"2011","unstructured":"David Geiger, Michael Rosemann, and Erwin Fielt. 2011. Crowdsourcing information systems: A systems theory perspective. In Proceedings of the 22nd Australasian Conference on Information Systems (ACIS\u201911)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2470654.2470744"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2470654.2470743"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1837885.1837906"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1960.tb03954.x"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 2002 IEEE Intelligent Vehicle Symposium","volume":"2","author":"Labayrade Raphael","year":"2002","unstructured":"Raphael Labayrade, Didier Aubert, and J.-P. Tarel. 2002. Real time obstacle detection in stereovision on non flat road geometry through \u201cv-disparity\u201d representation. In Proceedings of the 2002 IEEE Intelligent Vehicle Symposium, Vol. 2. IEEE, 646--651."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.12672\/ksis.2013.21.3.001"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5539823"},{"key":"e_1_2_1_24_1","volume-title":"Proc. Roy. Soc. Lond. B: Biol. Sci. 204","author":"Marr David","year":"1979","unstructured":"David Marr and Tomaso Poggio. 1979. A computational theory of human stereo vision. Proc. Roy. Soc. Lond. B: Biol. Sci. 204, 1156 (1979), 301--328."},{"key":"e_1_2_1_25_1","volume-title":"Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. Hum. Comput. 11, 11","author":"Oleson David","year":"2011","unstructured":"David Oleson, Alexander Sorokin, Greg P. Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. Hum. Comput. 11, 11 (2011)."},{"key":"e_1_2_1_26_1","volume-title":"Factors that affect depth perception in stereoscopic displays. Hum. Factors","author":"Patterson Robert","year":"1992","unstructured":"Robert Patterson, Linda Moe, and Tiger Hewitt. 1992. Factors that affect depth perception in stereoscopic displays. Hum. Factors (1992)."},{"volume-title":"Why We See What We Do Redux: A Wholly Empirical Theory of Vision","author":"Purves Dale","key":"e_1_2_1_27_1","unstructured":"Dale Purves and R. Beau Lotto. 2011. Why We See What We Do Redux: A Wholly Empirical Theory of Vision. Sinauer Associates."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1978942.1979148"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1609\/hcomp.v1i1.13082"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPIN.2011.6071951"},{"key":"e_1_2_1_31_1","volume-title":"Cooper","author":"Shepard Roger N.","year":"1986","unstructured":"Roger N. Shepard and Lynn A. Cooper. 1986. Mental Images and Their Transformations. MIT Press."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.171.3972.701"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.5555\/1613715.1613751"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2008.4562953"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","unstructured":"James Surowiecki. 2005. The Wisdom of Crowds. Anchor. 10.5555\/1095645","DOI":"10.5555\/1095645"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/1965992.1966009"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of International Conference on Computer Graphics Theory and Applications.","author":"Turner Eric","year":"2014","unstructured":"Eric Turner and Avideh Zakhor. 2014. Floor plan generation and room labeling of indoor environments from laser range data. In Proceedings of International Conference on Computer Graphics Theory and Applications."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/985692.985733"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2531602.2531604"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2873065","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2873065","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2873065","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T09:25:58Z","timestamp":1763457958000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2873065"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,3,31]]},"references-count":39,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2016,7,14]]}},"alternative-id":["10.1145\/2873065"],"URL":"https:\/\/doi.org\/10.1145\/2873065","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"type":"print","value":"2157-6904"},{"type":"electronic","value":"2157-6912"}],"subject":[],"published":{"date-parts":[[2016,3,31]]},"assertion":[{"value":"2015-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-12-01","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-03-31","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}