{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T15:38:50Z","timestamp":1775230730086,"version":"3.50.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2018,12,4]],"date-time":"2018-12-04T00:00:00Z","timestamp":1543881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","award":["611370"],"award-info":[{"award-number":["611370"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-1528025"],"award-info":[{"award-number":["IIS-1528025"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001459","name":"Ministry of Education - Singapore","doi-asserted-by":"publisher","award":["MOE2016-T2-2-154"],"award-info":[{"award-number":["MOE2016-T2-2-154"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100008536","name":"Amazon Web Services","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100008536","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004344","name":"Adobe Systems","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004344","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005950","name":"Hong Kong University of Science and Technology","doi-asserted-by":"publisher","award":["R9429"],"award-info":[{"award-number":["R9429"]}],"id":[{"id":"10.13039\/501100005950","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Stanford AI Lab-Toyota Center for Artificial Intelligence Research"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2018,12,31]]},"abstract":"<jats:p>\n            We introduce a novel framework for using natural language to generate and edit 3D indoor scenes, harnessing scene semantics and text-scene grounding knowledge learned from large annotated 3D scene databases. The advantage of natural language editing interfaces is strongest when performing semantic operations at the\n            <jats:italic>sub-scene<\/jats:italic>\n            level, acting on\n            <jats:italic>groups<\/jats:italic>\n            of objects. We learn how to manipulate these sub-scenes by analyzing existing 3D scenes. We perform edits by first parsing a natural language command from the user and transforming it into a\n            <jats:italic>semantic scene graph<\/jats:italic>\n            that is used to retrieve corresponding sub-scenes from the databases that match the command. We then augment this retrieved sub-scene by incorporating other objects that may be implied by the scene context. Finally, a new 3D scene is synthesized by aligning the augmented sub-scene with the user's current scene, where new objects are spliced into the environment, possibly triggering appropriate adjustments to the existing scene arrangement. A suggestive modeling interface with multiple interpretations of user commands is used to alleviate ambiguities in natural language. We conduct studies comparing our approach against both prior text-to-scene work and artist-made scenes and find that our method significantly outperforms prior work and is comparable to handmade scenes even when complex and varied natural sentences are used.\n          <\/jats:p>","DOI":"10.1145\/3272127.3275035","type":"journal-article","created":{"date-parts":[[2018,11,28]],"date-time":"2018-11-28T19:16:10Z","timestamp":1543432570000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":78,"title":["Language-driven synthesis of 3D scenes from scene databases"],"prefix":"10.1145","volume":"37","author":[{"given":"Rui","family":"Ma","sequence":"first","affiliation":[{"name":"Simon Fraser University and AltumView Systems Inc."}]},{"given":"Akshay Gadi","family":"Patil","sequence":"additional","affiliation":[{"name":"Simon Fraser University"}]},{"given":"Matthew","family":"Fisher","sequence":"additional","affiliation":[{"name":"Adobe Research"}]},{"given":"Manyi","family":"Li","sequence":"additional","affiliation":[{"name":"Shandong University and Simon Fraser University"}]},{"given":"S\u00f6ren","family":"Pirk","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Binh-Son","family":"Hua","sequence":"additional","affiliation":[{"name":"University of Tokyo"}]},{"given":"Sai-Kit","family":"Yeung","sequence":"additional","affiliation":[{"name":"Hong Kong University of Science and Technology"}]},{"given":"Xin","family":"Tong","sequence":"additional","affiliation":[{"name":"Microsoft Research Asia"}]},{"given":"Leonidas","family":"Guibas","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Hao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Simon Fraser University"}]}],"member":"320","published-online":{"date-parts":[[2018,12,4]]},"reference":[{"key":"e_1_2_2_1_1","first-page":"119","article-title":"Approximate is better than exact for interval estimation of binomial proportions","volume":"52","author":"Agresti Alan","year":"1998","unstructured":"Alan Agresti and Brent A Coull . 1998 . Approximate is better than exact for interval estimation of binomial proportions . The American Statistician 52 , 2 (1998), 119 -- 126 . Alan Agresti and Brent A Coull. 1998. Approximate is better than exact for interval estimation of binomial proportions. The American Statistician 52, 2 (1998), 119--126.","journal-title":"The American Statistician"},{"key":"e_1_2_2_2_1","volume-title":"Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACL-IJCNLP).","author":"Chang Angel","unstructured":"Angel Chang , Will Monroe , Manolis Savva , Christopher Potts , and Christopher D. Manning . 2015b. Text to 3D Scene Generation with Rich Lexical Grounding . In Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACL-IJCNLP). Angel Chang, Will Monroe, Manolis Savva, Christopher Potts, and Christopher D. Manning. 2015b. Text to 3D Scene Generation with Rich Lexical Grounding. In Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACL-IJCNLP)."},{"key":"e_1_2_2_3_1","volume-title":"Manning","author":"Chang Angel X.","year":"2017","unstructured":"Angel X. Chang , Mihail Eric , Manolis Savva , and Christopher D . Manning . 2017 . SceneSeer: 3D Scene Design with Natural Language. CoRR abs\/1703.00050 (2017). http:\/\/arxiv.org\/abs\/1703.00050 Angel X. Chang, Mihail Eric, Manolis Savva, and Christopher D. Manning. 2017. SceneSeer: 3D Scene Design with Natural Language. CoRR abs\/1703.00050 (2017). http:\/\/arxiv.org\/abs\/1703.00050"},{"key":"e_1_2_2_4_1","unstructured":"Angel X. Chang Thomas Funkhouser Leonidas Guibas Pat Hanrahan Qixing Huang Zimo Li Silvio Savarese Manolis Savva Shuran Song Hao Su Jianxiong Xiao Li Yi and Fisher Yu. 2015a. ShapeNet: An Information-Rich 3D Model Repository. (2015).  Angel X. Chang Thomas Funkhouser Leonidas Guibas Pat Hanrahan Qixing Huang Zimo Li Silvio Savarese Manolis Savva Shuran Song Hao Su Jianxiong Xiao Li Yi and Fisher Yu. 2015a. ShapeNet: An Information-Rich 3D Model Repository. (2015)."},{"key":"e_1_2_2_5_1","volume-title":"Proc. ACL Workshop on Interactive Language Learning, Visualization, and Interfaces (ILLVI).","author":"Chang Angel X.","unstructured":"Angel X. Chang , Manolis Savva , and Christopher D. Manning . 2014a. Interactive Learning of Spatial Knowledge for Text to 3D Scene Generation . In Proc. ACL Workshop on Interactive Language Learning, Visualization, and Interfaces (ILLVI). Angel X. Chang, Manolis Savva, and Christopher D. Manning. 2014a. Interactive Learning of Spatial Knowledge for Text to 3D Scene Generation. In Proc. ACL Workshop on Interactive Language Learning, Visualization, and Interfaces (ILLVI)."},{"key":"e_1_2_2_6_1","volume-title":"Manning","author":"Chang Angel X.","year":"2014","unstructured":"Angel X. Chang , Manolis Savva , and Christopher D . Manning . 2014 b. Learning Spatial Knowledge for Text to 3D Scene Generation. In Empirical Methods in Natural Language Processing (EMNLP) . Angel X. Chang, Manolis Savva, and Christopher D. Manning. 2014b. Learning Spatial Knowledge for Text to 3D Scene Generation. In Empirical Methods in Natural Language Processing (EMNLP)."},{"key":"e_1_2_2_7_1","unstructured":"Bob Coyne Alex Klapheke Masoud Rouhizadeh Richard Sproat and Daniel Bauer. 2012. Annotation Tools and Knowledge Representation for a Text-To-Scene System. In COLING. 679--694.  Bob Coyne Alex Klapheke Masoud Rouhizadeh Richard Sproat and Daniel Bauer. 2012. Annotation Tools and Knowledge Representation for a Text-To-Scene System. In COLING. 679--694."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/383259.383316"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2816795.2818057"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366154"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964929"},{"key":"e_1_2_2_12_1","volume-title":"Proc. IEEE Int. Conf. on Intelligent Robots & Systems. 1640--1647","author":"Guadarrama S.","unstructured":"S. Guadarrama , L. Riano , D. Golland , D. G\u00f6hring , Y. Jia , D. Klein , P. Abbeel , and T. Darrell . 2013. Grounding Spatial Relations for Human-Robot Interaction . In Proc. IEEE Int. Conf. on Intelligent Robots & Systems. 1640--1647 . S. Guadarrama, L. Riano, D. Golland, D. G\u00f6hring, Y. Jia, D. Klein, P. Abbeel, and T. Darrell. 2013. Grounding Spatial Relations for Human-Robot Interaction. In Proc. IEEE Int. Conf. on Intelligent Robots & Systems. 1640--1647."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766914"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2016.18"},{"key":"e_1_2_2_15_1","volume-title":"Proc. Int. Conf. on Machine Learning (ICML).","author":"Jiang Yun","year":"2012","unstructured":"Yun Jiang , Marcus Lim , and Ashutosh Saxena . 2012 . Learning Object Arrangements in 3D Scenes using Human Context . In Proc. Int. Conf. on Machine Learning (ICML). Yun Jiang, Marcus Lim, and Ashutosh Saxena. 2012. Learning Object Arrangements in 3D Scenes using Human Context. In Proc. Int. Conf. on Machine Learning (ICML)."},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366157"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766898"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2980223"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.245"},{"key":"e_1_2_2_20_1","volume-title":"The Stanford CoreNLP Natural Language Processing Toolkit","author":"Manning Christopher D.","unstructured":"Christopher D. Manning , Mihai Surdeanu , John Bauer , Jenny Finkel , Steven J. Bethard , and David McClosky . 2014. The Stanford CoreNLP Natural Language Processing Toolkit . In Association for Computational Linguistics (ACL) System Demonstrations . 55--60. http:\/\/www.aclweb.org\/anthology\/P\/P14\/P14-5010 Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Association for Computational Linguistics (ACL) System Demonstrations. 55--60. http:\/\/www.aclweb.org\/anthology\/P\/P14\/P14-5010"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964982"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2014.X.005"},{"key":"e_1_2_2_23_1","volume-title":"Learning 3D Scene Synthesis from Annotated RGB-D Images. Computer Graphics Forum (SGP) 35, 5","author":"Sadeghipour Zeinab","year":"2016","unstructured":"Zeinab Sadeghipour , Zicheng Liao , Ping Tan , and Hao Zhang . 2016. Learning 3D Scene Synthesis from Annotated RGB-D Images. Computer Graphics Forum (SGP) 35, 5 ( 2016 ). Zeinab Sadeghipour, Zicheng Liao, Ping Tan, and Hao Zhang. 2016. Learning 3D Scene Synthesis from Annotated RGB-D Images. Computer Graphics Forum (SGP) 35, 5 (2016)."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2015.7301289"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897824.2925867"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2006.19"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1180639.1180660"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366155"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/2386438.2386446"},{"key":"e_1_2_2_31_1","volume-title":"SUN RGB-D: A RGB-D scene understanding benchmark suite","author":"Song Shuran","unstructured":"Shuran Song , Samuel P. Lichtenberg , and Jianxiong Xiao . 2015. SUN RGB-D: A RGB-D scene understanding benchmark suite . In IEEE CVPR. 567--576. Shuran Song, Samuel P. Lichtenberg, and Jianxiong Xiao. 2015. SUN RGB-D: A RGB-D scene understanding benchmark suite. In IEEE CVPR. 567--576."},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.28"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913481635"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3197517.3201362"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601109"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2980224"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964981"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2015.2417575"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.211"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275035","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3272127.3275035","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3272127.3275035","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:44:04Z","timestamp":1750207444000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275035"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,4]]},"references-count":39,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2018,12,31]]}},"alternative-id":["10.1145\/3272127.3275035"],"URL":"https:\/\/doi.org\/10.1145\/3272127.3275035","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,12,4]]},"assertion":[{"value":"2018-12-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}