{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T19:44:21Z","timestamp":1770493461729,"version":"3.49.0"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2022,8,12]],"date-time":"2022-08-12T00:00:00Z","timestamp":1660262400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,8,12]],"date-time":"2022-08-12T00:00:00Z","timestamp":1660262400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61802344"],"award-info":[{"award-number":["61802344"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007928","name":"Ningbo Municipal Bureau of Science and Technology","doi-asserted-by":"publisher","award":["2021Z019"],"award-info":[{"award-number":["2021Z019"]}],"id":[{"id":"10.13039\/501100007928","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Sci. Eng."],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>As a fundamental part of indoor scene understanding, the research of indoor room layout estimation has attracted much attention recently. The task is to predict the structure of a room from a single image. In this paper, we illustrate that this task can be well solved even without sophisticated post-processing program, by adopting Feature Pyramid Networks (FPN) to solve this problem with adaptive changes. The proposed model employs two strategies to deliver quality output. First, it can predicts the coarse positions of key points correctly by preserving the order of these key points in the data augmentation stage. Then the coordinate of each corner point is refined by moving each corner point to its nearest image boundary as output. Our method has demonstrated great performance on the benchmark LSUN dataset on both processing efficiency and accuracy. Compared with the state-of-the-art end-to-end method, our method is two times faster at processing speed (32\u00a0ms) than its speed (86\u00a0ms), with 0.71% lower key point error and 0.2% higher pixel error respectively. Besides, the advanced two-step method is only 0.02% better than our result on key point error. Both the high efficiency and accuracy make our method a good choice for some real-time room layout estimation tasks.<\/jats:p>","DOI":"10.1007\/s41019-022-00192-6","type":"journal-article","created":{"date-parts":[[2022,8,12]],"date-time":"2022-08-12T16:02:52Z","timestamp":1660320172000},"page":"213-224","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Toward Enhancing Room Layout Estimation by Feature Pyramid Networks"],"prefix":"10.1007","volume":"7","author":[{"given":"Aopeng","family":"Wang","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2055-2553","authenticated-orcid":false,"given":"Shiting","family":"Wen","sequence":"additional","affiliation":[]},{"given":"Yunjun","family":"Gao","sequence":"additional","affiliation":[]},{"given":"Qing","family":"Li","sequence":"additional","affiliation":[]},{"given":"Ke","family":"Deng","sequence":"additional","affiliation":[]},{"given":"Chaoyi","family":"Pang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,8,12]]},"reference":[{"key":"192_CR1","doi-asserted-by":"crossref","unstructured":"Lin T, Doll\u00e1r P, Girshick RB, He K, Hariharan B, Belongie SJ (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, pp 936\u2013944","DOI":"10.1109\/CVPR.2017.106"},{"key":"192_CR2","doi-asserted-by":"crossref","unstructured":"Kirillov A, Girshick RB, He K, Doll\u00e1r P (2019) Panoptic feature pyramid networks. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, pp 6399\u20136408","DOI":"10.1109\/CVPR.2019.00656"},{"issue":"1\u20133","key":"192_CR3","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1007\/s11263-007-0090-8","volume":"77","author":"BC Russell","year":"2008","unstructured":"Russell BC, Torralba A, Murphy KP, Freeman WT (2008) Labelme: a database and web-based tool for image annotation. Int J Comput Vis 77(1\u20133):157\u2013173","journal-title":"Int J Comput Vis"},{"key":"192_CR4","doi-asserted-by":"crossref","unstructured":"Hedau V, Hoiem D, Forsyth DA (2009) Recovering the spatial layout of cluttered rooms. In: IEEE 12th international conference on computer vision, ICCV 2009, pp 1849\u20131856","DOI":"10.1109\/ICCV.2009.5459411"},{"key":"192_CR5","unstructured":"Lee DC, Gupta A, Hebert M, Kanade T (2010) Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: Proceedings of the 23rd international conference on neural information processing systems-volume 1, pp 1288\u20131296"},{"key":"192_CR6","doi-asserted-by":"crossref","unstructured":"Hedau V, Hoiem D, Forsyth DA (2012) Recovering free space of indoor scenes from a single image. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2807\u20132814","DOI":"10.1109\/CVPR.2012.6248005"},{"key":"192_CR7","doi-asserted-by":"crossref","unstructured":"Zhang J, Kan C, Schwing AG, Urtasun R (2013) Estimating the 3d layout of indoor scenes and its clutter from depth sensors. In: IEEE international conference on computer vision, ICCV 2013, pp 1273\u20131280","DOI":"10.1109\/ICCV.2013.161"},{"key":"192_CR8","doi-asserted-by":"crossref","unstructured":"Ramalingam S, Pillai JK, Jain A, Taguchi Y (2013) Manhattan junction catalogue for spatial reasoning of indoor scenes. In: 2013 IEEE conference on computer vision and pattern recognition, pp 3065\u20133072","DOI":"10.1109\/CVPR.2013.394"},{"key":"192_CR9","doi-asserted-by":"crossref","unstructured":"Mallya A, Lazebnik S (2015) Learning informative edge maps for indoor scene layout prediction. In: 2015 IEEE international conference on computer vision, ICCV 2015, pp 936\u2013944","DOI":"10.1109\/ICCV.2015.113"},{"key":"192_CR10","doi-asserted-by":"crossref","unstructured":"Dasgupta S, Fang K, Chen K, Savarese S (2016) Delay: robust spatial layout estimation for cluttered indoor scenes. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, pp 616\u2013624","DOI":"10.1109\/CVPR.2016.73"},{"key":"192_CR11","doi-asserted-by":"crossref","unstructured":"Ren Y, Li S, Chen C, Kuo CJ (2016) A coarse-to-fine indoor layout estimation (CFILE) method. In: Lai S, Lepetit V, Nishino K, Sato Y (eds) Computer vision\u2014ACCV 2016\u201413th Asian conference on computer vision, revised selected papers, Part V. Lecture notes in computer science, vol 10115, pp 36\u201351","DOI":"10.1007\/978-3-319-54193-8_3"},{"key":"192_CR12","doi-asserted-by":"crossref","unstructured":"Zhang W, Zhang W, Gu J (2019) Edge-semantic learning strategy for layout estimation in indoor environment. CoRR arXiv:1901.00621","DOI":"10.1109\/TCYB.2019.2895837"},{"key":"192_CR13","doi-asserted-by":"crossref","unstructured":"Kruzhilov I, Romanov M, Babichev D, Konushin A (2019) Double refinement network for room layout estimation. In: Palaiahnakote S, di Baja GS, Wang L, Yan WQ (eds) Pattern recognition\u20145th Asian Conference, ACPR 2019, Revised selected papers, Part I. Lecture notes in computer science, vol 12046, pp 557\u2013568","DOI":"10.1007\/978-3-030-41404-7_39"},{"key":"192_CR14","doi-asserted-by":"crossref","unstructured":"Lee C, Badrinarayanan V, Malisiewicz T, Rabinovich A (2017) Roomnet: end-to-end room layout estimation. In: IEEE international conference on computer vision, ICCV 2017, pp 4875\u20134884","DOI":"10.1109\/ICCV.2017.521"},{"key":"192_CR15","doi-asserted-by":"crossref","unstructured":"Hirzer M, Roth PM, Lepetit V (2020) Smart hypothesis generation for efficient and robust room layout estimation. In: IEEE winter conference on applications of computer vision, WACV 2020, pp 2901\u20132909","DOI":"10.1109\/WACV45572.2020.9093451"},{"key":"192_CR16","doi-asserted-by":"crossref","unstructured":"Zou C, Colburn A, Shan Q, Hoiem D (2018) Layoutnet: reconstructing the 3d room layout from a single RGB image. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, pp 2051\u20132059","DOI":"10.1109\/CVPR.2018.00219"},{"issue":"12","key":"192_CR17","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","volume":"39","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder\u2013decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481\u20132495","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"1","key":"192_CR18","first-page":"1929","volume":"15","author":"N Srivastava","year":"2014","unstructured":"Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929\u20131958","journal-title":"J Mach Learn Res"},{"key":"192_CR19","doi-asserted-by":"crossref","unstructured":"Lin T, Goyal P, Girshick RB, He K, Doll\u00e1r P (2017) Focal loss for dense object detection. In: IEEE international conference on computer vision, ICCV 2017, pp 2999\u20133007","DOI":"10.1109\/ICCV.2017.324"},{"key":"192_CR20","unstructured":"Zhang Y, Yu F, Song S, Xu P, Seff A, Xiao J (2015) Large-scale scene understanding challenge: room layout estimation. In: CVPR Workshop"},{"key":"192_CR21","doi-asserted-by":"crossref","unstructured":"Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) SUN database: large-scale scene recognition from abbey to zoo. In: The twenty-third IEEE conference on computer vision and pattern recognition, CVPR 2010, pp 3485\u20133492","DOI":"10.1109\/CVPR.2010.5539970"},{"key":"192_CR22","unstructured":"Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Bengio Y, LeCun Y (eds) Proceedings of 3rd international conference on learning representations, ICLR 2015"},{"key":"192_CR23","doi-asserted-by":"crossref","unstructured":"Xie S, Girshick RB, Doll\u00e1r P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, pp 5987\u20135995","DOI":"10.1109\/CVPR.2017.634"},{"key":"192_CR24","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, K\u00f6pf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, pp 8024\u20138035"},{"key":"192_CR25","doi-asserted-by":"crossref","unstructured":"Zhao H, Lu M, Yao A, Guo Y, Chen Y, Zhang L (2017) Physics inspired optimization on semantic transfer features: an alternative method for room layout estimation. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, pp 870\u2013878","DOI":"10.1109\/CVPR.2017.99"}],"container-title":["Data Science and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-022-00192-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41019-022-00192-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-022-00192-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,8]],"date-time":"2022-09-08T17:17:09Z","timestamp":1662657429000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41019-022-00192-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,12]]},"references-count":25,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["192"],"URL":"https:\/\/doi.org\/10.1007\/s41019-022-00192-6","relation":{},"ISSN":["2364-1185","2364-1541"],"issn-type":[{"value":"2364-1185","type":"print"},{"value":"2364-1541","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,12]]},"assertion":[{"value":"2 March 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 July 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 August 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 August 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"There is no conflict of interest in this manuscript.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"All authors agreed to participate","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"Not applicable.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}