{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T16:30:52Z","timestamp":1753893052729,"version":"3.41.2"},"reference-count":42,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,5,2]],"date-time":"2023-05-02T00:00:00Z","timestamp":1682985600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Robot. AI"],"abstract":"<jats:p>We present Affordance Recognition with One-Shot Human Stances (AROS), a one-shot learning approach that uses an explicit representation of interactions between highly articulated human poses and 3D scenes. The approach is one-shot since it does not require iterative training or retraining to add new affordance instances. Furthermore, only one or a small handful of examples of the target pose are needed to describe the interactions. Given a 3D mesh of a previously unseen scene, we can predict affordance locations that support the interactions and generate corresponding articulated 3D human bodies around them. We evaluate the performance of our approach on three public datasets of scanned real environments with varied degrees of noise. Through rigorous statistical analysis of crowdsourced evaluations, our results show that our one-shot approach is preferred up to 80% of the time over data-intensive baselines.<\/jats:p>","DOI":"10.3389\/frobt.2023.1076780","type":"journal-article","created":{"date-parts":[[2023,5,2]],"date-time":"2023-05-02T04:44:42Z","timestamp":1683002682000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["AROS: Affordance Recognition with One-Shot Human Stances"],"prefix":"10.3389","volume":"10","author":[{"given":"Abel","family":"Pacheco-Ortega","sequence":"first","affiliation":[]},{"given":"Walterio","family":"Mayol-Cuevas","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2023,5,2]]},"reference":[{"key":"B1","first-page":"39","volume-title":"An introduction to categorical data analysis","author":"Agresti","year":"2018"},{"key":"B2","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1080\/00220973.1995.9943797","article-title":"Multiple regression approach to analyzing contingency tables: Post hoc and planned comparison procedures","volume":"64","author":"Beasley","year":"1995","journal-title":"J. Exp. Educ."},{"article-title":"YOLOv4: Optimal speed and accuracy of object detection","year":"2020","author":"Bochkovskiy","key":"B3"},{"key":"B4","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1007\/978-3-030-58452-8_13","article-title":"End-to-End object detection with transformers","volume-title":"Computer Vision \u2013 ECCV 2020","author":"Carion","year":"2020"},{"key":"B5","first-page":"667","article-title":"Matterport3D: Learning from RGB-D data in indoor environments","volume-title":"International conference on 3D vision (3DV)","author":"Chang","year":"2017"},{"key":"B6","first-page":"260","article-title":"The two-dimensional case","volume-title":"Mathematical methods of statistics","author":"Cramer","year":"1946"},{"key":"B7","first-page":"11589","article-title":"SpineNet: Learning scale-permuted backbone for recognition and localization","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR)","author":"Du","year":"2020"},{"key":"B8","doi-asserted-by":"publisher","DOI":"10.1002\/eji.201445290","article-title":"In Defense of the direct perception of affordances","author":"Fouhey","year":"2015"},{"key":"B9","doi-asserted-by":"publisher","first-page":"448","DOI":"10.1177\/1098214011426594","article-title":"The chi-square test: Often used and more often misinterpreted","volume":"33","author":"Franke","year":"2012","journal-title":"Am. J. Eval."},{"key":"B10","article-title":"The theory of affordances","volume-title":"Perceiving, acting and knowing. Toward and ecological psychology","author":"Gibson","year":"1977"},{"key":"B11","first-page":"1529","article-title":"What makes a chair a chair?","volume-title":"2011 IEEE conference on computer vision and pattern recognition (CVPR)","author":"Grabner","year":"2011"},{"key":"B12","doi-asserted-by":"crossref","first-page":"1961","DOI":"10.1109\/CVPR.2011.5995448","article-title":"From 3D scene geometry to human workspace","volume-title":"CVPR 2011","author":"Gupta","year":"2011"},{"key":"B13","first-page":"2282","article-title":"Resolving 3D human pose ambiguities with 3D scene constraints","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"Hassan","year":"2019"},{"key":"B14","first-page":"14708","article-title":"Populating 3D scenes by learning human-scene interaction","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Hassan","year":"2021"},{"key":"B15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2897824.2925870","article-title":"Learning how objects function via co-analysis of interactions","volume":"35","author":"Hu","year":"2016","journal-title":"ACM Trans. Graph."},{"key":"B16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2766914","article-title":"Interaction context (ICON): Towards a geometric functionality descriptor","volume":"34","author":"Hu","year":"2015","journal-title":"ACM Trans. Graph."},{"key":"B17","doi-asserted-by":"publisher","first-page":"2040","DOI":"10.1109\/TPAMI.2015.2501811","article-title":"Modeling 3d environments through hidden human context","volume":"38","author":"Jiang","year":"2016","journal-title":"IEEE Trans. Pattern Analysis Mach. Intell."},{"key":"B18","first-page":"12368","article-title":"Putting humans in a scene: Learning affordance in 3d indoor environments","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Li","year":"2019"},{"key":"B19","first-page":"769","article-title":"Learning to segment affordances","volume-title":"The IEEE international conference on computer vision (ICCV) workshops","author":"Luddecke","year":"2017"},{"key":"B20","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1109\/3DV53792.2021.00022","article-title":"Mix3D: Out-of-Context data augmentation for 3D scenes","volume-title":"2021 international conference on 3D vision (3DV)","author":"Nekrasov","year":"2021"},{"key":"B21","first-page":"10967","article-title":"Expressive body capture: 3d hands, face, and body from a single image","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Pavlakos","year":"2019"},{"key":"B22","doi-asserted-by":"publisher","first-page":"202","DOI":"10.1006\/gmod.1999.0521","article-title":"Geometric properties of bisector surfaces","volume":"62","author":"Peternell","year":"2000","journal-title":"Graph. Models"},{"key":"B23","doi-asserted-by":"crossref","first-page":"2035","DOI":"10.1109\/ROBIO.2015.7419073","article-title":"Affordance-map: Mapping human context in 3D scenes using cost-sensitive SVM and virtual human models","volume-title":"2015 IEEE international conference on Robotics and biomimetics (ROBIO)","author":"Piyathilaka","year":"2015"},{"key":"B24","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1109\/CVPR.2016.69","article-title":"Learning action maps of large environments via first-person vision","volume-title":"2016 IEEE conference on computer vision and pattern recognition (CVPR)","author":"Rhinehart","year":"2016"},{"key":"B25","first-page":"186","article-title":"A multi-scale CNN for affordance segmentation in RGB images","volume-title":"European conference on computer vision","author":"Roy","year":"2016"},{"key":"B26","doi-asserted-by":"publisher","first-page":"45","DOI":"10.3389\/fnbot.2020.00045","article-title":"Geometric affordance perception: Leveraging deep 3D saliency with the interaction tensor","volume":"14","author":"Ruiz","year":"2020","journal-title":"Front. Neurorobotics"},{"key":"B27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2661229.2661230","article-title":"SceneGrok: Inferring action maps in 3D environments","volume":"33","author":"Savva","year":"2014","journal-title":"ACM Trans. Graph. (TOG)"},{"key":"B28","doi-asserted-by":"publisher","first-page":"591","DOI":"10.2307\/2333709","article-title":"An analysis of variance test for normality (complete samples)","volume":"52","author":"Shapiro","year":"1965","journal-title":"Biometrika"},{"key":"B29","doi-asserted-by":"publisher","first-page":"626","DOI":"10.2307\/2283989","article-title":"Rectangular confidence regions for the means of multivariate normal distributions","volume":"62","author":"\u0160id\u00e1k","year":"1967","journal-title":"J. Am. Stat. Assoc."},{"key":"B30","first-page":"746","article-title":"Indoor segmentation and support inference from RGBD images","volume-title":"European conference on computer vision","author":"Silberman","year":"2012"},{"article-title":"The Replica dataset: A digital Replica of indoor spaces","year":"2019","author":"Straub","key":"B31"},{"key":"B32","first-page":"2596","article-title":"Binge watching: Scaling affordance learning from sitcoms","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Wang","year":"2017"},{"key":"B33","doi-asserted-by":"crossref","first-page":"7240","DOI":"10.1109\/ICRA40945.2020.9197384","article-title":"Is that a chair? Imagining affordances using simulations of an articulated human body","volume-title":"2020 IEEE international conference on Robotics and automation (ICRA)","author":"Wu","year":"2020"},{"key":"B34","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1111\/cgf.12538","article-title":"Sample elimination for generating Poisson disk sample sets","volume":"34","author":"Yuksel","year":"2015","journal-title":"Comput. Graph. Forum"},{"key":"B35","doi-asserted-by":"crossref","DOI":"10.1109\/CVPRW56347.2022.00309","article-title":"ResNeSt: Split-Attention networks","author":"Zhang","year":""},{"key":"B36","first-page":"642","article-title":"Place: Proximity learning of articulation and contact in 3D environments","volume-title":"8th international conference on 3D Vision (3DV 2020)","author":"Zhang","year":""},{"key":"B37","first-page":"6193","article-title":"Generating 3D people in scenes without people","volume-title":"The IEEE\/CVF conference on computer vision and pattern recognition (CVPR)","author":"Zhang","year":""},{"key":"B38","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1111\/cgf.13112","article-title":"Character-object interaction retrieval using the interaction bisector surface","volume":"36","author":"Zhao","year":"2017","journal-title":"Eurogr. Symposium Geometry Process."},{"key":"B39","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2980179.2982410","article-title":"Relationship templates for creating scene variations","volume":"35","author":"Zhao","year":"2016","journal-title":"ACM Trans. Graph."},{"key":"B40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2574860","article-title":"Indexing 3D scenes using the interaction bisector surface","volume":"33","author":"Zhao","year":"2014","journal-title":"ACM Trans. Graph."},{"key":"B41","first-page":"5122","article-title":"Scene parsing through ADE20K dataset","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Zhou","year":"2017"},{"key":"B42","first-page":"3823","article-title":"Inferring forces and learning human utilities from videos","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"Zhu","year":"2016"}],"container-title":["Frontiers in Robotics and AI"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2023.1076780\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,2]],"date-time":"2023-05-02T04:44:59Z","timestamp":1683002699000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2023.1076780\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,2]]},"references-count":42,"alternative-id":["10.3389\/frobt.2023.1076780"],"URL":"https:\/\/doi.org\/10.3389\/frobt.2023.1076780","relation":{},"ISSN":["2296-9144"],"issn-type":[{"type":"electronic","value":"2296-9144"}],"subject":[],"published":{"date-parts":[[2023,5,2]]},"article-number":"1076780"}}