{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T09:58:57Z","timestamp":1777715937979,"version":"3.51.4"},"reference-count":86,"publisher":"SAGE Publications","issue":"6","license":[{"start":{"date-parts":[[2019,5,8]],"date-time":"2019-05-08T00:00:00Z","timestamp":1557273600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2022,5]]},"abstract":"<jats:p>This paper focuses on vision-based pose estimation for multiple rigid objects placed in clutter, especially in cases involving occlusions and objects resting on each other. Progress has been achieved recently in object recognition given advancements in deep learning. Nevertheless, such tools typically require a large amount of training data and significant manual effort to label objects. This limits their applicability in robotics, where solutions must scale to a large number of objects and variety of conditions. Moreover, the combinatorial nature of the scenes that could arise from the placement of multiple objects is difficult to capture in the training dataset. Thus, the learned models might not produce the desired level of precision required for tasks, such as robotic manipulation. This work proposes an autonomous process for pose estimation that spans from data generation to scene-level reasoning and self-learning. In particular, the proposed framework first generates a labeled dataset for training a convolutional neural network (CNN) for object detection in clutter. These detections are used to guide a scene-level optimization process, which considers the interactions between the different objects present in the clutter to output pose estimates of high precision. Furthermore, confident estimates are used to label online real images from multiple views and re-train the process in a self-learning pipeline. Experimental results indicate that this process is quickly able to identify in cluttered scenes physically consistent object poses that are more precise than those found by reasoning over individual instances of objects. Furthermore, the quality of pose estimates increases over time given the self-learning process.<\/jats:p>","DOI":"10.1177\/0278364919846551","type":"journal-article","created":{"date-parts":[[2019,5,8]],"date-time":"2019-05-08T03:11:26Z","timestamp":1557285086000},"page":"615-636","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":16,"title":["Physics-based scene-level reasoning for object pose estimation in clutter"],"prefix":"10.1177","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8547-2634","authenticated-orcid":false,"given":"Chaitanya","family":"Mitash","sequence":"first","affiliation":[{"name":"Computer Science Department, Rutgers University, Piscataway, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Abdeslam","family":"Boularias","sequence":"additional","affiliation":[{"name":"Computer Science Department, Rutgers University, Piscataway, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0675-3324","authenticated-orcid":false,"given":"Kostas","family":"Bekris","sequence":"additional","affiliation":[{"name":"Computer Science Department, Rutgers University, Piscataway, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2019,5,8]]},"reference":[{"key":"bibr1-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/1360612.1360684"},{"key":"bibr2-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-49409-8_51"},{"key":"bibr3-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2012.2206675"},{"key":"bibr4-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33712-3_37"},{"key":"bibr5-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2013.6630859"},{"key":"bibr6-0278364919846551","first-page":"1027","volume-title":"Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms","author":"Arthur D","year":"2007"},{"key":"bibr7-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(81)90009-1"},{"key":"bibr8-0278364919846551","author":"Besl PJ","year":"1992","journal-title":"Method for Registration of 3D Shapes"},{"key":"bibr9-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2015.65"},{"key":"bibr10-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913514283"},{"issue":"5","key":"bibr11-0278364919846551","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/cgf.12167","volume":"32","author":"Bouazix S","year":"2013","journal-title":"Computer Graphics Forum"},{"key":"bibr12-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10605-2_35"},{"key":"bibr13-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.366"},{"key":"bibr14-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.443"},{"key":"bibr15-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1186\/s40064-016-1906-1"},{"key":"bibr16-0278364919846551","first-page":"2441","volume-title":"IEEE International Conference on Robotics and Automation (ICRA)","author":"Cao Z","year":"2016"},{"key":"bibr17-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.15"},{"key":"bibr18-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2002.1047997"},{"key":"bibr19-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386067"},{"key":"bibr20-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911401765"},{"key":"bibr21-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/TASE.2016.2600527"},{"key":"bibr22-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/1014052.1014118"},{"key":"bibr23-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/3DIMPVT.2012.53"},{"key":"bibr24-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5540108"},{"key":"bibr25-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2006.1638022"},{"key":"bibr26-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46478-7_10"},{"key":"bibr27-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/358669.358692"},{"key":"bibr28-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/358669.358692"},{"key":"bibr29-0278364919846551","volume-title":"Proceedings of the Third Eurographics Symposium on Geometry Processing","author":"Gelfand N","year":"2005"},{"key":"bibr30-0278364919846551","first-page":"613","volume-title":"Robot World Cup","author":"Hernandez C","year":"2016"},{"key":"bibr31-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33885-4_60"},{"key":"bibr32-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46487-9_51"},{"key":"bibr33-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01249-6_2"},{"key":"bibr34-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2015.7354005"},{"key":"bibr35-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/237218.237240"},{"key":"bibr36-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/2047196.2047270"},{"key":"bibr37-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/34.765655"},{"key":"bibr38-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.169"},{"key":"bibr39-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46487-9_13"},{"key":"bibr40-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2011.6094527"},{"key":"bibr41-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-34327-8_15"},{"key":"bibr42-0278364919846551","first-page":"282","volume":"6","author":"Kocsis L","year":"2006","journal-title":"ECML"},{"key":"bibr43-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.115"},{"key":"bibr44-0278364919846551","volume-title":"International Conference on Simulation, Modeling and Programming for Autonomous Robots (SIMPAR)","author":"Littlefield Z","year":"2015"},{"key":"bibr45-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"bibr46-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.1999.790410"},{"key":"bibr47-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2018.00015"},{"key":"bibr48-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.292"},{"key":"bibr49-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12446"},{"key":"bibr50-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.20"},{"key":"bibr51-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8202206"},{"key":"bibr52-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8461163"},{"key":"bibr53-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/1057432.1057435"},{"key":"bibr54-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-49409-8_18"},{"key":"bibr55-0278364919846551","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2016.XII.023"},{"key":"bibr56-0278364919846551","first-page":"135","volume-title":"Asian Conference on Computer Vision","author":"Papazov C","year":"2010"},{"key":"bibr57-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989233"},{"key":"bibr58-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.151"},{"key":"bibr59-0278364919846551","author":"Pillai S","year":"2015","journal-title":"Robotics: Science and Systems"},{"key":"bibr60-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"bibr61-0278364919846551","first-page":"91","author":"Ren S","year":"2015","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr62-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2016.2532924"},{"key":"bibr63-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-005-3674-1"},{"key":"bibr64-0278364919846551","first-page":"145","author":"Rusinkiewicz S","year":"2001","journal-title":"IEEE Proceedings of 3DIM"},{"key":"bibr65-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.2009.5152473"},{"key":"bibr66-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.178"},{"key":"bibr67-0278364919846551","first-page":"4","volume":"2","author":"Segal A","year":"2009","journal-title":"Robotics: Science and Systems"},{"key":"bibr68-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.241"},{"key":"bibr69-0278364919846551","volume-title":"International Conference on Learning Representations (ICLR)","author":"Simonyan K","year":"2015"},{"key":"bibr70-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6906903"},{"key":"bibr71-0278364919846551","volume-title":"International Symposium on Robotics Research (ISRR)","author":"Srivatsan RA","year":"2017"},{"key":"bibr72-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8462971"},{"key":"bibr73-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.308"},{"key":"bibr74-0278364919846551","doi-asserted-by":"publisher","DOI":"10.5244\/C.28.82"},{"key":"bibr75-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10599-4_30"},{"key":"bibr76-0278364919846551","volume-title":"Probabilistic Robotics","author":"Thrun S","year":"2005"},{"key":"bibr77-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8202133"},{"key":"bibr78-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/PSIVT.2010.65"},{"key":"bibr79-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15558-1_26"},{"key":"bibr80-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAR.2018.8384709"},{"key":"bibr81-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298930"},{"key":"bibr82-0278364919846551","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2018.XIV.019"},{"key":"bibr83-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989165"},{"key":"bibr84-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1145\/1236246.1236270"},{"key":"bibr85-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46475-6_47"},{"key":"bibr86-0278364919846551","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364919846551","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0278364919846551","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364919846551","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T10:16:55Z","timestamp":1777457815000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0278364919846551"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,5,8]]},"references-count":86,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,5]]}},"alternative-id":["10.1177\/0278364919846551"],"URL":"https:\/\/doi.org\/10.1177\/0278364919846551","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"value":"0278-3649","type":"print"},{"value":"1741-3176","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,5,8]]}}}