{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T04:09:36Z","timestamp":1778645376851,"version":"3.51.4"},"reference-count":73,"publisher":"SAGE Publications","issue":"8","license":[{"start":{"date-parts":[[2013,7,1]],"date-time":"2013-07-01T00:00:00Z","timestamp":1372636800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2013,7]]},"abstract":"<jats:p>Understanding human activities and object affordances are two very important skills, especially for personal robots which operate in human environments. In this work, we consider the problem of extracting a descriptive labeling of the sequence of sub-activities being performed by a human, and more importantly, of their interactions with the objects in the form of associated affordances. Given a RGB-D video, we jointly model the human activities and object affordances as a Markov random field where the nodes represent objects and sub-activities, and the edges represent the relationships between object affordances, their relations with sub-activities, and their evolution over time. We formulate the learning problem using a structural support vector machine (SSVM) approach, where labelings over various alternate temporal segmentations are considered as latent variables. We tested our method on a challenging dataset comprising 120 activity videos collected from 4 subjects, and obtained an accuracy of 79.4% for affordance, 63.4% for sub-activity and 75.0% for high-level activity labeling. We then demonstrate the use of such descriptive labeling in performing assistive tasks by a PR2 robot.<\/jats:p>","DOI":"10.1177\/0278364913478446","type":"journal-article","created":{"date-parts":[[2013,7,11]],"date-time":"2013-07-11T06:08:48Z","timestamp":1373522928000},"page":"951-970","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":535,"title":["Learning human activities and object affordances from RGB-D           videos"],"prefix":"10.1177","volume":"32","author":[{"given":"Hema Swetha","family":"Koppula","sequence":"first","affiliation":[{"name":"Department of Computer Science, Cornell University,\r          USA"}]},{"given":"Rudhir","family":"Gupta","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Cornell University,\r          USA"}]},{"given":"Ashutosh","family":"Saxena","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Cornell University,\r          USA"}]}],"member":"179","published-online":{"date-parts":[[2013,7,11]]},"reference":[{"key":"bibr1-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1145\/1922649.1922653"},{"key":"bibr2-0278364913478446","author":"Aksoy E","year":"2010","journal-title":"Proceedings of ICRA"},{"key":"bibr3-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911410459"},{"key":"bibr4-0278364913478446","author":"Aldoma A","year":"2012","journal-title":"Proceedings of ICRA"},{"key":"bibr5-0278364913478446","author":"Anand A","year":"2012","journal-title":"The International Journal of Robotics Research"},{"key":"bibr6-0278364913478446","author":"Bollini M","year":"2012","journal-title":"Proceedings of ISER"},{"key":"bibr7-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364912437213"},{"key":"bibr8-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911401765"},{"key":"bibr9-0278364913478446","author":"Dalal N","year":"2005","journal-title":"Proceedings of CVPR"},{"key":"bibr10-0278364913478446","unstructured":"Diankov R (2010) Automated Construction of Robotic Manipulation             Programs. PhD thesis, Carnegie Mellon University, Robotics             Institute. http:\/\/www.programmingvision.com\/rosen_diankov_thesis.pdf."},{"key":"bibr11-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000022288.19776.77"},{"key":"bibr12-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390195"},{"key":"bibr13-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2010.2102756"},{"key":"bibr14-0278364913478446","author":"Gall J","year":"2011","journal-title":"Proceedings of CVPR"},{"key":"bibr15-0278364913478446","volume-title":"The Ecological Approach to Visual Perception","author":"Gibson J","year":"1979"},{"key":"bibr16-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2009.83"},{"key":"bibr17-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1007\/BF02612354"},{"key":"bibr18-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911434148"},{"key":"bibr19-0278364913478446","volume-title":"ICRA: Workshop on Semantic Perception, Mapping, and Exploration","author":"Hermans T","year":"2011"},{"key":"bibr20-0278364913478446","volume-title":"Proceedings of International Conference on Artificial Intelligence and Statistics","author":"Hoai M","year":"2012"},{"key":"bibr21-0278364913478446","author":"Hoai M","year":"2011","journal-title":"Proceedings of CVPR"},{"key":"bibr22-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2011.2129870"},{"key":"bibr23-0278364913478446","author":"Jiang Y","year":"2012","journal-title":"Proceedings of ICML"},{"key":"bibr24-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364912438781"},{"key":"bibr25-0278364913478446","author":"Jiang Y","year":"2011","journal-title":"Proceedings of ICRA"},{"key":"bibr26-0278364913478446","author":"Jiang Y","year":"2012","journal-title":"Proceedings of ISER"},{"key":"bibr27-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-009-5108-8"},{"key":"bibr28-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2010.08.002"},{"key":"bibr29-0278364913478446","volume-title":"Probabilistic Graphical Models: Principles and Techniques","author":"Koller D","year":"2009"},{"key":"bibr30-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911428653"},{"key":"bibr31-0278364913478446","author":"Koppula H","year":"2011","journal-title":"Proceedings of NIPS"},{"key":"bibr32-0278364913478446","author":"Koppula HS","year":"2012","journal-title":"CoRR abs\/1208.0967"},{"key":"bibr33-0278364913478446","author":"Kormushev P","year":"2010","journal-title":"Proceedings of IROS"},{"key":"bibr34-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911403178"},{"key":"bibr35-0278364913478446","author":"Lai K","year":"2011","journal-title":"Proceedings of ICRA"},{"key":"bibr36-0278364913478446","author":"Lai K","year":"2011","journal-title":"Proceedings of ICRA"},{"key":"bibr37-0278364913478446","author":"Laptev I","year":"2008","journal-title":"Proceedings of CVPR"},{"key":"bibr38-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.232"},{"key":"bibr39-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2010.5543273"},{"key":"bibr40-0278364913478446","author":"Liu J","year":"2009","journal-title":"Proceedings of CVPR"},{"key":"bibr41-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCB.2005.846654"},{"key":"bibr42-0278364913478446","author":"Matikainen P","year":"2012","journal-title":"Proceedings of CVPR"},{"key":"bibr43-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911430417"},{"key":"bibr44-0278364913478446","author":"Moldovan B","year":"2012","journal-title":"Latest Advances in Inductive Logic Programming"},{"key":"bibr45-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2007.914848"},{"key":"bibr46-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2011.6130379"},{"key":"bibr47-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1007\/s12369-009-0043-1"},{"key":"bibr48-0278364913478446","first-page":"791","author":"Pandey A","year":"2012","journal-title":"IEEE RO-MAN"},{"key":"bibr49-0278364913478446","author":"Pandey AK","year":"2010","journal-title":"Proceedings of IROS"},{"key":"bibr50-0278364913478446","author":"Pele O","year":"2008","journal-title":"Proceedings of ECCV"},{"key":"bibr51-0278364913478446","author":"Pirsiavash H","year":"2012","journal-title":"Proceedings of CVPR"},{"key":"bibr52-0278364913478446","volume-title":"Proceedings of the Fourteenth Computer Vision Winter Workshop (CVWW)","author":"Ridge B","year":"2009"},{"key":"bibr53-0278364913478446","author":"Rohrbach M","year":"2012","journal-title":"Proceedings of CVPR"},{"key":"bibr54-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364911408155"},{"key":"bibr55-0278364913478446","author":"Rother C","year":"2007","journal-title":"Proceedings of CVPR"},{"key":"bibr56-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2010.5651280"},{"key":"bibr57-0278364913478446","author":"Rusu RB","year":"2009","journal-title":"Proceedings of IROS"},{"key":"bibr58-0278364913478446","author":"Sadanand S","year":"2012","journal-title":"Proceedings of CVPR"},{"key":"bibr59-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364907087172"},{"key":"bibr60-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2008.132"},{"key":"bibr61-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-010-0384-0"},{"key":"bibr62-0278364913478446","author":"Shotton J","year":"2011","journal-title":"Proceedings of CVPR"},{"key":"bibr63-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1177\/0278364909356602"},{"key":"bibr64-0278364913478446","author":"Sung J","year":"2012","journal-title":"Proceedings of ICRA"},{"key":"bibr65-0278364913478446","volume-title":"AAAI workshop on Pattern, Activity and Intent Recognition (PAIR)","author":"Sung JY","year":"2011"},{"key":"bibr66-0278364913478446","author":"Tang K","year":"2012","journal-title":"Proceedings of CVPR"},{"key":"bibr67-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015444"},{"key":"bibr68-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015341"},{"key":"bibr69-0278364913478446","author":"Yang W","year":"2010","journal-title":"Proceedings of CVPR"},{"key":"bibr70-0278364913478446","author":"Yao B","year":"2010","journal-title":"Proceedings of CVPR"},{"key":"bibr71-0278364913478446","author":"Yao B","year":"2011","journal-title":"Proceedings of ICCV"},{"key":"bibr72-0278364913478446","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553523"},{"key":"bibr73-0278364913478446","author":"Zhang H","year":"2011","journal-title":"Proceedings of IROS"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364913478446","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364913478446","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T10:18:10Z","timestamp":1777457890000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0278364913478446"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,7]]},"references-count":73,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2013,7]]}},"alternative-id":["10.1177\/0278364913478446"],"URL":"https:\/\/doi.org\/10.1177\/0278364913478446","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"value":"0278-3649","type":"print"},{"value":"1741-3176","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,7]]}}}