{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,26]],"date-time":"2026-04-26T00:47:48Z","timestamp":1777164468268,"version":"3.51.4"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2021,11,12]],"date-time":"2021-11-12T00:00:00Z","timestamp":1636675200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61702073"],"award-info":[{"award-number":["61702073"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"crossref","award":["2019M661079"],"award-info":[{"award-number":["2019M661079"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61772102"],"award-info":[{"award-number":["61772102"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Liaoning Collaborative Fund","award":["2020-HYLH-17"],"award-info":[{"award-number":["2020-HYLH-17"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2021,11,30]]},"abstract":"<jats:p>In this article, we propose a framework for crowd behavior prediction in complicated scenarios. The fundamental framework is designed using the standard encoder-decoder scheme, which is built upon the long short-term memory module to capture the temporal evolution of crowd behaviors. To model interactions among humans and environments, we embed both the social and the physical attention mechanisms into the long short-term memory. The social attention component can model the interactions among different pedestrians, whereas the physical attention component helps to understand the spatial configurations of the scene. Since pedestrians\u2019 behaviors demonstrate multi-modal properties, we use the generative model to produce multiple acceptable future paths. The proposed framework not only predicts an individual\u2019s trajectory accurately but also forecasts the ongoing group behaviors by leveraging on the coherent filtering approach. Experiments are carried out on the standard crowd benchmarks (namely, the ETH, the UCY, the CUHK crowd, and the CrowdFlow datasets), which demonstrate that the proposed framework is effective in forecasting crowd behaviors in complex scenarios.<\/jats:p>","DOI":"10.1145\/3449359","type":"journal-article","created":{"date-parts":[[2021,11,12]],"date-time":"2021-11-12T21:16:06Z","timestamp":1636751766000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Where Are They Going? Predicting Human Behaviors in Crowded Scenes"],"prefix":"10.1145","volume":"17","author":[{"given":"Bo","family":"Zhang","sequence":"first","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rui","family":"Zhang","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niccolo","family":"Bisagno","sequence":"additional","affiliation":[{"name":"DISI, University of Trento, Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nicola","family":"Conci","sequence":"additional","affiliation":[{"name":"DISI, University of Trento, Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Francesco G. B.","family":"De Natale","sequence":"additional","affiliation":[{"name":"DISI, University of Trento, Trento, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongbo","family":"Liu","sequence":"additional","affiliation":[{"name":"Dalian Maritime University, Dalian, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,11,12]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.110"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.283"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2007.382977"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-12304-7_27"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.365"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_42"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2018.8545447"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2018.09.002"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.176"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3052930"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00240"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2012.6239348"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.51.4282"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33765-9_15"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.11.021"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2016.2580401"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.233"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2014.2358029"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.111"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.5555\/1623264.1623280"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.64"},{"key":"e_1_3_1_23_2","unstructured":"Alexandre Robicquet Alexandre Alahi Amir Sadeghian Bryan Anenberg John Doherty Eli Wu and Silvio Savarese. 2016. Forecasting social navigation in crowded complex scenes. arXiv:1601.00998."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_33"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1177\/0278364920917446"},{"key":"e_1_3_1_26_2","doi-asserted-by":"crossref","unstructured":"Amir Sadeghian Vineet Kosaraju Ali Sadeghian Noriaki Hirose and Silvio Savarese. 2018. Sophie: An attentive GAN for predicting paths compliant to social and physical constraints. arXiv:1806.01482.","DOI":"10.1109\/CVPR.2019.00144"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/AVSS.2018.8639113"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2016.2539878"},{"key":"e_1_3_1_29_2","first-page":"593","volume-title":"Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition","author":"Shi Jianbo","year":"1994","unstructured":"Jianbo Shi and Carlo Tomasi. 1994. Good features to track. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 593\u2013600."},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.123"},{"key":"e_1_3_1_31_2","first-page":"1928","volume-title":"Proceedings of the IEEE International Conference on Robotics and Automation","author":"Berg Jur Van den","year":"2008","unstructured":"Jur Van den Berg, Ming Lin, and Dinesh Manocha. 2008. Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, Los Alamitos, CA, 1928\u20131935."},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460504"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/2856400.2856410"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298971"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_16"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/2772879.2773256"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.5555\/2964398.2964462"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-014-0735-3"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3449359","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3449359","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:56Z","timestamp":1750197716000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3449359"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,12]]},"references-count":37,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,11,30]]}},"alternative-id":["10.1145\/3449359"],"URL":"https:\/\/doi.org\/10.1145\/3449359","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,12]]},"assertion":[{"value":"2020-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}