{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T15:03:57Z","timestamp":1775228637006,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,8,3]],"date-time":"2023-08-03T00:00:00Z","timestamp":1691020800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Federal Ministry for Economic Affairs and Climate Action (BMWK)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>With the ongoing development of automated driving systems, the crucial task of predicting pedestrian behavior is attracting growing attention. The prediction of future pedestrian trajectories from the ego-vehicle camera perspective is particularly challenging due to the dynamically changing scene. Therefore, we present Behavior-Aware Pedestrian Trajectory Prediction (BA-PTP), a novel approach to pedestrian trajectory prediction for ego-centric camera views. It incorporates behavioral features extracted from real-world traffic scene observations such as the body and head orientation of pedestrians, as well as their pose, in addition to positional information from body and head bounding boxes. For each input modality, we employed independent encoding streams that are combined through a modality attention mechanism. To account for the ego-motion of the camera in an ego-centric view, we introduced Spatio-Temporal Ego-Motion Module (STEMM), a novel approach to ego-motion prediction. Compared to the related works, it utilizes spatial goal points of the ego-vehicle that are sampled from its intended route. We experimentally validated the effectiveness of our approach using two datasets for pedestrian behavior prediction in urban traffic scenes. Based on ablation studies, we show the advantages of incorporating different behavioral features for pedestrian trajectory prediction in the image plane. Moreover, we demonstrate the benefit of integrating STEMM into our pedestrian trajectory prediction method, BA-PTP. BA-PTP achieves state-of-the-art performance on the PIE dataset, outperforming prior work by 7% in MSE-1.5 s and CMSE as well as 9% in CFMSE.<\/jats:p>","DOI":"10.3390\/make5030050","type":"journal-article","created":{"date-parts":[[2023,8,3]],"date-time":"2023-08-03T11:35:21Z","timestamp":1691062521000},"page":"957-978","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Behavior-Aware Pedestrian Trajectory Prediction in Ego-Centric Camera Views with Spatio-Temporal Ego-Motion Estimation"],"prefix":"10.3390","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9598-1125","authenticated-orcid":false,"given":"Phillip","family":"Czech","sequence":"first","affiliation":[{"name":"Perception & Maps Department, Mercedes-Benz AG, 71063 Sindelfingen, Germany"},{"name":"Institute of Signal Processing and System Theory, University of Stuttgart, 70550 Stuttgart, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1439-850X","authenticated-orcid":false,"given":"Markus","family":"Braun","sequence":"additional","affiliation":[{"name":"Perception & Maps Department, Mercedes-Benz AG, 71063 Sindelfingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ulrich","family":"Kre\u00dfel","sequence":"additional","affiliation":[{"name":"Perception & Maps Department, Mercedes-Benz AG, 71063 Sindelfingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8322-117X","authenticated-orcid":false,"given":"Bin","family":"Yang","sequence":"additional","affiliation":[{"name":"Institute of Signal Processing and System Theory, University of Stuttgart, 70550 Stuttgart, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zhang, C., and Berger, C. (2023). Pedestrian Behavior Prediction Using Deep Learning Methods for Urban Scenarios: A Review. IEEE Trans. Intell. Transp. Syst., early access.","DOI":"10.1109\/TITS.2023.3281393"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Rasouli, A., Kotseruba, I., Kunic, T., and Tsotsos, J.K. (November, January 27). PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and Trajectory Prediction. Proceedings of the IEEE\/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00636"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Bhattacharyya, A., Fritz, M., and Schiele, B. (2018, January 18\u201322). Long-term on-board prediction of people in traffic scenes under uncertainty. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00441"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2716","DOI":"10.1109\/LRA.2022.3145090","article-title":"Stepwise goal-driven networks for trajectory prediction","volume":"7","author":"Wang","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1463","DOI":"10.1109\/LRA.2021.3056339","article-title":"Bitrap: Bi-directional pedestrian trajectory prediction with multi-modal goal estimation","volume":"6","author":"Yao","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"012047","DOI":"10.1088\/1742-6596\/1621\/1\/012047","article-title":"Using graph convolutional networks skeleton-based pedestrian intention estimation models for trajectory prediction","volume":"1621","author":"Cao","year":"2020","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Sui, Z., Zhou, Y., Zhao, X., Chen, A., and Ni, Y. (October, January 27). Joint Intention and Trajectory Prediction Based on Transformer. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems, IROS 2021, Prague, Czech Republic.","DOI":"10.1109\/IROS51168.2021.9636241"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Su, Z., Huang, G., Zhang, S., and Hua, W. (2022, January 23\u201327). Crossmodal transformer based generative framework for pedestrian trajectory prediction. Proceedings of the International Conference on Robotics and Automation, ICRA 2022, Philadelphia, PA, USA.","DOI":"10.1109\/ICRA46639.2022.9812226"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1007\/s12204-023-2565-3","article-title":"Action-Aware Encoder-Decoder Network for Pedestrian Trajectory Prediction","volume":"28","author":"Fu","year":"2023","journal-title":"J. Shanghai Jiaotong Univ. (Sci.)"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Malla, S., Dariush, B., and Choi, C. (2020, January 13\u201319). Titan: Future forecast using action priors. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01120"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Czech, P., Braun, M., Kre\u00dfel, U., and Yang, B. (2022, January 12\u201314). On-Board Pedestrian Trajectory Prediction Using Behavioral Features. Proceedings of the 21st IEEE International Conference on Machine Learning and Applications (ICMLA), Nassau, Bahamas.","DOI":"10.1109\/ICMLA55696.2022.00070"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Lorenzo, J., Parra, I., Wirth, F., Stiller, C., Llorca, D.F., and Sotelo, M.A. (November, January 19). Rnn-based pedestrian crossing prediction using activity and pose-related features. Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2020, Las Vegas, NV, USA.","DOI":"10.1109\/IV47402.2020.9304652"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Yau, T., Malekmohammadi, S., Rasouli, A., Lakner, P., Rohani, M., and Luo, J. (June, January 30). Graph-sim: A graph-based spatiotemporal interaction modelling for pedestrian action prediction. Proceedings of the IEEE International Conference on Robotics and Automation, ICRA 2021, Xi\u2019an, China.","DOI":"10.1109\/ICRA48506.2021.9561107"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Rasouli, A., Rohani, M., and Luo, J. (2021, January 10\u201317). Bifold and semantic reasoning for pedestrian behavior prediction. Proceedings of the IEEE\/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01531"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kotseruba, I., Rasouli, A., and Tsotsos, J.K. (2021, January 3\u20138). Benchmark for evaluating pedestrian action prediction. Proceedings of the IEEE Winter Conference on Applications of Computer Vision, WACV 2021, Waikoloa, HI, USA.","DOI":"10.1109\/WACV48630.2021.00130"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Yao, Y., Atkins, E., Johnson-Roberson, M., Vasudevan, R., and Du, X. (2021, January 19\u201326). Coupling Intent and Action for Pedestrian Crossing Behavior Prediction. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event.","DOI":"10.24963\/ijcai.2021\/171"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Rasouli, A., Kotseruba, I., and Tsotsos, J.K. (2017, January 22\u201329). Are they going to cross? A benchmark dataset and baseline for pedestrian crosswalk behavior. Proceedings of the IEEE International Conference on Computer Vision Workshops, ICCV Workshops 2017, Venice, Italy.","DOI":"10.1109\/ICCVW.2017.33"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Fang, Z., and L\u00f3pez, A.M. (2018, January 26\u201330). Is the pedestrian going to cross? Answering by 2d pose estimation. Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2018, Changshu, Suzhou, China.","DOI":"10.1109\/IVS.2018.8500413"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"21050","DOI":"10.1109\/TITS.2022.3173537","article-title":"Pedestrian graph+: A fast pedestrian crossing prediction model based on graph convolutional networks","volume":"23","author":"Cadena","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3485","DOI":"10.1109\/LRA.2020.2976305","article-title":"Spatiotemporal relationship reasoning for pedestrian intent prediction","volume":"5","author":"Liu","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Varytimidis, D., Alonso-Fernandez, F., Duran, B., and Englund, C. (2018, January 26\u201329). Action and intention recognition of pedestrians in urban traffic. Proceedings of the 14th International Conference on Signal-Image Technology & Internet-Based Systems, SITIS 2018, Las Palmas de Gran Canaria, Spain.","DOI":"10.1109\/SITIS.2018.00109"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yagi, T., Mangalam, K., Yonetani, R., and Sato, Y. (2018, January 18\u201323). Future person localization in first-person videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00792"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1109\/TIV.2022.3162719","article-title":"Predicting Pedestrian Crossing Intention with Feature Fusion and Spatio-Temporal Attention","volume":"7","author":"Yang","year":"2022","journal-title":"IEEE Trans. Intell. Veh."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ham, J.S., Kim, D.H., Jung, N., and Moon, J. (2023, January 18\u201322). CIPF: Crossing Intention Prediction Network Based on Feature Fusion Modules for Improving Pedestrian Safety. Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada.","DOI":"10.1109\/CVPRW59228.2023.00374"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Tian, R., and Ding, Z. (2023, January 7\u201314). TrEP: Transformer-based Evidential Prediction for Pedestrian Intention with Uncertainty. Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA.","DOI":"10.1609\/aaai.v37i3.25463"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"14922","DOI":"10.1109\/TITS.2021.3135136","article-title":"Pedestrian behavior prediction for automated driving: Requirements, metrics, and relevant features","volume":"23","author":"Herman","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1007\/s11263-018-1104-4","article-title":"Context-based path prediction for targets with switching dynamics","volume":"127","author":"Kooij","year":"2019","journal-title":"Int. J. Comput. Vis."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Ridel, D.A., Deo, N., Wolf, D., and Trivedi, M. (2019, January 9\u201312). Understanding pedestrian-vehicle interactions with vehicle mounted vision: An LSTM model and empirical analysis. Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2019, Paris, France.","DOI":"10.1109\/IVS.2019.8813798"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., and Savarese, S. (2016, January 27\u201330). Social lstm: Human trajectory prediction in crowded spaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.110"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Rezatofighi, H., and Savarese, S. (2019, January 15\u201320). Sophie: An attentive gan for predicting paths compliant to social and physical constraints. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00144"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H., and Chandraker, M. (2017, January 21\u201326). Desire: Distant future prediction in dynamic scenes with interacting agents. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.233"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Salzmann, T., Ivanovic, B., Chakravarty, P., and Pavone, M. (2020, January 23\u201328). Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58523-5_40"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhao, H., and Wildes, R.P. (2021, January 10\u201317). Where are you heading? Dynamic trajectory prediction with expert goal examples. Proceedings of the IEEE\/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00753"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Neumann, L., and Vedaldi, A. (2021, January 20\u201325). Pedestrian and Ego-vehicle Trajectory Prediction from Monocular Camera. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01007"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Yin, Z., Liu, R., Xiong, Z., and Yuan, Z. (2021, January 19\u201326). Multimodal Transformer Networks for Pedestrian Trajectory Prediction. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event.","DOI":"10.24963\/ijcai.2021\/174"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Song, X., Kang, M., Zhou, S., Wang, J., Mao, Y., and Zheng, N. (2022, January 23\u201327). Pedestrian Intention Prediction Based on Traffic-Aware Scene Graph Model. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems, IROS 2022, Kyoto, Japan.","DOI":"10.1109\/IROS47612.2022.9981690"},{"key":"ref_37","unstructured":"Zhai, X., Hu, Z., Yang, D., Zhou, L., and Liu, J. (2022, January 4\u20138). Social Aware Multi-Modal Pedestrian Crossing Behavior Prediction. Proceedings of the 16th Asian Conference on Computer Vision, Macao, China."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Achaji, L., Moreau, J., Fouqueray, T., Aioun, F., and Charpillet, F. (2022, January 4\u20139). Is attention to bounding boxes all you need for pedestrian action prediction?. Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2022, Aachen, Germany.","DOI":"10.1109\/IV51971.2022.9827084"},{"key":"ref_39","unstructured":"Weng, J.J., Ahuja, N., and Huang, T.S. (1993, January 11\u201314). Learning recognition and segmentation of 3-D objects from 2-D images. Proceedings of the Fourth International Conference on Computer Vision, ICCV 1993, Berlin, Germany."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Chitta, K., Prakash, A., and Geiger, A. (2021, January 10\u201317). Neat: Neural attention fields for end-to-end autonomous driving. Proceedings of the IEEE\/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01550"},{"key":"ref_41","unstructured":"Renz, K., Chitta, K., Mercea, O.B., Koepke, A., Akata, Z., and Geiger, A. (2022). PlanT: Explainable Planning Transformers via Object-Level Representations. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Chitta, K., Prakash, A., Jaeger, B., Yu, Z., Renz, K., and Geiger, A. (2022). Transfuser: Imitation with transformer-based sensor fusion for autonomous driving. IEEE Trans. Pattern Anal. Mach. Intell., early access.","DOI":"10.1109\/TPAMI.2022.3200245"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1844","DOI":"10.1109\/TPAMI.2019.2897684","article-title":"EuroCity Persons: A Novel Benchmark for Person Detection in Traffic Scenes","volume":"41","author":"Braun","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_44","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_45","first-page":"15331","article-title":"Vru pose-ssd: Multiperson pose estimation for automated driving","volume":"35","author":"Kumar","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Braun, M., Rao, Q., Wang, Y., and Flohr, F. (2016, January 1\u20134). Pose-rcnn: Joint object detection and pose estimation using 3d object proposals. Proceedings of the 19th IEEE International Conference on Intelligent Transportation Systems, ITSC 2016, Rio de Janeiro, Brazil.","DOI":"10.1109\/ITSC.2016.7795763"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Braun, M., Flohr, F.B., Krebs, S., Kre\u00dfel, U., and Gavrila, D.M. (2021, January 11\u201317). Simple Pair Pose-Pairwise Human Pose Estimation in Dense Urban Traffic Scenes. Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2021, Nagoya, Japan.","DOI":"10.1109\/IV48863.2021.9575435"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Wang, S., Yang, D., Wang, B., Guo, Z., Verma, R., Ramesh, J., Weinrich, C., Kre\u00dfel, U., and Flohr, F.B. (2021, January 11\u201317). UrbanPose: A new benchmark for VRU pose estimation in urban traffic scenes. Proceedings of the IEEE Intelligent Vehicles Symposium, IV 2021, Nagoya, Japan.","DOI":"10.1109\/IV48863.2021.9575469"}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/5\/3\/50\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:25:15Z","timestamp":1760127915000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/5\/3\/50"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,3]]},"references-count":48,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["make5030050"],"URL":"https:\/\/doi.org\/10.3390\/make5030050","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,3]]}}}