{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T10:26:32Z","timestamp":1779359192220,"version":"3.51.4"},"reference-count":76,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T00:00:00Z","timestamp":1676505600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T00:00:00Z","timestamp":1676505600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"crossref","award":["336116, 345122, 316765"],"award-info":[{"award-number":["336116, 345122, 316765"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018948","name":"Infotech Oulu","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100018948","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We explore using body gestures for hidden emotional state analysis. As an important non-verbal communicative fashion, human body gestures are capable of conveying emotional information during social communication. In previous works, efforts have been made mainly on facial expressions, speech, or expressive body gestures to interpret classical expressive emotions. Differently, we focus on a specific group of body gestures, called micro-gestures (MGs), used in the psychology research field to interpret inner human feelings. MGs are subtle and spontaneous body movements that are proven, together with micro-expressions, to be more reliable than normal facial expressions for conveying hidden emotional information. In this work, a comprehensive study of MGs is presented from the computer vision aspect, including a novel spontaneous micro-gesture (SMG) dataset with two emotional stress states and a comprehensive statistical analysis indicating the correlations between MGs and emotional states. Novel frameworks are further presented together with various state-of-the-art methods as benchmarks for automatic classification, online recognition of MGs, and emotional stress state recognition. The dataset and methods presented could inspire a new way of utilizing body gestures for human emotion understanding and bring a new direction to the emotion AI community. The source code and dataset are made available:<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/mikecheninoulu\/SMG\">https:\/\/github.com\/mikecheninoulu\/SMG<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s11263-023-01761-6","type":"journal-article","created":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T06:29:06Z","timestamp":1676528946000},"page":"1346-1366","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":58,"title":["SMG: A Micro-gesture Dataset Towards Spontaneous Body Gestures for Emotional Stress State Analysis"],"prefix":"10.1007","volume":"131","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3267-2664","authenticated-orcid":false,"given":"Haoyu","family":"Chen","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Henglin","family":"Shi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xin","family":"Liu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaobai","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guoying","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,2,16]]},"reference":[{"issue":"6111","key":"1761_CR1","doi-asserted-by":"publisher","first-page":"1225","DOI":"10.1126\/science.1224313","volume":"338","author":"H Aviezer","year":"2012","unstructured":"Aviezer, H., Trope, Y., & Todorov, A. (2012). Body cues, not facial expressions, discriminate between intense positive and negative emotions. Science, 338(6111), 1225\u20131229.","journal-title":"Science"},{"key":"1761_CR2","volume-title":"Nonverbal communication: The unspoken dialogue","author":"J Burgoon","year":"1994","unstructured":"Burgoon, J., Buller, D., & WG, W. (1994). Nonverbal communication: The unspoken dialogue. Greyden Press."},{"issue":"1","key":"1761_CR3","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1109\/TPAMI.2019.2929257","volume":"43","author":"Z Cao","year":"2019","unstructured":"Cao, Z., Hidalgo, G., Simon, T., Wei, S. E., & Sheikh, Y. (2019). Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1), 172\u2013186.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1761_CR4","doi-asserted-by":"crossref","unstructured":"Carreira, J., & Zisserman, A. (2017). Quo vadis, action recognition? a new model and the kinetics dataset. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 6299\u20136308).","DOI":"10.1109\/CVPR.2017.502"},{"key":"1761_CR5","doi-asserted-by":"crossref","unstructured":"Chen, H., Liu, X., Li, X., Shi, H., & Zhao, G. (2019). Analyze spontaneous gestures for emotional stress state recognition: A micro-gesture dataset and analysis with deep learning. In Proceedings of the IEEE international conference on automatic face & gesture recognition (pp. 1\u20138).","DOI":"10.1109\/FG.2019.8756513"},{"key":"1761_CR6","doi-asserted-by":"publisher","first-page":"9689","DOI":"10.1109\/TIP.2020.3028962","volume":"29","author":"H Chen","year":"2020","unstructured":"Chen, H., Liu, X., Shi, J., & Zhao, G. (2020). Temporal hierarchical dictionary guided decoding for online gesture segmentation and recognition. IEEE Transactions on Image Processing, 29, 9689\u20139702.","journal-title":"IEEE Transactions on Image Processing"},{"key":"1761_CR7","doi-asserted-by":"crossref","unstructured":"Cheng, K., Zhang, Y., He, X., Chen, W., Cheng, J., & Lu, H. (2020). Skeleton-based action recognition with shift graph convolutional network. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 183\u2013192).","DOI":"10.1109\/CVPR42600.2020.00026"},{"key":"1761_CR8","doi-asserted-by":"crossref","unstructured":"Crasto, N., Weinzaepfel, P., Alahari, K., & Schmid, C. (2019). MARS: Motion-augmented RGB stream for action recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition.","DOI":"10.1109\/CVPR.2019.00807"},{"key":"1761_CR9","volume-title":"The gift of fear","author":"G de Becker","year":"1997","unstructured":"de Becker, G. (1997). The gift of fear. Dell Publishing."},{"key":"1761_CR10","unstructured":"de\u00a0Lara, N., & Pineau, E. (2018). A simple baseline algorithm for graph classification. In Relational representation learning workshop, the conference on neural information processing systems."},{"key":"1761_CR11","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1196\/annals.1280.010","volume":"1000","author":"P Ekman","year":"2004","unstructured":"Ekman, P. (2004). Darwin, deception, and facial expression. Annals of the New York Academy of Sciences, 1000, 205\u201321.","journal-title":"Annals of the New York Academy of Sciences"},{"key":"1761_CR12","volume-title":"What the face reveals: Basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS)","author":"R Ekman","year":"1997","unstructured":"Ekman, R. (1997). What the face reveals: Basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS). Oxford University Press."},{"issue":"3","key":"1761_CR13","doi-asserted-by":"publisher","first-page":"572","DOI":"10.1016\/j.patcog.2010.09.020","volume":"44","author":"M El Ayadi","year":"2011","unstructured":"El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572\u2013587.","journal-title":"Pattern Recognition"},{"key":"1761_CR14","doi-asserted-by":"crossref","unstructured":"Escalera, S., Bar\u00f3, X., Gonz\u00e0lez, J., Bautista, M.A., Madadi, M., Reyes, M., Ponce-L\u00f3pez, V., Escalante, H.J., Shotton, J., & Guyon, I. (2015). Chalearn looking at people challenge 2014: Dataset and results. In Proceedings of the European conference on computer vision (pp. 459\u2013473).","DOI":"10.1007\/978-3-319-16178-5_32"},{"issue":"3","key":"1761_CR15","first-page":"238","volume":"57","author":"E Fix","year":"1989","unstructured":"Fix, E., & Hodges, J. L. (1989). Discriminatory analysis. nonparametric discrimination: Consistency properties. International Statistical Review\/Revue Internationale de Statistique, 57(3), 238\u2013247.","journal-title":"International Statistical Review\/Revue Internationale de Statistique"},{"key":"1761_CR16","unstructured":"Ginevra, C., Loic, K., & George, C. (2008). Emotion recognition through multiple modalities: Face, body gesture, speech (pp. 92\u2013103). Springer."},{"key":"1761_CR17","doi-asserted-by":"crossref","unstructured":"Goyal, R., Ebrahimi\u00a0Kahou, S., Michalski, V., Materzynska, J., Westphal, S., Kim, H., Haenel, V., Fruend, I., Yianilos, P., & Mueller-Freitag, M., et\u00a0al. (2017). The\u201d something something\u201d video database for learning and evaluating visual common sense. In: Proceedings of the IEEE international conference on computer vision (pp. 5842\u20135850).","DOI":"10.1109\/ICCV.2017.622"},{"issue":"3","key":"1761_CR18","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1017\/S0140525X00013066","volume":"5","author":"JA Gray","year":"1982","unstructured":"Gray, J. A. (1982). Pr\u00e9cis of the neuropsychology of anxiety: An enquiry into the functions of the septo-hippocampal system. Behavioral and Brain Sciences, 5(3), 469\u2013484.","journal-title":"Behavioral and Brain Sciences"},{"issue":"7","key":"1761_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0066762","volume":"8","author":"Y Gu","year":"2013","unstructured":"Gu, Y., Mai, X., & Luo, Y. (2013). Do bodily expressions compete with facial expressions? Time course of integration of emotional signals from the face and the body. PLOS ONE, 8(7), 1\u20139.","journal-title":"PLOS ONE"},{"key":"1761_CR20","doi-asserted-by":"crossref","unstructured":"Gunes, H., & Piccardi, M. (2006). A bimodal face and body gesture database for automatic analysis of human nonverbal affective behavior. In 18th international conference on pattern recognition (vol.\u00a01, pp. 1148\u20131153).","DOI":"10.1109\/ICPR.2006.39"},{"key":"1761_CR21","doi-asserted-by":"crossref","unstructured":"Hara, K., Kataoka, H., & Satoh, Y. (2018). Can spatiotemporal 3d cnns retrace the history of 2D CNNs and imagenet? In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 6546\u20136555).","DOI":"10.1109\/CVPR.2018.00685"},{"key":"1761_CR22","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770\u2013778).","DOI":"10.1109\/CVPR.2016.90"},{"key":"1761_CR23","unstructured":"Ho, T. K. (1995). Random decision forests. In Proceedings of international conference on document analysis and recognition (vol.\u00a01, pp. 278\u2013282)."},{"issue":"4","key":"1761_CR24","doi-asserted-by":"publisher","first-page":"161","DOI":"10.5121\/ijaia.2012.3412","volume":"3","author":"RZ Khan","year":"2012","unstructured":"Khan, R. Z., & Ibraheem, N. A. (2012). Hand gesture recognition: A literature review. International Journal of Artificial Intelligence & Applications, 3(4), 161.","journal-title":"International Journal of Artificial Intelligence & Applications"},{"key":"1761_CR25","doi-asserted-by":"crossref","unstructured":"Kipp, M., & Martin, J. C. (2009). Gesture and emotion: Can basic gestural form features discriminate emotions? In International conference on affective computing and intelligent interaction and workshops (pp. 1\u20138).","DOI":"10.1109\/ACII.2009.5349544"},{"key":"1761_CR26","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1037\/rev0000059","volume":"124","author":"S Kita","year":"2017","unstructured":"Kita, S., Alibali, M., & Chu, M. (2017). How do gestures influence thinking and speaking? the gesture-for-conceptualization hypothesis. Psychological Review, 124, 245\u2013266.","journal-title":"Psychological Review"},{"issue":"4","key":"1761_CR27","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1145\/3185521","volume":"61","author":"M Krakovsky","year":"2018","unstructured":"Krakovsky, M. (2018). Artificial (emotional) intelligence. Communications of the ACM, 61(4), 18\u201319.","journal-title":"Communications of the ACM"},{"key":"1761_CR28","doi-asserted-by":"crossref","unstructured":"Kuehne, H., Richard, A., & Gall, J. (2019). A hybrid RNN-HMM approach for weakly supervised temporal action segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence.","DOI":"10.1109\/TPAMI.2018.2884469"},{"key":"1761_CR29","volume-title":"Body language for dummies","author":"E Kuhnke","year":"2009","unstructured":"Kuhnke, E. (2009). Body language for dummies. Wiley."},{"key":"1761_CR30","unstructured":"Li, S., & Deng, W. (2020). Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing."},{"key":"1761_CR31","doi-asserted-by":"crossref","unstructured":"Li, Y., Lan, C., Xing, J., Zeng, W., Yuan, C., & Liu, J. (2016). Online human action detection using joint classification-regression recurrent neural networks. In Proceedings of the European conference on computer vision.","DOI":"10.1007\/978-3-319-46478-7_13"},{"key":"1761_CR32","doi-asserted-by":"crossref","unstructured":"Lin, J., Gan, C., & Han, S. (2019). TSM: Temporal shift module for efficient video understanding. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 7083\u20137093).","DOI":"10.1109\/ICCV.2019.00718"},{"key":"1761_CR33","doi-asserted-by":"crossref","unstructured":"Liu, J., Shahroudy, A., Xu, D., & Wang, G. (2016). Spatio-temporal LSTM with trust gates for 3D human action recognition. In: Proceedings of the European conference on computer vision.","DOI":"10.1007\/978-3-319-46487-9_50"},{"key":"1761_CR34","doi-asserted-by":"crossref","unstructured":"Liu, J., Shahroudy, A., Wang, G., Duan, L.Y., & Kot, A.C. (2018). Ssnet: Scale selection network for online 3D action prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition.","DOI":"10.1109\/CVPR.2018.00871"},{"key":"1761_CR35","doi-asserted-by":"crossref","unstructured":"Liu, J., Wang, G., Hu, P., Duan, L.Y., & Kot, A. C. (2017). Global context-aware attention LSTM networks for 3d action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.","DOI":"10.1109\/CVPR.2017.391"},{"key":"1761_CR36","doi-asserted-by":"crossref","unstructured":"Liu, X., Shi, H., Chen, H., Yu, Z., Li, X., & Zhao, G. (2021). imigue: An identity-free video dataset for micro-gesture understanding and emotion analysis. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 10631\u201310642).","DOI":"10.1109\/CVPR46437.2021.01049"},{"key":"1761_CR37","doi-asserted-by":"crossref","unstructured":"Liu, Z., Zhang, H., Chen, Z., Wang, Z., & Ouyang, W. (2020). Disentangling and unifying graph convolutions for skeleton-based action recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 143\u2013152).","DOI":"10.1109\/CVPR42600.2020.00022"},{"issue":"1","key":"1761_CR38","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s11263-019-01215-y","volume":"128","author":"Y Luo","year":"2020","unstructured":"Luo, Y., Ye, J., Adams, R. B., Li, J., Newman, M. G., & Wang, J. Z. (2020). Arbee: Towards automated recognition of bodily expression of emotion in the wild. International Journal of Computer Vision, 128(1), 1\u201325.","journal-title":"International Journal of Computer Vision"},{"key":"1761_CR39","doi-asserted-by":"crossref","unstructured":"Mahmoud, M., Baltru\u0161aitis, T., Robinson, P., & Riek, L.D. (2011). 3D corpus of spontaneous complex mental states. In International conference on affective computing and intelligent interaction (pp. 205\u2013214).","DOI":"10.1007\/978-3-642-24600-5_24"},{"key":"1761_CR40","volume-title":"What every BODY is saying: An ex-FBI agent\u2019s guide to speed reading people","author":"J Navarro","year":"2008","unstructured":"Navarro, J., & Karlins, M. (2008). What every BODY is saying: An ex-FBI agent\u2019s guide to speed reading people. Collins."},{"key":"1761_CR41","doi-asserted-by":"crossref","unstructured":"Neverova, N., Wolf, C., Taylor, G., & Nebout, F. (2016). Moddrop: Adaptive multi-modal gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 38(8).","DOI":"10.1109\/TPAMI.2015.2461544"},{"key":"1761_CR42","unstructured":"Noroozi, F., Kaminska, D., Corneanu, C., Sapinski, T., Escalera, S., & Anbarjafari, G. (2018). Survey on emotional body gesture recognition. IEEE Transactions on Affective Computing."},{"key":"1761_CR43","doi-asserted-by":"crossref","unstructured":"Oh, S. J., Benenson, R., Fritz, M., & Schiele, B. (2016). Faceless person recognition: Privacy implications in social media. In Proceedings of the European conference on computer vision (pp. 19\u201335).","DOI":"10.1007\/978-3-319-46487-9_2"},{"key":"1761_CR44","doi-asserted-by":"crossref","unstructured":"Palena, N., Caso, L., Vrij, A., & Orthey, R. (2018). Detecting deception through small talk and comparable truth baselines. Journal of Investigative Psychology and Offender Profiling 15.","DOI":"10.1002\/jip.1495"},{"key":"1761_CR45","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780195096736.001.0001","volume-title":"Affective neuroscience: The foundations of human and animal emotions","author":"J Panksepp","year":"1998","unstructured":"Panksepp, J. (1998). Affective neuroscience: The foundations of human and animal emotions. Oxford University Press."},{"key":"1761_CR46","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et\u00a0al. (2019) Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems."},{"key":"1761_CR47","doi-asserted-by":"crossref","unstructured":"Peng, W., Hong, X., Chen, H., & Zhao, G. (2020). Learning graph convolutional network for skeleton-based human action recognition by neural searching. In Proceedings of the AAAI conference on artificial intelligence.","DOI":"10.1609\/aaai.v34i03.5652"},{"key":"1761_CR48","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/8022.001.0001","volume-title":"Honest signals: How they shape our world","author":"A Pentland","year":"2008","unstructured":"Pentland, A. (2008). Honest signals: How they shape our world. MIT Press."},{"issue":"3","key":"1761_CR49","doi-asserted-by":"publisher","first-page":"269","DOI":"10.1007\/s10339-016-0757-6","volume":"17","author":"WT Pouw","year":"2016","unstructured":"Pouw, W. T., Mavilidi, M. F., Van Gog, T., & Paas, F. (2016). Gesturing during mental problem solving reduces eye movements, especially for individuals with lower visual working memory capacity. Cognitive Processing, 17(3), 269\u2013277.","journal-title":"Cognitive Processing"},{"key":"1761_CR50","doi-asserted-by":"crossref","unstructured":"Richard, A., Kuehne, H., Iqbal, A., & Gall, J. (2018). Neuralnetwork-viterbi: A framework for weakly supervised video learning. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition.","DOI":"10.1109\/CVPR.2018.00771"},{"issue":"6088","key":"1761_CR51","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1038\/323533a0","volume":"323","author":"DE Rumelhart","year":"1986","unstructured":"Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533\u2013536.","journal-title":"Nature"},{"issue":"1","key":"1761_CR52","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","volume":"20","author":"F Scarselli","year":"2008","unstructured":"Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2008). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61\u201380.","journal-title":"IEEE Transactions on Neural Networks"},{"key":"1761_CR53","doi-asserted-by":"crossref","unstructured":"Schapire, R. E. (2013). Explaining adaboost. In Empirical inference (pp. 37\u201352). Springer.","DOI":"10.1007\/978-3-642-41136-6_5"},{"issue":"9","key":"1761_CR54","doi-asserted-by":"publisher","first-page":"1238","DOI":"10.1016\/j.neunet.2008.05.003","volume":"21","author":"K Schindler","year":"2008","unstructured":"Schindler, K., Van Gool, L., & De Gelder, B. (2008). Recognizing emotions expressed by body pose: A biologically inspired neural model. Neural Networks, 21(9), 1238\u20131246.","journal-title":"Neural Networks"},{"key":"1761_CR55","unstructured":"Serge, G. (1995). International Glossary of Gestalt Psychotherapy. FORGE."},{"key":"1761_CR56","doi-asserted-by":"crossref","unstructured":"Shahroudy, A., Liu, J., Ng, T. T., & Wang, G. (2016). Ntu rgb+d: A large scale dataset for 3D human activity analysis. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition.","DOI":"10.1109\/CVPR.2016.115"},{"key":"1761_CR57","doi-asserted-by":"crossref","unstructured":"Shi, L., Zhang, Y., Cheng, J., & Lu, H. (2019). Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 12026\u201312035).","DOI":"10.1109\/CVPR.2019.01230"},{"key":"1761_CR58","doi-asserted-by":"crossref","unstructured":"Shiffrar, M., Kaiser, M., & Chouchourelou, A. (2011). Seeing human movement as inherently social. The Science of Social Vision.","DOI":"10.1093\/acprof:oso\/9780195333176.003.0015"},{"key":"1761_CR59","doi-asserted-by":"crossref","unstructured":"Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 1297\u20131304).","DOI":"10.1109\/CVPR.2011.5995316"},{"key":"1761_CR60","unstructured":"Soomro, K., Zamir, A. R., & Shah, M. (2012). Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402."},{"key":"1761_CR61","doi-asserted-by":"crossref","unstructured":"Sun, S., Kuang, Z., Sheng, L., Ouyang, W., & Zhang, W. (2018). Optical flow guided feature: A fast and robust motion representation for video action recognition. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition.","DOI":"10.1109\/CVPR.2018.00151"},{"key":"1761_CR62","doi-asserted-by":"crossref","unstructured":"Tran, D., Bourdev, L., Fergus, R., Torresani, L., & Paluri, M. (2015). Learning spatiotemporal features with 3D convolutional networks. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 4489\u20134497).","DOI":"10.1109\/ICCV.2015.510"},{"key":"1761_CR63","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems 30."},{"issue":"2","key":"1761_CR64","doi-asserted-by":"publisher","first-page":"265","DOI":"10.1111\/lcrp.12126","volume":"23","author":"A Vrij","year":"2018","unstructured":"Vrij, A., Leal, S., Jupe, L., & Harvey, A. (2018). Within-subjects verbal lie detection measures: A comparison between total detail and proportion of complications. Legal and Criminological Psychology, 23(2), 265\u2013279.","journal-title":"Legal and Criminological Psychology"},{"issue":"1","key":"1761_CR65","doi-asserted-by":"publisher","first-page":"9","DOI":"10.5093\/ejpalc2021a2","volume":"13","author":"A Vrij","year":"2020","unstructured":"Vrij, A., Mann, S., Leal, S., & Fisher, R. P. (2020). Combining verbal veracity assessment techniques to distinguish truth tellers from lie tellers. European Journal of Psychology Applied to Legal Context, 13(1), 9\u201319.","journal-title":"European Journal of Psychology Applied to Legal Context"},{"key":"1761_CR66","doi-asserted-by":"crossref","unstructured":"Wallbott, H. G. (1998). Bodily expression of emotion. European Journal of Social Psychology, 28(6), 879\u2013896.","DOI":"10.1002\/(SICI)1099-0992(1998110)28:6<879::AID-EJSP901>3.0.CO;2-W"},{"issue":"11","key":"1761_CR67","doi-asserted-by":"publisher","first-page":"2740","DOI":"10.1109\/TPAMI.2018.2868668","volume":"41","author":"L Wang","year":"2018","unstructured":"Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., & Van Gool, L. (2018). Temporal segment networks for action recognition in videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(11), 2740\u20132755.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1761_CR68","doi-asserted-by":"crossref","unstructured":"Wu, D., Pigou, L., Kindermans, P.J., Le, N.D.H., Shao, L., Dambre, J., & Odobez, J.M. (2016). Deep dynamic neural networks for multimodal gesture segmentation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 38(8).","DOI":"10.1109\/TPAMI.2016.2537340"},{"key":"1761_CR69","doi-asserted-by":"crossref","unstructured":"Xu, M., Gao, M., Chen, Y. T., Davis, L. S., & Crandall, D. J. (2019a). Temporal recurrent networks for online action detection. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 5532\u20135541).","DOI":"10.1109\/ICCV.2019.00563"},{"key":"1761_CR70","doi-asserted-by":"crossref","unstructured":"Xu, M., Gao, M., Chen, Y.T., Davis, L. S., & Crandall, D. J. (2019b). Temporal recurrent networks for online action detection. In Proceedings of the IEEE\/CVF international conference on computer vision.","DOI":"10.1109\/ICCV.2019.00563"},{"key":"1761_CR71","doi-asserted-by":"crossref","unstructured":"Yan, S., Xiong, Y., & Lin, D. (2018). Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence (vol.\u00a032).","DOI":"10.1609\/aaai.v32i1.12328"},{"key":"1761_CR72","doi-asserted-by":"crossref","unstructured":"You, Y., Chen, T., Wang, Z., & Shen, Y. (2020). L2-gcn: Layer-wise and learned efficient training of graph convolutional networks. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 2127\u20132135).","DOI":"10.1109\/CVPR42600.2020.00220"},{"key":"1761_CR73","doi-asserted-by":"crossref","unstructured":"Yu, N. (2008). Metaphor from body and culture. The Cambridge handbook of metaphor and thought (pp. 247\u2013261).","DOI":"10.1017\/CBO9780511816802.016"},{"key":"1761_CR74","doi-asserted-by":"crossref","unstructured":"Yu, Z., Zhou, B., Wan, J., Wang, P., Chen, H., Liu, X., Li, S. Z., & Zhao, G. (2020). Searching multi-rate and multi-modal temporal enhanced networks for gesture recognition. IEEE Transactions on Image Processing.","DOI":"10.1109\/TIP.2021.3087348"},{"key":"1761_CR75","doi-asserted-by":"crossref","unstructured":"Zanfir, M., Leordeanu, M., & Sminchisescu, C. (2013). The moving pose: An efficient 3D kinematics descriptor for low-latency action recognition and detection. In Proceedings of the IEEE\/CVF international conference on computer vision.","DOI":"10.1109\/ICCV.2013.342"},{"key":"1761_CR76","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Pal, S., Coates, M., & Ustebay, D. (2019). Bayesian graph convolutional neural networks for semi-supervised classification. In Proceedings of the AAAI conference on artificial intelligence (vol.\u00a033, pp. 5829\u20135836).","DOI":"10.1609\/aaai.v33i01.33015829"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-023-01761-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-023-01761-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-023-01761-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,6]],"date-time":"2023-12-06T23:50:44Z","timestamp":1701906644000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-023-01761-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,16]]},"references-count":76,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["1761"],"URL":"https:\/\/doi.org\/10.1007\/s11263-023-01761-6","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,16]]},"assertion":[{"value":"4 May 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 January 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 February 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}