{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T15:05:01Z","timestamp":1773414301917,"version":"3.50.1"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,7,8]],"date-time":"2021-07-08T00:00:00Z","timestamp":1625702400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,8]],"date-time":"2021-07-08T00:00:00Z","timestamp":1625702400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100014440","name":"Ministerio de Ciencia, Innovaci\u00f3n y Universidades","doi-asserted-by":"publisher","award":["RTI2018-096652-B-I00"],"award-info":[{"award-number":["RTI2018-096652-B-I00"]}],"id":[{"id":"10.13039\/100014440","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Programa de Apoyo a Proyectos de Investigaci\u00f3n de la Junta de Castilla y Le\u00f3n","award":["VA233P18"],"award-info":[{"award-number":["VA233P18"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2022,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Gaze control represents an important issue in the interaction between a robot and humans. Specifically, deciding who to pay attention to in a multi-party conversation is one way to improve the naturalness of a robot in human-robot interaction. This control can be carried out by means of two different models that receive the stimuli produced by the participants in an interaction, either an on-center off-surround competitive network or a recurrent neural network. A system based on a competitive neural network is able to decide who to look at with a smooth transition in the focus of attention when significant changes in stimuli occur. An important aspect in this process is the configuration of the different parameters of such neural network. The weights of the different stimuli have to be computed to achieve human-like behavior. This article explains how these weights can be obtained by solving an optimization problem. In addition, a new model using a recurrent neural network with LSTM layers is presented. This model uses the same set of stimuli but does not require its weighting. This new model is easier to train, avoiding manual configurations, and offers promising results in robot gaze control. The experiments carried out and some results are also presented.<\/jats:p>","DOI":"10.1007\/s11042-021-11112-7","type":"journal-article","created":{"date-parts":[[2021,7,8]],"date-time":"2021-07-08T02:02:38Z","timestamp":1625709758000},"page":"3351-3368","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Optimization and improvement of a robotics gaze control system using LSTM networks"],"prefix":"10.1007","volume":"81","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6649-5550","authenticated-orcid":false,"given":"Jaime Duque","family":"Domingo","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jaime","family":"G\u00f3mez-Garc\u00eda-Bermejo","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eduardo","family":"Zalama","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,7,8]]},"reference":[{"issue":"33","key":"11112_CR1","doi-asserted-by":"publisher","first-page":"24,013","DOI":"10.1007\/s11042-019-08293-7","volume":"79","author":"S Abd El-Moneim","year":"2020","unstructured":"Abd El-Moneim S, Nassar M, Dessouky MI, Ismail NA, El-Fishawy AS, Abd El-Samie FE (2020) Text-independent speaker recognition using lstm-rnn and speech enhancement. Mult Tools Appl 79(33):24,013\u201324,028","journal-title":"Mult Tools Appl"},{"issue":"1","key":"11112_CR2","doi-asserted-by":"publisher","first-page":"25","DOI":"10.5898\/JHRI.6.1.Admoni","volume":"6","author":"H Admoni","year":"2017","unstructured":"Admoni H, Scassellati B (2017) Social eye gaze in human-robot interaction: a review. J Human Robot Interact 6(1):25\u201363","journal-title":"J Human Robot Interact"},{"issue":"7","key":"11112_CR3","doi-asserted-by":"publisher","first-page":"9913","DOI":"10.3390\/s120709913","volume":"12","author":"F Alonso-Mart\u00edn","year":"2012","unstructured":"Alonso-Mart\u00edn F, Gorostiza JF, Malfaz M, Salichs MA (2012) User localization during human-robot interaction. Sensors 12(7):9913\u20139935","journal-title":"Sensors"},{"key":"11112_CR4","doi-asserted-by":"crossref","unstructured":"Andrist S, Mutlu B, Tapus A (2015) Look like me: matching robot personality via gaze to increase motivation. In: Proceedings of the 33rd annual ACM conference on human factors in computing systems, pp 3603\u20133612. ACM","DOI":"10.1145\/2702123.2702592"},{"key":"11112_CR5","unstructured":"Bendris M, Charlet D, Chollet G (2010) Lip activity detection for talking faces classification in tv-content. In: International conference on machine vision, pp 187\u2013190"},{"key":"11112_CR6","doi-asserted-by":"crossref","unstructured":"Benrachou DE, dos Santos FN, Boulebtateche B, Bensaoula S (2015) Online vision-based eye detection: Lbp\/svm vs lbp\/lstm-rnn. In: CONTROLO\u20192014\u2013proceedings of the 11th Portuguese conference on automatic control, pp 659\u2013668. Springer","DOI":"10.1007\/978-3-319-10380-8_63"},{"issue":"19","key":"11112_CR7","doi-asserted-by":"publisher","first-page":"27,309","DOI":"10.1007\/s11042-019-07827-3","volume":"78","author":"F Carrara","year":"2019","unstructured":"Carrara F, Elias P, Sedmidubsky J, Zezula P (2019) Lstm-based real-time action detection and prediction in human motion streams. Multimed Tools Appl 78(19):27,309\u201327,331","journal-title":"Multimed Tools Appl"},{"issue":"2","key":"11112_CR8","doi-asserted-by":"publisher","first-page":"2754","DOI":"10.1109\/LRA.2020.2972868","volume":"5","author":"Y Chen","year":"2020","unstructured":"Chen Y, Liu C, Shi BE, Liu M (2020) Robot navigation in crowds by graph convolutional networks with attention learned from human gaze. IEEE Robot Auto Lett 5(2):2754\u20132761","journal-title":"IEEE Robot Auto Lett"},{"key":"11112_CR9","unstructured":"Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection"},{"key":"11112_CR10","doi-asserted-by":"crossref","unstructured":"Domingo JD, G\u00f3mez-Garc\u00eda-Bermejo J, Zalama E (2020) Optimization of a robotics gaze control system. In: Workshop of physical agents, pp 213\u2013226. Springer","DOI":"10.1007\/978-3-030-62579-5_15"},{"key":"11112_CR11","doi-asserted-by":"publisher","first-page":"34","DOI":"10.3389\/fnbot.2020.00034","volume":"14","author":"J Duque-Domingo","year":"2020","unstructured":"Duque-Domingo J, G\u00f3mez-Garc\u00eda-Bermejo J, Zalama E (2020) Gaze control of a robotic head for realistic interaction with humans. Front Neurorobot 14:34","journal-title":"Front Neurorobot"},{"key":"11112_CR12","unstructured":"King E (2015) D.: Max-margin object detection. arXiv:1502.00046"},{"issue":"6","key":"11112_CR13","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1016\/S0149-7634(00)00025-7","volume":"24","author":"NJ Emery","year":"2000","unstructured":"Emery NJ (2000) The eyes have it: the neuroethology, function and evolution of social gaze. Neurosci Biobehav Rev 24(6):581\u2013604","journal-title":"Neurosci Biobehav Rev"},{"key":"11112_CR14","doi-asserted-by":"crossref","unstructured":"Fan L, Wang W, Huang S, Tang X, Zhu SC (2019) Understanding human gaze communication by spatio-temporal graph reasoning. In: Proceedings of the IEEE international conference on computer vision, pp 5724\u20135733","DOI":"10.1109\/ICCV.2019.00582"},{"key":"11112_CR15","doi-asserted-by":"crossref","unstructured":"Garau M, Slater M, Bee S, Sasse MA (2001) The impact of eye gaze on communication using humanoid avatars. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp 309\u2013316. ACM","DOI":"10.1145\/365024.365121"},{"issue":"1","key":"11112_CR16","first-page":"1","volume":"28","author":"D Gergle","year":"2013","unstructured":"Gergle D, Kraut RE, Fussell SR (2013) Using visual information for grounding and awareness in collaborative tasks. Human Comput Interact 28(1):1\u201339","journal-title":"Human Comput Interact"},{"key":"11112_CR17","doi-asserted-by":"crossref","unstructured":"Grossberg S (1982) Contour enhancement, short term memory, and constancies in reverberating neural networks. In: Studies of mind and brain, pp 332\u2013378. Springer","DOI":"10.1007\/978-94-009-7758-7_8"},{"issue":"2\/3","key":"11112_CR18","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1086\/200975","volume":"9","author":"ET Hall","year":"1968","unstructured":"Hall ET, Birdwhistell RL, Bock B, Bohannan P, Diebold JrAR, Durbin M, Edmonson MS, Fischer J, Hymes D, Kimball ST et al (1968) Proxemics [and comments and replies]. Curr Anthropol 9(2\/3):83\u2013108","journal-title":"Curr Anthropol"},{"issue":"8","key":"11112_CR19","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735\u20131780","journal-title":"Neural Comput"},{"key":"11112_CR20","doi-asserted-by":"crossref","unstructured":"Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1867\u20131874","DOI":"10.1109\/CVPR.2014.241"},{"issue":"1-2","key":"11112_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1207\/s15327051hci1901&2_1","volume":"19","author":"S Kiesler","year":"2004","unstructured":"Kiesler S, Hinds P (2004) Introduction to this special issue on human-robot interaction. Human Comput Interact 19(1-2):1\u20138","journal-title":"Human Comput Interact"},{"issue":"Jul","key":"11112_CR22","first-page":"1755","volume":"10","author":"DE King","year":"2009","unstructured":"King DE (2009) Dlib-ml: A machine learning toolkit. J Mach Learn Res 10(Jul):1755\u20131758","journal-title":"J Mach Learn Res"},{"key":"11112_CR23","doi-asserted-by":"crossref","unstructured":"Koochaki F, Najafizadeh L (2019) Eye gaze-based early intent prediction utilizing cnn-lstm. In: 2019 41st Annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 1310\u20131313. IEEE","DOI":"10.1109\/EMBC.2019.8857054"},{"key":"11112_CR24","unstructured":"Kousidis S, Schlangen D (2015) The power of a glance: Evaluating embodiment and turn-tracking strategies of an active robotic overhearer. In: 2015 AAAI Spring symposium series"},{"key":"11112_CR25","unstructured":"Kraft D, Schnepper K (1989) Slsqp\u2014a nonlinear programming method with quadratic programming subproblems. DLR Oberpfaffenhofen"},{"key":"11112_CR26","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1016\/j.patrec.2018.05.023","volume":"118","author":"S Lathuili\u00e8re","year":"2019","unstructured":"Lathuili\u00e8re S, Mass\u00e9 B, Mesejo P, Horaud R (2019) Neural network based reinforcement learning for audio\u2013visual gaze control in human\u2013robot interaction. Pattern Recogn. Lett. 118:61\u201371","journal-title":"Pattern Recogn. Lett."},{"issue":"4","key":"11112_CR27","doi-asserted-by":"publisher","first-page":"4527","DOI":"10.1007\/s11042-018-6058-6","volume":"78","author":"F Liu","year":"2019","unstructured":"Liu F, Chen Z, Wang J (2019) Video image target monitoring based on rnn-lstm. Multimed Tools Appl 78(4):4527\u20134544","journal-title":"Multimed Tools Appl"},{"key":"11112_CR28","unstructured":"Mass\u00e9 B (2018) Gaze direction in the context of social human-robot interaction. Ph.D thesis"},{"issue":"20","key":"11112_CR29","doi-asserted-by":"publisher","first-page":"26,901","DOI":"10.1007\/s11042-018-5893-9","volume":"77","author":"B Meng","year":"2018","unstructured":"Meng B, Liu X, Wang X (2018) Human action recognition based on quaternion spatial-temporal convolutional neural network and lstm in rgb videos. Multimed Tools Appl 77(20):26,901\u201326,918","journal-title":"Multimed Tools Appl"},{"key":"11112_CR30","doi-asserted-by":"crossref","unstructured":"Nguyen DC, Bailly G, Elisei F (2018) Comparing cascaded lstm architectures for generating head motion from speech in task-oriented dialogs. In: International conference on human-computer interaction, pp 164\u2013175. Springer","DOI":"10.1007\/978-3-319-91250-9_13"},{"key":"11112_CR31","unstructured":"Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91\u201399"},{"key":"11112_CR32","unstructured":"Rosales R, Sclaroff S (1998) Improved tracking of multiple humans with trajectory prediction and occlusion modeling. Tech. rep. Boston University Computer Science Department"},{"issue":"5","key":"11112_CR33","doi-asserted-by":"publisher","first-page":"72","DOI":"10.5772\/58402","volume":"11","author":"J Saldien","year":"2014","unstructured":"Saldien J, Vanderborght B, Goris K, Van Damme M, Lefeber D (2014) A motion system for social and animated robots. Int J Adv Robot Syst 11 (5):72","journal-title":"Int J Adv Robot Syst"},{"key":"11112_CR34","doi-asserted-by":"crossref","unstructured":"Shiomi M, Kanda T, Miralles N, Miyashita T, Fasel I, Movellan J, Ishiguro H (2004) Face-to-face interactive humanoid robot. In: 2004 IEEE\/RSJ International conference on intelligent robots and systems (IROS)(IEEE Cat. No. 04CH37566), vol 2. IEEE, pp 1340\u20131346","DOI":"10.1109\/IROS.2004.1389582"},{"issue":"1","key":"11112_CR35","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1109\/TCSVT.2008.2009262","volume":"19","author":"S Siatras","year":"2008","unstructured":"Siatras S, Nikolaidis N, Krinidis M, Pitas I (2008) Visual lip activity detection and speaker detection using mouth region intensities. IEEE Trans Circ Syst Video Technol 19(1):133\u2013137","journal-title":"IEEE Trans Circ Syst Video Technol"},{"key":"11112_CR36","doi-asserted-by":"crossref","unstructured":"Sidner CL, Kidd CD, Lee C, Lesh N (2004) Where to look: a study of human-robot engagement. In: Proceedings of the 9th international conference on Intelligent user interfaces, pp 78\u201384. ACM","DOI":"10.1145\/964442.964458"},{"issue":"1","key":"11112_CR37","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1207\/s15327051hci1901&2_2","volume":"19","author":"S Thrun","year":"2004","unstructured":"Thrun S (2004) Toward a framework for human-robot interaction. Human Comput Int 19(1):9\u201324","journal-title":"Human Comput Int"},{"key":"11112_CR38","doi-asserted-by":"publisher","first-page":"1155","DOI":"10.1109\/ACCESS.2017.2778011","volume":"6","author":"A Ullah","year":"2017","unstructured":"Ullah A, Ahmad J, Muhammad K, Sajjad M, Baik SW (2017) Action recognition in video sequences using deep bi-directional lstm with cnn features. IEEE Access 6:1155\u20131166","journal-title":"IEEE Access"},{"issue":"1","key":"11112_CR39","doi-asserted-by":"publisher","first-page":"1268","DOI":"10.3390\/s130101268","volume":"13","author":"J Vega","year":"2013","unstructured":"Vega J, Perdices E, Ca\u00f1as J (2013) Robot evolutionary localization based on attentive visual short-term memory. Sensors 13(1):1268\u20131299","journal-title":"Sensors"},{"issue":"6","key":"11112_CR40","doi-asserted-by":"publisher","first-page":"9522","DOI":"10.3390\/s140609522","volume":"14","author":"R Viciana-Abad","year":"2014","unstructured":"Viciana-Abad R, Marfil R, Perez-Lorenzo J, Bandera J, Romero-Garces A, Reche-Lopez P (2014) Audio-visual perception system for a humanoid robotic head. Sensors 14(6):9522\u20139545","journal-title":"Sensors"},{"key":"11112_CR41","first-page":"511","volume":"1","author":"P Viola","year":"2001","unstructured":"Viola P, Jones M, et al. (2001) Rapid object detection using a boosted cascade of simple features. CVPR (1) 1:511\u2013518","journal-title":"CVPR (1)"},{"issue":"2","key":"11112_CR42","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1109\/THMS.2014.2303083","volume":"44","author":"A Zaraki","year":"2014","unstructured":"Zaraki A, Mazzei D, Giuliani M, De Rossi D (2014) Designing and evaluating a social gaze-control system for a humanoid robot. IEEE Trans Human Mach Syst 44(2):157\u2013168","journal-title":"IEEE Trans Human Mach Syst"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-021-11112-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-021-11112-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-021-11112-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T19:38:24Z","timestamp":1645472304000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-021-11112-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,8]]},"references-count":42,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,1]]}},"alternative-id":["11112"],"URL":"https:\/\/doi.org\/10.1007\/s11042-021-11112-7","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,8]]},"assertion":[{"value":"26 January 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 May 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 July 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}