{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T23:46:07Z","timestamp":1771458367545,"version":"3.50.1"},"reference-count":45,"publisher":"Institute of Electronics, Information and Communications Engineers (IEICE)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IEICE Trans. Fundamentals"],"published-print":{"date-parts":[[2023,6,1]]},"DOI":"10.1587\/transfun.2022eap1068","type":"journal-article","created":{"date-parts":[[2022,11,30]],"date-time":"2022-11-30T06:33:25Z","timestamp":1669790005000},"page":"938-946","source":"Crossref","is-referenced-by-count":1,"title":["GazeFollowTR: A Method of Gaze Following with Reborn Mechanism"],"prefix":"10.1587","volume":"E106.A","author":[{"given":"Jingzhao","family":"DAI","sequence":"first","affiliation":[{"name":"School of Electronic Science and Engineering, Nanjing University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ming","family":"LI","sequence":"additional","affiliation":[{"name":"School of Electronic Science and Engineering, Nanjing University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xuejiao","family":"HU","sequence":"additional","affiliation":[{"name":"School of Electronic Science and Engineering, Nanjing University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yang","family":"LI","sequence":"additional","affiliation":[{"name":"School of Electronic Science and Engineering, Nanjing University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sidan","family":"DU","sequence":"additional","affiliation":[{"name":"School of Electronic Science and Engineering, Nanjing University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"532","reference":[{"key":"1","doi-asserted-by":"crossref","unstructured":"[1] P. Wei, Y. Liu, T. Shu, N. Zheng, and S. Zhu, \u201cWhere and why are they looking? Jointly inferring human attention and intentions in complex tasks,\u201d 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp.6801-6809, June 2018, doi: 10.1109\/CVPR.2018.00711. 10.1109\/cvpr.2018.00711","DOI":"10.1109\/CVPR.2018.00711"},{"key":"2","doi-asserted-by":"crossref","unstructured":"[2] R.F. Ribeiro and P.D.P. Costa, \u201cDriver gaze zone dataset with depth data,\u201d 2019 14th IEEE International Conference on Automatic Face &amp; Gesture Recognition (FG 2019), pp.1-5, May 2019, doi: 10.1109\/FG.2019.8756592. 10.1109\/fg.2019.8756592","DOI":"10.1109\/FG.2019.8756592"},{"key":"3","doi-asserted-by":"crossref","unstructured":"[3] P.A. Dias, D. Malafronte, H. Medeiros, and F. Odone, \u201cGaze estimation for assisted living environments,\u201d 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.279-288, March 2020, doi: 10.1109\/WACV45572.2020.9093439. 10.1109\/wacv45572.2020.9093439","DOI":"10.1109\/WACV45572.2020.9093439"},{"key":"4","doi-asserted-by":"crossref","unstructured":"[4] Y. Fang, H. Duan, F. Shi, X. Min, and G. Zhai, \u201cIdentifying children with autism spectrum disorder based on gaze-following,\u201d 2020 IEEE International Conference on Image Processing (ICIP), pp.423-427, Oct. 2020, doi: 10.1109\/ICIP40778.2020.9190831. 10.1109\/icip40778.2020.9190831","DOI":"10.1109\/ICIP40778.2020.9190831"},{"key":"5","doi-asserted-by":"publisher","unstructured":"[5] K. Kikuchi, H. Takahira, R. Ishikawa, E. Wakamatsu, T. Shinkawa, and M. Yamada, \u201cDevelopment of a device to measure movement of gaze and hand,\u201d IEICE Trans. Fundamentals, vol.E97-A, no.2, pp.534-537, Feb. 2014, doi: 10.1587\/transfun.E97.A.534. 10.1587\/transfun.e97.a.534","DOI":"10.1587\/transfun.E97.A.534"},{"key":"6","doi-asserted-by":"publisher","unstructured":"[6] M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara, \u201cPredicting human eye fixations via an LSTM-based saliency attentive model,\u201d IEEE Trans. Image Process., vol.27, no.10, pp.5142-5154, 2018, doi: 10.1109\/TIP.2018.2851672. 10.1109\/tip.2018.2851672","DOI":"10.1109\/TIP.2018.2851672"},{"key":"7","doi-asserted-by":"crossref","unstructured":"[7] A. Volokitin, M. Gygli, and X. Boix, \u201cPredicting when saliency maps are accurate and eye fixations consistent,\u201d 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 10.1109\/cvpr.2016.65","DOI":"10.1109\/CVPR.2016.65"},{"key":"8","doi-asserted-by":"publisher","unstructured":"[8] H. Takahira, R. Ishikawa, K. Kikuchi, T. Shinkawa, and M. Yamada, \u201cAnalysis of gaze movement while reading E-books,\u201d IEICE Trans Fundamentals, vol.E97-A, no.2, pp.530-533, Feb. 2014, doi: 10.1587\/transfun.E97.A.530. 10.1587\/transfun.e97.a.530","DOI":"10.1587\/transfun.E97.A.530"},{"key":"9","unstructured":"[9] S. Yeamkuan and K. Chamnongthai, \u201cFixational feature-based gaze pattern recognition using long short-term memory,\u201d 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp.1-4, Dec. 2020."},{"key":"10","doi-asserted-by":"crossref","unstructured":"[10] D. Lian, Z. Yu, and S. Gao, \u201cBelieve it or not, we know what you are looking at!,\u201d Computer Vision-ACCV 2018, pp.35-50, Springer International Publishing, Cham, 2019. 10.1007\/978-3-030-20893-6_3","DOI":"10.1007\/978-3-030-20893-6_3"},{"key":"11","doi-asserted-by":"publisher","unstructured":"[11] B. Liu and K. Arakawa, \u201cA method for generating color palettes with deep neural networks considering human perception,\u201d IEICE Trans. Fundamentals, vol.E105-A, no.4, pp.639-646, April 2022, doi: 10.1587\/transfun.2021SMP0011. 10.1587\/transfun.2021smp0011","DOI":"10.1587\/transfun.2021SMP0011"},{"key":"12","doi-asserted-by":"crossref","unstructured":"[12] L. Fan, W. Wang, S.-C. Zhu, X. Tang, and S. Huang, \u201cUnderstanding human gaze communication by spatio-temporal graph reasoning,\u201d 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), 2019. 10.1109\/iccv.2019.00582","DOI":"10.1109\/ICCV.2019.00582"},{"key":"13","doi-asserted-by":"publisher","unstructured":"[13] N. Zhuang, B. Ni, Y. Xu, X. Yang, W. Zhang, Z. Li, and W. Gao, \u201cMUGGLE: MUlti-stream group gaze learning and estimation,\u201d IEEE Trans. Circuits Syst. Video Technol., vol.30, no.10, pp.3637-3650, 2020, doi: 10.1109\/TCSVT.2019.2940479. 10.1109\/tcsvt.2019.2940479","DOI":"10.1109\/TCSVT.2019.2940479"},{"key":"14","doi-asserted-by":"crossref","unstructured":"[14] Y. Fang, J. Tang, W. Shen, W. Shen, X. Gu, L. Song, and G. Zhai, \u201cDual attention guided gaze target detection in the wild,\u201d 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.11385-11394, June 2021, doi: 10.1109\/CVPR46437.2021.01123. 10.1109\/cvpr46437.2021.01123","DOI":"10.1109\/CVPR46437.2021.01123"},{"key":"15","doi-asserted-by":"crossref","unstructured":"[15] E. Chong, N. Ruiz, Y. Wang, Y. Zhang, A. Rozga, and J.M. Rehg, \u201cConnecting gaze, scene, and attention: Generalized attention estimation via joint modeling of gaze and scene saliency,\u201d Computer Vision-ECCV 2018, pp.397-412, Springer International Publishing, Cham, 2018. 10.1007\/978-3-030-01228-1_24","DOI":"10.1007\/978-3-030-01228-1_24"},{"key":"16","doi-asserted-by":"crossref","unstructured":"[16] Y. Cheng and F. Lu, \u201cGaze estimation using Transformer,\u201d 2022 26th International Conference on Pattern Recognition (ICPR), pp.3341-3347, Aug. 2022. 10.1109\/icpr56361.2022.9956687","DOI":"10.1109\/ICPR56361.2022.9956687"},{"key":"17","unstructured":"[18] Z. Cai, K. Huang, and C. Peng, \u201cReborn mechanism: Rethinking the negative phase information flow in convolutional neural network,\u201d arXiv preprint arXiv:2106.07026v2, 2021. 10.48550\/arXiv.2106.07026"},{"key":"18","doi-asserted-by":"crossref","unstructured":"[19] M. Tonsen, X. Zhang, Y. Sugano, and A. Bulling, \u201cLabeled pupils in the wild: A dataset for studying pupil detection in unconstrained environments,\u201d ACM, pp.139-142, 2016. 10.1145\/2857491.2857520","DOI":"10.1145\/2857491.2857520"},{"key":"19","doi-asserted-by":"crossref","unstructured":"[20] T. Fischer, H.J. Chang, and Y. Demiris, \u201cRT-GENE: Real-time eye gaze estimation in natural environments,\u201d European Conference on Computer Vision, pp.339-357, 2018. 10.1007\/978-3-030-01249-6_21","DOI":"10.1007\/978-3-030-01249-6_21"},{"key":"20","doi-asserted-by":"crossref","unstructured":"[21] Y. Ganin, D. Kononenko, D. Sungatullina, and V. Lempitsky, \u201cDeepWarp: Photorealistic image resynthesis for gaze manipulation,\u201d ECCV, pp.311-326, 2016. 10.1007\/978-3-319-46475-6_20","DOI":"10.1007\/978-3-319-46475-6_20"},{"key":"21","unstructured":"[22] A. Recasens Continente, A. Khosla, C. Vondrick, and A. Torralba, Where are They Looking?, Advances in Neural Information Processing Systems (NIPS), 2015."},{"key":"22","doi-asserted-by":"crossref","unstructured":"[23] P. Kellnhofer, A. Recasens, S. Stent, W. Matusik, and A. Torralba, \u201cGaze360: Physically unconstrained gaze estimation in the wild,\u201d 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), pp.6911-6920, Oct.-Nov. 2019, doi: 10.1109\/ICCV.2019.00701. 10.1109\/iccv.2019.00701","DOI":"10.1109\/ICCV.2019.00701"},{"key":"23","doi-asserted-by":"crossref","unstructured":"[24] E. Chong, Y. Wang, N. Ruiz, and J.M. Rehg, \u201cDetecting attended visual targets in video,\u201d 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.5395-5405, June 2020, doi: 10.1109\/CVPR42600.2020.00544. 10.1109\/cvpr42600.2020.00544","DOI":"10.1109\/CVPR42600.2020.00544"},{"key":"24","doi-asserted-by":"crossref","unstructured":"[25] H.R. Tavakoli, F. Ahmed, A. Borji, and J. Laaksonen, \u201cSaliency revisited: Analysis of mouse movements versus fixations,\u201d 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6354-6362, July 2017, doi: 10.1109\/CVPR.2017.673. 10.1109\/cvpr.2017.673","DOI":"10.1109\/CVPR.2017.673"},{"key":"25","doi-asserted-by":"crossref","unstructured":"[26] W. Jian and Z. Xinbo, \u201cAnalysis of eye gaze points based on visual search,\u201d 2014 International Conference on Orange Technologies, pp.13-16, Sept. 2014, doi: 10.1109\/ICOT.2014.6954665. 10.1109\/icot.2014.6954665","DOI":"10.1109\/ICOT.2014.6954665"},{"key":"26","doi-asserted-by":"crossref","unstructured":"[27] H. Tomas, M. Reyes, R. Dionido, M. Ty, J. Mirando, J. Casimiro, R. Atienza, and R. Guinto, \u201cGOO: A dataset for gaze object prediction in retail environments,\u201d 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.3119-3127, June 2021, doi: 10.1109\/CVPRW53098.2021.00349. 10.1109\/cvprw53098.2021.00349","DOI":"10.1109\/CVPRW53098.2021.00349"},{"key":"27","doi-asserted-by":"crossref","unstructured":"[28] Z. Bylinskii, A. Recasens, A. Borji, A. Oliva, A. Torralba, and F. Durand, \u201cWhere should saliency models look next?,\u201d Computer Vision-ECCV 2016, pp.809-824, Springer International Publishing, Cham, 2016. 10.1007\/978-3-319-46454-1_49","DOI":"10.1007\/978-3-319-46454-1_49"},{"key":"28","unstructured":"[29] A. Borji and L. Itti, \u201cCAT2000: A large scale fixation dataset for boosting saliency research,\u201d ArXiv, vol.abs\/1505.03581, 2015. 10.48550\/arXiv.1505.03581"},{"key":"29","unstructured":"[30] M. K\u00fcmmerer, L. Theis, and M. Bethge, \u201cDeep gaze I: Boosting saliency prediction with feature maps trained on ImageNet,\u201d Computer Science, 2014."},{"key":"30","unstructured":"[31] J. Pan, C. Canton, K. Mcguinness, N.E. O&apos;Connor, and X. Giro-I-Nieto, \u201cSalGAN: Visual saliency prediction with generative adversarial networks,\u201d arXiv preprint arXiv:1701.01081v3, 2017. 10.48550\/arXiv.1701.01081"},{"key":"31","doi-asserted-by":"crossref","unstructured":"[32] T. Pfister, J. Charles, and A. Zisserman, \u201cFlowing ConvNets for human pose estimation in videos,\u201d 2015 IEEE International Conference on Computer Vision (ICCV), pp.1913-1921, Dec. 2015, doi: 10.1109\/ICCV.2015.222. 10.1109\/iccv.2015.222","DOI":"10.1109\/ICCV.2015.222"},{"key":"32","doi-asserted-by":"crossref","unstructured":"[33] Z. Hu, Y. Li, and Z. Yang, \u201cImproving convolutional neural network using pseudo derivative ReLU,\u201d 2018 5th International Conference on Systems and Informatics (ICSAI), pp.283-287, Nov. 2018, doi: 10.1109\/ICSAI.2018.8599372. 10.1109\/icsai.2018.8599372","DOI":"10.1109\/ICSAI.2018.8599372"},{"key":"33","unstructured":"[34] A.L. Maas, \u201cRectifier nonlinearities improve neural network acoustic models,\u201d International Conference on Machine Learning (ICML), vol.30, 2013."},{"key":"34","unstructured":"[35] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, \u201cFast and accurate deep network learning by exponential linear units (ELUs),\u201d Computer Science, 2015."},{"key":"35","unstructured":"[36] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, and N. Houlsby, \u201cAn image is worth 16x16 words: Transformers for image recognition at scale,\u201d arXiv preprint arXiv:2010.11929v2, 2020. 10.48550\/arXiv.2010.11929"},{"key":"36","unstructured":"[37] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, \u201cAttention is all you need,\u201d arXiv:1706.03762, 2017. 10.48550\/arXiv.1706.03762"},{"key":"37","doi-asserted-by":"crossref","unstructured":"[38] Y. Chen, Y. Bai, W. Zhang, and T. Mei, \u201cDestruction and construction learning for fine-grained image recognition,\u201d 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 10.1109\/cvpr.2019.00530","DOI":"10.1109\/CVPR.2019.00530"},{"key":"38","doi-asserted-by":"crossref","unstructured":"[39] J. Fu, H. Zheng, and T. Mei, \u201cLook closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition,\u201d 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4476-4484, July 2017, doi: 10.1109\/CVPR.2017.476. 10.1109\/cvpr.2017.476","DOI":"10.1109\/CVPR.2017.476"},{"key":"39","doi-asserted-by":"publisher","unstructured":"[40] S. Jiang, W. Min, L. Liu, and Z. Luo, \u201cMulti-scale multi-view deep feature aggregation for food recognition,\u201d IEEE Trans. Image Process., vol.29, pp.265-276, 2020, doi: 10.1109\/TIP.2019.2929447. 10.1109\/tip.2019.2929447","DOI":"10.1109\/TIP.2019.2929447"},{"key":"40","doi-asserted-by":"crossref","unstructured":"[41] J. He, Z. Shao, J. Wright, D. Kerr, C. Boushey, and F. Zhu, \u201cMulti-task image-based dietary assessment for food recognition and portion size estimation,\u201d 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp.49-54, Aug. 2020, doi: 10.1109\/MIPR49039.2020.00018. 10.1109\/mipr49039.2020.00018","DOI":"10.1109\/MIPR49039.2020.00018"},{"key":"41","doi-asserted-by":"publisher","unstructured":"[42] S. Jiang, W. Min, Y. Lyu, and L. Liu, \u201cFew-shot food recognition via multi-view representation learning,\u201d ACM Trans. Multimedia Comput. Commun. Appl., vol.16, no.3, pp.1-20, 2020, doi: 10.1145\/3391624. 10.1145\/3391624","DOI":"10.1145\/3391624"},{"key":"42","doi-asserted-by":"publisher","unstructured":"[43] H. Liang, G. Wen, Y. Hu, M. Luo, P. Yang, and Y. Xu, \u201cMVANet: Multi-tasks guided multi-view attention network for Chinese food recognition,\u201d IEEE Trans. Multimedia, vol.23, pp.3551-3561, 2021, doi: 10.1109\/TMM.2020.3028478. 10.1109\/tmm.2020.3028478","DOI":"10.1109\/TMM.2020.3028478"},{"key":"43","doi-asserted-by":"publisher","unstructured":"[44] D. Lian, L. Hu, W. Luo, Y. Xu, L. Duan, J. Yu, and S. Gao, \u201cMultiview multitask gaze estimation with deep convolutional neural networks,\u201d IEEE Trans. Neural Netw. Learn. Syst., vol.30, no.10, pp.3010-3023, Oct. 2019, doi: 10.1109\/TNNLS.2018.2865525. 10.1109\/tnnls.2018.2865525","DOI":"10.1109\/TNNLS.2018.2865525"},{"key":"44","doi-asserted-by":"crossref","unstructured":"[45] T.-Y. Lin, P. Doll\u00e1r, R. Girshick, K. He, B. Hariharan, and S. Belongie, \u201cFeature pyramid networks for object detection,\u201d CoRR, vol.abs\/1612.03144, 2016, doi: 10.48550\/arXiv.1612.03144.","DOI":"10.1109\/CVPR.2017.106"},{"key":"45","doi-asserted-by":"publisher","unstructured":"[46] S. Mochiduki, R. Watanabe, H. Takahira, and M. Yamada, \u201cAnalysis of head movement during gaze movement with varied viewing distances and positions,\u201d IEICE Trans. Fundamentals, vol.E101-A, no.6, pp.892-899, June 2018, doi: 10.1587\/transfun.E101.A.892. 10.1587\/transfun.e101.a.892","DOI":"10.1587\/transfun.E101.A.892"}],"container-title":["IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transfun\/E106.A\/6\/E106.A_2022EAP1068\/_pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,3]],"date-time":"2023-06-03T04:05:34Z","timestamp":1685765134000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.jstage.jst.go.jp\/article\/transfun\/E106.A\/6\/E106.A_2022EAP1068\/_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,1]]},"references-count":45,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023]]}},"URL":"https:\/\/doi.org\/10.1587\/transfun.2022eap1068","relation":{},"ISSN":["0916-8508","1745-1337"],"issn-type":[{"value":"0916-8508","type":"print"},{"value":"1745-1337","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,1]]},"article-number":"2022EAP1068"}}