{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T14:50:15Z","timestamp":1781016615186,"version":"3.54.1"},"reference-count":64,"publisher":"Emerald","issue":"3","license":[{"start":{"date-parts":[[2023,2,3]],"date-time":"2023-02-03T00:00:00Z","timestamp":1675382400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["DTA"],"published-print":{"date-parts":[[2023,6,14]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>Student engagement is a key factor that connects with student achievement and retention. This paper aims to identify individuals' engagement automatically in the classroom with multimodal data for supporting educational research.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>The video and electroencephalogram data of 36 undergraduates were collected to represent observable and internal information. Since different modal data have different granularity, this study proposed the Fast\u2013Slow Neural Network (FSNN) to detect engagement through both observable and internal information, with an asynchrony structure to preserve the sequence information of data with different granularity.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>Experimental results show that the proposed algorithm can recognize engagement better than the traditional data fusion methods. The results are also analyzed to figure out the reasons for the better performance of the proposed FSNN.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>This study combined multimodal data from observable and internal aspects to improve the accuracy of engagement detection in the classroom. The proposed FSNN used the asynchronous process to deal with the problem of remaining sequential information when facing multimodal data with different granularity.<\/jats:p><\/jats:sec>","DOI":"10.1108\/dta-05-2022-0199","type":"journal-article","created":{"date-parts":[[2023,2,3]],"date-time":"2023-02-03T08:12:20Z","timestamp":1675411940000},"page":"418-435","source":"Crossref","is-referenced-by-count":5,"title":["Multimodal Fast\u2013Slow Neural Network for learning engagement evaluation"],"prefix":"10.1108","volume":"57","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2510-1575","authenticated-orcid":false,"given":"Lizhao","family":"Zhang","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7710-7231","authenticated-orcid":false,"given":"Jui-Long","family":"Hung","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9069-6109","authenticated-orcid":false,"given":"Xu","family":"Du","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hao","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhuang","family":"Hu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"140","published-online":{"date-parts":[[2023,2,3]]},"reference":[{"issue":"5","key":"key2023062207415453300_ref001","doi-asserted-by":"crossref","first-page":"369","DOI":"10.1002\/pits.20303","article-title":"Student engagement with school: critical conceptual and methodological issues of the construct","volume":"45","year":"2008","journal-title":"Psychology in the Schools"},{"issue":"5","key":"key2023062207415453300_ref002","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1016\/j.jsp.2006.04.002","article-title":"Measuring cognitive and psychological engagement: validation of the student engagement instrument","volume":"44","year":"2006","journal-title":"Journal of School Psychology"},{"key":"key2023062207415453300_ref003","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1016\/j.future.2020.02.075","article-title":"Affective database for e-learning and classroom environments using Indian students' faces, hand gestures and body postures","volume":"108","year":"2020","journal-title":"Future Generation Computer Systems \u2013 The International Journal of Escience"},{"issue":"2","key":"key2023062207415453300_ref004","doi-asserted-by":"publisher","first-page":"1387","DOI":"10.1007\/s10639-019-10004-6","article-title":"Automatic detection of students' affective states in classroom environment using hybrid convolutional neural networks","volume":"25","year":"2020","journal-title":"Education and Information Technologies"},{"issue":"2","key":"key2023062207415453300_ref005","first-page":"423","article-title":"Multimodal machine learning: a survey and taxonomy","volume":"41","year":"2018","journal-title":"IEEE Transactions on Pattern Analysis Machine Intelligence"},{"issue":"8","key":"key2023062207415453300_ref006","doi-asserted-by":"publisher","first-page":"1798","DOI":"10.1109\/tpami.2013.50","article-title":"Representation learning: a review and new perspectives","volume":"35","year":"2013","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"8","key":"key2023062207415453300_ref007","doi-asserted-by":"crossref","first-page":"1361","DOI":"10.1177\/0735633117744346","article-title":"On-task and off-task behavior in the classroom: a study on mathematics learning with educational video games","volume":"56","year":"2019","journal-title":"Journal of Educational Computing Research"},{"issue":"1","key":"key2023062207415453300_ref008","first-page":"172","article-title":"OpenPose: realtime multi-person 2D pose estimation using part affinity fields","volume":"43","year":"2019","journal-title":"IEEE Transactions on Pattern Analysis Machine Intelligence"},{"issue":"5","key":"key2023062207415453300_ref009","doi-asserted-by":"publisher","first-page":"534","DOI":"10.1093\/iwc\/iwv013","article-title":"Connecting brains and bodies: applying physiological computing to support social interaction","volume":"27","year":"2015","journal-title":"Interacting with Computers"},{"issue":"4","key":"key2023062207415453300_ref010","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1080\/10494820.2017.1341938","article-title":"Effects of online synchronous instruction with an attention monitoring and alarm mechanism on sustained attention and learning performance","volume":"26","year":"2018","journal-title":"Interactive Learning Environments"},{"issue":"2","key":"key2023062207415453300_ref011","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1111\/bjet.12359","article-title":"Assessing the attention levels of students by using a novel attention aware system based on brainwave signals","volume":"48","year":"2017","journal-title":"British Journal of Educational Technology"},{"issue":"5","key":"key2023062207415453300_ref012","doi-asserted-by":"publisher","first-page":"1441","DOI":"10.1111\/bjet.13015","article-title":"The promise and challenges of multimodal learning analytics","volume":"51","year":"2020","journal-title":"British Journal of Educational Technology"},{"issue":"3","key":"key2023062207415453300_ref013","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1145\/2682899","article-title":"A review and meta-analysis of multimodal affect detection systems","volume":"47","year":"2015","journal-title":"ACM Computing Surveys"},{"issue":"1","key":"key2023062207415453300_ref014","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1111\/jcal.12216","article-title":"Reduced mental load in learning a motor visual task with virtual 3D method","volume":"34","year":"2018","journal-title":"Journal of Computer Assisted Learning"},{"issue":"1","key":"key2023062207415453300_ref015","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40561-018-0080-z","article-title":"Engagement detection in online learning: a review","volume":"6","year":"2019","journal-title":"Smart Learning Environments"},{"issue":"9","key":"key2023062207415453300_ref016","doi-asserted-by":"publisher","first-page":"1375","DOI":"10.1016\/j.cub.2017.04.002","article-title":"Brain-to-brain synchrony tracks real-world dynamic group interactions in the classroom","volume":"27","year":"2017","journal-title":"Current Biology"},{"key":"key2023062207415453300_ref017","doi-asserted-by":"publisher","first-page":"104","DOI":"10.1016\/j.compedu.2019.04.004","article-title":"Technology enhanced learning in higher education; motivations, engagement and academic achievement","volume":"137","year":"2019","journal-title":"Computers & Education"},{"issue":"5","key":"key2023062207415453300_ref018","doi-asserted-by":"publisher","first-page":"1505","DOI":"10.1111\/bjet.12992","article-title":"Multimodal learning analytics for game-based learning","volume":"51","year":"2020","journal-title":"British Journal of Educational Technology"},{"issue":"7","key":"key2023062207415453300_ref019","doi-asserted-by":"publisher","first-page":"1553","DOI":"10.1109\/tmm.2013.2267205","article-title":"Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention","volume":"15","year":"2013","journal-title":"IEEE Transactions on Multimedia"},{"issue":"1","key":"key2023062207415453300_ref020","doi-asserted-by":"crossref","first-page":"59","DOI":"10.3102\/00346543074001059","article-title":"School engagement: potential of the concept, state of the evidence","volume":"74","year":"2004","journal-title":"Review of Educational Research"},{"issue":"1","key":"key2023062207415453300_ref021","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1111\/jcal.12020","article-title":"A dynamic analysis of the interplay between asynchronous and synchronous communication in online learning: the impact of motivation","volume":"30","year":"2014","journal-title":"Journal of Computer Assisted Learning"},{"key":"key2023062207415453300_ref022","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1007\/s10648-019-09514-z","article-title":"Attentive or not? Toward a machine learning approach to assessing students' visible engagement in classroom instruction","volume":"33","year":"2021","journal-title":"Educational Psychology Review"},{"issue":"18","key":"key2023062207415453300_ref023","doi-asserted-by":"publisher","first-page":"25321","DOI":"10.1007\/s11042-019-7651-z","article-title":"Students' affective content analysis in smart classroom environment using deep learning techniques","volume":"78","year":"2019","journal-title":"Multimedia Tools and Applications"},{"key":"key2023062207415453300_ref024","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.compedu.2015.09.005","article-title":"Measuring student engagement in technology-mediated learning: a review","volume":"90","year":"2015","journal-title":"Computers Education"},{"key":"key2023062207415453300_ref025","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1100\/tsw.2009.83","article-title":"Estimating brain load from the EEG","volume":"9","year":"2009","journal-title":"TheScientificWorldJournal"},{"issue":"7","key":"key2023062207415453300_ref026","first-page":"4","article-title":"What multimodal data can tell us about the students' regulation of their learning process","volume":"72","year":"2019","journal-title":"Learning Instruction"},{"issue":"1","key":"key2023062207415453300_ref027","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/s10462-018-09677-1","article-title":"Survey on supervised machine learning techniques for automatic text classification","volume":"52","year":"2019","journal-title":"Artificial Intelligence Review"},{"issue":"1","key":"key2023062207415453300_ref028","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1080\/07294360.2017.1344197","article-title":"Student engagement in the educational interface: understanding the mechanisms of student success","volume":"37","year":"2018","journal-title":"Higher Education Research Development"},{"issue":"6","key":"key2023062207415453300_ref029","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1037\/0022-0663.74.6.844","article-title":"Time-on-task: issues of timing, sampling, and definition","volume":"74","year":"1982","journal-title":"Journal of Educational Psychology"},{"key":"key2023062207415453300_ref030","volume-title":"Piecing Together the Student Success Puzzle: Research, Propositions, and Recommendations: ASHE Higher Education Report","year":"2011"},{"key":"key2023062207415453300_ref031","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1016\/j.chb.2017.05.017","article-title":"Effects of an integrated physiological signal-based attention-promoting and English listening system on students' learning performance and behavioral patterns","volume":"75","year":"2017","journal-title":"Computers in Human Behavior"},{"issue":"1","key":"key2023062207415453300_ref032","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1007\/s11042-013-1391-2","article-title":"Multimedia classification and event detection using double fusion","volume":"71","year":"2014","journal-title":"Multimedia Tools and Applications"},{"issue":"7","key":"key2023062207415453300_ref033","doi-asserted-by":"publisher","first-page":"838","DOI":"10.1080\/01443410.2013.860217","article-title":"Measuring cognitive load with electroencephalography and self-report: focus on the effect of English-medium learning for Korean students","volume":"34","year":"2014","journal-title":"Educational Psychology"},{"key":"key2023062207415453300_ref034","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1016\/j.compedu.2021.104283","article-title":"Impact of the provision of PowerPoint slides on learning","volume":"173","year":"2021","journal-title":"Computers & Education"},{"key":"key2023062207415453300_ref035","doi-asserted-by":"crossref","unstructured":"Li, Q., Ren, Y., Wei, T., Wang, C., Liu, Z. and Yue, J. (2020), \u201cA learning attention monitoring system via photoplethysmogram using wearable wrist devices\u201d, in Pinkwart, N. and Liu, S. (Eds), Artificial Intelligence Supported Educational Technologies, Springer, Cham, pp. 133-150.","DOI":"10.1007\/978-3-030-41099-5_8"},{"issue":"2","key":"key2023062207415453300_ref036","first-page":"132","article-title":"Construction of multi-mode affective learning system: taking affective design as an example","volume":"19","year":"2016","journal-title":"Educational Technology & Society"},{"issue":"1","key":"key2023062207415453300_ref037","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1080\/10494820.2018.1451899","article-title":"Improving effectiveness of learners' review of video lectures by using an attention-based video lecture review mechanism based on brainwave signals","volume":"27","year":"2019","journal-title":"Interactive Learning Environments"},{"issue":"17","key":"key2023062207415453300_ref038","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1249\/TJX.0000000000000099","article-title":"Classroom-based physical activity and on-task behavior","volume":"4","year":"2019","journal-title":"Translational Journal of the American College of Sports Medicine"},{"issue":"1","key":"key2023062207415453300_ref039","doi-asserted-by":"crossref","first-page":"153","DOI":"10.3102\/00028312037001153","article-title":"Student engagement in instructional activity: patterns in the elementary, middle, and high school years","volume":"37","year":"2000","journal-title":"American Educational Research Journal"},{"issue":"1","key":"key2023062207415453300_ref040","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1007\/s10548-014-0361-y","article-title":"Frontal midline theta reflects individual task performance in a working memory task","volume":"28","year":"2015","journal-title":"Brain Topography"},{"key":"key2023062207415453300_ref041","doi-asserted-by":"publisher","first-page":"14819","DOI":"10.1109\/access.2017.2731784","article-title":"An EEG-based cognitive load assessment in multimedia learning using feature extraction and partial directed coherence","volume":"5","year":"2017","journal-title":"IEEE Access"},{"key":"key2023062207415453300_ref042","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1007\/978-3-662-44415-3_16","article-title":"Majority vote of diverse classifiers for late fusion","year":"2014"},{"issue":"22","key":"key2023062207415453300_ref043","doi-asserted-by":"crossref","first-page":"4729","DOI":"10.3390\/app9224729","article-title":"A computer-vision based application for student behavior monitoring in classroom","volume":"9","year":"2019","journal-title":"Applied Sciences"},{"key":"key2023062207415453300_ref044","doi-asserted-by":"crossref","first-page":"5499","DOI":"10.1007\/s10639-020-10229-w","article-title":"Multimodal data indicators for capturing cognitive, motivational, and emotional learning processes: a systematic literature review","volume":"25","year":"2020","journal-title":"Education Information Technologies"},{"issue":"11","key":"key2023062207415453300_ref045","doi-asserted-by":"crossref","first-page":"2424","DOI":"10.1016\/j.clinph.2006.06.754","article-title":"A theoretical basis for standing and traveling brain waves measured with human EEG with implications for an integrated consciousness","volume":"117","year":"2006","journal-title":"Clinical Neurophysiology"},{"key":"key2023062207415453300_ref046","doi-asserted-by":"crossref","unstructured":"Pekrun, R. and Linnenbrink-Garcia, L. (2012), \u201cAcademic emotions and student engagement\u201d, in Christenson, S.L., Reschly, A.L. and Wylie, C. (Eds), Handbook of Research on Student Engagement, Springer, Boston, MA, pp. 259-282.","DOI":"10.1007\/978-1-4614-2018-7_12"},{"issue":"1","key":"key2023062207415453300_ref047","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/srep43916","article-title":"EEG in the classroom: synchronised neural recordings during video presentation","volume":"7","year":"2017","journal-title":"Scientific Reports"},{"key":"key2023062207415453300_ref048","first-page":"149","article-title":"Assessing neurosky's usability to detect attention levels in an assessment exercise","year":"2009"},{"issue":"10","key":"key2023062207415453300_ref049","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1177\/016146811311501007","article-title":"Student off-task behavior in computer-based learning in the Philippines: comparison to prior research in the USA","volume":"115","year":"2013","journal-title":"Teachers College Record"},{"issue":"5","key":"key2023062207415453300_ref050","doi-asserted-by":"crossref","first-page":"125","DOI":"10.19173\/irrodl.v12i5.999","article-title":"Quality of learners' time and learning performance beyond quantitative time-on-task","volume":"12","year":"2011","journal-title":"International Review of Research in Open Distributed Learning"},{"issue":"6088","key":"key2023062207415453300_ref051","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","year":"1986","journal-title":"Nature"},{"key":"key2023062207415453300_ref052","doi-asserted-by":"publisher","first-page":"238","DOI":"10.1109\/acsat.2012.50","article-title":"Hybrid intelligent technique for text categorization","volume-title":"2012 International Conference on Advanced Computer Science Applications and Technologies (ACSAT)","year":"2012"},{"issue":"5","key":"key2023062207415453300_ref053","doi-asserted-by":"publisher","first-page":"1450","DOI":"10.1111\/bjet.12993","article-title":"Multimodal data capabilities for learning: what can multimodal data tell us about learning?","volume":"51","year":"2020","journal-title":"British Journal of Educational Technology"},{"key":"key2023062207415453300_ref054","first-page":"1257","article-title":"Physiological synchrony in EEG, electrodermal activity and heart rate detects attentionally relevant events in time","volume":"14","year":"2020","journal-title":"Frontiers in Neuroscience"},{"issue":"4","key":"key2023062207415453300_ref055","doi-asserted-by":"crossref","first-page":"046028","DOI":"10.1088\/1741-2552\/aba87d","article-title":"Physiological synchrony in EEG, electrodermal activity and heart rate reflects shared selective auditory attention","volume":"17","year":"2020","journal-title":"Journal of Neural Engineering"},{"issue":"2","key":"key2023062207415453300_ref056","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1353\/rhe.1982.0017","article-title":"An assessment of the academic and social influences on freshman year educational outcomes","volume":"5","year":"1982","journal-title":"The Review of Higher Education"},{"key":"key2023062207415453300_ref057","first-page":"33","article-title":"Predicting student engagement in classrooms using facial behavioral cues","year":"2017"},{"issue":"1","key":"key2023062207415453300_ref058","doi-asserted-by":"crossref","first-page":"89","DOI":"10.3102\/00346543045001089","article-title":"Dropout from higher education: a theoretical synthesis of recent research","volume":"45","year":"1975","journal-title":"Review of Educational Research"},{"key":"key2023062207415453300_ref059","unstructured":"Usart, M., Romero, M. and Barber\u00e0, E. (2013), \u201cMeasuring students' Time Perspective and Time on Task in GBL activities\u201d, ELearn Center research paper series, Universitat Oberta de Catalunya, Barcelona, Spain, pp. 40-51."},{"key":"key2023062207415453300_ref060","doi-asserted-by":"crossref","unstructured":"Wu, Z.Y., Cai, L.H. and Meng, H. (2006), \u201cMulti-level fusion of audio and visual features for speaker identification\u201d, in Zhang, D. and Jain, A.K. (Eds), Advances in Biometrics, Proceedings, Vol. 3832, Springer-Verlag Berlin, Berlin, pp. 493-499.","DOI":"10.1007\/11608288_66"},{"key":"key2023062207415453300_ref061","doi-asserted-by":"publisher","first-page":"340","DOI":"10.1016\/j.chb.2017.12.037","article-title":"Review on portable EEG technology in educational research","volume":"81","year":"2018","journal-title":"Computers in Human Behavior"},{"issue":"3","key":"key2023062207415453300_ref062","doi-asserted-by":"crossref","first-page":"145","DOI":"10.2114\/jpa2.28.145","article-title":"A brainwave signal measurement and data processing technique for daily life applications","volume":"28","year":"2009","journal-title":"Journal of Physiological Anthropology"},{"issue":"11","key":"key2023062207415453300_ref063","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1109\/35.41402","article-title":"Integration of acoustic and visual speech signals using neural networks","volume":"27","year":"1989","journal-title":"IEEE Communications Magazine"},{"issue":"1","key":"key2023062207415453300_ref064","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1109\/tpami.2008.52","article-title":"A survey of affect recognition methods: audio, visual, and spontaneous expressions","volume":"31","year":"2009","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"}],"container-title":["Data Technologies and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-05-2022-0199\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-05-2022-0199\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T23:15:09Z","timestamp":1753398909000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/dta\/article\/57\/3\/418-435\/38843"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,3]]},"references-count":64,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,2,3]]},"published-print":{"date-parts":[[2023,6,14]]}},"alternative-id":["10.1108\/DTA-05-2022-0199"],"URL":"https:\/\/doi.org\/10.1108\/dta-05-2022-0199","relation":{},"ISSN":["2514-9288","2514-9288"],"issn-type":[{"value":"2514-9288","type":"print"},{"value":"2514-9288","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,3]]}}}