{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,29]],"date-time":"2025-12-29T18:48:20Z","timestamp":1767034100722,"version":"3.37.3"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2016,10,1]],"date-time":"2016-10-01T00:00:00Z","timestamp":1475280000000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"funder":[{"DOI":"10.13039\/501100003593","name":"CNPq","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003593","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004586","name":"FAPERJ","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004586","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2017,2]]},"DOI":"10.1007\/s11042-016-3846-8","type":"journal-article","created":{"date-parts":[[2016,10,1]],"date-time":"2016-10-01T00:25:56Z","timestamp":1475281556000},"page":"5691-5720","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Extending multimedia languages to support multimodal user interactions"],"prefix":"10.1007","volume":"76","author":[{"given":"\u00c1lan L\u00edvio Vasconcelos","family":"Guedes","sequence":"first","affiliation":[]},{"given":"Roberto Gerson de Albuquerque","family":"Azevedo","sequence":"additional","affiliation":[]},{"given":"Simone Diniz Junqueira","family":"Barbosa","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2016,10,1]]},"reference":[{"key":"3846_CR1","unstructured":"ABNT (2008) ABNT NBR 15606-2: Televis\u00e3o digital terrestre \u2013 Codifica\u00e7\u00e3o de dados e especifica\u00e7\u00f5es de transmiss\u00e3o para radiodifus\u00e3o digital Parte 2: Ginga-NCL para receptores fixos e m\u00f3veis \u2013 Linguagem de aplica\u00e7\u00e3o XML para codifica\u00e7\u00e3o de aplica\u00e7\u00f5es. http:\/\/forumsbtvd.org.br\/acervo-online\/normas-brasileiras-de-tv-digital\/ . Accessed 3 Mar 2016"},{"key":"3846_CR2","doi-asserted-by":"publisher","first-page":"832","DOI":"10.1145\/182.358434","volume":"26","author":"JF Allen","year":"1983","unstructured":"Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26:832\u2013843. doi: 10.1145\/182.358434","journal-title":"Commun ACM"},{"key":"3846_CR3","unstructured":"Angeluci ACB, de Albuquerque Azevedo RG, Soares LFG (2009) O uso da linguagem declarativa do Ginga-NCL na constru\u00e7\u00e3o de conte\u00fados audiovisuais interativos: a experi\u00eancia do \u201cRoteiros do Dia.\u201d 1o Simp\u00f3sio Int Telev Digit SIMTVD 91"},{"key":"3846_CR4","doi-asserted-by":"crossref","unstructured":"Beckham JL, Fabbrizio GD, Klarlund N (2001) Towards SMIL as a foundation for multimodal, multimedia applications. In: Dalsgaard P, Lindberg B, Benner H, Tan Z-H (eds) EUROSPEECH 2001 Scand. 7th Eur Conf Speech Commun Technol ISCA, 1363\u20131366","DOI":"10.21437\/Eurospeech.2001-353"},{"key":"3846_CR5","doi-asserted-by":"crossref","unstructured":"Bolt RA (1980) Put-that-there: voice and gesture at the graphics interface. Proc 7th Annu Conf Comput Graph Interact Tech. ACM, New York, NY, USA, 262\u2013270","DOI":"10.1145\/800250.807503"},{"key":"3846_CR6","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1145\/1047936.1047943","volume":"1","author":"DCA Bulterman","year":"2005","unstructured":"Bulterman DCA, Hardman L (2005) Structured multimedia authoring. ACM Trans Multimed Comput Commun Appl 1:89\u2013109. doi: 10.1145\/1047936.1047943","journal-title":"ACM Trans Multimed Comput Commun Appl"},{"key":"3846_CR7","unstructured":"Bulterman DCA, Rutledge LW (2008) SMIL 3.0: Flexible multimedia for web, mobile devices and daisy talking books, 2nd ed. Springer Publishing Company, Incorporated"},{"key":"3846_CR8","doi-asserted-by":"crossref","unstructured":"Carvalho LAMC, Guimar\u00e3es AP, Mac\u00eado HT (2008) Architectures for interactive vocal environment to Brazilian digital TV middleware. Proc 2008 Euro Am Conf Telemat Inf Syst ACM, New York, NY, USA, 22:1\u201322:8","DOI":"10.1145\/1621087.1621109"},{"key":"3846_CR9","unstructured":"Carvalho L, Macedo H (2010) Estendendo a NCL para promover interatividade vocal em Aplica\u00e7\u00f5es Ginga na TVDi Brasileira. WebMedia 10 Proc. 16th Braz Symp Multimed Web"},{"key":"3846_CR10","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1007\/978-3-642-21672-5_38","volume-title":"Univers. Access Hum.-Comput. Interact. Des. EInclusion","author":"D Costa","year":"2011","unstructured":"Costa D, Duarte C (2011) Adapting multimodal fission to user\u2019s abilities. In: Stephanidis C (ed) Univers. Access Hum.-Comput. Interact. Des. EInclusion. Springer, Berlin Heidelberg, pp 347\u2013356"},{"key":"3846_CR11","doi-asserted-by":"crossref","unstructured":"Coutaz J, Nigay L, Salber D, Blandford A, May J, Young RM (1995) Four easy pieces for assessing the usability of multimodal interaction: the CARE properties. In: InterAct, 115\u2013120","DOI":"10.1007\/978-1-5041-2896-4_19"},{"key":"3846_CR12","doi-asserted-by":"crossref","unstructured":"Dumas B, Lalanne D, Ingold R (2009) HephaisTK: a toolkit for rapid prototyping of multimodal interfaces. Proc 2009 Int Conf. Multimodal Interfaces. ACM, New York, NY, USA, 231\u2013232","DOI":"10.1145\/1647314.1647360"},{"key":"3846_CR13","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1007\/s12193-010-0043-3","volume":"3","author":"B Dumas","year":"2010","unstructured":"Dumas B, Lalanne D, Ingold R (2010) Description languages for multimodal interaction: a set of guidelines and its illustration with SMUIML. J Multimodal User Interfaces 3:237\u2013247. doi: 10.1007\/s12193-010-0043-3","journal-title":"J Multimodal User Interfaces"},{"key":"3846_CR14","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/978-3-642-00437-7_1","volume-title":"Hum. Mach. Interact","author":"B Dumas","year":"2009","unstructured":"Dumas B, Lalanne D, Oviatt S (2009) Multimodal Interfaces: A Survey of Principles, Models and Frameworks. In: Kohlas J, Lalanne D (eds) Hum. Mach. Interact. Springer, Berlin Heidelberg, pp 3\u201326"},{"key":"3846_CR15","doi-asserted-by":"publisher","unstructured":"Ghinea G, Timmerer C, Lin W, Gulliver SR (2014) Mulsemedia: state of the art, perspectives, and challenges. ACM Trans Multimed Comput Commun Appl 11:17:1\u201317:23. doi: 10.1145\/2617994","DOI":"10.1145\/2617994"},{"key":"3846_CR16","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1007\/s00530-013-0332-2","volume":"20","author":"T Hachaj","year":"2014","unstructured":"Hachaj T, Ogiela MR (2014) Rule-based approach to recognizing human body poses and gestures in real time. Multimed Syst 20:81\u201399. doi: 10.1007\/s00530-013-0332-2","journal-title":"Multimed Syst"},{"key":"3846_CR17","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1109\/93.735868","volume":"5","author":"C-M Huang","year":"1998","unstructured":"Huang C-M, Wang C (1998) Synchronization for interactive multimedia presentations. IEEE Multimed 5:44\u201362. doi: 10.1109\/93.735868","journal-title":"IEEE Multimed"},{"key":"3846_CR18","unstructured":"Ideum Inc (2016) Gesture markup language. http:\/\/www.gestureml.org\/ . Accessed 3 Mar 2016"},{"key":"3846_CR19","unstructured":"ISO\/IEC (2013) ISO\/IEC 23005-3:2013 Information Technology - Media Context and Control - Part 3: Sensory Information. http:\/\/www.iso.org\/iso\/home\/store\/catalogue_ics\/catalogue_detail_ics.htm?csnumber=60391 . Accessed 3 Mar 2016"},{"key":"3846_CR20","unstructured":"ISO\/IEC (2014) ISO\/IEC 23005-1:2014 Information technology - Media context and control - Part 1: Architecture. http:\/\/www.iso.org\/iso\/home\/store\/catalogue_ics\/catalogue_detail_ics.htm?csnumber=60359 . Accessed 3 Mar 2016"},{"key":"3846_CR21","unstructured":"ITU (2015) ITU Recommendation H.761: Nested context language (NCL) and Ginga-NCL for IPTV services. http:\/\/handle.itu.int\/11.1002\/1000\/12237 . Accessed 3 Mar 2016"},{"key":"3846_CR22","unstructured":"Jonathan Duddington (2016) eSpeak text to speech engine. http:\/\/espeak.sourceforge.net\/ . Accessed 3 Mar 2016"},{"key":"3846_CR23","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1007\/1-4020-3075-4_8","volume-title":"Spok. Multimodal Hum.-Comput. Dialogue Mob. Environ","author":"K Katsurada","year":"2005","unstructured":"Katsurada K, Yamada H, Nakamura Y, Kobayashi S, Nitta T (2005) XISL: A Modality-Independent MMI Description Language. In: B\u00fchler D, Dybkj\u00e6r L, Minker W (eds) Spok. Multimodal Hum.-Comput. Dialogue Mob. Environ. Springer, Netherlands, pp 133\u2013148"},{"key":"3846_CR24","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1007\/11821830_17","volume-title":"Intell. virtual agents","author":"S Kopp","year":"2006","unstructured":"Kopp S, Krenn B, Marsella S, Marshall AN, Pelachaud C, Pirker H, Th\u00f3risson KR, Vilhj\u00e1lmsson H (2006) Towards a common framework for multimodal generation: the behavior markup language. In: Gratch J, Young M, Aylett R, Ballin D, Olivier P (eds) Intell. virtual agents. Springer Berlin Heidelberg, Berlin, pp 205\u2013217"},{"key":"3846_CR25","unstructured":"Lazar J, Feng JH, Hochheiser H (2010) Research methods in human-computer interaction. Wiley Publishing"},{"key":"3846_CR26","unstructured":"Leap Motion Inc (2016) Leap motion controller. https:\/\/www.leapmotion.com\/ . Accessed 3 Mar 2016"},{"key":"3846_CR27","unstructured":"Lee Laboratory of Nagoya Institute of Technology (2016) Julius speech recognition engine. http:\/\/julius.osdn.jp\/ . Accessed 3 Mar 2016"},{"key":"3846_CR28","doi-asserted-by":"crossref","unstructured":"Meixner B, Kosch H (2012) Interactive non-linear video: definition and XML structure. Proc 2012 ACM Symp Doc Eng ACM, New York, NY, USA, 49\u201358","DOI":"10.1145\/2361354.2361367"},{"key":"3846_CR29","doi-asserted-by":"crossref","unstructured":"Oviatt S (2007) Multimodal Interfaces. Hum-Comput Interact Handb. CRC Press, 413\u2013432","DOI":"10.1201\/9781410615862.ch21"},{"key":"3846_CR30","doi-asserted-by":"publisher","unstructured":"Rainer B, Timmerer C (2014) a generic utility model representing the quality of sensory experience. ACM Trans Multimed Comput Commun Appl 11:14:1\u201314:17. doi: 10.1145\/2648429","DOI":"10.1145\/2648429"},{"key":"3846_CR31","doi-asserted-by":"publisher","unstructured":"Rowe LA (2013) Looking forward 10\u00a0years to multimedia successes. ACM Trans Multimed Comput Commun Appl 9:37:1\u201337:7. doi: 10.1145\/2490825","DOI":"10.1145\/2490825"},{"key":"3846_CR32","unstructured":"Salt Forum Speech Application Language Tags Specification. http:\/\/www.saltforum.org . Accessed 3 Mar 2016"},{"key":"3846_CR33","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1007\/s12193-013-0119-y","volume":"7","author":"D Schnelle-Walka","year":"2013","unstructured":"Schnelle-Walka D, Radomski S, M\u00fchlh\u00e4user M (2013) JVoiceXML as a modality component in the W3C multimodal architecture. J Multimodal User Interfaces 7:183\u2013194. doi: 10.1007\/s12193-013-0119-y","journal-title":"J Multimodal User Interfaces"},{"key":"3846_CR34","volume-title":"Designing the user interface: strategies for effective human-computer interaction","author":"B Shneiderman","year":"1997","unstructured":"Shneiderman B (1997) Designing the user interface: strategies for effective human-computer interaction, 3rd edn. Addison-Wesley Longman Publishing Co., Inc., Boston","edition":"3"},{"key":"3846_CR35","unstructured":"Soares LFG (2009) Nested context model 3.0: Part 1 \u2013 NCM Core. Monogr Comput Sci. PUC-Rio Inf MCC1805. ftp:\/\/obaluae.inf.puc-rio.br\/pub\/docs\/techreports\/05_18_soares.pdf . Accessed 3 Mar 2016"},{"key":"3846_CR36","unstructured":"Soares LFG, Lima GF (2015) NCL handbook. Monogr Comput Sci. PUC-Rio Inf MCC1813. handbook.ncl.org.br. Accessed 3 Mar 2016"},{"key":"3846_CR37","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1109\/MCOM.2010.5473867","volume":"48","author":"LFG Soares","year":"2010","unstructured":"Soares LFG, Marcio Ferreira M, de Neto CSS, Moreno MF (2010) Ginga-NCL: declarative middleware for multimedia IPTV services. IEEE Commun Mag 48:74\u201381. doi: 10.1109\/MCOM.2010.5473867","journal-title":"IEEE Commun Mag"},{"key":"3846_CR38","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1016\/j.patrec.2013.07.003","volume":"36","author":"M Turk","year":"2014","unstructured":"Turk M (2014) Multimodal interaction: a review. Pattern Recognit Lett 36:189\u2013195. doi: 10.1016\/j.patrec.2013.07.003","journal-title":"Pattern Recognit Lett"},{"key":"3846_CR39","unstructured":"W3C (2001) XHTML\u2009+\u2009Voice Profile 1.0. http:\/\/www.w3.org\/TR\/xhtml+voice\/ . Accessed 3 Mar 2016"},{"key":"3846_CR40","unstructured":"W3C (2003) Multimodal interaction framework. www.w3.org\/TR\/mmi-framework\/ . Accessed 3 Mar 2016"},{"key":"3846_CR41","unstructured":"W3C (2004) Speech recognition grammar specification version 1.0. http:\/\/www.w3.org\/TR\/speech-grammar\/ . Accessed 3 Mar 2016"},{"key":"3846_CR42","unstructured":"W3C (2007) Voice Extensible Markup Language (VoiceXML) 2.1. http:\/\/www.w3.org\/TR\/voicexml21\/ . Accessed 3 Mar 2016"},{"key":"3846_CR43","unstructured":"W3C (2009) EMMA: Extensible MultiModal Annotation markup language. http:\/\/www.w3.org\/TR\/2009\/REC-emma-20090210\/ . Accessed 3 Mar 2016"},{"key":"3846_CR44","unstructured":"W3C (2010) Speech Synthesis Markup Language (SSML) Version 1.1. http:\/\/www.w3.org\/TR\/speech-synthesis11\/ . Accessed 3 Mar 2016"},{"key":"3846_CR45","unstructured":"W3C (2011) Ink Markup Language (InkML). http:\/\/www.w3.org\/TR\/2011\/REC-InkML-20110920\/ . Accessed 3 Mar 2016"},{"key":"3846_CR46","unstructured":"W3C (2012) Multimodal Architecture and Interfaces. http:\/\/www.w3.org\/TR\/mmi-arch\/ . Accessed 3 Mar 2016"},{"key":"3846_CR47","unstructured":"W3C (2012) State Chart XML (SCXML): State Machine Notation for Control Abstraction. http:\/\/www.w3.org\/TR\/scxml\/ . Accessed 3 Mar 2016"},{"key":"3846_CR48","doi-asserted-by":"crossref","unstructured":"Wang K (2002) SALT: a spoken language interface for web-based multimodal dialog systems. Proc Int Conf Spok Lang Process","DOI":"10.3115\/1118808.1118823"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-016-3846-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11042-016-3846-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-016-3846-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,20]],"date-time":"2023-08-20T09:02:35Z","timestamp":1692522155000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11042-016-3846-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,10,1]]},"references-count":48,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2017,2]]}},"alternative-id":["3846"],"URL":"https:\/\/doi.org\/10.1007\/s11042-016-3846-8","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"type":"print","value":"1380-7501"},{"type":"electronic","value":"1573-7721"}],"subject":[],"published":{"date-parts":[[2016,10,1]]}}}