{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T06:44:41Z","timestamp":1773902681235,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":31,"publisher":"ACM","license":[{"start":{"date-parts":[[2017,11,3]],"date-time":"2017-11-03T00:00:00Z","timestamp":1509667200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2017,11,3]]},"DOI":"10.1145\/3136755.3136769","type":"proceedings-article","created":{"date-parts":[[2017,11,6]],"date-time":"2017-11-06T13:30:29Z","timestamp":1509975029000},"page":"297-301","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Temporal alignment using the incremental unit framework"],"prefix":"10.1145","author":[{"given":"Casey","family":"Kennington","sequence":"first","affiliation":[{"name":"Boise State University, USA"}]},{"given":"Ting","family":"Han","sequence":"additional","affiliation":[{"name":"Bielefeld University, Germany"}]},{"given":"David","family":"Schlangen","sequence":"additional","affiliation":[{"name":"Bielefeld University, Germany"}]}],"member":"320","published-online":{"date-parts":[[2017,11,3]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Proceedings of CSLP. 1922\u2014-1925","author":"Aist Gregory","year":"2006","unstructured":"Gregory Aist , James Allen , Ellen Campana , Lucian Galescu , Carlos Gallo , Scott Stoness , Mary Swift , and Michael Tanenhaus . 2006 . Software architectures for incremental understanding of human speech . In Proceedings of CSLP. 1922\u2014-1925 . Gregory Aist, James Allen, Ellen Campana, Lucian Galescu, Carlos Gallo, Scott Stoness, Mary Swift, and Michael Tanenhaus. 2006. Software architectures for incremental understanding of human speech. In Proceedings of CSLP. 1922\u2014-1925."},{"key":"e_1_3_2_1_2_1","volume-title":"Pragmatics","volume":"1","author":"Aist Gregory","year":"2007","unstructured":"Gregory Aist , James Allen , Ellen Campana , Carlos Gomez Gallo , Scott Stoness , and Mary Swift . 2007 . Incremental understanding in human-computer dialogue and experimental evidence for advantages over nonincremental methods . In Pragmatics , Vol. 1 . Trento, Italy, 149\u2013154. Gregory Aist, James Allen, Ellen Campana, Carlos Gomez Gallo, Scott Stoness, and Mary Swift. 2007. Incremental understanding in human-computer dialogue and experimental evidence for advantages over nonincremental methods. In Pragmatics, Vol. 1. Trento, Italy, 149\u2013154."},{"key":"e_1_3_2_1_3_1","unstructured":"James F. Allen. 2013. Maintaining Knowledge about Temporal Intervals. In Readings in Qualitative Reasoning About Physical Systems. 361\u2013372. org\/10.1016\/B978-1-4832-1447-4.50033-X   James F. Allen. 2013. Maintaining Knowledge about Temporal Intervals. In Readings in Qualitative Reasoning About Physical Systems. 361\u2013372. org\/10.1016\/B978-1-4832-1447-4.50033-X"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2070481.2070521"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of LREC. 266\u2013271","author":"Asri Layla El","year":"2014","unstructured":"Layla El Asri , Romain Laroche , Olivier Pietquin , and Hatim Khouzaimi . 2014 . NASTIA: Negotiating Appointment Setting Interface . In Proceedings of LREC. 266\u2013271 . Layla El Asri, Romain Laroche, Olivier Pietquin, and Hatim Khouzaimi. 2014. NASTIA: Negotiating Appointment Setting Interface. In Proceedings of LREC. 266\u2013271."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-010-0182-0"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/2390444.2390464"},{"key":"e_1_3_2_1_9_1","volume-title":"Micro-Structure of Disfluencies: Basics for Conversational Speech Synthesis. Interspeech 2015","author":"Betz Simon","year":"2015","unstructured":"Simon Betz , Petra Wagner , and David Schlangen . 2015. Micro-Structure of Disfluencies: Basics for Conversational Speech Synthesis. Interspeech 2015 ( 2015 ). Simon Betz, Petra Wagner, and David Schlangen. 2015. Micro-Structure of Disfluencies: Basics for Conversational Speech Synthesis. Interspeech 2015 (2015)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2008.04.002"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSMC.2005.1571679"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2909824.3020214"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.3115\/976909.979653"},{"key":"e_1_3_2_1_14_1","volume-title":"Proceedings of SigDial","author":"Kennington Casey","year":"2013","unstructured":"Casey Kennington , Spyros Kousidis , and David Schlangen . 2013 . Interpreting Situated Dialogue Utterances: an Update Model that Uses Speech, Gaze, and Gesture Information . In Proceedings of SigDial 2013. 173\u2013182. Casey Kennington, Spyros Kousidis, and David Schlangen. 2013. Interpreting Situated Dialogue Utterances: an Update Model that Uses Speech, Gaze, and Gesture Information. In Proceedings of SigDial 2013. 173\u2013182."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-4312"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-3631"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W14-0212"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/319382.319398"},{"key":"e_1_3_2_1_19_1","volume-title":"Multimodal interfaces. The human-computer interaction handbook: Fundamentals, evolving technologies and emerging applications 14","author":"Oviatt Sharon","year":"2003","unstructured":"Sharon Oviatt . 2003. Multimodal interfaces. The human-computer interaction handbook: Fundamentals, evolving technologies and emerging applications 14 ( 2003 ), 286\u2013304. Sharon Oviatt. 2003. Multimodal interfaces. The human-computer interaction handbook: Fundamentals, evolving technologies and emerging applications 14 (2003), 286\u2013304."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2017.02.003"},{"key":"e_1_3_2_1_21_1","unstructured":"Gerhard Russ Brian Sallans and Harald Hareter. 2005. Semantic Based Information Fusion in a Multimodal Interface.. In CSREA HCI. Citeseer 94\u2013102.  Gerhard Russ Brian Sallans and Harald Hareter. 2005. Semantic Based Information Fusion in a Multimodal Interface.. In CSREA HCI. Citeseer 94\u2013102."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/1708376.1708381"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5087\/dad.2011.105"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2818346.2823301"},{"key":"e_1_3_2_1_25_1","volume-title":"Towards Incremental Speech Production in Dialogue Systems","author":"Skantze Gabriel","unstructured":"Gabriel Skantze and Anna Hjalmarsson . 1991. Towards Incremental Speech Production in Dialogue Systems . In Word Journal Of The International Linguistic Association . Tokyo, Japan, 1\u20138. Gabriel Skantze and Anna Hjalmarsson. 1991. Towards Incremental Speech Production in Dialogue Systems. In Word Journal Of The International Linguistic Association. Tokyo, Japan, 1\u20138."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/1944506.1944507"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/1609067.1609150"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101149.1101236"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0010-0285(02)00503-0"},{"key":"e_1_3_2_1_30_1","volume-title":"N.Y.) 268, 5217","author":"Tanenhaus Michael","year":"1995","unstructured":"Michael Tanenhaus , Michael Spivey-Knowlton , Kathleen Eberhard , and Julie Sedivy . 1995. Integration of visual and linguistic information in spoken language comprehension. Science (New York , N.Y.) 268, 5217 ( 1995 ), 1632\u20131634. Michael Tanenhaus, Michael Spivey-Knowlton, Kathleen Eberhard, and Julie Sedivy. 1995. Integration of visual and linguistic information in spoken language comprehension. Science (New York, N.Y.) 268, 5217 (1995), 1632\u20131634."},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC","author":"Tokunaga Takenobu","year":"2012","unstructured":"Takenobu Tokunaga , Ryu Iida , Asuka Terai , and Naoko Kuriyama . 2012 . The REX corpora : A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues . In Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC 2012). 422\u2013429. Takenobu Tokunaga, Ryu Iida, Asuka Terai, and Naoko Kuriyama. 2012. The REX corpora : A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues. In Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC 2012). 422\u2013429."},{"key":"e_1_3_2_1_32_1","unstructured":"Minh T Vo and Alex Waibel. 1997. Modeling and interpreting multimodal inputs: A semantic integration approach. Technical Report. DTIC Document. Abstract 1 Introduction 2 Related Work 3 The IU Framework 4 IU Temporal Alignment 4.1 The Alignment IU-Module 4.2 Activity Detection Driven (AD) 4.3 Act and Revoke (AR) 4.4 Combined AR&AD 5 Evaluation 6 Discussion & Conclusion References  Minh T Vo and Alex Waibel. 1997. Modeling and interpreting multimodal inputs: A semantic integration approach. Technical Report. DTIC Document. Abstract 1 Introduction 2 Related Work 3 The IU Framework 4 IU Temporal Alignment 4.1 The Alignment IU-Module 4.2 Activity Detection Driven (AD) 4.3 Act and Revoke (AR) 4.4 Combined AR&AD 5 Evaluation 6 Discussion & Conclusion References"}],"event":{"name":"ICMI '17: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION","location":"Glasgow UK","acronym":"ICMI '17","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"]},"container-title":["Proceedings of the 19th ACM International Conference on Multimodal Interaction"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3136755.3136769","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3136755.3136769","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:13:37Z","timestamp":1750212817000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3136755.3136769"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11,3]]},"references-count":31,"alternative-id":["10.1145\/3136755.3136769","10.1145\/3136755"],"URL":"https:\/\/doi.org\/10.1145\/3136755.3136769","relation":{},"subject":[],"published":{"date-parts":[[2017,11,3]]},"assertion":[{"value":"2017-11-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}