{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T17:03:35Z","timestamp":1776099815498,"version":"3.50.1"},"reference-count":35,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2014,1,1]],"date-time":"2014-01-01T00:00:00Z","timestamp":1388534400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["12070356, 1218209"],"award-info":[{"award-number":["12070356, 1218209"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100015853","name":"Rochester Institute of Technology","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100015853","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Technical Institute for the Deaf"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Access. Comput."],"published-print":{"date-parts":[[2014,1]]},"abstract":"<jats:p>\n            Real-time captioning enables deaf and hard of hearing (DHH) people to follow classroom lectures and other aural speech by converting it into visual text with less than a five second delay. Keeping the delay short allows end-users to follow and participate in conversations. This article focuses on the fundamental problem that makes real-time captioning difficult: sequential keyboard typing is much slower than speaking. We first surveyed the audio characteristics of 240 one-hour-long captioned lectures on YouTube, such as speed and duration of speaking bursts. We then analyzed how these characteristics impact caption generation and readability, considering specifically our human-powered\n            <jats:italic>collaborative captioning<\/jats:italic>\n            approach. We note that most of these characteristics are also present in more general domains. For our caption comparison evaluation, we transcribed a classroom lecture in real-time using all three captioning approaches. We recruited 48 participants (24 DHH) to watch these classroom transcripts in an eye-tracking laboratory. We presented these captions in a randomized, balanced order. We show that both hearing and DHH participants preferred and followed collaborative captions better than those generated by automatic speech recognition (ASR) or professionals due to the more consistent\n            <jats:italic>flow<\/jats:italic>\n            of the resulting captions. These results show the potential to reliably capture speech even during sudden bursts of speed, as well as for generating \u201cenhanced\u201d captions, unlike other human-powered captioning approaches.\n          <\/jats:p>","DOI":"10.1145\/2543578","type":"journal-article","created":{"date-parts":[[2014,1,28]],"date-time":"2014-01-28T13:49:22Z","timestamp":1390916962000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":61,"title":["Accessibility Evaluation of Classroom Captions"],"prefix":"10.1145","volume":"5","author":[{"given":"Raja S.","family":"Kushalnagar","sequence":"first","affiliation":[{"name":"Rochester Institute of Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Walter S.","family":"Lasecki","sequence":"additional","affiliation":[{"name":"University of Rochester"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeffrey P.","family":"Bigham","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2014,1]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIC-STH.2009.5444533"},{"key":"e_1_2_1_2_1","volume-title":"NCES 2011-033","author":"Aud Susan","year":"2011","unstructured":"Susan Aud , William Hussar , Grace Kena , Kevin Bianco , Lauren Frohlich , Jana Kemp , and Kim Tahan . 2011 . The condition of education 2011 . NCES 2011-033 . Susan Aud, William Hussar, Grace Kena, Kevin Bianco, Lauren Frohlich, Jana Kemp, and Kim Tahan. 2011. The condition of education 2011. NCES 2011-033."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/638249.638284"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2047196.2047201"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1866029.1866080"},{"key":"e_1_2_1_6_1","first-page":"34","article-title":"Captioned television for the deaf","volume":"117","author":"Boyd J.","year":"1972","unstructured":"J. Boyd and E. A. Vader . 1972 . Captioned television for the deaf . Am. Ann. Deaf 117 , 1, 34 -- 37 . J. Boyd and E. A. Vader. 1972. Captioned television for the deaf. Am. Ann. Deaf 117, 1, 34--37.","journal-title":"Am. Ann. Deaf"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201908)","author":"Cui Xiaodong","year":"2008","unstructured":"Xiaodong Cui , Liang Gu , Bing Xiang , Wei Zhang , and Yuqing Gao . 2008 . Developing high performance asr in the IBM multilingual speech-to-speech translation system . In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201908) . IEEE, Los Alamitos, CA, 5121--5124. DOI:http:\/\/dx.doi.org\/10.1109\/ICASSP. 2008.4518811. 10.1109\/ICASSP.2008.4518811 Xiaodong Cui, Liang Gu, Bing Xiang, Wei Zhang, and Yuqing Gao. 2008. Developing high performance asr in the IBM multilingual speech-to-speech translation system. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201908). IEEE, Los Alamitos, CA, 5121--5124. DOI:http:\/\/dx.doi.org\/10.1109\/ICASSP.2008.4518811."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1353\/tech.2006.0068"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1093\/deafed\/6.4.285"},{"key":"e_1_2_1_10_1","volume-title":"Instructional Technology And Education of the Deaf Symposium. NTID","author":"Fifield M. Bryce","year":"2001","unstructured":"M. Bryce Fifield . 2001 . Realtime remote online captioning: An effective accommodation for rural schools and colleges . In Instructional Technology And Education of the Deaf Symposium. NTID , Rochester, NY, 1--9. M. Bryce Fifield. 2001. Realtime remote online captioning: An effective accommodation for rural schools and colleges. In Instructional Technology And Education of the Deaf Symposium. NTID, Rochester, NY, 1--9."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1598\/RT.59.7.3"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1353\/aad.2012.0377"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1353\/aad.2012.0073"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1353\/aad.2012.0144"},{"key":"e_1_2_1_15_1","volume-title":"The state of closed captioning services in the United States. Tech. rep","author":"Jordan Amy B.","unstructured":"Amy B. Jordan , Anne Albright , Amy Branner , and John Sullivan . 2003. The state of closed captioning services in the United States. Tech. rep ., National Captioning Institute Foundation , Washington, DC ., 47pp. Amy B. Jordan, Anne Albright, Amy Branner, and John Sullivan. 2003. The state of closed captioning services in the United States. Tech. rep., National Captioning Institute Foundation, Washington, DC., 47pp."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1268784.1268860"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.380986"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2384916.2384930"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2461121.2461142"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1093\/deafed\/7.4.267"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2384916.2384942"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2047196.2047200"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2380116.2380122"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2470654.2466269"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1279540.1279551"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1093\/deafed\/enn014"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1037\/0022-0663.93.1.187"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2468356.2468360"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1753326.1753584"},{"key":"e_1_2_1_30_1","first-page":"30","article-title":"Recognition means more than just getting the words right: Beyond accuracy to readability","volume":"1","author":"Stuckless Ross","year":"1999","unstructured":"Ross Stuckless . 1999 . Recognition means more than just getting the words right: Beyond accuracy to readability . Speech Technol. 1 , 30 -- 35 . Ross Stuckless. 1999. Recognition means more than just getting the words right: Beyond accuracy to readability. Speech Technol. 1, 30--35.","journal-title":"Speech Technol."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1518\/001872096778702006"},{"key":"e_1_2_1_32_1","volume-title":"Appellants","author":"United States Supreme Court USSC. 2012.","unstructured":"United States Supreme Court USSC. 2012. Boyer v. Lousiana, 11-9953 , Appellants \u2019 Original Transcript . United States Supreme Court USSC. 2012. Boyer v. Lousiana, 11-9953, Appellants\u2019 Original Transcript."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/FIE.2005.1612286"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1188816.1188819"},{"key":"e_1_2_1_35_1","first-page":"175","article-title":"A historical study of typewriters and typing methods: From the position of planning japanese parallels","volume":"2","author":"Yamada Hisao","year":"1980","unstructured":"Hisao Yamada . 1980 . A historical study of typewriters and typing methods: From the position of planning japanese parallels . J. Inf. Process. 2 , 4, 175 -- 202 . Hisao Yamada. 1980. A historical study of typewriters and typing methods: From the position of planning japanese parallels. J. Inf. Process. 2, 4, 175--202.","journal-title":"J. Inf. Process."}],"container-title":["ACM Transactions on Accessible Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2543578","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2543578","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:10:07Z","timestamp":1750234207000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2543578"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,1]]},"references-count":35,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2014,1]]}},"alternative-id":["10.1145\/2543578"],"URL":"https:\/\/doi.org\/10.1145\/2543578","relation":{},"ISSN":["1936-7228","1936-7236"],"issn-type":[{"value":"1936-7228","type":"print"},{"value":"1936-7236","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,1]]},"assertion":[{"value":"2013-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-01-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}