{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T02:16:41Z","timestamp":1768011401585,"version":"3.49.0"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2011,10,1]],"date-time":"2011-10-01T00:00:00Z","timestamp":1317427200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001381","name":"National Research Foundation-Prime Minister's office, Republic of Singapore","doi-asserted-by":"publisher","award":["NRF2007IDM-IDM002-047, NRF2008IDM-IDM004-029"],"award-info":[{"award-number":["NRF2007IDM-IDM002-047, NRF2008IDM-IDM004-029"]}],"id":[{"id":"10.13039\/501100001381","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2011,10]]},"abstract":"<jats:p>\n            There are more than 66 million people suffering from hearing impairment and this disability brings them difficulty in video content understanding due to the loss of audio information. If the scripts are available, captioning technology can help them in a certain degree by synchronously illustrating the scripts during the playing of videos. However, we show that the existing captioning techniques are far from satisfactory in assisting the hearing-impaired audience to enjoy videos. In this article, we introduce a scheme to enhance video accessibility using a\n            <jats:italic>Dynamic Captioning<\/jats:italic>\n            approach, which explores a rich set of technologies including face detection and recognition, visual saliency analysis, text-speech alignment, etc. Different from the existing methods that are categorized as static captioning, dynamic captioning puts scripts at suitable positions to help the hearing-impaired audience better recognize the speaking characters. In addition, it progressively highlights the scripts word-by-word via aligning them with the speech signal and illustrates the variation of voice volume. In this way, the special audience can better track the scripts and perceive the moods that are conveyed by the variation of volume. We implemented the technology on 20 video clips and conducted an in-depth study with 60 real hearing-impaired users. The results demonstrated the effectiveness and usefulness of the video accessibility enhancement scheme.\n          <\/jats:p>","DOI":"10.1145\/2037676.2037681","type":"journal-article","created":{"date-parts":[[2011,11,8]],"date-time":"2011-11-08T13:32:01Z","timestamp":1320759121000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":45,"title":["Video accessibility enhancement for hearing-impaired users"],"prefix":"10.1145","volume":"7S","author":[{"given":"Richang","family":"Hong","sequence":"first","affiliation":[{"name":"Hefei University of Technology, Hefei China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Meng","family":"Wang","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, Hefei China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiao-Tong","family":"Yuan","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mengdi","family":"Xu","sequence":"additional","affiliation":[{"name":"National University of Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianguo","family":"Jiang","sequence":"additional","affiliation":[{"name":"Hefei University of Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuicheng","family":"Yan","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tat-Seng","family":"Chua","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2011,11,4]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding.","author":"Aimera J.","unstructured":"Aimera , J. and Wooters , C . 2003. A robust speaker clustering algorithm . In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding. Aimera, J. and Wooters, C. 2003. A robust speaker clustering algorithm. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the IEEE International Conference on Computer Vision.","author":"Arandjelovic O.","unstructured":"Arandjelovic , O. and Zisserman , A . 2005. Automatic face recognition for film character retrieval in feature-length films . In Proceedings of the IEEE International Conference on Computer Vision. Arandjelovic, O. and Zisserman, A. 2005. Automatic face recognition for film character retrieval in feature-length films. In Proceedings of the IEEE International Conference on Computer Vision."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the International Conference on Web Engineering.","author":"Arrue M.","unstructured":"Arrue , M. and Vigo , M . 2007. Considering web accessibility in information retrieval systems . In Proceedings of the International Conference on Web Engineering. Arrue, M. and Vigo, M. 2007. Considering web accessibility in information retrieval systems. In Proceedings of the International Conference on Web Engineering."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the Annual Workshop on Human-Computer Interaction and Information Retrieval.","author":"Azzopardi L.","unstructured":"Azzopardi , L. , Glassey , R. , Polajnar , M. , and Ruthven , I . 2009. Puppyir: Designing an open source framework for interactive information services for children . In Proceedings of the Annual Workshop on Human-Computer Interaction and Information Retrieval. Azzopardi, L., Glassey, R., Polajnar, M., and Ruthven, I. 2009. Puppyir: Designing an open source framework for interactive information services for children. In Proceedings of the Annual Workshop on Human-Computer Interaction and Information Retrieval."},{"key":"e_1_2_1_5_1","first-page":"32","article-title":"Captioned television for the deaf","volume":"117","author":"Boyd J.","year":"1972","unstructured":"Boyd , J. and Vade , E. 1972 . Captioned television for the deaf . Am Ann Hear. Impaired 117 , 1, 32 -- 37 . Boyd, J. and Vade, E. 1972. Captioned television for the deaf. Am Ann Hear. Impaired 117, 1, 32--37.","journal-title":"Am Ann Hear. Impaired"},{"key":"e_1_2_1_6_1","first-page":"943","article-title":"The effects of caption rate and language level on comprehension of a captioned video presentation","volume":"125","author":"Braveman B.","year":"1980","unstructured":"Braveman , B. and Hertzog , M. 1980 . The effects of caption rate and language level on comprehension of a captioned video presentation . Am Ann Hear. Impaired 125 , 7, 943 -- 948 . Braveman, B. and Hertzog, M. 1980. The effects of caption rate and language level on comprehension of a captioned video presentation. Am Ann Hear. Impaired 125, 7, 943--948.","journal-title":"Am Ann Hear. Impaired"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/638249.638287"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the International Conference on Multi-Media Modeling.","author":"Cu M.","unstructured":"Cu , M. , Chia , L. T. , Yi , H. , and Rajan , D . 2006. Affective content detection in sitcom using subtitle and audio . In Proceedings of the International Conference on Multi-Media Modeling. Cu, M., Chia, L. T., Yi, H., and Rajan, D. 2006. Affective content detection in sitcom using subtitle and audio. In Proceedings of the International Conference on Multi-Media Modeling."},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the European Conference on Speech Communication and Technology.","author":"Daelons W.","unstructured":"Daelons , W. and Bosch , V . 1993. Tabtalk: Reusability in dataoriented grapheme-to-phoneme conversion . In Proceedings of the European Conference on Speech Communication and Technology. Daelons, W. and Bosch, V. 1993. Tabtalk: Reusability in dataoriented grapheme-to-phoneme conversion. In Proceedings of the European Conference on Speech Communication and Technology."},{"key":"e_1_2_1_10_1","volume-title":"Proceedings of the British Machine Vision Conference.","author":"Everinghan M.","unstructured":"Everinghan , M. , Siviv , J. , and Zisserman , A . 2006. Hello&excl; My name is.. Buffy. Automatic naming of characters in TV videos . In Proceedings of the British Machine Vision Conference. Everinghan, M., Siviv, J., and Zisserman, A. 2006. Hello&excl; My name is.. Buffy. Automatic naming of characters in TV videos. In Proceedings of the British Machine Vision Conference."},{"key":"e_1_2_1_11_1","first-page":"6","article-title":"Improving deaf users accessibility in hyptertext information retrieval: are graphical interfaces useful for them&quest;","volume":"26","author":"Fajardo I.","year":"2006","unstructured":"Fajardo , I. , Canas , J.-J. , Salmeron , L. , and Abascal , J. 2006 . Improving deaf users accessibility in hyptertext information retrieval: are graphical interfaces useful for them&quest; Behav. Inform. Technol. 26 , 6 . Fajardo, I., Canas, J.-J., Salmeron, L., and Abascal, J. 2006. Improving deaf users accessibility in hyptertext information retrieval: are graphical interfaces useful for them&quest; Behav. Inform. Technol. 26, 6.","journal-title":"Behav. Inform. Technol."},{"key":"e_1_2_1_12_1","volume-title":"Statistical Methods for Research Workers","author":"Fisher R. A.","unstructured":"Fisher , R. A. 1970. Statistical Methods for Research Workers . Macmillan Pub Co. Fisher, R. A. 1970. Statistical Methods for Research Workers. Macmillan Pub Co."},{"key":"e_1_2_1_13_1","first-page":"78","article-title":"Working memory capacity and comprehension processes in hearing impaired reader","volume":"2","author":"Garrison W.","year":"1997","unstructured":"Garrison , W. , Long , G. , and Dowaliby , F. 1997 . Working memory capacity and comprehension processes in hearing impaired reader . J. Hear. Impaired Stud Hear. Impaired Edu. 2 , 2, 78 -- 94 . Garrison, W., Long, G., and Dowaliby, F. 1997. Working memory capacity and comprehension processes in hearing impaired reader. J. Hear. Impaired Stud Hear. Impaired Edu. 2, 2, 78--94.","journal-title":"J. Hear. Impaired Stud Hear. Impaired Edu."},{"key":"e_1_2_1_14_1","first-page":"374","article-title":"How level and type of deafness affect user perception of multimedia video clips","volume":"2","author":"Gulliver S. R.","year":"2003","unstructured":"Gulliver , S. R. and Ghinea , G. 2003 a. How level and type of deafness affect user perception of multimedia video clips . Inform. Soc. J. 2 , 4, 374 -- 386 . Gulliver, S. R. and Ghinea, G. 2003a. How level and type of deafness affect user perception of multimedia video clips. Inform. Soc. J. 2, 4, 374--386.","journal-title":"Inform. Soc. J."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the IEEE International Conference on Multimedia and Expo.","author":"Gulliver S. R.","unstructured":"Gulliver , S. R. and Ghinea , G . 2003b. Impact of captions on hearing impaired and hearing perception of multimedia video clips . In Proceedings of the IEEE International Conference on Multimedia and Expo. Gulliver, S. R. and Ghinea, G. 2003b. Impact of captions on hearing impaired and hearing perception of multimedia video clips. In Proceedings of the IEEE International Conference on Multimedia and Expo."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1874013"},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Huang X. Alleva F. Hon H. Hwuang M.-Y. Lee K.-F. and Rosenfeld R. 1993. The Sphinx II speech recognition system: An overview. In Computer Speech and Language.  Huang X. Alleva F. Hon H. Hwuang M.-Y. Lee K.-F. and Rosenfeld R. 1993. The Sphinx II speech recognition system: An overview. In Computer Speech and Language.","DOI":"10.3115\/1075671.1075690"},{"key":"e_1_2_1_18_1","first-page":"43","article-title":"Television literacy: comprehension of program content using closed captions for the deaf","volume":"6","author":"Jelinek L.","year":"2001","unstructured":"Jelinek , L. and Jackson , D. 2001 . Television literacy: comprehension of program content using closed captions for the deaf . J. Hear. Impaired Stud Hear. Impaired Educ. 6 , 1, 43 -- 53 . Jelinek, L. and Jackson, D. 2001. Television literacy: comprehension of program content using closed captions for the deaf. J. Hear. Impaired Stud Hear. Impaired Educ. 6, 1, 43--53.","journal-title":"J. Hear. Impaired Stud Hear. Impaired Educ."},{"key":"e_1_2_1_19_1","volume-title":"-R","author":"Juslin P.-N.","year":"2009","unstructured":"Juslin , P.-N. and Scherer , K . -R . 2009 . Speech emotion analysis. http:\/\/www.scholarpedia.org\/article\/speech emotion analysis. Juslin, P.-N. and Scherer, K.-R. 2009. Speech emotion analysis. http:\/\/www.scholarpedia.org\/article\/speech emotion analysis."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/957013.957094"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the International Conference on Spoken Language Processing.","author":"Moreno P. J.","year":"1998","unstructured":"Moreno , P. J. 1998 . A recursive algorithm for the forced alignment of very long audio segments . In Proceedings of the International Conference on Spoken Language Processing. Moreno, P. J. 1998. A recursive algorithm for the forced alignment of very long audio segments. In Proceedings of the International Conference on Spoken Language Processing."},{"key":"e_1_2_1_22_1","unstructured":"Nielsen J. 1995. Advances in human-computer interaction. http:\/\/en.wikipedia.org\/wiki\/Deaf 5.   Nielsen J. 1995. Advances in human-computer interaction. http:\/\/en.wikipedia.org\/wiki\/Deaf 5."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11222-008-9111-x"},{"key":"e_1_2_1_24_1","volume-title":"-B","author":"Reynolds D.-A.","year":"2000","unstructured":"Reynolds , D.-A. , Quatieri , T.-F. , and Dunn , R . -B . 2000 . Speaker verification using adapted Gaussian mixture models. In Digital Signal Processing , 14--91. Reynolds, D.-A., Quatieri, T.-F., and Dunn, R.-B. 2000. Speaker verification using adapted Gaussian mixture models. In Digital Signal Processing, 14--91."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2005.251"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631300"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631305"},{"key":"e_1_2_1_28_1","series-title":"SIAM J. Optim","volume-title":"On accelerated proximal gradient methods for convex-concave optimization","author":"Tseng P.","unstructured":"Tseng , P. 2008. On accelerated proximal gradient methods for convex-concave optimization . SIAM J. Optim . Tseng, P. 2008. On accelerated proximal gradient methods for convex-concave optimization. SIAM J. Optim."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.","author":"Viola P.","unstructured":"Viola , P. and Jones , M . 2001. Rapid object detection using a boosted cascaded of simple features . In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Viola, P. and Jones, M. 2001. Rapid object detection using a boosted cascaded of simple features. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition."},{"key":"e_1_2_1_30_1","volume-title":"-M","author":"Wan V.","year":"2000","unstructured":"Wan , V. and Campbell , W . -M . 2000 . Support vector machines for speaker verification and identification. In Proc. IEEE. Wan, V. and Campbell, W.-M. 2000. Support vector machines for speaker verification and identification. In Proc. IEEE."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10. 1109\/TCSVT.2009.2017400"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2009.2012919"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1631272.1631314"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.4249\/scholarpedia.9431"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2008.79"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1459359.1459457"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.292"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2037676.2037681","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2037676.2037681","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T09:54:28Z","timestamp":1750240468000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2037676.2037681"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2011,10]]}},"alternative-id":["10.1145\/2037676.2037681"],"URL":"https:\/\/doi.org\/10.1145\/2037676.2037681","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,10]]},"assertion":[{"value":"2011-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-11-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}