{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:29:05Z","timestamp":1750307345213,"version":"3.41.0"},"reference-count":20,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2011,8,1]],"date-time":"2011-08-01T00:00:00Z","timestamp":1312156800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Speech Lang. Process."],"published-print":{"date-parts":[[2011,8]]},"abstract":"<jats:p>This article presents a description of the INESC-ID Age and Gender classification systems which were developed for aiding the detection of child abuse material within the scope of the European project I-DASH. The Age and Gender classification systems are composed respectively by the fusion of four and six individual subsystems trained with short- and long-term acoustic and prosodic features, different classification strategies, Gaussian Mixture Models-Universal Background Model (GMM-UBM), Multi-Layer Perceptrons (MLP) and Support Vector Machines (SVM), trained over five different speech corpus. The best results obtained by the calibration and linear logistic regression fusion back-end show an absolute improvement of 2% on the unweighted accuracy value for the Age and 1% for the Gender when compared to the best individual frontend systems in the development set. The final age\/gender detection system evaluated using a six-hour child abuse (CA) test set achieved promising results given the extremely difficult conditions of this type of video material. In order to further improve the performance in the CA domain, the classification modules were adapted using unsupervised selection of training data. An automatic data selection algorithm using frame-level posterior probabilities was developed. Performance improvement after adapting the classification modules was around 10% relative when compared with the baseline classifiers.<\/jats:p>","DOI":"10.1145\/1998384.1998387","type":"journal-article","created":{"date-parts":[[2011,8,16]],"date-time":"2011-08-16T19:11:58Z","timestamp":1313521918000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["Age and gender detection in the I-DASH project"],"prefix":"10.1145","volume":"7","author":[{"given":"Hugo","family":"Meinedo","sequence":"first","affiliation":[{"name":"L2F - Spoken Language Systems Lab, INESC-ID, Portugal"}]},{"given":"Isabel","family":"Trancoso","sequence":"additional","affiliation":[{"name":"L2F - Spoken Language Systems Lab, INESC-ID and Instituto Superior T\u00e9cnico, Portugal"}]}],"member":"320","published-online":{"date-parts":[[2011,8,18]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ODYSSEY.2006.248112"},{"volume-title":"Proceedings of Interspeech.","author":"Batliner A.","key":"e_1_2_1_2_1"},{"volume-title":"Proceedings of InterSpeech.","author":"Bugalho M.","key":"e_1_2_1_3_1"},{"volume-title":"Proceedings of the Language and Resources Conference (LREC).","author":"Burkhardt F.","key":"e_1_2_1_4_1"},{"key":"e_1_2_1_5_1","unstructured":"Eskenazi M. Mostow J. and Graff D. 1997. The CMU kids corpus. In Linguistic Data Consortium. Philadelphia PA.  Eskenazi M. Mostow J. and Graff D. 1997. The CMU kids corpus. In Linguistic Data Consortium. Philadelphia PA."},{"volume-title":"Proceedings of the International Conference on Affective Computing and Intelligent Interaction (ACII).","author":"Eyben F.","key":"e_1_2_1_6_1"},{"volume-title":"Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) .","author":"Hermansky H.","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6393(98)00032-6"},{"volume-title":"Proceedings of the EuroSpeech.","author":"Lee S.","key":"e_1_2_1_9_1"},{"volume-title":"Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP).","author":"Ma J.","key":"e_1_2_1_10_1"},{"volume-title":"Proceedings of Interspeech.","author":"Meinedo H.","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2003.818026"},{"volume-title":"Proceedings of the International Workshop on Multimedia Signal Processing (MMSP).","author":"Potamianos A.","key":"e_1_2_1_14_1"},{"key":"e_1_2_1_15_1","article-title":"The effects of bandwidth reduction on human and computer recognition of children's speech. IEEE Signal","author":"Russell M.","year":"2007","journal-title":"Process. Lett., 1044--1046."},{"volume-title":"Proceedings of Interspeech.","author":"Schuller B.","key":"e_1_2_1_16_1"},{"volume-title":"The Science of the Singing Voice","author":"Sundberg J.","key":"e_1_2_1_17_1","doi-asserted-by":"crossref","DOI":"10.1121\/1.399243"},{"volume-title":"Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP).","author":"Wang L.","key":"e_1_2_1_18_1"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Wessel F. and Ney H. 2005. Unsupervised training of acoustic models for large vocabulary continuous speech recognition. In IEEE Trans. Speech Audio Process.  Wessel F. and Ney H. 2005. Unsupervised training of acoustic models for large vocabulary continuous speech recognition. In IEEE Trans. Speech Audio Process.","DOI":"10.1109\/TSA.2004.838537"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1996.541104"},{"volume-title":"Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding.","author":"Wu Y.","key":"e_1_2_1_21_1"}],"container-title":["ACM Transactions on Speech and Language Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1998384.1998387","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1998384.1998387","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T11:06:27Z","timestamp":1750244787000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1998384.1998387"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,8]]},"references-count":20,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2011,8]]}},"alternative-id":["10.1145\/1998384.1998387"],"URL":"https:\/\/doi.org\/10.1145\/1998384.1998387","relation":{},"ISSN":["1550-4875","1550-4883"],"issn-type":[{"type":"print","value":"1550-4875"},{"type":"electronic","value":"1550-4883"}],"subject":[],"published":{"date-parts":[[2011,8]]},"assertion":[{"value":"2010-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-08-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}