{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:23:53Z","timestamp":1750220633718,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T00:00:00Z","timestamp":1597881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,8,23]]},"DOI":"10.1145\/3394486.3403326","type":"proceedings-article","created":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T23:03:55Z","timestamp":1597964635000},"page":"2755-2763","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Acoustic Measures for Real-Time Voice Coaching"],"prefix":"10.1145","author":[{"given":"Ying","family":"Li","sequence":"first","affiliation":[{"name":"Giving Tech Labs, Seattle, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Abraham","family":"Miller","sequence":"additional","affiliation":[{"name":"Giving Tech Labs, Seattle, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Arthur","family":"Liu","sequence":"additional","affiliation":[{"name":"Giving Tech Labs, Seattle, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kyle","family":"Coburn","sequence":"additional","affiliation":[{"name":"Giving Tech Labs, Seattle, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Luis J.","family":"Salazar","sequence":"additional","affiliation":[{"name":"Giving Tech Labs, Seattle, WA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,8,20]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"A Real-Time Phoneme Counting Algorithm and Application for Speech Rate Monitoring. Journal of Fluency Disorders 51 (01","author":"Aharonson Vered","year":"2017","unstructured":"Vered Aharonson , Eran Aharonson , Katia Levi , Aviv Sotzianu , Ofer Amir , and Ovadia-Blechman Zehava . 2017. A Real-Time Phoneme Counting Algorithm and Application for Speech Rate Monitoring. Journal of Fluency Disorders 51 (01 2017 ). https:\/\/doi.org\/10.1016\/j.jfludis.2017.01.001 10.1016\/j.jfludis.2017.01.001 Vered Aharonson, Eran Aharonson, Katia Levi, Aviv Sotzianu, Ofer Amir, and Ovadia-Blechman Zehava. 2017. A Real-Time Phoneme Counting Algorithm and Application for Speech Rate Monitoring. Journal of Fluency Disorders 51 (01 2017). https:\/\/doi.org\/10.1016\/j.jfludis.2017.01.001"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2010.09.020"},{"key":"e_1_3_2_2_3_1","volume-title":"Proceedings, Institute of Phonetic Science","author":"Boersma Paul","year":"1993","unstructured":"Paul Boersma . 1993 . ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND . In Proceedings, Institute of Phonetic Science . University of Amsterdam, 97--110. Paul Boersma. 1993. ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND. In Proceedings, Institute of Phonetic Science. University of Amsterdam, 97--110."},{"key":"e_1_3_2_2_4_1","unstructured":"Paul Boersma and David Weenink. 2020. Praat: doing phonetics by computer. http:\/\/www.praat.org\/  Paul Boersma and David Weenink. 2020. Praat: doing phonetics by computer. http:\/\/www.praat.org\/"},{"key":"e_1_3_2_2_5_1","volume-title":"a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America 111 4","author":"de Cheveign\u00e9 Alain","year":"2002","unstructured":"Alain de Cheveign\u00e9 and Hideki Kawahara . 2002. YIN , a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America 111 4 ( 2002 ), 1917--30. Alain de Cheveign\u00e9 and Hideki Kawahara. 2002. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America 111 4 (2002), 1917--30."},{"key":"e_1_3_2_2_6_1","volume-title":"Proceedings of 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), K Hirose, S Nakamura, and T Kaboyashi (Eds.). 3110--3113","author":"Dean David","year":"2010","unstructured":"David Dean , Sridha Sridharan , Robert Vogt , and Michael Mason . 2010 . The QUTNOISE-TIMIT corpus for the evaluation of voice activity detection algorithms . In Proceedings of 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), K Hirose, S Nakamura, and T Kaboyashi (Eds.). 3110--3113 . David Dean, Sridha Sridharan, Robert Vogt, and Michael Mason. 2010. The QUTNOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. In Proceedings of 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), K Hirose, S Nakamura, and T Kaboyashi (Eds.). 3110--3113."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2006-275"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2006-275"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"crossref","DOI":"10.1109\/ICASSP.2003.1198819","volume-title":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)","volume":"1","author":"Wang Dong","year":"2003","unstructured":"Dong Wang , Lie Lu , and Hong-Jiang Zhang . 2003 . Speech segmentation without speech recognition . In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03) ., Vol. 1 . I--I. https:\/\/doi.org\/10.1109\/ICASSP. 2003.1198819 10.1109\/ICASSP.2003.1198819 Dong Wang, Lie Lu, and Hong-Jiang Zhang. 2003. Speech segmentation without speech recognition. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., Vol. 1. I--I. https:\/\/doi.org\/10.1109\/ICASSP.2003.1198819"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2006-230"},{"key":"e_1_3_2_2_11_1","unstructured":"Gustav Theodor Fechner Edwin G Boring and Davis H Howes. 1966. Elements of Psychophysics: Transl. by Helmut E. Adler. Holt Rinehart and Winston.  Gustav Theodor Fechner Edwin G Boring and Davis H Howes. 1966. Elements of Psychophysics: Transl. by Helmut E. Adler. Holt Rinehart and Winston."},{"key":"e_1_3_2_2_12_1","volume-title":"On the role of spectral transition for speech perception. The Journal of the Acoustical Society of America 80 (11","author":"Furui Sadaoki","year":"1986","unstructured":"Sadaoki Furui . 1986. On the role of spectral transition for speech perception. The Journal of the Acoustical Society of America 80 (11 1986 ), 1016--25. https:\/\/doi.org\/10.1121\/1.393842 10.1121\/1.393842 Sadaoki Furui. 1986. On the role of spectral transition for speech perception. The Journal of the Acoustical Society of America 80 (11 1986), 1016--25. https:\/\/doi.org\/10.1121\/1.393842"},{"key":"e_1_3_2_2_13_1","unstructured":"John S. Garofolo Lori F. Lamel William M. Fisher Jonathan G. Fiscus David S. Pallett Nancy L. Dahlgren and Victor Zue. 1993. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. https:\/\/catalog.ldc.upenn.edu\/LDC93S1  John S. Garofolo Lori F. Lamel William M. Fisher Jonathan G. Fiscus David S. Pallett Nancy L. Dahlgren and Victor Zue. 1993. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. https:\/\/catalog.ldc.upenn.edu\/LDC93S1"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2104953"},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.wocn.2003.09.005"},{"volume-title":"Intonation Systems, A Survey of Twenty Languages, Daniel Hirst and Albert di Cristo (Eds.)","author":"Hirst Daniel","key":"e_1_3_2_2_16_1","unstructured":"Daniel Hirst and Albert di Cristo . 1999. A survey of intonation systems . In Intonation Systems, A Survey of Twenty Languages, Daniel Hirst and Albert di Cristo (Eds.) . Cambridge University Press , Chapter 1, 1--44. Daniel Hirst and Albert di Cristo. 1999. A survey of intonation systems. In Intonation Systems, A Survey of Twenty Languages, Daniel Hirst and Albert di Cristo (Eds.). Cambridge University Press, Chapter 1, 1--44."},{"key":"e_1_3_2_2_17_1","volume-title":"Spoken Language Processing: A Guide to Theory, Algorithm, and System Development","author":"Huang Xuedong","unstructured":"Xuedong Huang , Alex Acero , Hsiao-Wuen Hon , and Raj Reddy . 2001. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development ( 1 st ed.). Prentice Hall PTR , USA. Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, and Raj Reddy. 2001. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development (1st ed.). Prentice Hall PTR, USA.","edition":"1"},{"key":"e_1_3_2_2_18_1","unstructured":"Judd Humpherys. 2012. Your Speech Patterns Affect Sales Performance. https:\/\/ezinearticles.com\/?Your-Speech-Patterns-Affect-SalesPerformance&id=7306149  Judd Humpherys. 2012. Your Speech Patterns Affect Sales Performance. https:\/\/ezinearticles.com\/?Your-Speech-Patterns-Affect-SalesPerformance&id=7306149"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1002\/9781118983973"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177729332"},{"key":"e_1_3_2_2_21_1","volume-title":"The Pin Drop Principle","author":"Lewis David","unstructured":"David Lewis and G. Riley Mills . 2012. The Pin Drop Principle ( 1 st ed.). Jossey-Bass , USA. David Lewis and G. Riley Mills. 2012. The Pin Drop Principle (1st ed.). Jossey-Bass, USA.","edition":"1"},{"key":"e_1_3_2_2_22_1","volume-title":"Russo","author":"Livingstone Steven R.","year":"2018","unstructured":"Steven R. Livingstone and Frank A . Russo . 2018 . The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLOS ONE 13, 5 (05 2018), 1--35. https:\/\/doi.org\/10.1371\/journal.pone.0196391 10.1371\/journal.pone.0196391 Steven R. Livingstone and Frank A. Russo. 2018. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLOS ONE 13, 5 (05 2018), 1--35. https:\/\/doi.org\/10.1371\/journal.pone.0196391"},{"key":"#cr-split#-e_1_3_2_2_23_1.1","doi-asserted-by":"crossref","unstructured":"Lawrence Marks and Mary Florentine. 2010. Measurement of Loudness Part I: Methods Problems and Pitfalls. 17--56. https:\/\/doi.org\/10.1007\/978-1-4419-6712-1_2 10.1007\/978-1-4419-6712-1_2","DOI":"10.1007\/978-1-4419-6712-1_2"},{"key":"#cr-split#-e_1_3_2_2_23_1.2","doi-asserted-by":"crossref","unstructured":"Lawrence Marks and Mary Florentine. 2010. Measurement of Loudness Part I: Methods Problems and Pitfalls. 17--56. https:\/\/doi.org\/10.1007\/978-1-4419-6712-1_2","DOI":"10.1007\/978-1-4419-6712-1_2"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.25080\/Majora-7b98e3ed-003"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178877"},{"key":"e_1_3_2_2_26_1","volume-title":"Speech Recognition Using On-Line Estimation Of Speaking Rate. In Fifth European Conference on Speech Communication and Technology, EUROSPEECH","volume":"4","author":"Morgan Nelson","year":"1997","unstructured":"Nelson Morgan , Eric Fosler-Lussier , and Nikki Mirghafori . 1997 . Speech Recognition Using On-Line Estimation Of Speaking Rate. In Fifth European Conference on Speech Communication and Technology, EUROSPEECH 1997, Vol. 4 . Nelson Morgan, Eric Fosler-Lussier, and Nikki Mirghafori. 1997. Speech Recognition Using On-Line Estimation Of Speaking Rate. In Fifth European Conference on Speech Communication and Technology, EUROSPEECH 1997, Vol. 4."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2011-317"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSC.2010.41"},{"key":"e_1_3_2_2_29_1","unstructured":"The WebRTC project authors. 2011 (accessed Novemver 2019). The WebRTC project. https:\/\/webrtc.org\/  The WebRTC project authors. 2011 (accessed Novemver 2019). The WebRTC project. https:\/\/webrtc.org\/"},{"key":"e_1_3_2_2_30_1","volume-title":"Discrete-Time Speech Signal Processing: Principles and Practice","author":"Quatieri Thomas F.","unstructured":"Thomas F. Quatieri . 2001. Discrete-Time Speech Signal Processing: Principles and Practice ( 1 st ed.). Prentice Hall , USA. Thomas F. Quatieri. 2001. Discrete-Time Speech Signal Processing: Principles and Practice (1st ed.). Prentice Hall, USA.","edition":"1"},{"key":"e_1_3_2_2_31_1","volume-title":"The Pytorch-kaldi Speech Recognition Toolkit. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Ravanelli Mirco","year":"2018","unstructured":"Mirco Ravanelli , Titouan Parcollet , and Yoshua Bengio . 2018 . The Pytorch-kaldi Speech Recognition Toolkit. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), 6465--6469. Mirco Ravanelli, Titouan Parcollet, and Yoshua Bengio. 2018. The Pytorch-kaldi Speech Recognition Toolkit. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018), 6465--6469."},{"key":"e_1_3_2_2_32_1","volume-title":"A comparative analysis of speech rate and perception in radio bulletins. Text and Talk 32 (05","author":"Rodero Emma","year":"2012","unstructured":"Emma Rodero . 2012. A comparative analysis of speech rate and perception in radio bulletins. Text and Talk 32 (05 2012 ), 391--411. https:\/\/doi.org\/10.1515\/text2012-0019 10.1515\/text2012-0019 Emma Rodero. 2012. A comparative analysis of speech rate and perception in radio bulletins. Text and Talk 32 (05 2012), 391--411. https:\/\/doi.org\/10.1515\/text2012-0019"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1080\/026992000298931"},{"volume-title":"Computer Analysis of Human Behavior","author":"Schuller B.","key":"e_1_3_2_2_34_1","unstructured":"B. Schuller . 2011. Voice and speech analysis in search of states and traits . In Computer Analysis of Human Behavior . Springer , 227--253. B. Schuller. 2011. Voice and speech analysis in search of states and traits. In Computer Analysis of Human Behavior. Springer, 227--253."},{"volume-title":"The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH 2010.","author":"Schuller Bj\u00f6rn W.","key":"e_1_3_2_2_35_1","unstructured":"Bj\u00f6rn W. Schuller , Stefan Steidl , Anton Batliner , Felix Burkhardt , Laurence Devillers , Christian A. M\u00fcller , and Shrikanth S. Narayanan . 2010 . The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH 2010. Bj\u00f6rn W. Schuller, Stefan Steidl, Anton Batliner, Felix Burkhardt, Laurence Devillers, Christian A. M\u00fcller, and Shrikanth S. Narayanan. 2010. The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH 2010."},{"key":"e_1_3_2_2_36_1","volume-title":"Voice of Authority: Professionals Lower Their Vocal Frequencies When Giving Expert Advice. Journal of Nonverbal Behavior (05","author":"Sorokowski Piotr","year":"2019","unstructured":"Piotr Sorokowski , David Puts , Janie Johnson , Olga \u00f3kiewicz , Agnieszka Sorokowska , Marta Kowal , Basia Borkowska , and Katarzyna Pisanski . 2019. Voice of Authority: Professionals Lower Their Vocal Frequencies When Giving Expert Advice. Journal of Nonverbal Behavior (05 2019 ). https:\/\/doi.org\/10.1007\/s10919-019-00307-0 10.1007\/s10919-019-00307-0 Piotr Sorokowski, David Puts, Janie Johnson, Olga \u00f3kiewicz, Agnieszka Sorokowska, Marta Kowal, Basia Borkowska, and Katarzyna Pisanski. 2019. Voice of Authority: Professionals Lower Their Vocal Frequencies When Giving Expert Advice. Journal of Nonverbal Behavior (05 2019). https:\/\/doi.org\/10.1007\/s10919-019-00307-0"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472768"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2012-97"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2006-204"}],"event":{"name":"KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"],"location":"Virtual Event CA USA","acronym":"KDD '20"},"container-title":["Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394486.3403326","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394486.3403326","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:01:49Z","timestamp":1750197709000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394486.3403326"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,20]]},"references-count":40,"alternative-id":["10.1145\/3394486.3403326","10.1145\/3394486"],"URL":"https:\/\/doi.org\/10.1145\/3394486.3403326","relation":{},"subject":[],"published":{"date-parts":[[2020,8,20]]},"assertion":[{"value":"2020-08-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}