{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T21:54:52Z","timestamp":1775253292557,"version":"3.50.1"},"reference-count":52,"publisher":"ASME International","issue":"5","license":[{"start":{"date-parts":[[2022,1,13]],"date-time":"2022-01-13T00:00:00Z","timestamp":1642032000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.asme.org\/publications-submissions\/publishing-information\/legal-policies"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["2026276"],"award-info":[{"award-number":["2026276"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["2024656"],"award-info":[{"award-number":["2024656"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["asmedigitalcollection.asme.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Voice recognition has become an integral part of our lives, commonly used in call centers and as part of virtual assistants. However, voice recognition is increasingly applied to more industrial uses. Each of these use cases has unique characteristics that may impact the effectiveness of voice recognition, which could impact industrial productivity, performance, or even safety. One of the most prominent among them is the unique background noises that are dominant in each industry. The existence of different machinery and different work layouts are primary contributors to this. Another important characteristic is the type of communication that is present in these settings. Daily communication often involves longer sentences uttered under relatively silent conditions, whereas communication in industrial settings is often short and conducted in loud conditions. In this study, we demonstrated the importance of taking these two elements into account by comparing the performances of two voice recognition algorithms under several background noise conditions: a regular Convolutional Neural Network (CNN)-based voice recognition algorithm to an Auto Speech Recognition (ASR)-based model with a denoising module. Our results indicate that there is a significant performance drop between the typical background noise use (white noise) and the rest of the background noises. Also, our custom ASR model with the denoising module outperformed the CNN-based model with an overall performance increase between 14\u201335% across all background noises. Both results give proof that specialized voice recognition algorithms need to be developed for these environments to reliably deploy them as control mechanisms.<\/jats:p>","DOI":"10.1115\/1.4053521","type":"journal-article","created":{"date-parts":[[2022,1,13]],"date-time":"2022-01-13T03:12:39Z","timestamp":1642043559000},"update-policy":"https:\/\/doi.org\/10.1115\/crossmarkpolicy-asme","source":"Crossref","is-referenced-by-count":15,"title":["The Effect of Different Occupational Background Noises on Voice Recognition Accuracy"],"prefix":"10.1115","volume":"22","author":[{"given":"Song","family":"Li","sequence":"first","affiliation":[{"name":"University of Florida Department of Computer Information, Science and Engineering, , 303 Weil Hall, Gainesville, FL 32603 ;"},{"name":"Johns Hopkins University Department of Computer Science, , 3400 N Charles Street, Baltimore, MD 21218"}]},{"given":"Mustafa Ozkan","family":"Yerebakan","sequence":"additional","affiliation":[{"name":"University of Florida Department of Industrial and, Systems Engineering, , 303 Weil Hall, Gainesville, FL 32603"}]},{"given":"Yue","family":"Luo","sequence":"additional","affiliation":[{"name":"University of Florida Department of Industrial and, Systems Engineering, , 303 Weil Hall, Gainesville, FL 32603"}]},{"given":"Ben","family":"Amaba","sequence":"additional","affiliation":[{"name":"IBM , Blue Lagoon Drive, Miami, FL 33126"}]},{"given":"William","family":"Swope","sequence":"additional","affiliation":[{"name":"IBM , 410 Robin Hood Cir Unit 102, Naple, FL 34104"}]},{"given":"Boyi","family":"Hu","sequence":"additional","affiliation":[{"name":"University of Florida Department of Industrial and, Systems Engineering, , 303 Weil Hall, Gainesville, FL 32603"}]}],"member":"33","published-online":{"date-parts":[[2022,3,31]]},"reference":[{"issue":"3","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1080\/10429247.2015.1054752","article-title":"Evaluation of Google\u2019s Voice Recognition and Sentence Classification for Health Care Applications","volume":"27","author":"Uddin","year":"2015","journal-title":"Eng. Manage. J."},{"key":"2023033100104056500_","article-title":"Towards Multimodal Emotion Recognition in German Speech Events in Cars Using Transfer Learning","author":"Cevher","year":"2019"},{"key":"2023033100104056500_","first-page":"1","article-title":"A Voice-Controlled Multi-Functional Smart Home Automation System","author":"Mittal","year":"2015"},{"key":"2023033100104056500_","article-title":"Speech and Voice Recognition Market by Type (SPEECH and Voice Recognition), End User (Automotive, Healthcare, BFSI, EDUCATION, Legal), Technology (Artificial Intelligence and NON-ARTIFICIAL Intelligence), and Geography\u2014Global Forecast to 2025","author":"Meticulous Market Research","year":"2019"},{"issue":"3","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1016\/j.rcim.2011.09.010","article-title":"Industrially Oriented Voice Control System","volume":"28","author":"Rogowski","year":"2012","journal-title":"Robot. Comput.-Integr. Manuf."},{"key":"2023033100104056500_","volume-title":"Safety and Health for Engineers","author":"Brauer","year":"2016"},{"key":"2023033100104056500_","article-title":"Automation, Robotics, and the Factory of the Future","author":"Tilley","year":"2020"},{"key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"144","DOI":"10.1016\/j.cie.2017.09.016","article-title":"Smart Operators in Industry 4.0: A Human-Centered Approach to Enhance Operators\u2019 Capabilities and Competencies Within the New Smart Factory Context","volume":"113","author":"Longo","year":"2017","journal-title":"Comput. Ind. Eng."},{"issue":"22","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"9921","DOI":"10.1073\/pnas.92.22.9921","article-title":"The Role of Voice Input for Human-Machine Communication","volume":"92","author":"Cohen","year":"1995","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1016\/j.mfglet.2020.09.001","article-title":"Voice-Enabled Assistants of the Operator 4.0 in the Social Smart Factory: Prospective Role and Challenges for an Advanced Human\u2013Machine Interaction","volume":"26","author":"Longo","year":"2020","journal-title":"Manuf. Lett."},{"key":"2023033100104056500_","first-page":"1","article-title":"Emergency Tractor Shut-Off Using a Voice Command System","author":"Rains","year":"2014"},{"key":"2023033100104056500_","first-page":"1","article-title":"Voice-Activated System to Remotely Control Industrial and Building Automation Systems Using Cloud Computing","author":"Valenzuela","year":"2013"},{"issue":"6","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"5046","DOI":"10.1109\/JIOT.2018.2854591","article-title":"Voice Activated Semi-autonomous Vehicle Using Off the Shelf Home Automation Hardware","volume":"5","author":"Solorio","year":"2018","journal-title":"IEEE Internet Things J."},{"key":"2023033100104056500_","first-page":"1","article-title":"Novice User Experiences With a Voice-Enabled Human\u2013Robot Interaction Tool","author":"Pleva","year":"2019"},{"issue":"5","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"801","DOI":"10.4218\/etrij.10.1510.0024","article-title":"Statistical Model-Based Noise Reduction Approach for Car Interior Applications to Speech Recognition","volume":"32","author":"Lee","year":"2010","journal-title":"ETRI J."},{"key":"2023033100104056500_","first-page":"158","article-title":"Voice-Controlled In-Vehicle Systems: Effects of Voice-Recognition Accuracy in the Presence of Background Noise","author":"Sokol","year":"2017"},{"key":"2023033100104056500_","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/978-3-319-75677-6_29","volume-title":"Vehicle and Automotive Engineering","author":"Czap","year":"2018"},{"key":"2023033100104056500_","first-page":"336","article-title":"Voice Authentication by Text Dependent Single Utterance for In-Car Environment","author":"Tamoto","year":"2019"},{"key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"712","DOI":"10.1016\/j.procs.2019.11.022","article-title":"Voice-Controlled Autonomous Vehicle Using IoT","volume":"160","author":"Sachdev","year":"2019","journal-title":"Procedia Comput. Sci."},{"issue":"1","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18196\/jet.3147","article-title":"Open Source System for Smart Home Devices Based on Smartphone Virtual Assistant","volume":"3","author":"Susanto","year":"2019","journal-title":"J. Electr. Eng. UMY"},{"issue":"1","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41597-021-00937-4","article-title":"The COUGHVID Crowdsourcing Dataset, a Corpus for the Study of Large-Scale Cough Analysis Algorithms","volume":"8","author":"Orlandic","year":"2021","journal-title":"Sci. Data"},{"issue":"3","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1080\/00049158.1978.10674186","article-title":"Noise and Vibration Hazards in Chainsaw Operations: A Review","volume":"41","author":"Davis","year":"1978","journal-title":"Aust. For."},{"issue":"8","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"42","DOI":"10.5120\/5565-7646","article-title":"Literature Review on Automatic Speech Recognition","volume":"41","author":"Ghai","year":"2012","journal-title":"Int. J. Comput. Appl."},{"issue":"5","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"1060","DOI":"10.1109\/TASL.2013.2244083","article-title":"\u201cMachine Learning Paradigms for Speech Recognition: An Overview","volume":"21","author":"Deng","year":"2013","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."},{"key":"2023033100104056500_","first-page":"1","article-title":"English Spoken Digits Database Under Noise Conditions for Research: SDDN","author":"Ouisaadane","year":"2019"},{"key":"2023033100104056500_","first-page":"41","article-title":"Modulation-Based Detection of Speech in Real Background Noise: Generalization to Novel Background Classes","author":"Bach","year":"2010"},{"key":"2023033100104056500_","first-page":"2670","article-title":"Dynamic Noise Aware Training for Speech Enhancement Based on Deep Neural Networks","author":"Xu","year":"2014"},{"key":"2023033100104056500_","first-page":"1090","article-title":"Speech Recognition with no Speech or with Noisy Speech","author":"Krishna","year":"2019"},{"key":"2023033100104056500_","first-page":"4960","article-title":"Listen, Attend and Spell: A Neural Network for Large Vocabulary Conversational Speech Recognition","author":"Chan","year":"2016"},{"key":"2023033100104056500_","first-page":"3104","article-title":"Sequence to Sequence Learning with Neural Networks","author":"Sutskever","year":"2014"},{"key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"1724","DOI":"10.3115\/v1\/d14-1179","article-title":"Learning Phrase Representations Using RNN Encoder-Decoder For Statistical Machine Translation","author":"Cho","year":"2014"},{"key":"2023033100104056500_","article-title":"Common Voice by Mozilla","author":"Mozilla","year":"2017"},{"key":"2023033100104056500_","article-title":"Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition","author":"Warden","year":"2018"},{"key":"2023033100104056500_","doi-asserted-by":"publisher","DOI":"10.21437\/interspeech.2019-3087","article-title":"A Scalable Noisy Speech Dataset and Online Subjective Test Framework","author":"Reddy","year":"2019","journal-title":"Interspeech"},{"issue":"1","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"S3","DOI":"10.3109\/14992027.2011.635316","article-title":"Typical Noise Exposure in Daily Life","volume":"51","author":"Flamme","year":"2012","journal-title":"Int. J. Audiol."},{"key":"2023033100104056500_","doi-asserted-by":"publisher","DOI":"10.1177\/09544054211014492","article-title":"Environmental Effects on Reliability and Accuracy of MFCC Based Voice Recognition for Industrial Human-Robot-Interaction","volume":"235","author":"Birch","year":"2021","journal-title":"Proc. Inst. Mech. Eng. B: J. Eng. Manuf."},{"key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"103903","DOI":"10.1016\/j.engappai.2020.103903","article-title":"Performing Predefined Tasks Using the Human\u2013Robot Interaction on Speech Recognition for an Industrial Robot","volume":"95","author":"Bingol","year":"2020","journal-title":"Eng. Appl. Artif. Intell."},{"key":"2023033100104056500_","first-page":"1","article-title":"A Hybrid DSP\/Deep Learning Approach to Real-Time Full-Band Speech Enhancement","author":"Valin","year":"2018"},{"key":"2023033100104056500_","first-page":"5069","article-title":"A Wavenet for Speech Denoising","author":"Rethage","year":"2018"},{"key":"2023033100104056500_","doi-asserted-by":"crossref","DOI":"10.21437\/Interspeech.2017-1428","article-title":"SEGAN: Speech Enhancement Generative Adversarial Network","author":"Pascual","year":"2017"},{"issue":"1","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1136\/oem.2005.025924","article-title":"Do Ambient Noise Exposure Levels Predict Hearing Loss in a Modern Industrial Cohort?","volume":"64","author":"Rabinowitz","year":"2007","journal-title":"Occup. Environ. Med."},{"key":"2023033100104056500_","article-title":"Overall Statistics\u2014All U.S. Industries\u2014Ohl","author":"NIOSH","year":"2019"},{"issue":"6","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"888","DOI":"10.1016\/j.marpolbul.2010.01.003","article-title":"Assessing Underwater Noise Levels During Pile-Driving at an Offshore Windfarm and its Potential Effects on Marine Mammals","volume":"60","author":"Bailey","year":"2010","journal-title":"Mar. Pollut. Bull."},{"key":"2023033100104056500_","doi-asserted-by":"crossref","DOI":"10.1201\/b22272","volume-title":"Piling Engineering","author":"Fleming","year":"2008"},{"key":"2023033100104056500_","first-page":"6341","article-title":"Federated Learning for Keyword Spotting","author":"Leroy","year":"2019"},{"key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"814","DOI":"10.21437\/Interspeech.2019-2396","article-title":"Speech Model Pre-Training for End-to-End Spoken Language Understanding","author":"Lugosch","year":"2019"},{"key":"2023033100104056500_","article-title":"A Neural Attention Model for Speech Command Recognition","author":"de Andrade","year":"2018"},{"issue":"2","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1109\/JSTSP.2019.2909479","article-title":"Comparison and Analysis of Sample CNN Architectures for Audio Classification","volume":"13","author":"Kim","year":"2019","journal-title":"IEEE J. Sel. Top. Signal Process."},{"issue":"1","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1016\/S0346-251X(98)00049-9","article-title":"Voice Recognition Software Accuracy With Second Language Speakers of English","volume":"27","author":"Coniam","year":"1999","journal-title":"System"},{"key":"2023033100104056500_","first-page":"173","article-title":"Deep Speech 2: End-to-End Speech Recognition in English and Mandarin","author":"Amodei"},{"issue":"4","key":"2023033100104056500_","doi-asserted-by":"publisher","first-page":"3643","DOI":"10.11591\/ijece.v10i4","article-title":"A Novel Automatic Voice Recognition System Based on Text-Independent in a Noisy Environment","volume":"10","author":"Hamza","year":"2020","journal-title":"Int. J. Electr. Comput. Eng."},{"key":"2023033100104056500_","first-page":"052090","article-title":"The Software System Implementation of Speech Command Recognizer Under Intensive Background Noise","author":"Song","year":"2019"}],"container-title":["Journal of Computing and Information Science in Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/22\/5\/050905\/6869354\/jcise_22_5_050905.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article-pdf\/22\/5\/050905\/6869354\/jcise_22_5_050905.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,15]],"date-time":"2023-11-15T19:46:56Z","timestamp":1700077616000},"score":1,"resource":{"primary":{"URL":"https:\/\/asmedigitalcollection.asme.org\/computingengineering\/article\/22\/5\/050905\/1131336\/The-Effect-of-Different-Occupational-Background"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,31]]},"references-count":52,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,10,1]]}},"URL":"https:\/\/doi.org\/10.1115\/1.4053521","relation":{},"ISSN":["1530-9827","1944-7078"],"issn-type":[{"value":"1530-9827","type":"print"},{"value":"1944-7078","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,31]]},"article-number":"050905"}}