{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:41:21Z","timestamp":1760236881929,"version":"build-2065373602"},"reference-count":32,"publisher":"Walter de Gruyter GmbH","issue":"2","license":[{"start":{"date-parts":[[2022,12,1]],"date-time":"2022-12-01T00:00:00Z","timestamp":1669852800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The paper proposes a framework to record meeting to avoid hassle of writing points of meeting. Key components of framework are \u201cModel Trainer\u201d and \u201cMeeting Recorder\u201d. In model trainer, we first clean the noise in audio, then oversample the data size and extract features from audio, in the end we train the classification model. Meeting recorder is a post-processor used for sound recognition using the trained model and converting the audio into text. Experimental results show the high accuracy and effectiveness of the proposed implementation.<\/jats:p>","DOI":"10.2478\/acss-2022-0019","type":"journal-article","created":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T11:36:19Z","timestamp":1674560179000},"page":"183-189","source":"Crossref","is-referenced-by-count":1,"title":["An Intelligent Framework for Person Identification Using Voice Recognition and Audio Data Classification"],"prefix":"10.2478","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5604-5484","authenticated-orcid":false,"given":"Isra","family":"Khan","sequence":"first","affiliation":[{"name":"College of Computing and Information Sciences, Karachi Institute of Economics and Technology , Karachi , Pakistan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3080-4769","authenticated-orcid":false,"given":"Shah Muhammad","family":"Emaduddin","sequence":"additional","affiliation":[{"name":"College of Computing and Information Sciences, Karachi Institute of Economics and Technology , Karachi , Pakistan"}]},{"given":"Ashhad","family":"Ullah","sequence":"additional","affiliation":[{"name":"College of Computing and Information Sciences, Karachi Institute of Economics and Technology , Karachi , Pakistan"}]},{"given":"A Rafi","family":"Ullah","sequence":"additional","affiliation":[{"name":"Optimizia , Karachi , Pakistan"}]}],"member":"374","published-online":{"date-parts":[[2023,1,24]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"[1] D.S. Park, W. Chan, Y. Zhang, C.C. Chiu, B. Zoph, E.D. Cubuk, and Q.V. Le, \u201cSpecaugment: A simple data augmentation method for automatic speech recognition,\u201d 2019. arXiv preprint arXiv:1904.08779. https:\/\/doi.org\/10.48550\/arXiv.1904.08779","key":"2025101108011115751_j_acss-2022-0019_ref_001","DOI":"10.21437\/Interspeech.2019-2680"},{"doi-asserted-by":"crossref","unstructured":"[2] T. Fukuda, O. Ichikawa, and M. Nishimura, \u201cDetecting breathing sounds in realistic Japanese telephone conversations and its application to automatic speech recognition,\u201d Speech Communication, vol. 98, pp. 95\u2013103, Apr. 2018. https:\/\/doi.org\/10.1016\/j.specom.2018.01.008","key":"2025101108011115751_j_acss-2022-0019_ref_002","DOI":"10.1016\/j.specom.2018.01.008"},{"doi-asserted-by":"crossref","unstructured":"[3] M. Wickert, \u201cReal-time digital signal processing using pyaudio\\_helper and the ipywidgets,\u201d in Proceedings of the 17th Python in Science Conference, Austin, TX, USA, Jul. 2018, pp. 9\u201315. https:\/\/doi.org\/10.25080\/Majora-4af1f417-00e","key":"2025101108011115751_j_acss-2022-0019_ref_003","DOI":"10.25080\/Majora-4af1f417-00e"},{"unstructured":"[4] A. Srivastava and S. Maheshwari, \u201cSignal denoising and multiresolution analysis by discrete wavelet transform,\u201d Innovative Trends in Applied Physical, Chemical, Mathematical Sciences and Emerging Energy Technology for Sustainable Development, 2015.","key":"2025101108011115751_j_acss-2022-0019_ref_004"},{"doi-asserted-by":"crossref","unstructured":"[5] J. P. Dron and F. Bolaers, \u201cImprovement of the sensitivity of the scalar indicators (crest factor, kurtosis) using a de-noising method by spectral subtraction: application to the detection of defects in ball bearings,\u201d Journal of Sound and Vibration, vol. 270, no. 1\u20132, pp. 61\u201373, Feb. 2004. https:\/\/doi.org\/10.1016\/S0022-460X(03)00483-8","key":"2025101108011115751_j_acss-2022-0019_ref_005","DOI":"10.1016\/S0022-460X(03)00483-8"},{"unstructured":"[6] E. Eban, A. Jansen, and S. Chaudhuri, \u201cFiltering wind noises in video content,\u201d U.S. Patent Application 15\/826 622, March 22, 2018.","key":"2025101108011115751_j_acss-2022-0019_ref_006"},{"doi-asserted-by":"crossref","unstructured":"[7] B.B. Ali, W. Wojcik, O. Mamyrbayev, M. Turdalyuly, and N. Mekebayev, \u201cSpeech recognizer -based non-uniform spectral compression for robust MFCC feature extraction,\u201d Przeglad Elektrotechniczny, vol. 94, no. 6, pp.90\u201393, Jun. 2018. https:\/\/doi.org\/10.15199\/48.2018.06.17","key":"2025101108011115751_j_acss-2022-0019_ref_007","DOI":"10.15199\/48.2018.06.17"},{"doi-asserted-by":"crossref","unstructured":"[8] \u00c7.P. Dautov and M.S. \u00d6zerdem, \u201cWavelet transform and signal denoising using Wavelet method,\u201d in 2018 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, May 2018, pp. 1\u20134. https:\/\/doi.org\/10.1109\/SIU.2018.8404418","key":"2025101108011115751_j_acss-2022-0019_ref_008","DOI":"10.1109\/SIU.2018.8404418"},{"doi-asserted-by":"crossref","unstructured":"[9] R. Liu, L.O. Hall, K.W. Bowyer, D.B. Goldgof, R. Gatenby, and K.B. Ahmed, \u201cSynthetic minority image over-sampling technique: How to improve AUC for glioblastoma patient survival prediction,\u201d in 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, Oct. 2017, pp. 1357\u20131362. https:\/\/doi.org\/10.1109\/SMC.2017.8122802","key":"2025101108011115751_j_acss-2022-0019_ref_009","DOI":"10.1109\/SMC.2017.8122802"},{"doi-asserted-by":"crossref","unstructured":"[10] R. Lotfian and C. Busso, \u201cOver-sampling emotional speech data based on subjective evaluations provided by multiple individuals,\u201d IEEE Transactions on Affective Computing, vol. 12, no. 4, pp. 870\u2013882, Feb. 2019. https:\/\/doi.org\/10.1109\/TAFFC.2019.2901465","key":"2025101108011115751_j_acss-2022-0019_ref_010","DOI":"10.1109\/TAFFC.2019.2901465"},{"unstructured":"[11] Khan, I., Ullah, A. and Emad, S.M., \u201cRobust Feature Extraction Techniques in Speech Recognition: A Comparative Analysis\u201d in 2019 KIET Journal of Computing and Information Sciences, 2(2), pp.11-11.","key":"2025101108011115751_j_acss-2022-0019_ref_011"},{"doi-asserted-by":"crossref","unstructured":"[12] E. Mulyanto, E.M. Yuniarno, and M.H. Purnomo, \u201cAdding an emotions filter to Javanese text -to-speech system,\u201d in 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia (CENIM), Surabaya, Indonesia, Nov. 2018, pp. 142\u2013146. https:\/\/doi.org\/10.1109\/CENIM.2018.8711229","key":"2025101108011115751_j_acss-2022-0019_ref_012","DOI":"10.1109\/CENIM.2018.8711229"},{"doi-asserted-by":"crossref","unstructured":"[13] H. Liao, G. Pundak, O. Siohan, M.K. Carroll, N. Coccaro, Q.M. Jiang, T.N. Sainath, A. Senior, F. Beaufays, and M. Bacchiani, \u201cLarge vocabulary automatic speech recognition for children,\u201d in Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, Sep. 2015, pp. 1611\u20131615. https:\/\/doi.org\/10.21437\/Interspeech.2015-373","key":"2025101108011115751_j_acss-2022-0019_ref_013","DOI":"10.21437\/Interspeech.2015-373"},{"doi-asserted-by":"crossref","unstructured":"[14] K.E. Kafoori and S.M. Ahadi, \u201cRobust recognition of noisy speech through partial imputation of missing data,\u201d Circuits, Systems, and Signal Processing, vol. 37, no. 4, pp. 1625\u20131648, Apr. 2018. https:\/\/doi.org\/10.1007\/s00034-017-0616-4","key":"2025101108011115751_j_acss-2022-0019_ref_014","DOI":"10.1007\/s00034-017-0616-4"},{"doi-asserted-by":"crossref","unstructured":"[15] H.F.C. Chuctaya, R.N.M. Mercado, and J.J.G. Gaona, \u201cIsolated automatic speech recognition of Quechua numbers using MFCC, DTW and KNN,\u201d Int. J. Adv. Comput. Sci. Appl, vol. 9, no. 10, pp. 24\u201329, 2018. https:\/\/doi.org\/10.14569\/IJACSA.2018.091003","key":"2025101108011115751_j_acss-2022-0019_ref_015","DOI":"10.14569\/IJACSA.2018.091003"},{"doi-asserted-by":"crossref","unstructured":"[16] A. Winursito, R. Hidayat, A. Bejo, and M.N.Y. Utomo, \u201cFeature data reduction of MFCC using PCA and SVD in speech recognition system,\u201d in 2018 International Conference on Smart Computing and Electronic Enterprise (ICSCEE), Shah Alam, Malaysia, Jul. 2018, pp. 1\u20136. https:\/\/doi.org\/10.1109\/ICSCEE.2018.8538414","key":"2025101108011115751_j_acss-2022-0019_ref_016","DOI":"10.1109\/ICSCEE.2018.8538414"},{"unstructured":"[17] L.N. Thu, A. Win, and H.N. Oo, \u201cA review for reduction of noise by wavelet transform in audio signals,\u201d International Research Journal of Engineering and Technology (IRJET), vol. 6, no. 5, May 2019.","key":"2025101108011115751_j_acss-2022-0019_ref_017"},{"doi-asserted-by":"crossref","unstructured":"[18] Y. Luo and N. Mesgarani, \u201cTasnet: time-domain audio separation network for real-time, single-channel speech separation,\u201d in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, Apr. 2018, pp. 696\u2013700. https:\/\/doi.org\/10.1109\/ICASSP.2018.8462116","key":"2025101108011115751_j_acss-2022-0019_ref_018","DOI":"10.1109\/ICASSP.2018.8462116"},{"doi-asserted-by":"crossref","unstructured":"[19] E. Ramentol, Y. Caballero, R. Bello, and F. Herrera, \u201cSMOTE-RSB*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory,\u201d Knowledge and Information Systems, vol. 33, no. 2, pp. 245\u2013265, Nov. 2012. https:\/\/doi.org\/10.1007\/s10115-011-0465-6","key":"2025101108011115751_j_acss-2022-0019_ref_019","DOI":"10.1007\/s10115-011-0465-6"},{"doi-asserted-by":"crossref","unstructured":"[20] L. Abdi and S. Hashemi, \u201cTo combat multi-class imbalanced problems by means of over-sampling techniques,\u201d IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 1, pp. 238\u2013251, Jul. 2015. https:\/\/doi.org\/10.1109\/TKDE.2015.2458858","key":"2025101108011115751_j_acss-2022-0019_ref_020","DOI":"10.1109\/TKDE.2015.2458858"},{"doi-asserted-by":"crossref","unstructured":"[21] A.E. Martin, \u201cA compositional neural architecture for language,\u201d Journal of Cognitive Neuroscience, vol. 32, no. 8, pp. 1407\u20131427, Aug. 2020. https:\/\/doi.org\/10.1162\/jocn_a_0155232108553","key":"2025101108011115751_j_acss-2022-0019_ref_021","DOI":"10.1162\/jocn_a_01552"},{"doi-asserted-by":"crossref","unstructured":"[22] S. B\u00f6ck, F. Korzeniowski, J. Schl\u00fcter, F. Krebs, and G. Widmer, \u201cMadmom: A new Python audio and music signal processing library,\u201d in Proceedings of the 24th ACM International Conference on Multimedia, Oct. 2016, pp. 1174\u20131178. https:\/\/doi.org\/10.1145\/2964284.2973795","key":"2025101108011115751_j_acss-2022-0019_ref_022","DOI":"10.1145\/2964284.2973795"},{"unstructured":"[23] Z. Wang, \u201cBaidu online network technology Be\u0133ing Co Ltd, Audio processing method and apparatus based on artificial intelligence,\u201d U.S. Patent Application 10\/192163, 2019.","key":"2025101108011115751_j_acss-2022-0019_ref_023"},{"unstructured":"[24] J.P. Cunningham and Z. Ghahramani, \u201cLinear dimensionality reduction: Survey, insights, and generalizations,\u201d The Journal of Machine Learning Research, vol. 16, no. 1, pp. 2859\u20132900, 2015. chrome-extension:\/\/efaidnbmnnnibpcajpcglclefindmkaj\/https:\/\/stat.columbia.edu\/~cunningham\/pdf\/CunninghamJMLR2015.pdf","key":"2025101108011115751_j_acss-2022-0019_ref_024"},{"doi-asserted-by":"crossref","unstructured":"[25] Y. LeCun, Y. Bengio, and G. Hinton, \u201cDeep learning,\u201d Nature, vol. 521, pp.436\u2013444, May 2015. https:\/\/doi.org\/10.1038\/nature1453926017442","key":"2025101108011115751_j_acss-2022-0019_ref_025","DOI":"10.1038\/nature14539"},{"doi-asserted-by":"crossref","unstructured":"[26] W. Chan, N. Jaitly, Q. Le, and O. Vinyals, \u201cListen, attend and spell: A neural network for large vocabulary conversational speech recognition,\u201d in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, Mar. 2016, pp. 4960\u20134964. https:\/\/doi.org\/10.1109\/ICASSP.2016.7472621","key":"2025101108011115751_j_acss-2022-0019_ref_026","DOI":"10.1109\/ICASSP.2016.7472621"},{"unstructured":"[27] J.K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio, \u201cAttention-based models for speech recognition,\u201d in Advances in Neural Information Processing Systems, Proceedings of the Annual Conference on Neural Information Processing Systems 2015, Montreal, QC, Canada, 2015, pp. 577\u2013585. chrome-extension:\/\/efaidnbmnnnibpcajpcglclefindmkaj\/https:\/\/proceedings.neurips.cc\/paper\/2015\/file\/1068c6e4c8051cfd4e9ea8072e3189e2-Paper.pdf","key":"2025101108011115751_j_acss-2022-0019_ref_027"},{"doi-asserted-by":"crossref","unstructured":"[28] V.Z. K\u00ebpuska and H.A. Elharati, \u201cRobust speech recognition system using conventional and hybrid features of MFCC, LPCC, PLP, RASTAPLP and hidden Markov model classifier in noisy conditions,\u201d Journal of Computer and Communications, vol. 3, no. 6, pp. 1\u20139, Jun. 2015. https:\/\/doi.org\/10.4236\/jcc.2015.36001","key":"2025101108011115751_j_acss-2022-0019_ref_028","DOI":"10.4236\/jcc.2015.36001"},{"doi-asserted-by":"crossref","unstructured":"[29] \u00c7.P. Dautov and M.S. \u00d6zerdem, \u201cWavelet transform and signal denoising using Wavelet method,\u201d in 2018 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, May 2018, pp. 1\u20134. https:\/\/doi.org\/10.1109\/SIU.2018.8404418","key":"2025101108011115751_j_acss-2022-0019_ref_029","DOI":"10.1109\/SIU.2018.8404418"},{"doi-asserted-by":"crossref","unstructured":"[30] N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, \u201cSMOTE: synthetic minority over-sampling technique,\u201d Journal of Artificial Intelligence Research, vol. 16, pp. 321\u2013357, 2002. https:\/\/doi.org\/10.1613\/jair.953","key":"2025101108011115751_j_acss-2022-0019_ref_030","DOI":"10.1613\/jair.953"},{"doi-asserted-by":"crossref","unstructured":"[31] Z. T\u00fcske, R. Schl\u00fcter, and H. Ney, \u201cAcoustic modeling of speech waveform based on multi-resolution, neural network signal processing,\u201d in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, Apr. 2018, pp. 4859\u20134863. https:\/\/doi.org\/10.1109\/ICASSP.2018.8461871","key":"2025101108011115751_j_acss-2022-0019_ref_031","DOI":"10.1109\/ICASSP.2018.8461871"},{"doi-asserted-by":"crossref","unstructured":"[32] R. Shadiev, T.T. Wu, A. Sun, and Y.M. Huang, \u201cApplications of speech-to-text recognition and computer-aided translation for facilitating cross-cultural learning through a learning activity: issues and their solutions,\u201d Educational Technology Research and Development, vol. 66, no. 1, pp. 191\u2013214, Feb. 2018. https:\/\/doi.org\/10.1007\/s11423-017-9556-8","key":"2025101108011115751_j_acss-2022-0019_ref_032","DOI":"10.1007\/s11423-017-9556-8"}],"container-title":["Applied Computer Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.sciendo.com\/pdf\/10.2478\/acss-2022-0019","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T08:01:27Z","timestamp":1760169687000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.sciendo.com\/article\/10.2478\/acss-2022-0019"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,1]]},"references-count":32,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,1,24]]},"published-print":{"date-parts":[[2022,12,1]]}},"alternative-id":["10.2478\/acss-2022-0019"],"URL":"https:\/\/doi.org\/10.2478\/acss-2022-0019","relation":{},"ISSN":["2255-8691"],"issn-type":[{"type":"electronic","value":"2255-8691"}],"subject":[],"published":{"date-parts":[[2022,12,1]]}}}