{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T17:12:19Z","timestamp":1769706739784,"version":"3.49.0"},"reference-count":15,"publisher":"SAGE Publications","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IFS"],"published-print":{"date-parts":[[2023,7,2]]},"abstract":"<jats:p>Over the past ten years, deep learning has enabled significant advancements in the improvement of noisy speech. Due to the short time stability of speech signal, previous speech enhancement (SE) methods concentrated only on magnitude estimation, and these methods added a phase of the mixture in reconstructing the speech. The performance is limited in these approaches since the phase will also carry some of the speech information. Some of the speech enhancement approaches were developed later to jointly estimate both magnitudes as well as phases. Recently, complex-valued models, like deep complex convolution recurrent network (DCCRN), are proposed, but the computation of the model is very huge. In this work, we propose a Discrete Cosine Transform-based Densely Connected Convolutional Gated Recurrent Unit (DCTDCCGRU) model using dilated dense block and stacked GRU. The dense connectivity strengthens the gradient propagation by concatenating features from previous layers at the input. The advantage of the dense block is that at various resolutions, the dilated convolutions aid with context aggregation, and the dense connectivity provides a feature map with more precise target information by passing through multiple layers. To represent the correlation between neighboring noisy speech frames, a two Layer GRU is added in the bottleneck of U-Net. The experimental findings demonstrate that the proposed model outperformed the other existing models in terms of STOI (short-time objective intelligibility), PESQ (perceptual evaluation of the speech quality), and output SNR (signal-to-noise ratio).<\/jats:p>","DOI":"10.3233\/jifs-223951","type":"journal-article","created":{"date-parts":[[2023,5,9]],"date-time":"2023-05-09T15:45:31Z","timestamp":1683647131000},"page":"1195-1208","source":"Crossref","is-referenced-by-count":8,"title":["DCT based densely connected convolutional GRU for real-time speech enhancement"],"prefix":"10.1177","volume":"45","author":[{"given":"Chaitanya","family":"Jannu","sequence":"first","affiliation":[{"name":"School of Electronics Engineering, VIT-AP University, Amaravati, India"}]},{"given":"Sunny Dayal","family":"Vanambathina","sequence":"additional","affiliation":[{"name":"School of Electronics Engineering, VIT-AP University, Amaravati, India"}]}],"member":"179","reference":[{"issue":"1","key":"10.3233\/JIFS-223951_ref1","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/T-C.1974.223784","article-title":"Discrete cosine transform","volume":"100","author":"Ahmed","year":"1974","journal-title":"IEEE transactions on Computers"},{"issue":"4","key":"10.3233\/JIFS-223951_ref2","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1121\/1.382599","article-title":"Image method for efficiently simulating small-room acoustics","volume":"65","author":"Allen","year":"1979","journal-title":"The Journal of the Acoustical Society of America"},{"issue":"5","key":"10.3233\/JIFS-223951_ref4","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1109\/TASLP.2017.2687829","article-title":"Features for masking-based monaural speech separation in reverberant conditions","volume":"25","author":"Delfarah","year":"2017","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"8","key":"10.3233\/JIFS-223951_ref16","doi-asserted-by":"crossref","first-page":"1256","DOI":"10.1109\/TASLP.2019.2915167","article-title":"Conv-tasnet: Surpassing ideal time\u2013frequency magnitude masking for speech separation","volume":"27","author":"Luo","year":"2019","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"11","key":"10.3233\/JIFS-223951_ref18","doi-asserted-by":"crossref","first-page":"1680","DOI":"10.1109\/LSP.2018.2871419","article-title":"A deep learning loss function based on the perceptual evaluation of the speech quality","volume":"25","author":"Martin-Donas","year":"2018","journal-title":"IEEE Signal Processing Letters"},{"issue":"10","key":"10.3233\/JIFS-223951_ref19","doi-asserted-by":"crossref","first-page":"e3","DOI":"10.23915\/distill.00003","article-title":"Deconvolution and checkerboard artifacts","volume":"1","author":"Odena","year":"2016","journal-title":"Distill"},{"issue":"4","key":"10.3233\/JIFS-223951_ref20","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1016\/j.specom.2010.12.003","article-title":"The importance of phase in speech enhancement","volume":"53","author":"Paliwal","year":"2011","journal-title":"Speech Communication"},{"key":"10.3233\/JIFS-223951_ref21","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1109\/TASLP.2021.3064421","article-title":"Dense cnn with self-attention for time-domain speech enhancement","volume":"29","author":"Pandey","year":"2021","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"11","key":"10.3233\/JIFS-223951_ref30","doi-asserted-by":"crossref","first-page":"1486","DOI":"10.1016\/j.specom.2006.09.003","article-title":"Binary and ratio time-frequency masks for robust speech recognition","volume":"48","author":"Srinivasan","year":"2006","journal-title":"Speech Communication"},{"issue":"3","key":"10.3233\/JIFS-223951_ref34","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/0167-6393(93)90095-3","article-title":"Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems","volume":"12","author":"Varga","year":"1993","journal-title":"Speech Communication"},{"issue":"10","key":"10.3233\/JIFS-223951_ref35","doi-asserted-by":"crossref","first-page":"1702","DOI":"10.1109\/TASLP.2018.2842159","article-title":"Supervised speech separation based on deep learning: An overview","volume":"26","author":"Wang","year":"2018","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"12","key":"10.3233\/JIFS-223951_ref36","doi-asserted-by":"crossref","first-page":"1849","DOI":"10.1109\/TASLP.2014.2352935","article-title":"On training targets for supervised speech separation","volume":"22","author":"Wang","year":"2014","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"3","key":"10.3233\/JIFS-223951_ref37","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1109\/TASLP.2015.2512042","article-title":"Complex ratio masking for monaural speech separation","volume":"24","author":"Williamson","year":"2015","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"3","key":"10.3233\/JIFS-223951_ref38","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1109\/TASLP.2015.2512042","article-title":"Complex ratio masking for monaural speech separation","volume":"24","author":"Williamson","year":"2015","journal-title":"IEEE\/ACM Transactions on Audio, Speech, and Language Processing"},{"issue":"1","key":"10.3233\/JIFS-223951_ref39","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1109\/LSP.2013.2291240","article-title":"An experimental study on speech enhancement based on deep neural networks","volume":"21","author":"Xu","year":"2013","journal-title":"IEEE Signal Processing Letters"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JIFS-223951","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T07:19:13Z","timestamp":1769671153000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JIFS-223951"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,2]]},"references-count":15,"journal-issue":{"issue":"1"},"URL":"https:\/\/doi.org\/10.3233\/jifs-223951","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,2]]}}}