{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T00:42:56Z","timestamp":1760402576857,"version":"build-2065373602"},"reference-count":32,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2020,4,23]],"date-time":"2020-04-23T00:00:00Z","timestamp":1587600000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002531","name":"Kyungpook National University","doi-asserted-by":"publisher","award":["Research Fund, 2018"],"award-info":[{"award-number":["Research Fund, 2018"]}],"id":[{"id":"10.13039\/501100002531","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Applied Sciences"],"abstract":"<jats:p>In this paper, methods to estimate the number of basis vectors of the nonnegative matrix factorization (NMF) of automatic music transcription (AMT) systems are proposed. Previously, studies on NMF-based AMT have demonstrated that the number of basis vectors affects the performance and that the number of note events can be a good selection as the rank of NMF. However, many NMF-based AMT methods do not provide a method to estimate the appropriate number of basis vectors; instead, the number is assumed to be given in advance, even though the number of basis vectors significantly affects the algorithm\u2019s performance. Recently, based on Bayesian methods, certain estimation algorithms for the number of basis vectors have been proposed; however, they are not designed to be used as music transcription algorithms but are components of specific NMF methods and thus cannot be used generally as NMF-based transcription algorithms. Our proposed estimation algorithms are based on eigenvalue decomposition and Stein\u2019s unbiased risk estimator (SURE). Because the SURE method requires variance in undesired components as a priori knowledge, the proposed algorithms estimate the value using random matrix theory and first and second onset information in the input music signal. Experiments were then conducted for the AMT task using the MIDI-aligned piano sounds (MAPS) database, and these algorithms were compared with variational NMF, gamma process NMF, and NMF with automatic relevance determination algorithms. Based on experimental results, the conventional NMF-based transcription algorithm with the proposed rank estimation algorithms demonstrated enhanced F1 score performances of 2\u20133% compared to the algorithms. While the performance advantages are not significantly large, the results are meaningful because the proposed algorithms are lightweight, are easy to combine with any other NMF methods that require an a priori rank parameter, and do not have setting parameters that considerably affect the performance.<\/jats:p>","DOI":"10.3390\/app10082911","type":"journal-article","created":{"date-parts":[[2020,4,23]],"date-time":"2020-04-23T10:46:22Z","timestamp":1587638782000},"page":"2911","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Estimating the Rank of a Nonnegative Matrix Factorization Model for Automatic Music Transcription Based on Stein\u2019s Unbiased Risk Estimator"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8220-192X","authenticated-orcid":false,"given":"Seokjin","family":"Lee","sequence":"first","affiliation":[{"name":"School of Electronics Engineering, Kyungpook National University, Daegu 41566, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2020,4,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1066","DOI":"10.1109\/TASL.2006.885253","article-title":"Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria","volume":"15","author":"Virtanen","year":"2007","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"ref_3","unstructured":"Lee, D.D., and Seung, H.S. (2001). Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.csda.2006.11.006","article-title":"Algorithms and applications for approximate nonnegative matrix factorization","volume":"52","author":"Berry","year":"2007","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Wang, H., Wang, M., Li, J., Song, L., and Hao, Y. (2019). A Novel Signal Separation Method Based on Improved Sparse Non-Negative Matrix Factorization. Entropy, 21.","DOI":"10.3390\/e21050445"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1109\/MSP.2018.2877582","article-title":"Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications","volume":"36","author":"Fu","year":"2019","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3909","DOI":"10.1109\/TGRS.2017.2683719","article-title":"Total variation regularized reweighted sparse nonnegative matrix factorization for hyperspectral unmixing","volume":"55","author":"He","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1109\/TCI.2017.2693967","article-title":"Distributed blind hyperspectral unmixing via joint sparsity and low-rank constrained non-negative matrix factorization","volume":"3","author":"Tsinos","year":"2017","journal-title":"IEEE Trans. Comput. Imaging"},{"key":"ref_9","unstructured":"Smaragdis, P., and Brown, J.C. (2003, January 19\u201322). Non-negative matrix factorization for polyphonic music transcription. Proceedings of the 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Vincent, E., Bertin, N., and Badeau, R. (April, January 31). Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription. Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA.","DOI":"10.1109\/ICASSP.2008.4517558"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1109\/MSP.2018.2869928","article-title":"Automatic music transcription: An overview","volume":"36","author":"Benetos","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1610","DOI":"10.1587\/transfun.E95.A.1610","article-title":"Polyphonic Music Transcription by Nonnegative Matrix Factorization with Harmonicity and Temporality Criteria","volume":"95","author":"Park","year":"2012","journal-title":"IEICE Trans. Fundam. Electron. Commun. Comput. Sci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2009\/785152","article-title":"Bayesian inference for nonnegative matrix factorisation models","volume":"2009","author":"Cemgil","year":"2009","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_14","unstructured":"Hoffman, M.D., Blei, D.M., and Cook, P.R. (2010, January 21\u201324). Bayesian nonparametric matrix factorization for recorded music. Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1592","DOI":"10.1109\/TPAMI.2012.240","article-title":"Automatic relevance determination in nonnegative matrix factorization with the\/spl beta\/-divergence","volume":"35","author":"Tan","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"5804","DOI":"10.1109\/TSP.2008.2005865","article-title":"Dimension estimation in noisy PCA with SURE and random matrix theory","volume":"56","author":"Ulfarsson","year":"2008","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1135","DOI":"10.1214\/aos\/1176345632","article-title":"Estimation of the mean of a multivariate normal distribution","volume":"9","author":"Stein","year":"1981","journal-title":"Ann. Stat."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2276","DOI":"10.1587\/transinf.2019EDL8049","article-title":"Estimation of the Matrix Rank of Harmonic Components of a Spectrogram in a Piano Music Signal Based on the Stein\u2019s Unbiased Risk Estimator and Median Filter","volume":"102","author":"Lee","year":"2019","journal-title":"IEICE Trans. Inf. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1087","DOI":"10.1109\/TNNLS.2012.2197827","article-title":"Online nonnegative matrix factorization with robust stochastic approximation","volume":"23","author":"Guan","year":"2012","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"980","DOI":"10.1587\/transfun.E96.A.980","article-title":"RLS-based on-line sparse nonnegative matrix factorization method for acoustic signal processing systems","volume":"96","author":"Lee","year":"2013","journal-title":"IEICE Trans. Fundam. Electron. Commun. Comput. Sci."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ulfarsson, M.O., and Solo, V. (2013, January 26\u201331). Tuning parameter selection for nonnegative matrix factorization. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada.","DOI":"10.1109\/ICASSP.2013.6638936"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Cichocki, A., Zdunek, R., Phan, A.H., and Amari, S.I. (2009). Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation, John Wiley & Sons.","DOI":"10.1002\/9780470747278"},{"key":"ref_23","unstructured":"Wang, B., and Plumbley, M.D. (2005, January 23\u201324). Musical audio stream separation by non-negative matrix factorization. Proceedings of the DMRN Summer Conference, At Glasgow, UK."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"794","DOI":"10.1137\/16M1080999","article-title":"The nonnegative rank of a matrix: Hard problems, easy solutions","volume":"59","author":"Shitov","year":"2017","journal-title":"SIAM Rev."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1016\/0024-3795(93)90224-C","article-title":"Nonnegative Ranks, Decompositions, and Factorizations of Nonnegative Matices","volume":"190","author":"Cohen","year":"1993","journal-title":"Linear Algebra Appl."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1109\/TSP.2008.2008212","article-title":"Generalized SURE for exponential families: Applications to regularization","volume":"57","author":"Eldar","year":"2008","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1382","DOI":"10.1016\/j.jmva.2005.08.003","article-title":"Eigenvalues of large sample covariance matrices of spiked population models","volume":"97","author":"Baik","year":"2006","journal-title":"J. Multivar. Anal."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Klapuri, A. (1999, January 15\u201319). Sound onset detection by applying psychoacoustic knowledge. Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ, USA.","DOI":"10.1109\/ICASSP.1999.757494"},{"key":"ref_29","unstructured":"Dixon, S. (2006, January 18\u201320). Onset detection revisited. Proceedings of the 9th International Conference on Digital Audio Effects, Montreal, QC, Canada."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1643","DOI":"10.1109\/TASL.2009.2038819","article-title":"Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle","volume":"18","author":"Emiya","year":"2009","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_31","unstructured":"Sch\u00f6rkhuber, C., Klapuri, A., Holighaus, N., and D\u00f6rfler, M. (2014, January 27\u201329). A Matlab toolbox for efficient perfect reconstruction time-frequency transforms with log-frequency resolution. Proceedings of the Audio Engineering Society Conference: 53rd International Conference: Semantic Audio, London, UK."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1121\/1.396427","article-title":"Measurement of pitch by subharmonic summation","volume":"83","author":"Hermes","year":"1988","journal-title":"J. Acoust. Soc. Am."}],"container-title":["Applied Sciences"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2076-3417\/10\/8\/2911\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:45:17Z","timestamp":1760363117000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2076-3417\/10\/8\/2911"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,23]]},"references-count":32,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2020,4]]}},"alternative-id":["app10082911"],"URL":"https:\/\/doi.org\/10.3390\/app10082911","relation":{},"ISSN":["2076-3417"],"issn-type":[{"type":"electronic","value":"2076-3417"}],"subject":[],"published":{"date-parts":[[2020,4,23]]}}}