{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,29]],"date-time":"2025-09-29T08:11:33Z","timestamp":1759133493626},"reference-count":20,"publisher":"MIT Press","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computer Music Journal"],"published-print":{"date-parts":[[2012,12]]},"abstract":"<jats:p>In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics.<\/jats:p>","DOI":"10.1162\/comj_a_00146","type":"journal-article","created":{"date-parts":[[2013,1,8]],"date-time":"2013-01-08T14:55:56Z","timestamp":1357656956000},"page":"81-94","source":"Crossref","is-referenced-by-count":45,"title":["A Shift-Invariant Latent Variable Model for Automatic Music Transcription"],"prefix":"10.1162","volume":"36","author":[{"given":"Emmanouil","family":"Benetos","sequence":"first","affiliation":[{"name":"Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London E1 4NS, UK. ,"}]},{"given":"Simon","family":"Dixon","sequence":"additional","affiliation":[{"name":"Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London E1 4NS, UK. ,"}]}],"member":"281","reference":[{"key":"p_1","first-page":"315","author":"Bay M.","year":"2009","journal-title":"10th International Society for Music Information Retrieval Conference"},{"key":"p_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2011.2162394"},{"key":"p_4","first-page":"19","author":"Benetos E.","year":"2011","journal-title":"8th Sound and Music Computing Conference"},{"key":"p_5","doi-asserted-by":"publisher","DOI":"10.1080\/09298211003695579"},{"key":"p_7","doi-asserted-by":"publisher","DOI":"10.1111\/j.2517-6161.1977.tb01600.x"},{"key":"p_8","first-page":"489","author":"Dessein A.","year":"2010","journal-title":"11th International Society for Music Information Retrieval Conference"},{"key":"p_10","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2009.2038819"},{"key":"p_11","first-page":"401","author":"Fuentes B.","year":"2011","journal-title":"IEEE International Conference on Audio, Speech and Signal Processing"},{"key":"p_12","first-page":"229","author":"Goto M.","year":"2003","journal-title":"International Conference on Music Information Retrieval"},{"key":"p_13","first-page":"21","author":"Grindlay G.","year":"2010","journal-title":"11th International Society for Music Information Retrieval Conference"},{"key":"p_14","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2011.2162395"},{"key":"p_15","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2006.885248"},{"key":"p_19","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1109\/ICASSP.2009.4959583","author":"Mysore G.","year":"2009","journal-title":"IEEE International Conference on Acoustics, Speech, and Signal Processing"},{"key":"p_20","author":"Poliner G.","year":"2007","journal-title":"EURASIP Journal on Advances in Signal Processing (8):154-162."},{"key":"p_21","doi-asserted-by":"publisher","DOI":"10.1109\/5.18626"},{"key":"p_23","doi-asserted-by":"publisher","DOI":"10.1162\/comj.2008.32.3.72"},{"key":"p_24","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2007.912998"},{"key":"p_25","first-page":"322","author":"Sch\u00f6rkhuber C.","year":"2010","journal-title":"7th Sound and Music Computing Conference"},{"key":"p_27","doi-asserted-by":"publisher","DOI":"10.1121\/1.3106529"},{"key":"p_29","doi-asserted-by":"crossref","first-page":"2069","DOI":"10.1109\/ICASSP.2008.4518048","author":"Smaragdis P.","year":"2008","journal-title":"IEEE International Conference on Acoustics, Speech, and Signal Processing"}],"container-title":["Computer Music Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/COMJ_a_00146","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,4]],"date-time":"2024-05-04T12:46:45Z","timestamp":1714826805000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/comj\/article\/36\/4\/81-94\/94504"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,12]]},"references-count":20,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,12]]}},"alternative-id":["10.1162\/COMJ_a_00146"],"URL":"https:\/\/doi.org\/10.1162\/comj_a_00146","relation":{},"ISSN":["0148-9267","1531-5169"],"issn-type":[{"value":"0148-9267","type":"print"},{"value":"1531-5169","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,12]]}}}