{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,29]],"date-time":"2025-11-29T08:01:18Z","timestamp":1764403278070},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:p>This paper targets the perceptual task of separating the different interacting voices, i.e., monophonic melodic streams, in a polyphonic musical piece. We target symbolic music, where notes are explicitly encoded, and model this task as a Multi-Trajectory Tracking (MTT) problem from discrete observations, i.e., notes in a pitch-time space. Our approach builds a graph from a musical piece, by creating one node for every note, and separates the melodic trajectories by predicting a link between two notes if they are consecutive in the same voice\/stream. This kind of local, greedy prediction is made possible by node embeddings created by a heterogeneous graph neural network that can capture inter- and intra-trajectory information. Furthermore, we propose a new regularization loss that encourages the output to respect the MTT premise of at most one incoming and one outgoing link for every node, favoring monophonic (voice) trajectories; this loss function might also be useful in other general MTT scenarios. Our approach does not use domain-specific heuristics, is scalable to longer sequences and a higher number of voices, and can handle complex cases such as voice inversions and overlaps. We reach new state-of-the-art results for the voice separation task on classical music of different styles. All code, data, and pretrained models are available on https:\/\/github.com\/manoskary\/vocsep_ijcai2023<\/jats:p>","DOI":"10.24963\/ijcai.2023\/430","type":"proceedings-article","created":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T08:31:30Z","timestamp":1691742690000},"page":"3866-3874","source":"Crossref","is-referenced-by-count":3,"title":["Musical Voice Separation as Link Prediction: Modeling a Musical Perception Task as a Multi-Trajectory Tracking Problem"],"prefix":"10.24963","author":[{"given":"Emmanouil","family":"Karystinaios","sequence":"first","affiliation":[{"name":"Johannes Kepler University Linz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Francesco","family":"Foscarin","sequence":"additional","affiliation":[{"name":"Johannes Kepler University Linz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gerhard","family":"Widmer","sequence":"additional","affiliation":[{"name":"Johannes Kepler University Linz, Austria"},{"name":"LIT AI Lab, Linz Institute of Technology, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"32","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2023","name":"Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}","start":{"date-parts":[[2023,8,19]]},"theme":"Artificial Intelligence","location":"Macau, SAR China","end":{"date-parts":[[2023,8,25]]}},"container-title":["Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T08:48:24Z","timestamp":1691743704000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2023\/430"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2023,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2023\/430","relation":{},"subject":[],"published":{"date-parts":[[2023,8]]}}}