{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,19]],"date-time":"2025-12-19T15:55:52Z","timestamp":1766159752253,"version":"build-2065373602"},"reference-count":276,"publisher":"Association for Computing Machinery (ACM)","issue":"4","funder":[{"name":"Queen Mary University of London\u2019s UKRI Centre for Doctoral Training in AI and Music","award":["EP\/S022694\/1"],"award-info":[{"award-number":["EP\/S022694\/1"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2026,3,31]]},"abstract":"<jats:p>Research on generative systems in music has seen considerable attention and growth in recent years. A variety of attempts have been made to systematically evaluate such systems. We present an interdisciplinary review of the common evaluation targets, methodologies, and metrics for the evaluation of both system output and model use, covering subjective and objective approaches, qualitative and quantitative approaches, as well as empirical and computational methods. We examine the benefits and limitations of these approaches from a musicological, an engineering, and an HCI perspective.<\/jats:p>","DOI":"10.1145\/3769106","type":"journal-article","created":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T11:40:37Z","timestamp":1758627637000},"page":"1-36","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Survey on the Evaluation of Generative Models in Music"],"prefix":"10.1145","volume":"58","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6319-578X","authenticated-orcid":false,"given":"Alexander","family":"Lerch","sequence":"first","affiliation":[{"name":"Music, Georgia Institute of Technology","place":["Atlanta, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5454-8384","authenticated-orcid":false,"given":"Claire","family":"Arthur","sequence":"additional","affiliation":[{"name":"Music, Georgia Institute of Technology","place":["Atlanta, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1382-2914","authenticated-orcid":false,"given":"Nick","family":"Bryan-Kinns","sequence":"additional","affiliation":[{"name":"Creative Computing Institute, University of the Arts","place":["London, United Kingdom of Great Britain and Northern Ireland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6895-2441","authenticated-orcid":false,"given":"Corey","family":"Ford","sequence":"additional","affiliation":[{"name":"Creative Computing Institute, University of the Arts","place":["London, United Kingdom of Great Britain and Northern Ireland"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-9889-5617","authenticated-orcid":false,"given":"Qianyi","family":"Sun","sequence":"additional","affiliation":[{"name":"Music, Georgia Institute of Technology","place":["Atlanta, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2487-2052","authenticated-orcid":false,"given":"Ashvala","family":"Vinay","sequence":"additional","affiliation":[{"name":"NoneType Computing","place":["Atlanta, United States"]}]}],"member":"320","published-online":{"date-parts":[[2025,10,25]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1080\/09298215.2014.991738"},{"key":"e_1_3_2_3_2","unstructured":"A. Agostinelli T. I. Denk Z. Borsos J. Engel M. Verzetti A. Caillon Q. Huang A. Jansen A. Roberts M. Tagliasacchi et\u00a0al. 2023. MusicLM: Generating music from text. arXiv:2301.11325. Retrieved from https:\/\/arxiv.org\/abs\/2301.11325"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/2967506"},{"key":"e_1_3_2_5_2","volume-title":"Proc. ICML","author":"Alaa A.","year":"2022","unstructured":"A. Alaa, B. Van Breugel, E. S. Saveliev, and M. van der Schaar. 2022. How faithful is your synthetic data? Sample-level metrics for evaluating and auditing generative models. In Proc. ICML. PMLR, Baltimore."},{"issue":"3","key":"e_1_3_2_6_2","first-page":"48","article-title":"When to ask participants to think aloud: A comparative study of concurrent and retrospective think-aloud methods","volume":"6","author":"Alshammari T.","year":"2015","unstructured":"T. Alshammari, O. Alhadreti, and P. Mayhew. 2015. When to ask participants to think aloud: A comparative study of concurrent and retrospective think-aloud methods. International Journal of Human Computer Interaction 6, 3 (2015), 48\u201364.","journal-title":"International Journal of Human Computer Interaction"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300233"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3351148"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1186\/s13636-023-00279-6"},{"key":"e_1_3_2_10_2","volume-title":"Proc. ML Evaluation Standards Workshop at ICLR","author":"Banar B.","year":"2022","unstructured":"B. Banar and S. Colton. 2022. A quality-diversity-based evaluation strategy for symbolic music generation. In Proc. ML Evaluation Standards Workshop at ICLR. Online."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-03789-4_2"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.3390\/fi15080260"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1080\/10447310802205776"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3600211.3604686"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","unstructured":"R. Batlle-Roca E. G\u00f3mez W. Liao X. Serra and Y. Mitsufuji. 2023. Transparency in Music-Generative AI: A Systematic Literature Review. Preprint. DOI:10.21203\/rs.3.rs-3708077\/v1","DOI":"10.21203\/rs.3.rs-3708077\/v1"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","unstructured":"R. Batlle-Roca W.-H. Liao X. Serra Y. Mitsufuji and E. G\u00f3mez. 2024. Towards assessing data replication in music generation with music similarity metrics on raw audio. Proc. ISMIR San Francisco. DOI:10.5281\/zenodo.14877501","DOI":"10.5281\/zenodo.14877501"},{"key":"e_1_3_2_17_2","volume-title":"Choosers: A Visual Programming Language for Nondeterministic Music Composition by Non-Programmers","author":"Bellingham M.","year":"2022","unstructured":"M. Bellingham. 2022. Choosers: A Visual Programming Language for Nondeterministic Music Composition by Non-Programmers. Ph.D. Dissertation. The Open University, Milton Keynes."},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1162\/leon_a_01959"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/2491500.2491502"},{"key":"e_1_3_2_20_2","volume-title":"Aesthetics and Psychobiology","author":"Berlyne D. E.","year":"1971","unstructured":"D. E. Berlyne. 1971. Aesthetics and Psychobiology. Appleton-Century-Crofts, New York."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.procir.2024.01.098"},{"key":"e_1_3_2_22_2","volume-title":"Proc. ICLR","author":"Binkowski M.","year":"2018","unstructured":"M. Binkowski, D. J. Sutherland, M. Arbel, and A. Gretton. 2018. Demystifying MMD GANs. In Proc. ICLR. Vancouver."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV48630.2021.00158"},{"key":"e_1_3_2_24_2","volume-title":"Proc. PPIG","author":"Blackwell A. F.","year":"2000","unstructured":"A. F. Blackwell and T. R. G. Green. 2000. A cognitive dimensions questionnaire optimised for users. In Proc. PPIG."},{"key":"e_1_3_2_25_2","volume-title":"Funology: From Usability to Enjoyment (1st ed.)","author":"Blythe M. A.","year":"2004","unstructured":"M. A. Blythe, K. Overbeeke, A. F. Monk, and P. C. Wright. 2004. Funology: From Usability to Enjoyment (1st ed.). Springer, Dordrecht."},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(98)00055-1"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2021.103329"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/STCR51658.2021.9587927"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2023\/640"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1749-6632.2009.04843.x"},{"key":"e_1_3_2_31_2","volume-title":"Successful Qualitative Research: A Practical Guide for Beginners","author":"Braun V.","year":"2013","unstructured":"V. Braun and V. Clarke. 2013. Successful Qualitative Research: A Practical Guide for Beginners. SAGE Publications Ltd."},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1080\/2159676X.2019.1628806"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-70163-9"},{"key":"e_1_3_2_34_2","volume-title":"Proc. NeurIPS","author":"Brown T.","year":"2020","unstructured":"T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et\u00a0al. 2020. Language models are few-shot learners. In Proc. NeurIPS, Vol. 33. Online."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICTAI.2018.00123"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3636457"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1201\/9781003406273"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3635636.3656198"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-31360-8_10"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11633-023-1457-1"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/2804405"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1080\/15710880601007994"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1162\/LEON_a_01471"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3555578"},{"key":"e_1_3_2_45_2","volume-title":"Proc. USENIX Security Symp. (USENIX Security)","author":"Carlini N.","year":"2023","unstructured":"N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tram\u00e8r, B. Balle, D. Ippolito, and E. Wallace. 2023. Extracting training data from diffusion models. In Proc. USENIX Security Symp. (USENIX Security)."},{"key":"e_1_3_2_46_2","volume-title":"Proc. USENIX Security Symp. (USENIX Security)","author":"Carlini N.","year":"2021","unstructured":"N. Carlini, F. Tram\u00e8r, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, \u00da. Erlingsson, et\u00a0al. 2021. Extracting training data from large language models. In Proc. USENIX Security Symp. (USENIX Security)."},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.3389\/frai.2020.00014"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1080\/13600869.2012.646786"},{"key":"e_1_3_2_49_2","first-page":"19","volume-title":"Musical Excellence: Strategies and Techniques to Enhance Performance","author":"Chaffin R.","year":"2004","unstructured":"R. Chaffin and A. F. Lemieux. 2004. General perspectives on achieving musical excellence. In Musical Excellence: Strategies and Techniques to Enhance Performance. Aaron Williamon (Ed.), Oxford University Press, 19\u201339."},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/2317956.2318078"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP48485.2024.10447265"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSC.2020.00025"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","unstructured":"E. Cherry and C. Latulipe. 2014. Quantifying the creativity support of digital tools through the creativity support index. ACM Transactions on Computer-Human Interaction 21 4 (2014) 1\u201325. DOI:10.1145\/2617588","DOI":"10.1145\/2617588"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3604930.3605705"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3511808.3557235"},{"key":"e_1_3_2_56_2","volume-title":"Proc. ICLR","author":"Chu H.","year":"2016","unstructured":"H. Chu, R. Urtasun, and S. Fidler. 2016. Song from PI: A musically plausible network for pop music generation. In Proc. ICLR."},{"key":"e_1_3_2_57_2","unstructured":"Y. Chung P. Eu J. Lee K. Choi J. Nam and B. Sangbae Chon. 2025. KAD: No more FAD! An effective and efficient evaluation metric for audio generation. arXiv.2502.15602. Retrieved from https:\/\/arxiv.org\/abs\/2502.15602"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2022.118190"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1017\/S1355771808000332"},{"key":"e_1_3_2_60_2","series-title":"The computer music and digital audio series","volume-title":"The Algorithmic Composer","author":"Cope D.","year":"2000","unstructured":"D. Cope. 2000. The Algorithmic Composer. Number 16 in The computer music and digital audio series. A-R Ed, Middleton."},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.5555\/1213294"},{"key":"e_1_3_2_62_2","volume-title":"Proc. NeurIPS","author":"Copet J.","year":"2023","unstructured":"J. Copet, F. Kreuk, I. Gat, T. Remez, D. Kant, G. Synnaeve, Y. Adi, and A. Defossez. 2023. Simple and controllable music generation. In Proc. NeurIPS, Vol. 36."},{"key":"e_1_3_2_63_2","volume-title":"Flow: The Psychology of Optimal Experience","author":"Cs\u00edkszentmih\u00e1lyi M.","year":"1990","unstructured":"M. Cs\u00edkszentmih\u00e1lyi. 1990. Flow: The Psychology of Optimal Experience. Harper Collins, New York."},{"key":"e_1_3_2_64_2","volume-title":"Proc. ISMIR","author":"Dai S.","year":"2022","unstructured":"S. Dai, H. Yu, and R. B. Dannenberg. 2022. What is missing in deep music generation? A study of repetition and structure in popular music. In Proc. ISMIR."},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2012-476"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/iV.2017.49"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSAI.2017.8248347"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1162\/evco_a_00265"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR48806.2021.9413310"},{"key":"e_1_3_2_70_2","unstructured":"D. R. Desai and M. Riedl. 2024. Between copyright and computer science: The law and ethics of generative AI. Nw. J. Tech. & Intell. Prop. 55 (2024) 22. https:\/\/scholarlycommons.law.northwestern.edu\/njtip\/vol22\/iss1\/2"},{"key":"e_1_3_2_71_2","unstructured":"P. Dhariwal H. Jun C. Payne J. W. Kim A. Radford and I. Sutskever. 2020. Jukebox: A generative model for music. arXiv.2005.00341. Retrieved from https:\/\/arxiv.org\/abs\/2005.00341"},{"key":"e_1_3_2_72_2","volume-title":"Proc. NeurIPS","author":"Dieleman S.","year":"2018","unstructured":"S. Dieleman, A. van den Oord, and K. Simonyan. 2018. The challenge of realistic music generation: Modelling raw audio at scale. In Proc. NeurIPS, Vol. 31."},{"key":"e_1_3_2_73_2","volume-title":"Proc. ISMIR","author":"Donahue C.","year":"2019","unstructured":"C. Donahue, H. H. Mao, Y. E. Li, G. W. Cottrell, and J. McAuley. 2019. LakhNES: Improving multi-instrumental music generation with cross-domain pre-training. In Proc. ISMIR."},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11312"},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10096975"},{"key":"e_1_3_2_76_2","first-page":"33","volume-title":"Representing Musical Structure","author":"Dowling W. Jay","year":"1991","unstructured":"W. Jay Dowling. 1991. Pitch structure. In Representing Musical Structure. P. Howell, R. West, and I. Cross (Eds.), Academic Press, London, 33\u201357."},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1145\/1152215.1152254"},{"key":"e_1_3_2_78_2","article-title":"High fidelity neural audio compression","author":"D\u00e9fossez A.","year":"2023","unstructured":"A. D\u00e9fossez, J. Copet, G. Synnaeve, and Y. Adi. 2023. High fidelity neural audio compression. Transactions on Machine Learning Research (2023). Retrieved fromhttps:\/\/openreview.net\/forum?id=ivCd8z8zR2","journal-title":"Transactions on Machine Learning Research"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/UEMCON51285.2020.9298149"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2109381"},{"key":"e_1_3_2_81_2","volume-title":"Proc. ICLR","author":"Engel J.","year":"2019","unstructured":"J. Engel, K. K. Agrawal, S. Chen, I. Gulrajani, C. Donahue, and A. Roberts. 2019. GANSynth: Adversarial neural audio synthesis. In Proc. ICLR."},{"key":"e_1_3_2_82_2","volume-title":"Proc. ICLR","author":"Engel J.","year":"2020","unstructured":"J. Engel, L. Hantrakul, C. Gu, and A. Roberts. 2020. DDSP: Differentiable digital signal processing. In Proc. ICLR."},{"key":"e_1_3_2_83_2","volume-title":"Proc. ICML","author":"Engel J.","year":"2017","unstructured":"J. Engel, C. Resnick, A. Roberts, S. Dieleman, M. Norouzi, D. Eck, and K. Simonyan. 2017. Neural audio synthesis of musical notes with wavenet autoencoders. In Proc. ICML. PMLR, Sydney."},{"key":"e_1_3_2_84_2","volume-title":"Proc. ISMIR","author":"Ens J.","year":"2019","unstructured":"J. Ens and P. Pasquier. 2019. Quantifying musical style: Ranking symbolic music based on similarity to a style. In Proc. ISMIR."},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.adh4451"},{"key":"e_1_3_2_86_2","volume-title":"Proc. ICML","author":"Evans Z.","year":"2024","unstructured":"Z. Evans, C. J. Carr, J. Taylor, S. H. Hawley, and J. Pons. 2024. Fast timing-conditioned latent audio diffusion. In Proc. ICML. Retrieved from https:\/\/openreview.net\/forum?id=jOlO8t1xdx"},{"key":"e_1_3_2_87_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49660.2025.10888461"},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","DOI":"10.1613\/jair.3908"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.1145\/1978942.1978965"},{"key":"e_1_3_2_90_2","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.4813334"},{"key":"e_1_3_2_91_2","volume-title":"Proc. ICMC","author":"Fiebrink R.","year":"2010","unstructured":"R. Fiebrink, D. Trueman, C. Britt, M. Nagai, K. Kaczmarek, M. Early, M. R. Daniel, A. Hege, and P. Cook. 2010. Toward understanding human-computer interaction in composing the instrument. In Proc. ICMC. ICMA."},{"key":"e_1_3_2_92_2","volume-title":"Proc. of Gen. AI and HCI Workshop, CHI","author":"Ford C.","year":"2022","unstructured":"C. Ford and N. Bryan-Kinns. 2022. Speculating on reflection and people\u2019s music Co-creation with AI. In Proc. of Gen. AI and HCI Workshop, CHI. ACM."},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544548.3581077"},{"key":"e_1_3_2_94_2","doi-asserted-by":"publisher","DOI":"10.1145\/3635636.3656185"},{"key":"e_1_3_2_95_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-74494-8_57"},{"key":"e_1_3_2_96_2","doi-asserted-by":"publisher","DOI":"10.1145\/3364998"},{"key":"e_1_3_2_97_2","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376514"},{"key":"e_1_3_2_98_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-31727-9_10"},{"key":"e_1_3_2_99_2","unstructured":"S. Garcia-Valencia A. Betancourt and J. G. Lalinde-Pulido. 2021. A framework to compare music generative models using automatic evaluation metrics extended to rhythm. arXiv.2101.07669. Retrieved from https:\/\/arxiv.org\/abs\/2101.07669"},{"key":"e_1_3_2_100_2","unstructured":"N. R. Gaskins. 2022. Interrogating AI Bias through Digital Art. Blog post. Retrieved from https:\/\/just-tech.ssrc.org\/field-reviews\/interrogating-ai-bias-through-digital-art\/"},{"key":"e_1_3_2_101_2","doi-asserted-by":"publisher","DOI":"10.1145\/3458723"},{"key":"e_1_3_2_102_2","doi-asserted-by":"publisher","DOI":"10.1037\/a0025704"},{"key":"e_1_3_2_103_2","doi-asserted-by":"publisher","DOI":"10.1109\/MMSP53017.2021.9733502"},{"issue":"3","key":"e_1_3_2_104_2","first-page":"117","article-title":"Musical qualia, context, time and emotion","volume":"11","author":"Goguen J.","year":"2004","unstructured":"J. Goguen. 2004. Musical qualia, context, time and emotion. Journal of Consciousness Studies 11, 3-4 (2004), 117\u2013147.","journal-title":"Journal of Consciousness Studies"},{"key":"e_1_3_2_105_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2024.3381611"},{"key":"e_1_3_2_106_2","volume-title":"Proc. XML Conf.","author":"Good M.","year":"2001","unstructured":"M. Good. 2001. MusicXML: An Internet-Friendly Format for Sheet Music. In Proc. XML Conf."},{"key":"e_1_3_2_107_2","first-page":"29","volume-title":"The Encyclopedia of Central Banking","author":"Goodhart C.","year":"2015","unstructured":"C. Goodhart. 2015. Goodhart\u2019s law. In The Encyclopedia of Central Banking. L.-P. Rochon and S. Rossi (Eds.), Edward Elgar Publishing, 29\u201333."},{"issue":"25","key":"e_1_3_2_108_2","first-page":"723","article-title":"A kernel two-sample test","volume":"13","author":"Gretton A.","year":"2012","unstructured":"A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Sch\u00f6lkopf, and A. Smola. 2012. A kernel two-sample test. JMLR 13, 25 (2012), 723\u2013773.","journal-title":"JMLR"},{"key":"e_1_3_2_109_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49660.2025.10887745"},{"key":"e_1_3_2_110_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP48485.2024.10446663"},{"key":"e_1_3_2_111_2","volume-title":"Proc. ICLR","author":"Gulrajani I.","year":"2019","unstructured":"I. Gulrajani, C. Raffel, and L. Metz. 2019. Towards GAN benchmarks which require generalization. In Proc. ICLR."},{"key":"e_1_3_2_112_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0283103"},{"key":"e_1_3_2_113_2","doi-asserted-by":"publisher","DOI":"10.1109\/WASPAA.2015.7336923"},{"key":"e_1_3_2_114_2","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.1418331"},{"key":"e_1_3_2_115_2","volume-title":"Proc. ICML","author":"Hadjeres G.","year":"2017","unstructured":"G. Hadjeres, F. Pachet, and F. Nielsen. 2017. DeepBach: A steerable model for bach chorales generation. In Proc. ICML. PMLR, Sydney."},{"key":"e_1_3_2_116_2","volume-title":"Proc. ISMIR","author":"Hakimi S. H.","year":"2020","unstructured":"S. H. Hakimi, N. Bhonker, and R. El-Yaniv. 2020. BebopNet: Depp neural models for personalized jazz improvisations. In Proc. ISMIR. Online."},{"key":"e_1_3_2_117_2","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199230143.003.0019"},{"key":"e_1_3_2_118_2","doi-asserted-by":"publisher","DOI":"10.1177\/154193120605000909"},{"key":"e_1_3_2_119_2","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300260"},{"issue":"248","key":"e_1_3_2_120_2","first-page":"1","article-title":"Towards the systematic reporting of the energy and carbon footprints of machine learning","volume":"21","author":"Henderson P.","year":"2020","unstructured":"P. Henderson, J. Hu, J. Romoff, E. Brunskill, D. Jurafsky, and J. Pineau. 2020. Towards the systematic reporting of the energy and carbon footprints of machine learning. JMLR 21, 248 (2020), 1\u201343.","journal-title":"JMLR"},{"key":"e_1_3_2_121_2","volume-title":"Workshop on Gen. AI and HCI at CHI","author":"Hernandez-Olivan C.","year":"2022","unstructured":"C. Hernandez-Olivan, J. A. Puyuelo, and J. R. Beltran. 2022. Subjective evaluation of deep learning models for symbolic music composition. In Workshop on Gen. AI and HCI at CHI. ACM."},{"key":"e_1_3_2_122_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11023-020-09549-0"},{"key":"e_1_3_2_123_2","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2017.2737984"},{"key":"e_1_3_2_124_2","doi-asserted-by":"publisher","DOI":"10.1145\/3108242"},{"key":"e_1_3_2_125_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952132"},{"key":"e_1_3_2_126_2","volume-title":"Experimental Music: Composition with an Electronic Computer","author":"Hiller L. A.","year":"1959","unstructured":"L. A. Hiller and L. M. Isaacson. 1959. Experimental Music: Composition with an Electronic Computer. McGraw-Hill."},{"key":"e_1_3_2_127_2","doi-asserted-by":"publisher","DOI":"10.1186\/s13636-015-0054-9"},{"key":"e_1_3_2_128_2","volume-title":"Proc. ISMIR","author":"Huang C.-Z. A.","year":"2020","unstructured":"C.-Z. A. Huang, H. V. Koops, E. Newton-Rex, M. Dinculescu, and C. J. Cai. 2020. AI song contest: Human-AI co-creation in songwriting. In Proc. ISMIR."},{"key":"e_1_3_2_129_2","volume-title":"Proc. ICLR","author":"Huang C.-Z. A.","year":"2019","unstructured":"C.-Z. A. Huang, A. Vaswani, J. Uszkoreit, I. Simon, C. Hawthorne, N. Shazeer, A. M. Dai, M. D. Hoffman, M. Dinculescu, and D. Eck. 2019. Music transformer: Generating music with long-term structure. In Proc. ICLR."},{"key":"e_1_3_2_130_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.findings-emnlp.148"},{"key":"e_1_3_2_131_2","unstructured":"Y. Huang Z. Novack K. Saito J. Shi S. Watanabe Y. Mitsufuji J. Thickstun and C. Donahue. 2025. Aligning text-to-music evaluation with human preferences.arXiv.2503.16669. Retrieved from https:\/\/arxiv.org\/abs\/2503.16669"},{"key":"e_1_3_2_132_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413671"},{"key":"e_1_3_2_133_2","volume-title":"Empirical Studies in End-User Computer-Generated Music Composition Systems","author":"Hunt S.","year":"2021","unstructured":"S. Hunt. 2021. Empirical Studies in End-User Computer-Generated Music Composition Systems. PhD Thesis. University of the West of England, Bristol."},{"key":"e_1_3_2_134_2","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/6575.001.0001"},{"key":"e_1_3_2_135_2","unstructured":"ITU-R. 2006. BS.1387:2006: Method for objective measurements of perceived audio quality."},{"key":"e_1_3_2_136_2","unstructured":"ITU-R. 2015. BS.1116-3 : Methods for the subjective assessment of small impairments in audio systems."},{"key":"e_1_3_2_137_2","unstructured":"ITU-R. 2015. BS.1534-3: Method for the subjective assessment of intermediate quality level of audio systems."},{"key":"e_1_3_2_138_2","unstructured":"ITU-T. 1996. P.800 : Methods for subjective determination of transmission quality."},{"key":"e_1_3_2_139_2","volume-title":"Proc. CVPR","author":"Jayasumana S.","year":"2023","unstructured":"S. Jayasumana, S. Ramalingam, A. Veit, D. Glasner, A. Chakrabarti, and S. Kumar. 2023. Rethinking FID: Towards a better evaluation metric for image generation. In Proc. CVPR."},{"key":"e_1_3_2_140_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597493"},{"key":"e_1_3_2_141_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-020-10241-8"},{"key":"e_1_3_2_142_2","doi-asserted-by":"publisher","DOI":"10.1145\/3351095.3372829"},{"key":"e_1_3_2_143_2","doi-asserted-by":"publisher","DOI":"10.1007\/s12559-012-9156-1"},{"key":"e_1_3_2_144_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.plrev.2013.05.008"},{"key":"e_1_3_2_145_2","doi-asserted-by":"publisher","DOI":"10.1037\/aca0000403"},{"key":"e_1_3_2_146_2","volume-title":"Proc. CSMC","author":"Kalonaris S.","year":"2018","unstructured":"S. Kalonaris and A. Jordanous. 2018. Computational music aesthetics: A survey and some thoughts. In Proc. CSMC."},{"key":"e_1_3_2_147_2","doi-asserted-by":"publisher","DOI":"10.1145\/3546954"},{"key":"e_1_3_2_148_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1985.1168376"},{"key":"e_1_3_2_149_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00453"},{"key":"e_1_3_2_150_2","doi-asserted-by":"publisher","DOI":"10.1080\/09298215.2012.745579"},{"key":"e_1_3_2_151_2","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-2219"},{"key":"e_1_3_2_152_2","unstructured":"M. Kinsella. 2024. Time to Face the Music: A.I. Music Copyright Infringement Battle Makes It to Court. Blog post. Retrieved from https:\/\/wjlta.com\/2024\/01\/24\/time-to-face-the-music-a-i-music-copyright-infringement-battle-makes-it-to-court\/"},{"key":"e_1_3_2_153_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2020.3030497"},{"key":"e_1_3_2_154_2","volume-title":"Proc. ICLR","author":"Kong Z.","year":"2020","unstructured":"Z. Kong, W. Ping, J. Huang, K. Zhao, and B. Catanzaro. 2020. DiffWave: A versatile diffusion model for audio synthesis. In Proc. ICLR."},{"key":"e_1_3_2_155_2","doi-asserted-by":"publisher","DOI":"10.1017\/9781316979839.008"},{"key":"e_1_3_2_156_2","doi-asserted-by":"publisher","DOI":"10.2307\/40285829"},{"key":"e_1_3_2_157_2","unstructured":"D. Kvak. 2022. Towards evaluation of autonomously generated musical compositions: A comprehensive survey. arXiv:2204.04756. Retrieved from https:\/\/arxiv.org\/abs\/2204.04756"},{"key":"e_1_3_2_158_2","volume-title":"Proc. NeurIPS","author":"Lam M. W. Y.","year":"2023","unstructured":"M. W. Y. Lam, Q. Tian, T. Li, Z. Yin, S. Feng, M. Tu, Y. Ji, R. Xia, M. Ma, X. Song, et\u00a0al. 2023. Efficient neural music generation. In Proc. NeurIPS, Vol. 36."},{"key":"e_1_3_2_159_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-89350-9_6"},{"issue":"134","key":"e_1_3_2_160_2","article-title":"AI and the Sound of Music","author":"Lee E.","year":"2024","unstructured":"E. Lee. 2024. AI and the Sound of Music. The Yale Law Journal Forum134 (2024). Retrieved from https:\/\/www.yalelawjournal.org\/forum\/ai-and-the-sound-of-music","journal-title":"The Yale Law Journal Forum"},{"key":"e_1_3_2_161_2","volume-title":"Software-Based Extraction of Objective Parameters from Music Performances","author":"Lerch A.","year":"2009","unstructured":"A. Lerch. 2009. Software-Based Extraction of Objective Parameters from Music Performances. GRIN Verlag, M\u00fcnchen."},{"key":"e_1_3_2_162_2","volume-title":"An Introduction to Audio Content Analysis: Music Information Retrieval Tasks and Applic. (2nd ed.)","author":"Lerch A.","year":"2023","unstructured":"A. Lerch. 2023. An Introduction to Audio Content Analysis: Music Information Retrieval Tasks and Applic. (2nd ed.). Wiley-IEEE Press, Hoboken."},{"key":"e_1_3_2_163_2","volume-title":"Proc. ISMIR","author":"Lerch A.","year":"2019","unstructured":"A. Lerch, C. Arthur, A. Pati, and S. Gururani. 2019. Music performance analysis: A survey. In Proc. ISMIR."},{"key":"e_1_3_2_164_2","doi-asserted-by":"publisher","DOI":"10.5334\/tismir.53"},{"key":"e_1_3_2_165_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52734.2025.01230"},{"key":"e_1_3_2_166_2","doi-asserted-by":"publisher","DOI":"10.3389\/fnbot.2023.1267561"},{"key":"e_1_3_2_167_2","first-page":"55","article-title":"A technique for the measurement of attitudes","volume":"22","author":"Likert R.","year":"1932","unstructured":"R. Likert. 1932. A technique for the measurement of attitudes. Archives of Psychology 22 140 (1932), 55.","journal-title":"Archives of Psychology"},{"key":"e_1_3_2_168_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49660.2025.10890307"},{"key":"e_1_3_2_169_2","volume-title":"Proc. ICML","author":"Liu H.","year":"2023","unstructured":"H. Liu, Z. Chen, Y. Yuan, X. Mei, X. Liu, D. Mandic, W. Wang, and M. D. Plumbley. 2023. AudioLDM: Text-to-audio generation with latent diffusion models. In Proc. ICML, Vol. 202. PMLR. Retrieved from https:\/\/proceedings.mlr.press\/v202\/liu23f.html"},{"key":"e_1_3_2_170_2","doi-asserted-by":"publisher","DOI":"10.1121\/2.0001871"},{"key":"e_1_3_2_171_2","doi-asserted-by":"publisher","DOI":"10.1037\/aca0000629"},{"key":"e_1_3_2_172_2","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376739"},{"key":"e_1_3_2_173_2","doi-asserted-by":"publisher","DOI":"10.1145\/3490099.3511159"},{"key":"e_1_3_2_174_2","unstructured":"T. Lu C.-M. Geist J. Melechovsky A. Roy and D. Herremans. 2025. MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection. arXiv:2505.20979. Retrieved from https:\/\/arxiv.org\/abs\/2505.20979"},{"key":"e_1_3_2_175_2","doi-asserted-by":"publisher","DOI":"10.1145\/3301019.3319996"},{"key":"e_1_3_2_176_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-07015-0_16"},{"key":"e_1_3_2_177_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/196"},{"key":"e_1_3_2_178_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10618-022-00835-2"},{"key":"e_1_3_2_179_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-32003-6_50"},{"key":"e_1_3_2_180_2","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-36605-9_48"},{"key":"e_1_3_2_181_2","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2022-405"},{"key":"e_1_3_2_182_2","volume-title":"Proc. NeurIPS","author":"Manocha P.","year":"2021","unstructured":"P. Manocha, B. Xu, and A. Kumar. 2021. NORESQA: A framework for speech quality assessment using non-matching references. In Proc. NeurIPS. Online."},{"key":"e_1_3_2_183_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSC.2018.00077"},{"key":"e_1_3_2_184_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2020.3037506"},{"key":"e_1_3_2_185_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2018.2875349"},{"key":"e_1_3_2_186_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-25931-4"},{"key":"e_1_3_2_187_2","volume-title":"Proc. Constructive Machine Learning Workshop, NIPS","author":"Mogren O.","year":"2016","unstructured":"O. Mogren. 2016. C-RNN-GAN: Continuous recurrent neural networks with adversarial training. In Proc. Constructive Machine Learning Workshop, NIPS."},{"key":"e_1_3_2_188_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0089642"},{"key":"e_1_3_2_189_2","doi-asserted-by":"publisher","DOI":"10.1177\/102986490901300111"},{"key":"e_1_3_2_190_2","volume-title":"Proc. ISMIR","author":"Naruse D.","year":"2022","unstructured":"D. Naruse, T. Takahata, Y. Mukuta, and T. Harada. 2022. Pop music generation with controllable phrase lengths. In Proc. ISMIR."},{"key":"e_1_3_2_191_2","doi-asserted-by":"publisher","DOI":"10.5555\/2821575"},{"key":"e_1_3_2_192_2","unstructured":"Z. Ning H. Chen Y. Jiang C. Hao G. Ma S. Wang J. Yao and L. Xie. 2025. DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion. arXiv:2503.01183. Retrieved from https:\/\/arxiv.org\/abs\/2503.01183"},{"key":"e_1_3_2_193_2","doi-asserted-by":"publisher","DOI":"10.23919\/Eusipco47968.2020.9287799"},{"key":"e_1_3_2_194_2","volume-title":"Proc. Int. Workshop on Explainable AI for the Arts (XAIxArts) at ACM Creativity and Cognition","author":"Noel-Hirst A.","year":"2023","unstructured":"A. Noel-Hirst and N. Bryan-Kinns. 2023. An autoethnographic exploration of XAI in algorithmic composition. In Proc. Int. Workshop on Explainable AI for the Arts (XAIxArts) at ACM Creativity and Cognition."},{"key":"e_1_3_2_195_2","volume-title":"The Definition of User Experience (UX)","author":"Norman D.","year":"2016","unstructured":"D. Norman and J. Nielsen. 2016. The Definition of User Experience (UX). Technical Report. Retrieved from https:\/\/www.nngroup.com\/articles\/definition-user-experience\/"},{"key":"e_1_3_2_196_2","doi-asserted-by":"publisher","unstructured":"B. M. Oliver J. R. Pierce and C. E. Shannon. 1948. The philosophy of PCM. Proceedings of the IRE 36 11 (1948) 1324\u20131331. DOI:10.1109\/JRPROC.1948.231941","DOI":"10.1109\/JRPROC.1948.231941"},{"key":"e_1_3_2_197_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-018-3758-9"},{"key":"e_1_3_2_198_2","doi-asserted-by":"publisher","DOI":"10.5334\/tismir.90"},{"key":"e_1_3_2_199_2","doi-asserted-by":"publisher","DOI":"10.1080\/10447318.2022.2153320"},{"key":"e_1_3_2_200_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijhcs.2018.01.004"},{"key":"e_1_3_2_201_2","doi-asserted-by":"publisher","DOI":"10.5920\/JCMS.2017.12"},{"key":"e_1_3_2_202_2","doi-asserted-by":"publisher","DOI":"10.1007\/11553939_98"},{"key":"e_1_3_2_203_2","doi-asserted-by":"publisher","DOI":"10.22967\/HCIS.2022.12.038"},{"key":"e_1_3_2_204_2","volume-title":"Proc. ISMIR","author":"Pasini M.","year":"2022","unstructured":"M. Pasini and J. Schl\u00fcter. 2022. Musika! fast infinite waveform music generation. In Proc. ISMIR."},{"key":"e_1_3_2_205_2","doi-asserted-by":"publisher","DOI":"10.1145\/2930672"},{"key":"e_1_3_2_206_2","volume-title":"Proc. ISMIR","author":"Pati A.","year":"2019","unstructured":"A. Pati, A. Lerch, and G. Hadjeres. 2019. Learning to traverse latent spaces for musical score inpainting. In Proc. ISMIR."},{"key":"e_1_3_2_207_2","volume-title":"Proc. ML4MD, Extended Abstract","author":"Pati K. A.","year":"2019","unstructured":"K. A. Pati and A. Lerch. 2019. Latent space regularization for explicit control of musical attributes. In Proc. ML4MD, Extended Abstract."},{"key":"e_1_3_2_208_2","doi-asserted-by":"publisher","unstructured":"K. A. Pati and A. Lerch. 2020. Attribute-based regularization for latent spaces of variational auto-encoders. Neural Computing and Applications 33 (2020) 4429\u20134444. DOI:10.1007\/s00521-020-05270-2","DOI":"10.1007\/s00521-020-05270-2"},{"key":"e_1_3_2_209_2","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780199670000.003.0016"},{"key":"e_1_3_2_210_2","volume-title":"Proc. AISB","author":"Pease A.","year":"2011","unstructured":"A. Pease and S. Colton. 2011. On impact and evaluation in computational creativity: A discussion of the turing test and an alternative proposal. In Proc. AISB."},{"key":"e_1_3_2_211_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597512.3597528"},{"key":"e_1_3_2_212_2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511763205.005"},{"key":"e_1_3_2_213_2","volume-title":"Interaction Design: Beyond Human-Computer Interaction","author":"Preece J.","year":"2015","unstructured":"J. Preece, H. Sharp, and Y. Rogers. 2015. Interaction Design: Beyond Human-Computer Interaction. Wiley."},{"key":"e_1_3_2_214_2","doi-asserted-by":"publisher","DOI":"10.1145\/3613904.3641971"},{"key":"e_1_3_2_215_2","volume-title":"Proc. ICMPC","author":"Raman R.","year":"2016","unstructured":"R. Raman, K. Herndon, and W. J. Dowling. 2016. Effects of familiarity, key membership, and interval size on perceiving wrong notes in melodies. In Proc. ICMPC."},{"key":"e_1_3_2_216_2","volume-title":"Proc. ICML","author":"Ramesh A.","year":"2021","unstructured":"A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever. 2021. Zero-shot text-to-image generation. In Proc. ICML. PMLR, Vienna."},{"key":"e_1_3_2_217_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-73374-6_3"},{"key":"e_1_3_2_218_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414878"},{"key":"e_1_3_2_219_2","unstructured":"J. Retkowski J. St\u0119pniak and M. Modrzejewski. 2025. Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation. AAAI Workshop on Artificial Intelligence for Music."},{"key":"e_1_3_2_220_2","doi-asserted-by":"publisher","DOI":"10.1145\/3698061.3726924"},{"key":"e_1_3_2_221_2","volume-title":"Proc. Explainable AI for the Arts Workshop 2025 (XAIxArts)","author":"Cox S. Rhys","year":"2025","unstructured":"S. Rhys Cox, H. B\u00f8jer Djern\u00e6s, and N. van Berkel. 2025. Reflecting human values in XAI: Emotional and reflective benefits in creativity support tools. In Proc. Explainable AI for the Arts Workshop 2025 (XAIxArts). ACM, New York."},{"key":"e_1_3_2_222_2","volume-title":"Proc. NeurIPS","author":"Richardson Eitan","year":"2018","unstructured":"Eitan Richardson and Yair Weiss. 2018. On GANs and GMMs. In Proc. NeurIPS, Vol. 31."},{"key":"e_1_3_2_223_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11023-007-9066-2"},{"key":"e_1_3_2_224_2","volume-title":"Proc. ICML","author":"Roberts A.","year":"2018","unstructured":"A. Roberts, J. Engel, C. Raffel, C. Hawthorne, and D. Eck. 2018. A hierarchical latent vector model for learning long-term structure in music. In Proc. ICML. PMLR, Stockholm."},{"key":"e_1_3_2_225_2","doi-asserted-by":"publisher","DOI":"10.5334\/tismir.104"},{"key":"e_1_3_2_226_2","volume-title":"Proc. NeurIPS","author":"Salimans T.","year":"2016","unstructured":"T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, and X. Chen. 2016. Improved techniques for training GANs. In Proc. NeurIPS."},{"key":"e_1_3_2_227_2","doi-asserted-by":"publisher","DOI":"10.3389\/fpsyg.2020.580111"},{"key":"e_1_3_2_228_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.acl-long.437"},{"key":"e_1_3_2_229_2","doi-asserted-by":"publisher","DOI":"10.3389\/fnins.2021.612379"},{"key":"e_1_3_2_230_2","doi-asserted-by":"publisher","DOI":"10.1145\/3461778.3462163"},{"key":"e_1_3_2_231_2","doi-asserted-by":"publisher","DOI":"10.1037\/xap0000447"},{"key":"e_1_3_2_232_2","doi-asserted-by":"publisher","DOI":"10.1093\/oso\/9780192845290.001.0001"},{"key":"e_1_3_2_233_2","article-title":"Black artists say A.I. shows bias, with algorithms erasing their history","author":"Small Z.","year":"2023","unstructured":"Z. Small. 2023. Black artists say A.I. shows bias, with algorithms erasing their history. New York Times (2023).","journal-title":"New York Times"},{"key":"e_1_3_2_234_2","volume-title":"Proc. Conf. on Human-Computer Interaction with Mobile Devices and Services","author":"Sonnenburg S.","year":"2007","unstructured":"S. Sonnenburg, M. L. Braun, C. S. Ong, S. Bengio, L. Bottou, G. Holmes, and Y. LeCun. 2007. The need for open source software in machine learning. In Proc. Conf. on Human-Computer Interaction with Mobile Devices and Services."},{"key":"e_1_3_2_235_2","volume-title":"Proc. NeurIPS","author":"Srinivasan R.","year":"2021","unstructured":"R. Srinivasan, E. Denton, J. Famularo, N. Rostamzadeh, F. Diaz, and B. Coleman. 2021. Artsheets for art datasets. In Proc. NeurIPS. Online."},{"key":"e_1_3_2_236_2","doi-asserted-by":"publisher","DOI":"10.1109\/CEC.2019.8790099"},{"key":"e_1_3_2_237_2","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1234-981X(199707)5:3<305::AID-EURO184>3.0.CO;2-4"},{"key":"e_1_3_2_238_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1355"},{"key":"e_1_3_2_239_2","volume-title":"Proc. of Gen. AI and HCI Workshop, CHI","author":"Sturm B.","year":"2022","unstructured":"B. Sturm. 2022. Generative AI helps one express things for which they may not have expressions (yet). In Proc. of Gen. AI and HCI Workshop, CHI. ACM."},{"key":"e_1_3_2_240_2","doi-asserted-by":"publisher","unstructured":"B. L. T. Sturm and O. Ben-Tal. 2017. Taking the models back to music practice: Evaluating generative transcription models built using deep learning. Journal of Creative Music Systems 2 1 (2014) 32\u201360. DOI:10.5920\/JCMS.2017.09","DOI":"10.5920\/JCMS.2017.09"},{"key":"e_1_3_2_241_2","doi-asserted-by":"publisher","DOI":"10.1080\/09298215.2018.1515233"},{"key":"e_1_3_2_242_2","doi-asserted-by":"publisher","unstructured":"Bob L. T. Sturm Ken D\u00e9guernel Rujing S. Huang Andr\u00e9 Holzapfel Oliver Bown Nick Collins Jonathan Sterne Laura C. Vila Luca Casini David C. Dalmazzo et\u00a0al. 2024. MusAIcology: AI Music and the Need for a New Kind of Music Studies. Preprint. DOI:10.31235\/osf.io\/9pz4x","DOI":"10.31235\/osf.io\/9pz4x"},{"key":"e_1_3_2_243_2","doi-asserted-by":"publisher","DOI":"10.3390\/arts8030115"},{"key":"e_1_3_2_244_2","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445219"},{"key":"e_1_3_2_245_2","doi-asserted-by":"publisher","DOI":"10.1109\/INDICON.2015.7443304"},{"issue":"2","key":"e_1_3_2_246_2","first-page":"185","article-title":"Sounds of science: Copyright infringement in AI music generator outputs","volume":"29","author":"Sunray E.","year":"2021","unstructured":"E. Sunray. 2021. Sounds of science: Copyright infringement in AI music generator outputs. Catholic University Journal of Law and Technology 29, 2 (2021), 185\u2013218.","journal-title":"Catholic University Journal of Law and Technology"},{"key":"e_1_3_2_247_2","doi-asserted-by":"publisher","DOI":"10.1162\/014892601300126106"},{"key":"e_1_3_2_248_2","volume-title":"Proc. ISMIR","author":"Tan H. H.","year":"2020","unstructured":"H. H. Tan and D. Herremans. 2020. Music fadernets: Controllable music generation based on high-level features via low-level feature modelling. In Proc. ISMIR. Online."},{"key":"e_1_3_2_249_2","doi-asserted-by":"publisher","DOI":"10.5555\/1211585"},{"key":"e_1_3_2_250_2","unstructured":"The MIDI Assoc.2023. MIDI 2.0 Specification Overview."},{"key":"e_1_3_2_251_2","volume-title":"Proc. ICLR","author":"Theis L.","year":"2016","unstructured":"L. Theis, A. v. d. Oord, and M. Bethge. 2016. A note on the evaluation of generative models. In Proc. ICLR."},{"key":"e_1_3_2_252_2","volume-title":"Proc. NeurIPS.","author":"Thelle N. J. W.","year":"2022","unstructured":"N. J. W. Thelle and R. Fiebrink. 2022. How Do Musicians Experience Jamming With a Co-Creative \u201cAI\u201d?. In Proc. NeurIPS."},{"key":"e_1_3_2_253_2","volume-title":"A Geometry of Music: Harmony and Counterpoint in the Extended Common Practice","author":"Tymoczko D.","year":"2011","unstructured":"D. Tymoczko. 2011. A Geometry of Music: Harmony and Counterpoint in the Extended Common Practice. Oxford University Press, New York."},{"key":"e_1_3_2_254_2","volume-title":"Proc. NeurIPS","author":"Burg G. van den","year":"2021","unstructured":"G. van den Burg and C. Williams. 2021. On memorization in probabilistic deep generative models. In Proc. NeurIPS, Vol. 34. Online."},{"key":"e_1_3_2_255_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2009.932122"},{"key":"e_1_3_2_256_2","volume-title":"Proc. ISMIR","author":"Vinay A.","year":"2022","unstructured":"A. Vinay and A. Lerch. 2022. Evaluating generative audio systems and their metrics. In Proc. ISMIR."},{"key":"e_1_3_2_257_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2005.858005"},{"key":"e_1_3_2_258_2","doi-asserted-by":"publisher","DOI":"10.1162\/014892602320582981"},{"key":"e_1_3_2_259_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-024-09418-2"},{"key":"e_1_3_2_260_2","doi-asserted-by":"publisher","DOI":"10.1145\/3648609"},{"key":"e_1_3_2_261_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-99-4761-4_32"},{"key":"e_1_3_2_262_2","doi-asserted-by":"publisher","DOI":"10.1631\/FITEE.2300359"},{"key":"e_1_3_2_263_2","volume-title":"Proc. SysMus","author":"Wolf A.","year":"2011","unstructured":"A. Wolf and D. M\u00fcllensiefen. 2011. The perception of similarity in court cases of melodic plagiarism and a review of measures of melodic similarity. In Proc. SysMus."},{"key":"e_1_3_2_264_2","volume-title":"Proc. ISMIR","author":"Wu S.-L.","year":"2020","unstructured":"S.-L. Wu and Y.-H. Yang. 2020. The jazz transformer on the front line: Exploring the shortcomings of AI-composed music through quantitative measures. In Proc. ISMIR."},{"key":"e_1_3_2_265_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10095969"},{"key":"e_1_3_2_266_2","doi-asserted-by":"publisher","unstructured":"Z. Xiong W. Wang J. Yu Y. Lin and Z. Wang. 2023. A comprehensive survey for evaluation methodologies of AI-generated music. Preprint. DOI:10.48550\/ARXIV.2308.13736","DOI":"10.48550\/ARXIV.2308.13736"},{"key":"e_1_3_2_267_2","volume-title":"Proc. ISMIR","author":"Yang L.-C.","year":"2017","unstructured":"L.-C. Yang, S.-Y. Chou, and Y.-H. Y.2017. MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. In Proc. ISMIR."},{"key":"e_1_3_2_268_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-018-3849-7"},{"key":"e_1_3_2_269_2","doi-asserted-by":"publisher","unstructured":"Y.-C. Yeh W.-Y. Hsiao S. Fukayama T. Kitahara B. Genchel H.-M. Liu H.-W. Dong Y. Chen T. Leong and Y.-H. Yang. 2021. Automatic melody harmonization with triad chords: A comparative study. Journal of New Music Research (JNMR) 50 1 (2021) 37\u201351. DOI:10.1080\/09298215.2021.1873392","DOI":"10.1080\/09298215.2021.1873392"},{"key":"e_1_3_2_270_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-72914-1_24"},{"key":"e_1_3_2_271_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-023-06309-w"},{"key":"e_1_3_2_272_2","doi-asserted-by":"crossref","unstructured":"R. Yuan H. Lin Y. Wang Z. Tian S. Wu T. Shen G. Zhang Y. Wu C. Liu Z. Zhou et\u00a0al. 2024. ChatMusician: Understanding and Generating Music Intrinsically with LLM. arXiv:2402.16153. Retrieved from https:\/\/arxiv.org\/abs\/2402.16153","DOI":"10.18653\/v1\/2024.findings-acl.373"},{"key":"e_1_3_2_273_2","doi-asserted-by":"publisher","DOI":"10.5334\/tismir.151"},{"key":"e_1_3_2_274_2","doi-asserted-by":"publisher","DOI":"10.3390\/electronics14061197"},{"key":"e_1_3_2_275_2","doi-asserted-by":"publisher","DOI":"10.23977\/jaip.2023.060802"},{"key":"e_1_3_2_276_2","volume-title":"Proc. ICMC","author":"Zimmermann D.","year":"1996","unstructured":"D. Zimmermann. 1996. Creativity versus determinism: Cognitive science and music theory as touchstones of automatic music composition. In Proc. ICMC. ICMA, Hong Kong."},{"key":"e_1_3_2_277_2","volume-title":"Human Behavior and the Principle of Least Effort; An Introduction to Human Ecology.","author":"Zipf G. K.","year":"1965","unstructured":"G. K. Zipf. 1965. Human Behavior and the Principle of Least Effort; An Introduction to Human Ecology.Hafner, New York."}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3769106","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T14:14:49Z","timestamp":1761401689000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3769106"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,25]]},"references-count":276,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,3,31]]}},"alternative-id":["10.1145\/3769106"],"URL":"https:\/\/doi.org\/10.1145\/3769106","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"type":"print","value":"0360-0300"},{"type":"electronic","value":"1557-7341"}],"subject":[],"published":{"date-parts":[[2025,10,25]]},"assertion":[{"value":"2024-06-26","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-25","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}