{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:57:38Z","timestamp":1750309058643,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":45,"publisher":"ACM","license":[{"start":{"date-parts":[[2023,8,30]],"date-time":"2023-08-30T00:00:00Z","timestamp":1693353600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000781","name":"European Research Council","doi-asserted-by":"publisher","award":["EP\/X023478\/1"],"award-info":[{"award-number":["EP\/X023478\/1"]}],"id":[{"id":"10.13039\/501100000781","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/S022694\/1"],"award-info":[{"award-number":["EP\/S022694\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,30]]},"DOI":"10.1145\/3616195.3616196","type":"proceedings-article","created":{"date-parts":[[2023,10,11]],"date-time":"2023-10-11T22:56:24Z","timestamp":1697064984000},"page":"116-123","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["FM Tone Transfer with Envelope Learning"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2214-9725","authenticated-orcid":false,"given":"Franco","family":"Caspe","sequence":"first","affiliation":[{"name":"Centre for Digital Music, Queen Mary University of London, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9323-9069","authenticated-orcid":false,"given":"Andrew","family":"McPherson","sequence":"additional","affiliation":[{"name":"Dyson School of Design Engineering, Imperial College London, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5691-8107","authenticated-orcid":false,"given":"Mark","family":"Sandler","sequence":"additional","affiliation":[{"name":"Centre for Digital Music, Queen Mary University of London, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2023,10,11]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1080\/09298215.2011.647823"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1076\/jnmr.29.3.211.3092"},{"key":"e_1_3_2_1_3_1","unstructured":"ByteDance. 2023. Mawf. https:\/\/mawf.io\/.  ByteDance. 2023. Mawf. https:\/\/mawf.io\/."},{"key":"e_1_3_2_1_4_1","volume-title":"Tone Transfer: In-Browser Interactive Neural Audio Synthesis. In Joint Proceedings of the ACM IUI 2021 Workshops. ACM, College Station","author":"Carney Michelle","year":"2021","unstructured":"Michelle Carney , Chong Li , Edwin Toh , Nida Zada , Ping Yu , and Jesse Engel . 2021 . Tone Transfer: In-Browser Interactive Neural Audio Synthesis. In Joint Proceedings of the ACM IUI 2021 Workshops. ACM, College Station , United States,, 6\u00a0pages. Michelle Carney, Chong Li, Edwin Toh, Nida Zada, Ping Yu, and Jesse Engel. 2021. Tone Transfer: In-Browser Interactive Neural Audio Synthesis. In Joint Proceedings of the ACM IUI 2021 Workshops. ACM, College Station, United States,, 6\u00a0pages."},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of the 23nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR)","author":"Caspe Franco","year":"2022","unstructured":"Franco Caspe , Andrew McPherson , and Mark Sandler . 2022 . DDX7: Differentiable FM Synthesis of Musical Instrument Sounds . In Proceedings of the 23nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR) , Bengaluru, India, 9\u00a0pages. Franco Caspe, Andrew McPherson, and Mark Sandler. 2022. DDX7: Differentiable FM Synthesis of Musical Instrument Sounds. In Proceedings of the 23nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR), Bengaluru, India, 9\u00a0pages."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2022\/682"},{"key":"e_1_3_2_1_7_1","first-page":"7","article-title":"The Synthesis of Complex Audio Spectra by Means of Frequency Modulation","volume":"21","author":"Chowning M.","year":"1973","unstructured":"John\u00a0 M. Chowning . 1973 . The Synthesis of Complex Audio Spectra by Means of Frequency Modulation . JAES 21 , 7 (Sept. 1973), 526\u2013534. John\u00a0M. Chowning. 1973. The Synthesis of Complex Audio Spectra by Means of Frequency Modulation. JAES 21, 7 (Sept. 1973), 526\u2013534.","journal-title":"JAES"},{"key":"e_1_3_2_1_8_1","volume-title":"Proceedings of the International Conference on New Interfaces for Musical Expression,. NIME","author":"Dahlstedt Palle","year":"2017","unstructured":"Palle Dahlstedt . 2017 . Physical Interactions with Digital Strings - A Hybrid Approach to a Digital Keyboard Instrument . In Proceedings of the International Conference on New Interfaces for Musical Expression,. NIME , Aalborg University, Copenhagen, 115\u2013120. Palle Dahlstedt. 2017. Physical Interactions with Digital Strings - A Hybrid Approach to a Digital Keyboard Instrument. In Proceedings of the International Conference on New Interfaces for Musical Expression,. NIME, Aalborg University, Copenhagen, 115\u2013120."},{"key":"e_1_3_2_1_9_1","volume-title":"Proceedings of the 2001 International Computer Music Conference. Michigan Publishing, Havana, Cuba, 4\u00a0pages.","author":"Daudet L.","year":"2001","unstructured":"L. Daudet . 2001 . Transients Modelling by Pruned Wavelet Trees . In Proceedings of the 2001 International Computer Music Conference. Michigan Publishing, Havana, Cuba, 4\u00a0pages. L. Daudet. 2001. Transients Modelling by Pruned Wavelet Trees. In Proceedings of the 2001 International Computer Music Conference. Michigan Publishing, Havana, Cuba, 4\u00a0pages."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.1458024"},{"key":"e_1_3_2_1_11_1","volume-title":"17th International Conference on Digital Audio Effects (DAFx-14)","author":"Derrien Olivier","year":"2014","unstructured":"Olivier Derrien . 2014 . A Very Low Latency Pitch Tracker for Audio to Midi Conversion . In 17th International Conference on Digital Audio Effects (DAFx-14) . DAFX, Erlangen, Germany, 6\u00a0pages. Olivier Derrien. 2014. A Very Low Latency Pitch Tracker for Audio to Midi Conversion. In 17th International Conference on Digital Audio Effects (DAFx-14). DAFX, Erlangen, Germany, 6\u00a0pages."},{"key":"e_1_3_2_1_12_1","volume-title":"DDSP: Differentiable Digital Signal Processing. In 8th International Conference on Learning Representations. ICLR, Addis Ababa, Ethiopia, 19\u00a0pages.","author":"Engel Jesse","year":"2020","unstructured":"Jesse Engel , Lamtharn Hantrakul , Chenjie Gu , and Adam Roberts . 2020 . DDSP: Differentiable Digital Signal Processing. In 8th International Conference on Learning Representations. ICLR, Addis Ababa, Ethiopia, 19\u00a0pages. Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, and Adam Roberts. 2020. DDSP: Differentiable Digital Signal Processing. In 8th International Conference on Learning Representations. ICLR, Addis Ababa, Ethiopia, 19\u00a0pages."},{"key":"e_1_3_2_1_13_1","volume-title":"Proceedings of the 34th International Conference on Machine Learning -","volume":"1077","author":"Engel Jesse","year":"2017","unstructured":"Jesse Engel , Cinjon Resnick , Adam Roberts , Sander Dieleman , Mohammad Norouzi , Douglas Eck , and Karen Simonyan . 2017 . Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders . In Proceedings of the 34th International Conference on Machine Learning - Volume 70(ICML\u201917). JMLR.org, Sydney, NSW, Australia, 1068\u2013 1077 . Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad Norouzi, Douglas Eck, and Karen Simonyan. 2017. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders. In Proceedings of the 34th International Conference on Machine Learning - Volume 70(ICML\u201917). JMLR.org, Sydney, NSW, Australia, 1068\u20131077."},{"key":"e_1_3_2_1_14_1","volume-title":"Proceedings of the 18th Sound and Music Computing Conference, Vol.\u00a0abs\/2103","author":"Ganis Francesco","year":"2021","unstructured":"Francesco Ganis , Erik\u00a0Frej Knudesn , S\u00f8ren V.\u00a0K. Lyster , Robin Otterbein , David S\u00fcdholt , and Cumhur Erkut . 2021 . Real-Time Timbre Transfer and Sound Synthesis Using DDSP . In Proceedings of the 18th Sound and Music Computing Conference, Vol.\u00a0abs\/2103 .07220. Sound and Music Computing, Virtual, 11. Francesco Ganis, Erik\u00a0Frej Knudesn, S\u00f8ren V.\u00a0K. Lyster, Robin Otterbein, David S\u00fcdholt, and Cumhur Erkut. 2021. Real-Time Timbre Transfer and Sound Synthesis Using DDSP. In Proceedings of the 18th Sound and Music Computing Conference, Vol.\u00a0abs\/2103.07220. Sound and Music Computing, Virtual, 11."},{"key":"e_1_3_2_1_15_1","unstructured":"Pascal Gauthier. 2023. Dexed. https:\/\/asb2m10.github.io\/dexed\/.  Pascal Gauthier. 2023. Dexed. https:\/\/asb2m10.github.io\/dexed\/."},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the 22nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR), Online, 8\u00a0pages.","author":"Hayes Ben","year":"2021","unstructured":"Ben Hayes , Charalampos Saitis , and Gy\u00f6rgy Fazekas . 2021 . Neural Waveshaping Synthesis . In Proceedings of the 22nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR), Online, 8\u00a0pages. Ben Hayes, Charalampos Saitis, and Gy\u00f6rgy Fazekas. 2021. Neural Waveshaping Synthesis. In Proceedings of the 22nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR), Online, 8\u00a0pages."},{"key":"e_1_3_2_1_17_1","volume-title":"International Conference on Learning Representations. ICLR","author":"Huang Sicong","year":"2019","unstructured":"Sicong Huang , Qiyang Li , Cem Anil , Xuchan Bao , Sageev Oore , and Roger\u00a0 B. Grosse . 2019 . TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer . In International Conference on Learning Representations. ICLR , New Orlearns, USA, 17\u00a0pages. Sicong Huang, Qiyang Li, Cem Anil, Xuchan Bao, Sageev Oore, and Roger\u00a0B. Grosse. 2019. TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer. In International Conference on Learning Representations. ICLR, New Orlearns, USA, 17\u00a0pages."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077981.3078039"},{"key":"e_1_3_2_1_19_1","volume-title":"Proceedings of the 2008 Conference on New Interfaces for Musical Expression (NIME08)","author":"L\u00e4hdeoja Otso","year":"2008","unstructured":"Otso L\u00e4hdeoja . 2008 . An Approach to Instrument Augmentation : The Electric Guitar . In Proceedings of the 2008 Conference on New Interfaces for Musical Expression (NIME08) . NIME, Genoa, Italy, 4\u00a0pages. Otso L\u00e4hdeoja. 2008. An Approach to Instrument Augmentation : The Electric Guitar. In Proceedings of the 2008 Conference on New Interfaces for Musical Expression (NIME08). NIME, Genoa, Italy, 4\u00a0pages."},{"key":"e_1_3_2_1_20_1","volume-title":"Conference on Digital Audio Effects. DAFX","author":"Lazzarini Victor","year":"2007","unstructured":"Victor Lazzarini , Joseph Timoney , and Thomas Lysaght . 2007 . Adaptive FM Synthesis. In DAFX-07 the 10th Int . Conference on Digital Audio Effects. DAFX , Bordeaux, France, 6\u00a0pages. Victor Lazzarini, Joseph Timoney, and Thomas Lysaght. 2007. Adaptive FM Synthesis. In DAFX-07 the 10th Int. Conference on Digital Audio Effects. DAFX, Bordeaux, France, 6\u00a0pages."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1017\/S135577180200208X"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2018.2856090"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2007.323267"},{"key":"e_1_3_2_1_24_1","unstructured":"Google Magenta. 2023. DDSP-VST. https:\/\/magenta.tensorflow.org\/ddsp-vst.  Google Magenta. 2023. DDSP-VST. https:\/\/magenta.tensorflow.org\/ddsp-vst."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-47214-0_27"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1162\/comj_a_00565"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0167643"},{"key":"e_1_3_2_1_28_1","first-page":"9","article-title":"The Social Construction of the Early Electronic Music Synthesizer","volume":"4","author":"Pinch Trevor","year":"1998","unstructured":"Trevor Pinch and Frank Trocco . 1998 . The Social Construction of the Early Electronic Music Synthesizer . Icon 4 (1998), 9 \u2013 31 . jstor:23785956 Trevor Pinch and Frank Trocco. 1998. The Social Construction of the Early Electronic Music Synthesizer. Icon 4 (1998), 9\u201331. jstor:23785956","journal-title":"Icon"},{"key":"e_1_3_2_1_29_1","volume-title":"Proceedings of the International Computer Music Conference. Michigan Publishing","author":"P\u00f6pel Cornelius","year":"2005","unstructured":"Cornelius P\u00f6pel and Roger Dannenberg . 2005 . Audio Signal Driven Sound Synthesis . In Proceedings of the International Computer Music Conference. Michigan Publishing , Barcelona, Spain, 5\u00a0pages. Cornelius P\u00f6pel and Roger Dannenberg. 2005. Audio Signal Driven Sound Synthesis. In Proceedings of the International Computer Music Conference. Michigan Publishing, Barcelona, Spain, 5\u00a0pages."},{"key":"e_1_3_2_1_30_1","unstructured":"Qosmo. 2023. Neutone by Qosmo. https:\/\/neutone.space\/.  Qosmo. 2023. Neutone by Qosmo. https:\/\/neutone.space\/."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1080\/09298215.2011.642391"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10444-013-9337-9"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.2307\/3680788"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP43922.2022.9746940"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.21428\/92fbeb44.3d0e9e12"},{"key":"e_1_3_2_1_36_1","volume-title":"Accepted Papers at ICBINB","author":"Turian Joseph","year":"2020","unstructured":"Joseph Turian and Max Henry . 2020. I\u2019m Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch . In Accepted Papers at ICBINB 2020 . Curran Associates , Virtual , 16\u00a0pages. Joseph Turian and Max Henry. 2020. I\u2019m Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch. In Accepted Papers at ICBINB 2020. Curran Associates, Virtual, 16\u00a0pages."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.23919\/DAFx51585.2021.9768246"},{"key":"e_1_3_2_1_38_1","volume-title":"Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9). International Speech Communication Association (ISCA)","author":"van\u00a0den Oord Aaron","year":"2016","unstructured":"Aaron van\u00a0den Oord , Sander Dieleman , Heiga Zen , Karen Simonyan , Oriol Vinyals , Alex Graves , Nal Kalchbrenner , Andrew Senior , and Koray Kavukcuoglu . 2016 . WaveNet: A Generative Model for Raw Audio . In Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9). International Speech Communication Association (ISCA) , Sunnyvale, USA, 15\u00a0pages. Aaron van\u00a0den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. In Proc. 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9). International Speech Communication Association (ISCA), Sunnyvale, USA, 15\u00a0pages."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSA.2005.858531"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1162\/014892600559317"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.3758\/BF03207341"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.21437\/SSW.2019-1"},{"volume-title":"Proceedings of the 1987 International Computer Music Conference. Michigan Publishing, Champaign\/Urbana","author":"Wessel D.","key":"e_1_3_2_1_43_1","unstructured":"D. Wessel , D. Bristow , and Z. Settel . 1987. Control of Phrasing and Articulation in Synthesis . In Proceedings of the 1987 International Computer Music Conference. Michigan Publishing, Champaign\/Urbana , Illinois, USA, 9\u00a0pages. D. Wessel, D. Bristow, and Z. Settel. 1987. Control of Phrasing and Articulation in Synthesis. In Proceedings of the 1987 International Computer Music Conference. Michigan Publishing, Champaign\/Urbana, Illinois, USA, 9\u00a0pages."},{"key":"e_1_3_2_1_44_1","volume-title":"International Conference on Learning Representations. ICLR, Virtual, 27\u00a0pages.","author":"Wu Yusong","year":"2022","unstructured":"Yusong Wu , Ethan Manilow , Yi Deng , Rigel Swavely , Kyle Kastner , Tim Cooijmans , Aaron Courville , Cheng- Zhi\u00a0Anna Huang , and Jesse Engel . 2022 . MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling . In International Conference on Learning Representations. ICLR, Virtual, 27\u00a0pages. Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi\u00a0Anna Huang, and Jesse Engel. 2022. MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling. In International Conference on Learning Representations. ICLR, Virtual, 27\u00a0pages."},{"key":"e_1_3_2_1_45_1","volume-title":"Proceedings of the 23nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR)","author":"Da-Yi","year":"2022","unstructured":"Da-Yi Wu1, Wen-Yi Hsiao , Fu-Rong Yang , Oscar Friedman , Warren Jackson , Scott Bruzenak , Yi-Wen Liu , and Yi-Hsuan Yang . 2022 . DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation . In Proceedings of the 23nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR) , Bengaluru, India, 8\u00a0pages. Da-Yi Wu1, Wen-Yi Hsiao, Fu-Rong Yang, Oscar Friedman, Warren Jackson, Scott Bruzenak, Yi-Wen Liu, and Yi-Hsuan Yang. 2022. DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation. In Proceedings of the 23nd ISMIR Conference. International Society of Music Information Retrieval (ISMIR), Bengaluru, India, 8\u00a0pages."}],"event":{"name":"AM '23: Audio Mostly 2023","acronym":"AM '23","location":"Edinburgh United Kingdom"},"container-title":["Proceedings of the 18th International Audio Mostly Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3616195.3616196","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3616195.3616196","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:29:49Z","timestamp":1750285789000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3616195.3616196"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,30]]},"references-count":45,"alternative-id":["10.1145\/3616195.3616196","10.1145\/3616195"],"URL":"https:\/\/doi.org\/10.1145\/3616195.3616196","relation":{},"subject":[],"published":{"date-parts":[[2023,8,30]]},"assertion":[{"value":"2023-10-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}