{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T02:09:48Z","timestamp":1773281388867,"version":"3.50.1"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"2","funder":[{"name":"Institute for Information and Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government","award":["RS-2018-II180532, 50%"],"award-info":[{"award-number":["RS-2018-II180532, 50%"]}]},{"name":"National Research Foundation of Korea (NRF) grant funded by the Korea government","award":["RS-2024-00344323, 50%"],"award-info":[{"award-number":["RS-2024-00344323, 50%"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2026,3,31]]},"abstract":"<jats:p>Running multiple deep neural networks (DNNs) simultaneously on mobile devices introduces challenges due to constrained computing resources. Previous research has explored the use of heterogeneous processors for accelerating DNN inference but often overlooks thermal issues, which can degrade computing power. In this article, we propose Phoenix, a system specifically designed to enhance the performance of multi-instance DNNs in video applications by maximizing accuracy and ensuring the achievement of a required frame rate. Phoenix allocates DNN tasks to the most suitable hardware processors, understanding complex thermal dynamics through reinforcement learning, and postpones the onset of thermal throttling. Despite optimized task allocation, continuous inference of multiple DNNs can still lead to thermal throttling. To manage performance degradation, Phoenix employs a multi-exit network, adaptively executing inference tasks to ensure consistent frame rates. Phoenix minimizes accuracy loss from early exits by optimally generating and operating multi-exit networks. We evaluated Phoenix using two different benchmarks and Virtual Youtuber streaming application. The results demonstrated that Phoenix effectively enhances device performance by delaying thermal throttling and achieving optimal accuracy while maintaining a consistent frame rate.<\/jats:p>","DOI":"10.1145\/3793860","type":"journal-article","created":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T21:16:22Z","timestamp":1770326182000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Phoenix: Thermal-Aware On-Device Inference of Multi-Instance DNNs for Mobile Video Applications"],"prefix":"10.1145","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7956-5271","authenticated-orcid":false,"given":"Seunghyeok","family":"Jeon","sequence":"first","affiliation":[{"name":"Computer Science and Engineering, Yonsei University","place":["Seodaemun-gu, Korea (the Republic of)"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5182-2667","authenticated-orcid":false,"given":"Jiwon","family":"Kim","sequence":"additional","affiliation":[{"name":"Uppsala University","place":["Uppsala, Sweden"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9035-2602","authenticated-orcid":false,"given":"Jeho","family":"Lee","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Yonsei University","place":["Seodaemun-gu, Korea (the Republic of)"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9060-5091","authenticated-orcid":false,"given":"Hojung","family":"Cha","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Yonsei University","place":["Seodaemun-gu, Korea (the Republic of)"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,3,6]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-35602-5_37"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3380881"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3384419.3430721"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3419192"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3498361.3538948"},{"key":"e_1_3_1_7_2","first-page":"10","volume-title":"Proceedings of the 2023 USENIX Annual Technical Conference (USENIX ATC 23)","author":"Sung H.-H.","year":"2023","unstructured":"H.-H. Sung, J.-A. Chen, W. Niu, J. Guan, B. Ren, and X. Shen. 2023. Decentralized application-level adaptive scheduling for multi-instance DNNs on open mobile devices. In Proceedings of the 2023 USENIX Annual Technical Conference (USENIX ATC 23), Boston, MA, USA, 10\u201312"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241559"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2500423.2505320"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/SEMI-THERM.2015.7100138"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/SMARTCOMP52413.2021.00021"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3458864.3468161"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2017.2743240"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.5555\/1622737.1622748"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3469116.3470012"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2016.7900006"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3487552.3487863"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.3390\/electronics9122106"},{"key":"e_1_3_1_19_2","unstructured":"2017. Google NNAPI. Retrieved from https:\/\/developer.android.com\/ndk\/guides\/neuralnetworks"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/MDAT.2017.2695958"},{"key":"e_1_3_1_21_2","first-page":"7","volume-title":"Proceedings of the Eighth IEEE\/ACM\/IFIP International Conference on Hardware\/Software Codesign and System Synthesis","author":"Jung W.","year":"2023","unstructured":"W. Jung, C. Kang, C. Yoon, D. Kim, and H. Cha. 2023. DevScope: A nonintrusive and online power analysis tool for smartphone hardware components. In Proceedings of the Eighth IEEE\/ACM\/IFIP International Conference on Hardware\/Software Codesign and System Synthesis, Tampere, Finland, 7\u201312."},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992698"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1146\/annurev-statistics-031219-041220"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-70093-9_50"},{"key":"e_1_3_1_25_2","first-page":"13","volume-title":"Proceedings of the Learning for Dynamics and Control. PMLR, Virtual Conference","author":"Fan J.","year":"2020","unstructured":"J. Fan, Z. Wang, Y. Xie, and Z. Yang. 2020. A theoretical analysis of deep Q-learning. In Proceedings of the Learning for Dynamics and Control. PMLR, Virtual Conference, 13\u201318."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3498361.3538791"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3400302.3415698"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.5555\/3322706.3361996"},{"key":"e_1_3_1_29_2","unstructured":"M. Wistuba A. Rawat and T. Pedapati. 2019. A survey on neural architecture search. arXiv preprint arXiv:1905.01392."},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN52387.2021.9533875"},{"key":"e_1_3_1_31_2","unstructured":"VTUBER (Virtual Yotuber) Market size share growth and industry analysis by type (2D VTuber 3D VTuber) by application (livestreaming & performance digital contents & derivative others) and regional insight and forecast to 2031. 2025. Retrieved from https:\/\/www.businessresearchinsights.com\/market-reports\/VTuber-virtual-youtuber-market-109503"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW53098.2021.00265"},{"key":"e_1_3_1_33_2","unstructured":"MediaPipe Face Detection. 2025. Retrieved from https:\/\/github.com\/google\/mediapipe\/blob\/master\/docs\/solutions\/face_detection.md"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-97909-0_46"},{"key":"e_1_3_1_35_2","unstructured":"MediaPipe Tasks Face Landmark Detection. 2025. Retrieved from https:\/\/github.com\/googlesamples\/mediapipe\/tree\/main\/examples\/face_landmarker\/android"},{"key":"e_1_3_1_36_2","unstructured":"2025. Jocher G. Yolov5. Retrieved from https:\/\/github.com\/ultralytics\/yolov5"},{"key":"e_1_3_1_37_2","volume-title":"Proceedings of the International Conference on Machine Learning. PMLR","author":"Tan M.","year":"2019","unstructured":"M. Tan and Q. Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, California, USA"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581791.3596851"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2017.2770163"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2873210"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2855180"},{"issue":"5","key":"e_1_3_1_42_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3762655","article-title":"THERMOS: Thermally-Aware multi-objective scheduling of AI workloads on heterogeneous multi-chiplet PIM architectures","volume":"24","author":"Alish Kanani","year":"2025","unstructured":"Kanani Alish et al. 2025. THERMOS: Thermally-Aware multi-objective scheduling of AI workloads on heterogeneous multi-chiplet PIM architectures. ACM Transactions on Embedded Computing Systems 24, 5s (2025), 1\u201326.","journal-title":"ACM Transactions on Embedded Computing Systems"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3386359"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISLPED58423.2023.10244486"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581791.3596870"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3576842.3582375"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE56975.2023.10137095"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3793860","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T11:16:35Z","timestamp":1773227795000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3793860"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,6]]},"references-count":46,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,3,31]]}},"alternative-id":["10.1145\/3793860"],"URL":"https:\/\/doi.org\/10.1145\/3793860","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,6]]},"assertion":[{"value":"2025-05-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-20","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-03-06","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}