{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,9]],"date-time":"2026-05-09T14:27:18Z","timestamp":1778336838108,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":35,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T00:00:00Z","timestamp":1656288000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62172439, 61702561, 62122095, 62072472, U19A2067"],"award-info":[{"award-number":["62172439, 61702561, 62122095, 62072472, U19A2067"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Key R&D Program of China","award":["2019YFA0706403"],"award-info":[{"award-number":["2019YFA0706403"]}]},{"name":"Natural Science Foundation of Hunan Province","award":["2020JJ5774, 2020JJ2050"],"award-info":[{"award-number":["2020JJ5774, 2020JJ2050"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,6,27]]},"DOI":"10.1145\/3498361.3538932","type":"proceedings-article","created":{"date-parts":[[2022,6,16]],"date-time":"2022-06-16T16:21:53Z","timestamp":1655396513000},"page":"209-221","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":70,"title":["CoDL"],"prefix":"10.1145","author":[{"given":"Fucheng","family":"Jia","sequence":"first","affiliation":[{"name":"Central South University"}]},{"given":"Deyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Central South University"}]},{"given":"Ting","family":"Cao","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Shiqi","family":"Jiang","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Yunxin","family":"Liu","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Ju","family":"Ren","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Yaoxue","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tsinghua University and Central South University"}]}],"member":"320","published-online":{"date-parts":[[2022,6,27]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Snapdragon 855. 2021. https:\/\/www.qualcomm.com\/products\/snapdragon-855-mobile-platform  Snapdragon 855. 2021. https:\/\/www.qualcomm.com\/products\/snapdragon-855-mobile-platform"},{"key":"e_1_3_2_1_2_1","unstructured":"Snapdragon 865. 2021. https:\/\/www.qualcomm.com\/products\/snapdragon-865-5g-mobile-platform  Snapdragon 865. 2021. https:\/\/www.qualcomm.com\/products\/snapdragon-865-5g-mobile-platform"},{"key":"e_1_3_2_1_3_1","unstructured":"Snapdragon 888. 2021. https:\/\/www.qualcomm.com\/products\/snapdragon-888-5g-mobile-platform  Snapdragon 888. 2021. https:\/\/www.qualcomm.com\/products\/snapdragon-888-5g-mobile-platform"},{"key":"e_1_3_2_1_4_1","unstructured":"Kirin 990. 2021. https:\/\/www.hisilicon.com\/en\/products\/Kirin\/Kirin-flagship-chips\/Kirin-990-5G  Kirin 990. 2021. https:\/\/www.hisilicon.com\/en\/products\/Kirin\/Kirin-flagship-chips\/Kirin-990-5G"},{"key":"e_1_3_2_1_5_1","unstructured":"Jie An Haoyi Xiong Jiebo Luo Jun Huan and Jinwen Ma. 2019. Fast Universal Style Transfer for Artistic and Photorealistic Rendering. arXiv:1907.03118 [cs.CV]  Jie An Haoyi Xiong Jiebo Luo Jun Huan and Jinwen Ma. 2019. Fast Universal Style Transfer for Artistic and Photorealistic Rendering. arXiv:1907.03118 [cs.CV]"},{"key":"e_1_3_2_1_6_1","unstructured":"Ermao Cai Da-Cheng Juan Dimitrios Stamoulis and Diana Marculescu. 2017. NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks. In ACML. 622--637.  Ermao Cai Da-Cheng Juan Dimitrios Stamoulis and Diana Marculescu. 2017. NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks. In ACML. 622--637."},{"key":"e_1_3_2_1_7_1","volume-title":"TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In OSDI 18","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , Carlos Guestrin , and Arvind Krishnamurthy . 2018 . TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In OSDI 18 . USENIX Association , Carlsbad, CA , 578--594. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In OSDI 18. USENIX Association, Carlsbad, CA, 578--594."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Jiankang Deng Jia Guo Yuxiang Zhou Jinke Yu Irene Kotsia and Stefanos Zafeiriou. 2019. RetinaFace: Single-stage Dense Face Localisation in the Wild. arXiv:1905.00641 [cs.CV]  Jiankang Deng Jia Guo Yuxiang Zhou Jinke Yu Irene Kotsia and Stefanos Zafeiriou. 2019. RetinaFace: Single-stage Dense Face Localisation in the Wild. arXiv:1905.00641 [cs.CV]","DOI":"10.1109\/CVPR42600.2020.00525"},{"key":"e_1_3_2_1_9_1","unstructured":"Mali G76. 2021. https:\/\/developer.arm.com\/ip-products\/graphics-and-multimedia\/mali-gpus\/mali-g76-gpu  Mali G76. 2021. https:\/\/developer.arm.com\/ip-products\/graphics-and-multimedia\/mali-gpus\/mali-g76-gpu"},{"key":"e_1_3_2_1_10_1","unstructured":"Ling Huang Jinzhu Jia Bin Yu Byung-Gon Chun Petros Maniatis and Mayur Naik. 2010. Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression.. In NIPS.  Ling Huang Jinzhu Jia Bin Yu Byung-Gon Chun Petros Maniatis and Mayur Naik. 2010. Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression.. In NIPS."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081360"},{"key":"e_1_3_2_1_12_1","volume-title":"Profiling and Optimizing Deep Learning Inference on Mobile GPUs. In APSys '20","author":"Jiang Shiqi","year":"2020","unstructured":"Shiqi Jiang , Lihao Ran , Ting Cao , Yusen Xu , and Yunxin Liu . 2020 . Profiling and Optimizing Deep Learning Inference on Mobile GPUs. In APSys '20 . Association for Computing Machinery, New York, NY, USA, 75--81. Shiqi Jiang, Lihao Ran, Ting Cao, Yusen Xu, and Yunxin Liu. 2020. Profiling and Optimizing Deep Learning Inference on Mobile GPUs. In APSys '20. Association for Computing Machinery, New York, NY, USA, 75--81."},{"key":"e_1_3_2_1_13_1","volume-title":"LaLaRAND: Flexible Layer-by-Layer CPU\/GPU Scheduling for Real-Time DNN Tasks. In 2021 IEEE Real-Time Systems Symposium (RTSS). 329--341","author":"Kang Woosung","year":"2021","unstructured":"Woosung Kang , Kilho Lee , Jinkyu Lee , Insik Shin , and Hoon Sung Chwa . 2021 . LaLaRAND: Flexible Layer-by-Layer CPU\/GPU Scheduling for Real-Time DNN Tasks. In 2021 IEEE Real-Time Systems Symposium (RTSS). 329--341 . Woosung Kang, Kilho Lee, Jinkyu Lee, Insik Shin, and Hoon Sung Chwa. 2021. LaLaRAND: Flexible Layer-by-Layer CPU\/GPU Scheduling for Real-Time DNN Tasks. In 2021 IEEE Real-Time Systems Symposium (RTSS). 329--341."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3093337.3037698"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303950"},{"key":"e_1_3_2_1_16_1","volume-title":"DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices. In 2016 15th ACM\/IEEE International Conference on Information Processing in Sensor Networks (IPSN). 1--12","author":"Lane Nicholas D.","year":"2016","unstructured":"Nicholas D. Lane , Sourav Bhattacharya , Petko Georgiev , Claudio Forlivesi , Lei Jiao , Lorena Qendro , and Fahim Kawsar . 2016 . DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices. In 2016 15th ACM\/IEEE International Conference on Information Processing in Sensor Networks (IPSN). 1--12 . Nicholas D. Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices. In 2016 15th ACM\/IEEE International Conference on Information Processing in Sensor Networks (IPSN). 1--12."},{"key":"e_1_3_2_1_17_1","unstructured":"Tensorflow Lite. 2020. https:\/\/www.tensorflow.org\/lite\/  Tensorflow Lite. 2020. https:\/\/www.tensorflow.org\/lite\/"},{"key":"e_1_3_2_1_18_1","unstructured":"Tensorflow Lite. 2020. https:\/\/www.tensorflow.org\/lite\/  Tensorflow Lite. 2020. https:\/\/www.tensorflow.org\/lite\/"},{"key":"e_1_3_2_1_19_1","unstructured":"MACE. 2020. https:\/\/github.com\/XiaoMi\/mace  MACE. 2020. https:\/\/github.com\/XiaoMi\/mace"},{"key":"e_1_3_2_1_20_1","unstructured":"MNN. 2020. https:\/\/github.com\/alibaba\/MNN  MNN. 2020. https:\/\/github.com\/alibaba\/MNN"},{"key":"e_1_3_2_1_21_1","unstructured":"OpenCL. 2021. https:\/\/www.khronos.org\/opencl\/  OpenCL. 2021. https:\/\/www.khronos.org\/opencl\/"},{"key":"e_1_3_2_1_22_1","volume-title":"Paleo: A Performance Model for Deep Neural Networks. In ICLR.","author":"Qi Hang","year":"2017","unstructured":"Hang Qi , Evan R. Sparks , and Ameet Talwalkar . 2017 . Paleo: A Performance Model for Deep Neural Networks. In ICLR. Hang Qi, Evan R. Sparks, and Ameet Talwalkar. 2017. Paleo: A Performance Model for Deep Neural Networks. In ICLR."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"crossref","unstructured":"J. Redmon and A. Farhadi. 2017. YOLO9000: Better Faster Stronger. In CVPR 6517--6525.  J. Redmon and A. Farhadi. 2017. YOLO9000: Better Faster Stronger. In CVPR 6517--6525.","DOI":"10.1109\/CVPR.2017.690"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_1_25_1","volume-title":"Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations.","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman . 2015 . Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations."},{"key":"e_1_3_2_1_26_1","volume-title":"Ting Cao, and Yunxin Liu.","author":"Tang Xiaohu","year":"2021","unstructured":"Xiaohu Tang , Shihao Han , Li Lyna Zhang , Ting Cao, and Yunxin Liu. 2021 . To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks. In MLSys . https:\/\/www.microsoft.com\/en-us\/research\/publication\/to-bridge-neural-network-design-and-real-world-performance-a-behaviour-study-for-neural-networks\/ Xiaohu Tang, Shihao Han, Li Lyna Zhang, Ting Cao, and Yunxin Liu. 2021. To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks. In MLSys. https:\/\/www.microsoft.com\/en-us\/research\/publication\/to-bridge-neural-network-design-and-real-world-performance-a-behaviour-study-for-neural-networks\/"},{"key":"e_1_3_2_1_27_1","unstructured":"TinyML. 2021. https:\/\/github.com\/BurnellLiu\/TinyML  TinyML. 2021. https:\/\/github.com\/BurnellLiu\/TinyML"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447993.3448625"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2873210"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2019.2944584"},{"key":"e_1_3_2_1_31_1","unstructured":"Bichen Wu Xiaoliang Dai Peizhao Zhang Yanghan Wang Fei Sun Yiming Wu Yuandong Tian Peter Vajda Yangqing Jia and Kurt Keutzer. 2019. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. In CVPR. 10726--10734.  Bichen Wu Xiaoliang Dai Peizhao Zhang Yanghan Wang Fei Sun Yiming Wu Yuandong Tian Peter Vajda Yangqing Jia and Kurt Keutzer. 2019. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search. In CVPR. 10726--10734."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3384419.3430726"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458864.3467882"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2858384"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.525"}],"event":{"name":"MobiSys '22: The 20th Annual International Conference on Mobile Systems, Applications and Services","location":"Portland Oregon","acronym":"MobiSys '22","sponsor":["SIGMOBILE ACM Special Interest Group on Mobility of Systems, Users, Data and Computing","SIGOPS ACM Special Interest Group on Operating Systems"]},"container-title":["Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3498361.3538932","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3498361.3538932","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:10:04Z","timestamp":1750183804000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3498361.3538932"}},"subtitle":["efficient CPU-GPU co-execution for deep learning inference on mobile devices"],"short-title":[],"issued":{"date-parts":[[2022,6,27]]},"references-count":35,"alternative-id":["10.1145\/3498361.3538932","10.1145\/3498361"],"URL":"https:\/\/doi.org\/10.1145\/3498361.3538932","relation":{},"subject":[],"published":{"date-parts":[[2022,6,27]]},"assertion":[{"value":"2022-06-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}