{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T16:44:21Z","timestamp":1761324261018,"version":"3.41.0"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2019,12,26]],"date-time":"2019-12-26T00:00:00Z","timestamp":1577318400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council grant","doi-asserted-by":"crossref","award":["RG171010"],"award-info":[{"award-number":["RG171010"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012659","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61802368, 61521092, 61432016, 61432018, 61332009, 61702485, and 61872043"],"award-info":[{"award-number":["61802368, 61521092, 61432016, 61432018, 61332009, 61702485, and 61872043"]}],"id":[{"id":"10.13039\/501100012659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2016YFB1000402"],"award-info":[{"award-number":["2016YFB1000402"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"name":"CCF-Tencent Open Research Fund"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2019,12,31]]},"abstract":"<jats:p>Deep Neural Networks (DNNs) are now increasingly adopted in a variety of Artificial Intelligence (AI) applications. Meantime, more and more DNNs are moving from cloud to the mobile devices, as emerging AI chips are integrated into mobiles. Therefore, the DNN models can be deployed in the cloud, on the mobile devices, or even mobile-cloud coordinate processing, making it a big challenge to select an optimal deployment strategy under specific objectives.<\/jats:p>\n          <jats:p>This article proposes a DNN tuning framework, i.e., DNNTune, that can provide layer-wise behavior analysis across a number of platforms. Using DNNTune, this article further selects 13 representative DNN models, including CNN, LSTM, and MLP, and three mobile devices ranging from low-end to high-end, and two AI accelerator chips to characterize the DNN models on these devices to further assist users finding opportunities for mobile-cloud coordinate computing. Our experimental results demonstrate that DNNTune can find a coordinated deployment achieving up to 1.66\u00d7 speedup and 15\u00d7 energy saving comparing with mobile-only and cloud-only deployment.<\/jats:p>","DOI":"10.1145\/3368305","type":"journal-article","created":{"date-parts":[[2019,12,26]],"date-time":"2019-12-26T21:05:46Z","timestamp":1577394346000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":24,"title":["DNNTune"],"prefix":"10.1145","volume":"16","author":[{"given":"Chunwei","family":"Xia","sequence":"first","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China and School of Computer Science and Technology, University of Chinese Academy of Sciences, Shijingshan District, Beijing, China"}]},{"given":"Jiacheng","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China and School of Computer Science and Technology, University of Chinese Academy of Sciences, Shijingshan District, Beijing, China"}]},{"given":"Huimin","family":"Cui","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China and School of Computer Science and Technology, University of Chinese Academy of Sciences, Shijingshan District, Beijing, China"}]},{"given":"Xiaobing","family":"Feng","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China and School of Computer Science and Technology, University of Chinese Academy of Sciences, Shijingshan District, Beijing, China"}]},{"given":"Jingling","family":"Xue","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering University of New South Wales, Sydney, Australia, NSW"}]}],"member":"320","published-online":{"date-parts":[[2019,12,26]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Guohui Wang. 2015. OpenCL-Z Android Official Webpage. Retrieved from http:\/\/web.guohuiwang.com\/software\/opencl_z_android.  Guohui Wang. 2015. OpenCL-Z Android Official Webpage. Retrieved from http:\/\/web.guohuiwang.com\/software\/opencl_z_android."},{"key":"e_1_2_1_2_1","unstructured":"Qualcomm Technologies Inc. 2016. Snapdragon 820 Mobile Platform. Retrieved from https:\/\/www.qualcomm.com\/products\/snapdragon-820-mobile-platform.  Qualcomm Technologies Inc. 2016. Snapdragon 820 Mobile Platform. Retrieved from https:\/\/www.qualcomm.com\/products\/snapdragon-820-mobile-platform."},{"key":"e_1_2_1_3_1","unstructured":"Apple Inc. 2017. The future is here: iPhone X. Retrieved from https:\/\/www.apple.com\/newsroom\/2017\/09\/the-future-is-here-iphone-x\/.  Apple Inc. 2017. The future is here: iPhone X. Retrieved from https:\/\/www.apple.com\/newsroom\/2017\/09\/the-future-is-here-iphone-x\/."},{"key":"e_1_2_1_4_1","unstructured":"WikiChip. 2017. Kirin 970\u2014HiSilicon. Retrieved from https:\/\/en.wikichip.org\/wiki\/hisilicon\/kirin\/970.  WikiChip. 2017. Kirin 970\u2014HiSilicon. Retrieved from https:\/\/en.wikichip.org\/wiki\/hisilicon\/kirin\/970."},{"key":"e_1_2_1_5_1","unstructured":"Open Neural Network Exchange. 2017. Open Neural Network Exchange. Retrieved from https:\/\/github.com\/onnx\/onnx.  Open Neural Network Exchange. 2017. Open Neural Network Exchange. Retrieved from https:\/\/github.com\/onnx\/onnx."},{"key":"e_1_2_1_6_1","unstructured":"Matt Humrick and Ryan Smith. 2017. The Qualcomm Snapdragon 835 Performance Preview. Retrieved from https:\/\/www.anandtech.com\/show\/11201\/qualcomm-snapdragon-835-performance-preview\/2.  Matt Humrick and Ryan Smith. 2017. The Qualcomm Snapdragon 835 Performance Preview. Retrieved from https:\/\/www.anandtech.com\/show\/11201\/qualcomm-snapdragon-835-performance-preview\/2."},{"key":"e_1_2_1_7_1","unstructured":"Michael Passingham. 2017. Snapdragon 835 Benchmarks Revealed: All you need to know about the new chip. Retrieved from https:\/\/www.trustedreviews.com\/news\/snapdragon-835-phones-processor-specs-speed-benchmark-chipset-cores-2944086.  Michael Passingham. 2017. Snapdragon 835 Benchmarks Revealed: All you need to know about the new chip. Retrieved from https:\/\/www.trustedreviews.com\/news\/snapdragon-835-phones-processor-specs-speed-benchmark-chipset-cores-2944086."},{"key":"e_1_2_1_8_1","unstructured":"Inc. Gartner. 2018. Gartner Highlights 10 Uses for AI-Powered Smartphones. Retrieved from https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2018-03-20-gartner-highlights-10-uses-for-ai-powered-smartphones.  Inc. Gartner. 2018. Gartner Highlights 10 Uses for AI-Powered Smartphones. Retrieved from https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2018-03-20-gartner-highlights-10-uses-for-ai-powered-smartphones."},{"key":"e_1_2_1_9_1","unstructured":"HUAWEI Developer. 2018. HiAI Foundation. Retrieved from https:\/\/developer.huawei.com\/consumer\/cn\/hiai#Foundation.  HUAWEI Developer. 2018. HiAI Foundation. Retrieved from https:\/\/developer.huawei.com\/consumer\/cn\/hiai#Foundation."},{"key":"e_1_2_1_10_1","unstructured":"Amazon.com Inc. 2018. Huawei Honor 10. Retrieved from https:\/\/www.amazon.com\/Huawei-10-128GB-Factory-Unlocked-Smartphone\/dp\/B07D7GZBDW.  Amazon.com Inc. 2018. Huawei Honor 10. Retrieved from https:\/\/www.amazon.com\/Huawei-10-128GB-Factory-Unlocked-Smartphone\/dp\/B07D7GZBDW."},{"key":"e_1_2_1_11_1","unstructured":"WikiPedia. 2018. Kryo CPU. Retrieved from https:\/\/https:\/\/en.wikipedia.org\/wiki\/Kryo.  WikiPedia. 2018. Kryo CPU. Retrieved from https:\/\/https:\/\/en.wikipedia.org\/wiki\/Kryo."},{"key":"e_1_2_1_12_1","unstructured":"Qualcomm Technologies Inc. and\/or its affiliated companies. 2018. List of Qualcomm Snapdragon systems-on-chip. Retrieved from https:\/\/en.wikipedia.org\/wiki\/List_of_Qualcomm_Snapdragon_systems-on-chip.  Qualcomm Technologies Inc. and\/or its affiliated companies. 2018. List of Qualcomm Snapdragon systems-on-chip. Retrieved from https:\/\/en.wikipedia.org\/wiki\/List_of_Qualcomm_Snapdragon_systems-on-chip."},{"key":"e_1_2_1_13_1","unstructured":"MACE Developers. 2018. Mobile AI Compute Engine Documentation. Retrieved from https:\/\/mace.readthedocs.io\/en\/latest\/index.html.  MACE Developers. 2018. Mobile AI Compute Engine Documentation. Retrieved from https:\/\/mace.readthedocs.io\/en\/latest\/index.html."},{"key":"e_1_2_1_14_1","unstructured":"LineaseOS Wiki. 2018. OnePlus 3. Retrieved from https:\/\/wiki.lineageos.org\/devices\/oneplus3.  LineaseOS Wiki. 2018. OnePlus 3. Retrieved from https:\/\/wiki.lineageos.org\/devices\/oneplus3."},{"key":"e_1_2_1_15_1","unstructured":"LineaseOS Wiki. 2018. OnePlus 5t. Retrieved from https:\/\/wiki.lineageos.org\/devices\/dumpling.  LineaseOS Wiki. 2018. OnePlus 5t. Retrieved from https:\/\/wiki.lineageos.org\/devices\/dumpling."},{"key":"e_1_2_1_16_1","unstructured":"GSMArena.com. 2018. RedMi Note 4x. Retrieved from https:\/\/www.gsmarena.com\/xiaomi_redmi_note_4x-8580.php.  GSMArena.com. 2018. RedMi Note 4x. Retrieved from https:\/\/www.gsmarena.com\/xiaomi_redmi_note_4x-8580.php."},{"key":"e_1_2_1_17_1","unstructured":"Google Developers. 2018. Simpleperf. Retrieved from https:\/\/developer.android.com\/ndk\/guides\/simpleperf.  Google Developers. 2018. Simpleperf. Retrieved from https:\/\/developer.android.com\/ndk\/guides\/simpleperf."},{"key":"e_1_2_1_18_1","unstructured":"Qualcomm Technologies Inc. 2018. Snapdragon Profiler. Retrieved from https:\/\/developer.qualcomm.com\/software\/snapdragon-profiler.  Qualcomm Technologies Inc. 2018. Snapdragon Profiler. Retrieved from https:\/\/developer.qualcomm.com\/software\/snapdragon-profiler."},{"key":"e_1_2_1_19_1","unstructured":"Google Inc. 2018. Tensorflow Lite. Retrieved from https:\/\/www.tensorflow.org\/mobile\/tflite\/.  Google Inc. 2018. Tensorflow Lite. Retrieved from https:\/\/www.tensorflow.org\/mobile\/tflite\/."},{"key":"e_1_2_1_20_1","unstructured":"Google Inc. 2018. TensorFlow Lite is for mobile and embedded devices. Retrieved from https:\/\/www.tensorflow.org\/lite\/.  Google Inc. 2018. TensorFlow Lite is for mobile and embedded devices. Retrieved from https:\/\/www.tensorflow.org\/lite\/."},{"key":"e_1_2_1_21_1","unstructured":"TestMy.net. 2018. TestMyNet: Internet Speed Test. Retrieved from https:\/\/testmy.net\/.  TestMy.net. 2018. TestMyNet: Internet Speed Test. Retrieved from https:\/\/testmy.net\/."},{"key":"e_1_2_1_22_1","unstructured":"Google Inc. 2019. Hosted models. Retrieved from https:\/\/www.tensorflow.org\/lite\/guide\/hosted_models.  Google Inc. 2019. Hosted models. Retrieved from https:\/\/www.tensorflow.org\/lite\/guide\/hosted_models."},{"key":"e_1_2_1_23_1","unstructured":"NVIDIA Corporation. 2019. Jetson TX2 Module. Retrieved from https:\/\/developer.nvidia.com\/embedded\/jetson-tx2.  NVIDIA Corporation. 2019. Jetson TX2 Module. Retrieved from https:\/\/developer.nvidia.com\/embedded\/jetson-tx2."},{"key":"e_1_2_1_24_1","unstructured":"Google Inc. 2019. Model optimization. Retrieved from https:\/\/www.tensorflow.org\/lite\/performance\/model_optimization.  Google Inc. 2019. Model optimization. Retrieved from https:\/\/www.tensorflow.org\/lite\/performance\/model_optimization."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the OSDI\u201916","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , 2016 . TensorFlow: A system for large-scale machine learning . In Proceedings of the OSDI\u201916 . 265--283. Mart\u00edn Abadi, Paul Barham, Jianmin Chen, et al. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the OSDI\u201916. 265--283."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the OSDI\u201918","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , Carlos Guestrin , and Arvind Krishnamurthy . 2018 . TVM: An automated end-to-end optimizing compiler for deep learning . In Proceedings of the OSDI\u201918 . USENIX Association, 578--594. Retrieved from https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/chen. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An automated end-to-end optimizing compiler for deep learning. In Proceedings of the OSDI\u201918. USENIX Association, 578--594. Retrieved from https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/chen."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1966445.1966473"},{"key":"e_1_2_1_29_1","volume-title":"Balasubramanian et al","author":"Cuervo Eduardo","year":"2010","unstructured":"Eduardo Cuervo , Balasubramanian et al . 2010 . MAUI : Making smartphones last longer with code offload. In Proceedings of the MobiSys\u2019 10. 49--62. Eduardo Cuervo, Balasubramanian et al. 2010. MAUI: Making smartphones last longer with code offload. In Proceedings of the MobiSys\u201910. 49--62."},{"key":"e_1_2_1_30_1","volume-title":"Dae-ki Cho et al","author":"Cuervo Eduardo","year":"2010","unstructured":"Eduardo Cuervo , Aruna Balasubramanian , Dae-ki Cho et al . 2010 . MAUI : Making smartphones last longer with code offload. In Proceedings of the MobiSys\u201910. ACM , 49--62. Eduardo Cuervo, Aruna Balasubramanian, Dae-ki Cho et al. 2010. MAUI: Making smartphones last longer with code offload. In Proceedings of the MobiSys\u201910. ACM, 49--62."},{"key":"e_1_2_1_31_1","volume-title":"Socher et al","author":"Deng J.","year":"2009","unstructured":"J. Deng , W. Dong , R. Socher et al . 2009 . ImageNet: A large-scale hierarchical image database. In Proceedings of the CVPR\u2019 09. J. Deng, W. Dong, R. Socher et al. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the CVPR\u201909."},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","unstructured":"Mark Everingham Luc Gool Christopher K. Williams etal [n.d.]. The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88 2 ([n.d.]) 303--338.  Mark Everingham Luc Gool Christopher K. Williams et al. [n.d.]. The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88 2 ([n.d.]) 303--338.","DOI":"10.1007\/s11263-009-0275-4"},{"key":"e_1_2_1_33_1","first-page":"93","article-title":"COMET: Code offload by migrating execution transparently","volume":"12","author":"Gordon Mark S.","year":"2012","unstructured":"Mark S. Gordon , Davoud Anoushe Jamshidi , Scott A. Mahlke 2012 . COMET: Code offload by migrating execution transparently . In Proceedings of the OSDI , Vol. 12. 93 -- 106 . Mark S. Gordon, Davoud Anoushe Jamshidi, Scott A. Mahlke et al. 2012. COMET: Code offload by migrating execution transparently. In Proceedings of the OSDI, Vol. 12. 93--106.","journal-title":"Proceedings of the OSDI"},{"key":"e_1_2_1_34_1","unstructured":"Ga\u00ebl Guennebaud Beno\u00eet Jacob etal 2010. Eigen v3. Retrieved from http:\/\/eigen.tuxfamily.org.  Ga\u00ebl Guennebaud Beno\u00eet Jacob et al. 2010. Eigen v3. Retrieved from http:\/\/eigen.tuxfamily.org."},{"key":"e_1_2_1_35_1","volume-title":"International Conference on Machine Learning. 1737--1746","author":"Gupta Suyog","year":"2015","unstructured":"Suyog Gupta , Ankur Agrawal , Kailash Gopalakrishnan , and Pritish Narayanan . 2015 . Deep learning with limited numerical precision . In International Conference on Machine Learning. 1737--1746 . Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In International Conference on Machine Learning. 1737--1746."},{"key":"e_1_2_1_36_1","volume-title":"Dally","author":"Han Song","year":"2015","unstructured":"Song Han , Huizi Mao , and William J . Dally . 2015 . Deep compression: Compressing deep neural network with pruning, trained quantization, and Huffman coding. arXiv preprint arXiv:1510.00149 (2015). Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural network with pruning, trained quantization, and Huffman coding. arXiv preprint arXiv:1510.00149 (2015)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2906388.2906396"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3204949.3204975"},{"key":"e_1_2_1_39_1","volume-title":"Ng","author":"Hannun Awni Y.","year":"2014","unstructured":"Awni Y. Hannun , Carl Case , Jared Casper , Bryan Catanzaro , Greg Diamos , Erich Elsen , Ryan Prenger , Sanjeev Satheesh , Shubho Sengupta , Adam Coates , and Andrew Y . Ng . 2014 . Deep speech: Scaling up end-to-end speech recognition. CoRR abs\/1412.5567 (2014). Retrieved from http:\/\/arxiv.org\/abs\/1412.5567. Awni Y. Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y. Ng. 2014. Deep speech: Scaling up end-to-end speech recognition. CoRR abs\/1412.5567 (2014). Retrieved from http:\/\/arxiv.org\/abs\/1412.5567."},{"volume-title":"Proceedings of the CVPR\u201916","author":"He K.","key":"e_1_2_1_40_1","unstructured":"K. He , X. Zhang , S. Ren , and J. Sun . 2016. Deep residual learning for image recognition . In Proceedings of the CVPR\u201916 . 770--778. K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the CVPR\u201916. 770--778."},{"key":"e_1_2_1_41_1","unstructured":"Andrew G. Howard Menglong Zhu Bo Chen etal 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).  Andrew G. Howard Menglong Zhu Bo Chen et al. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081360"},{"key":"e_1_2_1_43_1","unstructured":"Forrest N. Iandola Matthew W. Moskewicz etal 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1 MB model size. arXiv preprint arXiv:1602.07360 (2016).  Forrest N. Iandola Matthew W. Moskewicz et al. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1 MB model size. arXiv preprint arXiv:1602.07360 (2016)."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037698"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2019.00021"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2017.7975270"},{"key":"e_1_2_1_47_1","unstructured":"Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. Retrieved from http:\/\/yann.lecun.com\/exdb\/mnist\/.  Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. Retrieved from http:\/\/yann.lecun.com\/exdb\/mnist\/."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"e_1_2_1_49_1","first-page":"2","article-title":"Building a large annotated corpus of English: The Penn treebank","volume":"19","author":"Marcus Mitchell P.","year":"1993","unstructured":"Mitchell P. Marcus , Mary Ann Marcinkiewicz , and Beatrice Santorini . 1993 . Building a large annotated corpus of English: The Penn treebank . Comput. Linguist. 19 , 2 (June 1993), 313--330. Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of English: The Penn treebank. Comput. Linguist. 19, 2 (June 1993), 313--330.","journal-title":"Comput. Linguist."},{"key":"e_1_2_1_50_1","unstructured":"Qualcomm Technologies Inc. 2017. Qualcomm-SnapdragonTM Mobile Platform OpenCL General Programming and Optimization.  Qualcomm Technologies Inc. 2017. Qualcomm-SnapdragonTM Mobile Platform OpenCL General Programming and Optimization."},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18)","author":"Sandler Mark","year":"2018","unstructured":"Mark Sandler , Andrew G. Howard , Menglong Zhu , 2018 . Inverted residuals and linear bottlenecks: Mobile networks for classification, detection, and segmentation . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18) . Mark Sandler, Andrew G. Howard, Menglong Zhu, et al. 2018. Inverted residuals and linear bottlenecks: Mobile networks for classification, detection, and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18)."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCBD.2016.029"},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the 3rd International Conference on Learning Representations (ICLR'15)","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman . 2014 . Very deep convolutional networks for large-scale image recognition . In Proceedings of the 3rd International Conference on Learning Representations (ICLR'15) . http:\/\/arxiv.org\/abs\/1409.1556. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR'15). http:\/\/arxiv.org\/abs\/1409.1556."},{"key":"e_1_2_1_54_1","volume-title":"Raluca Ada Popa, et al","author":"Stoica Ion","year":"2017","unstructured":"Ion Stoica , Dawn Song , Raluca Ada Popa, et al . 2017 . A Berkeley view of systems challenges for AI. arXiv preprint arXiv:1712.05855 (2017). Ion Stoica, Dawn Song, Raluca Ada Popa, et al. 2017. A Berkeley view of systems challenges for AI. arXiv preprint arXiv:1712.05855 (2017)."},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of the CVPR\u201915","author":"Szegedy Christian","year":"2015","unstructured":"Christian Szegedy , Vincent Vanhoucke , Sergey Ioffe , 2015 . Rethinking the inception architecture for computer vision . In Proceedings of the CVPR\u201915 . Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, et al. 2015. Rethinking the inception architecture for computer vision. In Proceedings of the CVPR\u201915."},{"key":"e_1_2_1_57_1","volume-title":"Proceedings of the IISWC\u201918","author":"Turner J.","year":"2018","unstructured":"J. Turner , J. Cano , V. Radu , E. J. Crowley , M. O\u2019Boyle , and A. Storkey . 2018. Characterising across-stack optimisations for deep convolutional neural networks . In Proceedings of the IISWC\u201918 . 101--110. DOI:https:\/\/doi.org\/10.1109\/IISWC. 2018 .8573503 10.1109\/IISWC.2018.8573503 J. Turner, J. Cano, V. Radu, E. J. Crowley, M. O\u2019Boyle, and A. Storkey. 2018. Characterising across-stack optimisations for deep convolutional neural networks. In Proceedings of the IISWC\u201918. 101--110. DOI:https:\/\/doi.org\/10.1109\/IISWC.2018.8573503"},{"key":"e_1_2_1_58_1","volume-title":"Attention is all you need. CoRR abs\/1706.03762","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N. Gomez , Lukasz Kaiser , and Illia Polosukhin . 2017. Attention is all you need. CoRR abs\/1706.03762 ( 2017 ). Retrieved from http:\/\/arxiv.org\/abs\/1706.03762. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. CoRR abs\/1706.03762 (2017). Retrieved from http:\/\/arxiv.org\/abs\/1706.03762."},{"key":"e_1_2_1_59_1","volume-title":"CoRR abs\/1907.10701","author":"Wang Yu","year":"2019","unstructured":"Yu Wang , Gu-Yeon Wei , and David Brooks . 2019. Benchmarking TPU, GPU , and CPU platforms for deep learning. CoRR abs\/1907.10701 ( 2019 ). Retrieved from http:\/\/arxiv.org\/abs\/1907.10701. Yu Wang, Gu-Yeon Wei, and David Brooks. 2019. Benchmarking TPU, GPU, and CPU platforms for deep learning. CoRR abs\/1907.10701 (2019). Retrieved from http:\/\/arxiv.org\/abs\/1907.10701."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2019.00048"},{"key":"e_1_2_1_61_1","unstructured":"Mengwei Xu Jiawei Liu Yuanqiang Liu etal 2018. When mobile apps going deep: An empirical study of mobile deep learning. arXiv preprint arXiv:1812.05448 (2018).  Mengwei Xu Jiawei Liu Yuanqiang Liu et al. 2018. When mobile apps going deep: An empirical study of mobile deep learning. arXiv preprint arXiv:1812.05448 (2018)."},{"key":"e_1_2_1_62_1","unstructured":"W. Zaremba I. Sutskever and O. Vinyals. 2014. Recurrent neural network regularization. ArXiv e-prints (Sept. 2014). Retrieved from arxiv:1409.2329  W. Zaremba I. Sutskever and O. Vinyals. 2014. Recurrent neural network regularization. ArXiv e-prints (Sept. 2014). Retrieved from arxiv:1409.2329"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3368305","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3368305","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:06Z","timestamp":1750197726000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3368305"}},"subtitle":["Automatic Benchmarking DNN Models for Mobile-cloud Computing"],"short-title":[],"issued":{"date-parts":[[2019,12,26]]},"references-count":62,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2019,12,31]]}},"alternative-id":["10.1145\/3368305"],"URL":"https:\/\/doi.org\/10.1145\/3368305","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2019,12,26]]},"assertion":[{"value":"2019-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-12-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}