{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:17:36Z","timestamp":1760710656577,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":21,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,6,24]],"date-time":"2021-06-24T00:00:00Z","timestamp":1624492800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"MSIT(Ministry of Science, ICT), Korea, under the High-Potential Individuals Global Training Program","award":["2020-0-01649"],"award-info":[{"award-number":["2020-0-01649"]}]},{"name":"Research Resettlement Fund for the new faculty of Seoul National University"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,6,25]]},"DOI":"10.1145\/3469116.3470014","type":"proceedings-article","created":{"date-parts":[[2021,6,24]],"date-time":"2021-06-24T10:10:05Z","timestamp":1624529405000},"page":"25-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["ParallelFusion"],"prefix":"10.1145","author":[{"given":"Jingyu","family":"Lee","sequence":"first","affiliation":[{"name":"Seoul National University, Seoul, Korea"}]},{"given":"Yunxin","family":"Liu","sequence":"additional","affiliation":[{"name":"Institute for AI Industry Research (AIR), Tsinghua University, Beijing, China"}]},{"given":"Youngki","family":"Lee","sequence":"additional","affiliation":[{"name":"Seoul National University, Seoul, Korea"}]}],"member":"320","published-online":{"date-parts":[[2021,6,24]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). 1--8. https:\/\/doi.org\/10","author":"Arafa Yehia","year":"2019","unstructured":"Yehia Arafa , Abdel-Hameed A. Badawy , Gopinath Chennupati , Nandakishore Santhi , and Stephan Eidenbenz . 2019 . Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). 1--8. https:\/\/doi.org\/10 .1109\/HPEC.2019.8916466 10.1109\/HPEC.2019.8916466 Yehia Arafa, Abdel-Hameed A. Badawy, Gopinath Chennupati, Nandakishore Santhi, and Stephan Eidenbenz. 2019. Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). 1--8. https:\/\/doi.org\/10.1109\/HPEC.2019.8916466"},{"key":"e_1_3_2_1_2_1","volume-title":"Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR abs\/1706.05587","author":"Chen Liang-Chieh","year":"2017","unstructured":"Liang-Chieh Chen , George Papandreou , Florian Schroff , and Hartwig Adam . 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR abs\/1706.05587 ( 2017 ). arXiv:1706.05587 http:\/\/arxiv.org\/abs\/1706.05587 Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR abs\/1706.05587 (2017). arXiv:1706.05587 http:\/\/arxiv.org\/abs\/1706.05587"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/3291168.3291211"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/MNET.2018.1700268"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190541"},{"key":"e_1_3_2_1_6_1","unstructured":"Google. Accessed on 20.12.2020. Tensorflow Lite. https:\/\/www.tensorflow.org\/lite\/  Google. Accessed on 20.12.2020. Tensorflow Lite. https:\/\/www.tensorflow.org\/lite\/"},{"key":"e_1_3_2_1_7_1","unstructured":"Google. Accessed on 20.12.2020. Tensorflow Serving. https:\/\/github.com\/tensorflow\/serving  Google. Accessed on 20.12.2020. Tensorflow Serving. https:\/\/github.com\/tensorflow\/serving"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_9_1","volume-title":"Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861","author":"Howard Andrew G","year":"2017","unstructured":"Andrew G Howard , Menglong Zhu , Bo Chen , Dmitry Kalenichenko , Weijun Wang , Tobias Weyand , Marco Andreetto , and Hartwig Adam . 2017 . Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017). Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)."},{"key":"e_1_3_2_1_10_1","volume-title":"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1MB model size. CoRR abs\/1602.07360","author":"Iandola Forrest N.","year":"2016","unstructured":"Forrest N. Iandola , Matthew W. Moskewicz , Khalid Ashraf , Song Han , William J. Dally , and Kurt Keutzer . 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1MB model size. CoRR abs\/1602.07360 ( 2016 ). arXiv:1602.07360 http:\/\/arxiv.org\/abs\/1602.07360 Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and &lt;1MB model size. CoRR abs\/1602.07360 (2016). arXiv:1602.07360 http:\/\/arxiv.org\/abs\/1602.07360"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3409963.3410493"},{"key":"e_1_3_2_1_12_1","volume-title":"Proceedings of Machine Learning and Systems 2020","author":"Jiang Xiaotang","year":"2020","unstructured":"Xiaotang Jiang , Huan Wang , Yiliu Chen , Ziqi Wu , Lichuan Wang , Bin Zou , Yafeng Yang , Zongyang Cui , Yu Cai , Tianhang Yu , Chengfei Lyu , and Zhihua Wu . 2020 . MNN: A Universal and Efficient Inference Engine . In Proceedings of Machine Learning and Systems 2020 , MLSys 2020, Austin, TX, USA , March 2-4, 2020, Inderjit S. Dhillon, Dimitris S. Papailiopoulos, and Vivienne Sze (Eds.). mlsys.org. https:\/\/proceedings.mlsys.org\/book\/287.pdf Xiaotang Jiang, Huan Wang, Yiliu Chen, Ziqi Wu, Lichuan Wang, Bin Zou, Yafeng Yang, Zongyang Cui, Yu Cai, Tianhang Yu, Chengfei Lyu, and Zhihua Wu. 2020. MNN: A Universal and Efficient Inference Engine. In Proceedings of Machine Learning and Systems 2020, MLSys 2020, Austin, TX, USA, March 2-4, 2020, Inderjit S. Dhillon, Dimitris S. Papailiopoulos, and Vivienne Sze (Eds.). mlsys.org. https:\/\/proceedings.mlsys.org\/book\/287.pdf"},{"key":"e_1_3_2_1_13_1","volume-title":"Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055","author":"Liu Hanxiao","year":"2018","unstructured":"Hanxiao Liu , Karen Simonyan , and Yiming Yang . 2018 . Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018). Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)."},{"key":"e_1_3_2_1_14_1","unstructured":"NVIDIA. Accessed on 20.12.2020. Tensor RT. https:\/\/developer.nvidia.com\/tensorrt  NVIDIA. Accessed on 20.12.2020. Tensor RT. https:\/\/developer.nvidia.com\/tensorrt"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.308"},{"key":"e_1_3_2_1_18_1","unstructured":"XiaoMi. Accessed on 20.12.2020. MACE. https:\/\/github.com\/XiaoMi\/mace  XiaoMi. Accessed on 20.12.2020. MACE. https:\/\/github.com\/XiaoMi\/mace"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3380881"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3419192"},{"key":"e_1_3_2_1_21_1","volume-title":"CoRR abs\/1904.07421","author":"Zha Sheng","year":"2019","unstructured":"Sheng Zha , Ziheng Jiang , Haibin Lin , and Zhi Zhang . 2019. Just-in- Time Dynamic-Batching . CoRR abs\/1904.07421 ( 2019 ). arXiv:1904.07421 http:\/\/arxiv.org\/abs\/1904.07421 Sheng Zha, Ziheng Jiang, Haibin Lin, and Zhi Zhang. 2019. Just-in-Time Dynamic-Batching. CoRR abs\/1904.07421 (2019). arXiv:1904.07421 http:\/\/arxiv.org\/abs\/1904.07421"},{"key":"e_1_3_2_1_22_1","volume-title":"Resnest: Split-attention networks. arXiv preprint arXiv:2004.08955","author":"Zhang Hang","year":"2020","unstructured":"Hang Zhang , Chongruo Wu , Zhongyue Zhang , Yi Zhu , Zhi Zhang , Haibin Lin , Yue Sun , Tong He , Jonas Mueller , R Manmatha , 2020 . Resnest: Split-attention networks. arXiv preprint arXiv:2004.08955 (2020). Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Zhi Zhang, Haibin Lin, Yue Sun, Tong He, Jonas Mueller, R Manmatha, et al. 2020. Resnest: Split-attention networks. arXiv preprint arXiv:2004.08955 (2020)."}],"event":{"name":"MobiSys '21: The 19th Annual International Conference on Mobile Systems, Applications, and Services","sponsor":["SIGMOBILE ACM Special Interest Group on Mobility of Systems, Users, Data and Computing","SIGOPS ACM Special Interest Group on Operating Systems"],"location":"Virtual WI USA","acronym":"MobiSys '21"},"container-title":["Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3469116.3470014","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3469116.3470014","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:29Z","timestamp":1750191509000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3469116.3470014"}},"subtitle":["Towards Maximum Utilization of Mobile GPU for DNN Inference"],"short-title":[],"issued":{"date-parts":[[2021,6,24]]},"references-count":21,"alternative-id":["10.1145\/3469116.3470014","10.1145\/3469116"],"URL":"https:\/\/doi.org\/10.1145\/3469116.3470014","relation":{},"subject":[],"published":{"date-parts":[[2021,6,24]]},"assertion":[{"value":"2021-06-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}