{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:19:14Z","timestamp":1750220354507,"version":"3.41.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2022,5,31]],"date-time":"2022-05-31T00:00:00Z","timestamp":1653955200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["CNS-1717657, CCF-1937435, CNS-1822085"],"award-info":[{"award-number":["CNS-1717657, CCF-1937435, CNS-1822085"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2022,5,31]]},"abstract":"<jats:p>In the past decade, Deep Neural Networks (DNNs), e.g., Convolutional Neural Networks, achieved human-level performance in vision tasks such as object classification and detection. However, DNNs are known to be computationally expensive and thus hard to be deployed in real-time and edge applications. Many previous works have focused on DNN model compression to obtain smaller parameter sizes and consequently, less computational cost. Such methods, however, often introduce noticeable accuracy degradation. In this work, we optimize a state-of-the-art DNN-based video detection framework\u2014Deep Feature Flow (DFF) from the cloud end using three proposed ideas. First, we propose Asynchronous DFF (ADFF) to asynchronously execute the neural networks. Second, we propose a Video-based Dynamic Scheduling (VDS) method that decides the detection frequency based on the magnitude of movement between video frames. Last, we propose Spatial Sparsity Inference, which only performs the inference on part of the video frame and thus reduces the computation cost. According to our experimental results, ADFF can reduce the bottleneck latency from 89 to 19 ms. VDS increases the detection accuracy by 0.6% mAP without increasing computation cost. And SSI further saves 0.2 ms with a 0.6% mAP degradation of detection accuracy.<\/jats:p>","DOI":"10.1145\/3484946","type":"journal-article","created":{"date-parts":[[2022,7,19]],"date-time":"2022-07-19T13:45:41Z","timestamp":1658238341000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Toward Efficient and Adaptive Design of Video Detection System with Deep Neural Networks"],"prefix":"10.1145","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8986-0696","authenticated-orcid":false,"given":"Jiachen","family":"Mao","sequence":"first","affiliation":[{"name":"Duke University, Durham, NC, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2744-9556","authenticated-orcid":false,"given":"Qing","family":"Yang","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4990-1729","authenticated-orcid":false,"given":"Ang","family":"Li","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6764-8782","authenticated-orcid":false,"given":"Kent W.","family":"Nixon","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3228-6544","authenticated-orcid":false,"given":"Hai","family":"Li","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1486-8412","authenticated-orcid":false,"given":"Yiran","family":"Chen","sequence":"additional","affiliation":[{"name":"Duke University, Durham, NC, United States"}]}],"member":"320","published-online":{"date-parts":[[2022,7,19]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani et\u00a0al. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, 5998\u20136008. Retrieved from http:\/\/papers.nips.cc\/paper\/7181-attention-is-all-you-need.pdf."},{"key":"e_1_3_1_3_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from http:\/\/arxiv.org\/abs\/1810.04805."},{"key":"e_1_3_1_4_2","first-page":"91","volume-title":"Advances in Neural Information Processing Systems","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. MIT Press, 91\u201399."},{"key":"e_1_3_1_5_2","unstructured":"Yi Sun Ding Liang Xiaogang Wang and Xiaoou Tang. 2015. DeepID3: Face recognition with very deep neural networks. Retrieved from http:\/\/arxiv.org\/abs\/1502.00873."},{"key":"e_1_3_1_6_2","unstructured":"Barret Zoph and Quoc V. Le. 2016. Neural architecture search with reinforcement learning. Retrieved from http:\/\/arxiv.org\/abs\/1611.01578."},{"key":"e_1_3_1_7_2","unstructured":"Hanxiao Liu Karen Simonyan and Yiming Yang. 2018. DARTS: Differentiable architecture search. Retrieved from http:\/\/arxiv.org\/abs\/1806.09055."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01234-2_48"},{"key":"e_1_3_1_9_2","first-page":"2074","volume-title":"Advances in Neural Information Processing Systems","author":"Wen Wei","year":"2016","unstructured":"Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. 2016. Learning structured sparsity in deep neural networks. In Advances in Neural Information Processing Systems. MIT Press, 2074\u20132082."},{"key":"e_1_3_1_10_2","first-page":"10988","volume-title":"Advances in Neural Information Processing Systems 31","author":"Chen Patrick","year":"2018","unstructured":"Patrick Chen, Si Si, Yang Li, Ciprian Chelba, and Cho-Jui Hsieh. 2018. GroupReduce: Block-wise low-rank approximation for neural language model shrinking. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Curran Associates, 10988\u201310998."},{"key":"e_1_3_1_11_2","unstructured":"Hsin-Pai Cheng Yuanjun Huang Xuyang Guo Yifei Huang Feng Yan Hai Li and Yiran Chen. 2018. Differentiable fine-grained quantization for deep neural network compression. Retrieved from http:\/\/arxiv.org\/abs\/1810.10351."},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6855235"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"Yuhao Zhu Anand Samajdar Matthew Mattina and Paul N. Whatmough. 2018. Euphrates: Algorithm-SoC co-design for low-power mobile continuous vision. Retrieved from http:\/\/arxiv.org\/abs\/1803.11232.","DOI":"10.1109\/ISCA.2018.00052"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3316781.3317865"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3287624.3287642"},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Linghao Song Jiachen Mao Youwei Zhuo Xuehai Qian Hai Li and Yiran Chen. 2019. HyPar: Towards hybrid parallelism for deep learning accelerator array. Retrieved from http:\/\/arxiv.org\/abs\/1901.02067.","DOI":"10.1109\/HPCA.2019.00027"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2018.8297378"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE.2017.7927211"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2017.8203852"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD.2017.8203791"},{"key":"e_1_3_1_21_2","unstructured":"Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand Marco Andreetto and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. Retrieved from http:\/\/arxiv.org\/abs\/1704.04861."},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"e_1_3_1_23_2","first-page":"1097","volume-title":"Advances in Neural Information Processing Systems 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, 1097\u20131105. Retrieved from http:\/\/papers.nips.cc\/paper\/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf."},{"key":"e_1_3_1_24_2","first-page":"265","volume-title":"Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201916)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi et\u00a0al. 2016. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201916). 265\u2013283."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2003.1221362"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.441"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.316"},{"key":"e_1_3_1_28_2","unstructured":"Anita Sellent Daniel Kondermann Stephan Simon Simon Baker Goksel Dedeoglu Oliver Erdler Phil Parsonage Christoph Unger and Wolfgang Niehsen. 2012. Optical flow estimation versus motion estimation. https:\/\/archiv.ub.uni-heidelberg.de\/volltextserver\/13641\/."},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-014-0423-0"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.5555\/1763974.1764031"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00753"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00908"},{"key":"e_1_3_1_33_2","first-page":"947","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems (NIPS\u201916)","author":"Figurnov Mikhail","year":"2016","unstructured":"Mikhail Figurnov et\u00a0al. 2016. PerforatedCNNs: Acceleration through elimination of redundant convolutions. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS\u201916). 947\u2013955."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_35_2","unstructured":"Tianqi Chen et\u00a0al. 2015. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. Retrieved from http:\/\/arxiv.org\/abs\/1512.01274."},{"key":"e_1_3_1_36_2","first-page":"303","volume-title":"Proceedings of the Internet Measurement Conference","author":"Guo Yihua","year":"2016","unstructured":"Yihua Guo, Feng Qian, Qi Alfred Chen, Zhuoqing Morley Mao, and Subhabrata Sen. 2016. Understanding on-device bufferbloat for cellular upload. In Proceedings of the Internet Measurement Conference. ACM, 303\u2013317."},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/1296907.1296909"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241563"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081360"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3210240.3210337"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241559"},{"key":"e_1_3_1_42_2","article-title":"Noscope: Optimizing neural network queries over video at scale","author":"Kang Daniel","year":"2017","unstructured":"Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. Noscope: Optimizing neural network queries over video at scale. Retrieved from https:\/\/arXiv:1703.02529.","journal-title":"Retrieved from https:\/\/arXiv:1703.02529"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3356250.3360044"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/2809695.2809711"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3300061.3300116"},{"key":"e_1_3_1_46_2","first-page":"379","volume-title":"Advances in Neural Information Processing Systems","author":"Dai Jifeng","year":"2016","unstructured":"Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-fcn: Object detection via region-based fully convolutional networks. In Advances in Neural Information Processing Systems. MIT Press, 379\u2013387."},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_1_49_2","first-page":"533","volume-title":"Proceedings of the ACM\/IEEE 45th Annual International Symposium on Computer Architecture (ISCA\u201918)","author":"Buckler Mark","year":"2018","unstructured":"Mark Buckler, Philip Bedoukian, Suren Jayasuriya, and Adrian Sampson. 2018. EVA \\( ^2 \\) : Exploiting temporal redundancy in live computer vision. In Proceedings of the ACM\/IEEE 45th Annual International Symposium on Computer Architecture (ISCA\u201918). IEEE, 533\u2013546."}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3484946","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3484946","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3484946","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:17:15Z","timestamp":1750191435000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3484946"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,31]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,5,31]]}},"alternative-id":["10.1145\/3484946"],"URL":"https:\/\/doi.org\/10.1145\/3484946","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2022,5,31]]},"assertion":[{"value":"2021-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}