{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T00:26:27Z","timestamp":1760401587220,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T00:00:00Z","timestamp":1760313600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning or greedy algorithms that optimize for a single frame. These myopic approaches adapt poorly to dynamic network and workload conditions, leading to high long-term costs and significant frame drops. This paper introduces a novel partitioning technique driven by a Deep Reinforcement Learning (DRL) agent on a local device that learns to dynamically partition a video analytics Deep Neural Network (DNN). The agent learns a farsighted policy to dynamically select the optimal DNN split point for each frame by observing the holistic system state. By optimizing for a cumulative long-term reward, our method significantly outperforms competitor methods, demonstrably reducing overall system cost and latency while nearly eliminating frame drops in our real-world testbed evaluation. The primary limitation is the initial offline training phase required by the DRL agent. Future work will focus on extending this dynamic partitioning framework to multi-device and multi-edge environments.<\/jats:p>","DOI":"10.3390\/make7040117","type":"journal-article","created":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T08:10:31Z","timestamp":1760343031000},"page":"117","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Learning to Partition: Dynamic Deep Neural Network Model Partitioning for Edge-Assisted Low-Latency Video Analytics"],"prefix":"10.3390","volume":"7","author":[{"given":"Yan","family":"Lyu","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Southeast University, Nanjing 211189, China"}]},{"given":"Likai","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Southeast University, Nanjing 211189, China"}]},{"given":"Xuezhi","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, University of Science and Technology, Nanjing 210094, China"}]},{"given":"Zhiyu","family":"Fan","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Southeast University, Nanjing 211189, China"}]},{"given":"Jinchen","family":"Wang","sequence":"additional","affiliation":[{"name":"North Information Control Research Academy Group Co., Ltd., Nanjing 211153, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8584-0532","authenticated-orcid":false,"given":"Guanyu","family":"Gao","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, University of Science and Technology, Nanjing 210094, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Wang, X., and Gao, G. (2021, January 20\u201324). SmartEye: An Open Source Framework for Real-Time Video Analytics with Edge-Cloud Collaboration. Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China.","DOI":"10.1145\/3474085.3478330"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1109\/MC.2017.3641638","article-title":"Real-time video analytics: The killer app for edge computing","volume":"50","author":"Ananthanarayanan","year":"2017","journal-title":"Computer"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1655","DOI":"10.1109\/JPROC.2019.2921977","article-title":"Deep learning with edge computing: A review","volume":"107","author":"Chen","year":"2019","journal-title":"Proc. IEEE"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3527155","article-title":"Split computing and early exiting for deep learning applications: Survey and research challenges","volume":"55","author":"Matsubara","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1109\/MCOM.001.2000373","article-title":"Communication-computation trade-off in resource-constrained edge inference","volume":"58","author":"Shao","year":"2020","journal-title":"IEEE Commun. Mag."},{"key":"ref_7","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1145\/3093337.3037698","article-title":"Neurosurgeon: Collaborative intelligence between the cloud and mobile edge","volume":"45","author":"Kang","year":"2017","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hu, C., and Li, B. (2022, January 2\u20135). Distributed inference with deep learning models across heterogeneous edge devices. Proceedings of the IEEE INFOCOM 2022-IEEE Conference on Computer Communications, Virtual.","DOI":"10.1109\/INFOCOM48880.2022.9796896"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Mohammed, T., Joe-Wong, C., Babbar, R., and Di Francesco, M. (2022, January 2\u20135). Distributed inference acceleration with adaptive DNN partitioning and offloading. Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications, Virtual.","DOI":"10.1109\/INFOCOM41043.2020.9155237"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Li, H., Hu, C., Jiang, J., Wang, Z., Wen, Y., and Zhu, W. (2018, January 11\u201313). Jalad: Joint accuracy-and latency-aware deep structure decoupling for edge-cloud execution. Proceedings of the 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), Singapore.","DOI":"10.1109\/PADSW.2018.8645013"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Shao, J., and Zhang, J. (2020, January 7\u201311). Bottlenet++: An end-to-end approach for feature compression in device-edge co-inference systems. Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland.","DOI":"10.1109\/ICCWorkshops49005.2020.9145068"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Laskaridis, S., Venieris, S.I., Almeida, M., Leontiadis, I., and Lane, N.D. (2020, January 21\u201325). SPINN: Synergistic progressive inference of neural networks over device and cloud. Proceedings of the 26th Annual International Conference on Mobile Computing and Networking, London, UK.","DOI":"10.1145\/3372224.3419194"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3973","DOI":"10.1109\/TNSM.2021.3116665","article-title":"Joint optimization with DNN partitioning and resource allocation in mobile edge computing","volume":"18","author":"Dong","year":"2021","journal-title":"IEEE Trans. Netw. Serv. Manag."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"9511","DOI":"10.1109\/JIOT.2020.3010258","article-title":"Joint multiuser dnn partitioning and computational resource allocation for collaborative edge intelligence","volume":"8","author":"Tang","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3630266","article-title":"Partnner: Platform-agnostic adaptive edge-cloud dnn partitioning for minimizing end-to-end latency","volume":"23","author":"Ghosh","year":"2024","journal-title":"ACM Trans. Embed. Comput. Syst."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Peng, S., Shen, Z., Zheng, Q., Hou, X., Jiang, D., Yuan, J., and Jin, J. (2025). APT-SAT: An Adaptive DNN Partitioning and Task Offloading Framework within Collaborative Satellite Computing Environments. IEEE Trans. Netw. Sci. Eng.","DOI":"10.1109\/TNSE.2025.3585287"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2914","DOI":"10.1109\/TNSM.2025.3561739","article-title":"Joint DNN Partitioning and Task Offloading Based on Attention Mechanism-Aided Reinforcement Learning","volume":"22","author":"Zhang","year":"2025","journal-title":"IEEE Trans. Netw. Serv. Manag."},{"key":"ref_19","first-page":"7","article-title":"Joint Architecture Design and Workload Partitioning for DNN Inference on Industrial IoT Clusters","volume":"23","author":"Fang","year":"2022","journal-title":"ACM Trans. Internet Technol. (TOIT)"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Hu, C., Bao, W., Wang, D., and Liu, F. (May, January 29). Dynamic adaptive DNN surgery for inference acceleration on the edge. Proceedings of the IEEE INFOCOM 2019-IEEE Conference on Computer Communications, Paris, France.","DOI":"10.1109\/INFOCOM.2019.8737614"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1109\/TNET.2020.3042320","article-title":"Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices","volume":"29","author":"Zeng","year":"2020","journal-title":"IEEE\/ACM Trans. Netw."},{"key":"ref_22","unstructured":"Xiao, Z., Xia, Z., Zheng, H., Zhao, B.Y., and Jiang, J. (2021, January 14\u201317). Towards performance clarity of edge video analytics. Proceedings of the 2021 IEEE\/ACM Symposium on Edge Computing (SEC), San Jose, CA, USA."},{"key":"ref_23","unstructured":"Du, K., Zhang, Q., Arapin, A., Wang, H., Xia, Z., and Jiang, J. (2022). Accmpeg: Optimizing video encoding for video analytics. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Chen, B., Yan, Z., and Nahrstedt, K. (2022, January 14\u201317). Context-aware image compression optimization for visual analytics offloading. Proceedings of the 13th ACM Multimedia Systems Conference, Athlone, Ireland.","DOI":"10.1145\/3524273.3528178"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wang, X., Gao, G., Wu, X., Lyu, Y., and Wu, W. (2022, January 17). Dynamic DNN model selection and inference off loading for video analytics with edge-cloud collaboration. Proceedings of the 32nd Workshop on Network and Operating Systems Support for Digital Audio and Video, Athlone, Ireland.","DOI":"10.1145\/3534088.3534352"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Ran, X., Chen, H., Zhu, X., Liu, Z., and Chen, J. (2018, January 15\u201319). Deepdecision: A mobile deep learning framework for edge video analytics. Proceedings of the IEEE INFOCOM 2018-IEEE Conference on Computer Communications, Honolulu, HI, USA.","DOI":"10.1109\/INFOCOM.2018.8485905"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"5870","DOI":"10.1109\/TMC.2022.3189186","article-title":"EdgeAdaptor: Online Configuration Adaption, Model Selection and Resource Provisioning for Edge DNN Inference Serving at Scale","volume":"22","author":"Zhao","year":"2022","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"9083","DOI":"10.1109\/TMM.2024.3385678","article-title":"EdgeVision: Towards collaborative video analytics on distributed edges for performance maximization","volume":"26","author":"Gao","year":"2024","journal-title":"IEEE Transactions on Multimedia"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Dong, Y., and Gao, G. (2024, January 13\u201316). EdgeCam: A Distributed Camera Operating System for Inference Scheduling and Continuous Learning. Proceedings of the 2024 IEEE\/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI), Hong Kong, China.","DOI":"10.1109\/IoTDI61053.2024.00028"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Jiang, J., Luo, Z., Hu, C., He, Z., Wang, Z., Xia, S., and Wu, C. (2021, January 7\u201310). Joint model and data adaptation for cloud inference serving. Proceedings of the 2021 IEEE Real-Time Systems Symposium (RTSS), Dortmund, Germany.","DOI":"10.1109\/RTSS52674.2021.00034"},{"key":"ref_31","unstructured":"Zhang, H., Ananthanarayanan, G., Bodik, P., Philipose, M., Bahl, P., and Freedman, M.J. (2017, January 27\u201329). Live video analytics at scale with approximation and delay-tolerance. Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, Boston, MA, USA."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Jiang, J., Ananthanarayanan, G., Bodik, P., Sen, S., and Stoica, I. (2018, January 20\u201325). Chameleon: Scalable adaptation of video analytics. Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, Budapest, Hungary.","DOI":"10.1145\/3230543.3230574"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, J., and Gao, G. (2025, January 24\u201326). CSVA: Complexity-Driven and Semantic-Aware Video Analytics via Edge-Cloud Collaboration. Proceedings of the International Conference on Wireless Artificial Intelligent Computing Systems and Applications, Tokyo, Japan.","DOI":"10.1007\/978-981-96-8731-2_11"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, L., Li, H., and Gruteser, M. (2019, January 21\u201325). Edge assisted real-time object detection for mobile augmented reality. Proceedings of the 25th Annual International Conference on Mobile Computing and Networking, Los Cabos, Mexico.","DOI":"10.1145\/3300061.3300116"}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/4\/117\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T09:12:50Z","timestamp":1760346770000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/4\/117"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,13]]},"references-count":36,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["make7040117"],"URL":"https:\/\/doi.org\/10.3390\/make7040117","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,13]]}}}