{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,5,22]],"date-time":"2024-05-22T05:50:02Z","timestamp":1716357002014},"reference-count":55,"publisher":"Wiley","issue":"12","license":[{"start":{"date-parts":[[2022,9,15]],"date-time":"2022-09-15T00:00:00Z","timestamp":1663200000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Trans Emerging Tel Tech"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Internet of Things (IoT) edge intelligence has emerged by optimizing the deep learning (DL) models deployed on resource\u2010constraint devices for quick decision\u2010making. In addition, edge intelligence reduces network overload and latency by bringing intelligent analytics closer to the source. On the other hand, DL models need a lot of computing resources. As a result, they have high computational workloads and memory footprint, making it impractical to deploy and execute on IoT edge devices with limited capabilities. In addition, existing layer\u2010based partitioning methods generate many intermediate results, resulting in a huge memory footprint. In this article, we propose a framework to provide a comprehensive solution that enables the deployment of convolutional neural networks (CNNs) onto distributed IoT devices for faster inference and reduced memory footprint. This framework considers a pretrained YOLOv2 model, and a weight pruning technique is applied to the pre\u2010trained model to reduce the number of non\u2010contributing parameters. We use the fused layer partitioning method to vertically partition the fused layers of the CNN and then distribute the partition among the edge devices to process the input. In our experiment, we have considered multiple Raspberry Pi as edge devices. Raspberry Pi with a neural computing stick is a gateway device to combine the results from various edge devices and get the final output. Our proposed model achieved inference latency of 5 to <jats:inline-graphic xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"graphic\/ett4648-math-0001.png\" xlink:title=\"urn:x-wiley:ett:media:ett4648:ett4648-math-0001\" \/>7 seconds for <jats:inline-graphic xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"graphic\/ett4648-math-0002.png\" xlink:title=\"urn:x-wiley:ett:media:ett4648:ett4648-math-0002\" \/> to <jats:inline-graphic xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"graphic\/ett4648-math-0003.png\" xlink:title=\"urn:x-wiley:ett:media:ett4648:ett4648-math-0003\" \/> fused layer partitioning for five devices with a 9% improvement in memory footprint.<\/jats:p>","DOI":"10.1002\/ett.4648","type":"journal-article","created":{"date-parts":[[2022,9,15]],"date-time":"2022-09-15T11:00:26Z","timestamp":1663239626000},"update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Memory optimization at Edge for Distributed Convolution Neural Network"],"prefix":"10.1002","volume":"33","author":[{"ORCID":"http:\/\/orcid.org\/0000-0001-9552-3047","authenticated-orcid":false,"given":"Soumyalatha","family":"Naveen","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering REVA University  Bangalore Karnataka India"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-2432-2552","authenticated-orcid":false,"given":"Manjunath R.","family":"Kounte","sequence":"additional","affiliation":[{"name":"School of Electronics and Communication Engineering REVA University  Bangalore Karnataka India"}]}],"member":"311","published-online":{"date-parts":[[2022,9,15]]},"reference":[{"key":"e_1_2_10_2_1","unstructured":"Cisco.Cisco annual internet report (2018\u20132023) white paper; cisco systems White Paper; January 2020."},{"key":"e_1_2_10_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2019.2918951"},{"key":"e_1_2_10_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2020.2984887"},{"key":"e_1_2_10_5_1","doi-asserted-by":"crossref","unstructured":"LiE ZhouZ ChenX.Edge intelligence: on\u2010demand deep learning model co\u2010inference with device\u2010edge synergy. Proceedings of the 2018 Workshop on Mobile Edge Communications; August 7 2018:31\u201036; Budapest Hungary.","DOI":"10.1145\/3229556.3229562"},{"key":"e_1_2_10_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2018.2858384"},{"key":"e_1_2_10_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10766-021-00712-3"},{"issue":"7","key":"e_1_2_10_8_1","first-page":"1665","article-title":"Model parallelism optimization for distributed inference via decoupled CNN structure","volume":"32","author":"Du J","year":"2020","journal-title":"IEEE Trans Parallel Distrib Syst"},{"key":"e_1_2_10_9_1","doi-asserted-by":"crossref","unstructured":"ZhangSQ LinJ ZhangQ.Adaptive distributed convolutional neural network inference at the network edge with ADCNN. Proceedings of the 49th International Conference on Parallel Processing\u2010ICPP; August 17 2020:1\u201011; Edmonton AB.","DOI":"10.1145\/3404397.3404473"},{"key":"e_1_2_10_10_1","doi-asserted-by":"crossref","unstructured":"MaoJ ChenX NixonKW KriegerC ChenY.MoDNN: local distributed mobile computing system for deep neural network. Proceedings of the Design Automation & Test in Europe Conference & Exhibition (DATE); 2017:1396\u20101401; Lausanne Switzerland.","DOI":"10.23919\/DATE.2017.7927211"},{"key":"e_1_2_10_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3358205"},{"key":"e_1_2_10_12_1","unstructured":"LiuZ SunM ZhouT HuangG DarrellT.Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270 October 11 2018."},{"key":"e_1_2_10_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3131396"},{"key":"e_1_2_10_14_1","unstructured":"TaoZ LiQ.eSGD: communication efficient distributed deep learning on the edge. Proceedings of the USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18); 2018; Boston MA."},{"key":"e_1_2_10_15_1","doi-asserted-by":"crossref","unstructured":"LangroudiHF KariaV GustafsonJL KudithipudiD.Adaptive posit:pParameter aware numerical format for deep learning inference on the edge. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops; 2020:726\u2010727; Seattle WA.","DOI":"10.1109\/CVPRW50498.2020.00371"},{"key":"e_1_2_10_16_1","doi-asserted-by":"crossref","unstructured":"BenditkisD KerenA Mor\u2010YosefL AvidorT ShohamN Tal\u2010IsraelN.Distributed deep neural network training on edge devices. Proceedings of the 4th ACM\/IEEE Symposium on Edge Computing; November 7 2019:304\u2010306; Arlington Virginia.","DOI":"10.1145\/3318216.3363324"},{"key":"e_1_2_10_17_1","unstructured":"ShiW HouY ZhouS NiuZ ZhangY GengL.Improving device\u2010edge cooperative inference of deep learning via 2\u2010step pruning. Proceedings of the IEEE INFOCOM 2019\u2010IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS); Apr 29 2019:1\u20106; Paris France: IEEE."},{"key":"e_1_2_10_18_1","article-title":"Distributed deep learning model for intelligent video surveillance systems with edge computing","author":"Chen J","year":"2019","journal-title":"IEEE Trans Ind Inform"},{"key":"e_1_2_10_19_1","doi-asserted-by":"crossref","unstructured":"LiD SalonidisT DesaiNV ChuahMC.Deepcham: collaborative edge\u2010mediated adaptive deep learning for mobile object recognition. Proceedings of the 2016 IEEE\/ACM Symposium on Edge Computing (SEC); October 27 2016:64\u201076; Washington DC: IEEE.","DOI":"10.1109\/SEC.2016.38"},{"key":"e_1_2_10_20_1","unstructured":"JiangZ ChenT LiM.Efficient deep learning inference on edge devices. ACM SysML; 2018."},{"key":"e_1_2_10_21_1","doi-asserted-by":"crossref","unstructured":"DeyS MondalJ MukherjeeA.Offloaded execution of deep learning inference at edge: challenges and insights. Proceedings of the 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops); March 11 2019:855\u2010861; Kyoto Japan: IEEE.","DOI":"10.1109\/PERCOMW.2019.8730817"},{"key":"e_1_2_10_22_1","doi-asserted-by":"crossref","unstructured":"GunarathneB PrabhathC PereraV GunasekaraK.Distributing deep learning inference on edge devices. Proceedings of the 16th International Conference on Emerging Networking Experiments and Technologies; November 23 2020:556\u2010557; Barcelona Spain.","DOI":"10.1145\/3386367.3431666"},{"key":"e_1_2_10_23_1","doi-asserted-by":"crossref","unstructured":"NaveenS KounteMR.Machine learning at resource constraint edge device using bonsai algorithm. Proceedings of the 2020 3rd International Conference on Advances in Electronics Computers and Communications (ICAECC); December 11 2020:1\u20106; Bangalore India: IEEE.","DOI":"10.1109\/ICAECC50550.2020.9339514"},{"key":"e_1_2_10_24_1","unstructured":"ZhuM GuptaS.To prune or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878 October 5 2017."},{"key":"e_1_2_10_25_1","unstructured":"HanS MaoH DallyWJ.Deep compression: compressing deep neural networks with pruning trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149 October 1 2015."},{"key":"e_1_2_10_26_1","unstructured":"ChengY WangD ZhouP ZhangT.A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282; October 3 2017."},{"key":"e_1_2_10_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3487045"},{"key":"e_1_2_10_28_1","doi-asserted-by":"crossref","unstructured":"GeS LuoZ ZhaoS JinX ZhangXY.Compressing deep neural networks for efficient visual inference. Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME); July 10 2017:667\u2010672; Hong Kong China: IEEE.","DOI":"10.1109\/ICME.2017.8019465"},{"key":"e_1_2_10_29_1","unstructured":"YeS FengX ZhangT et al.Progressive DNN compression: a key to achieve ultra\u2010high weight pruning and quantization rates using ADMM. arXiv preprint arXiv:1903.09769 March 23 2019."},{"key":"e_1_2_10_30_1","unstructured":"YuPH WuSS KloppJP ChenLG ChienSY.Joint pruning & quantization for extremely sparse neural networks. arXiv preprint arXiv:2010.01892; October 5 2020."},{"key":"e_1_2_10_31_1","doi-asserted-by":"publisher","DOI":"10.3389\/frai.2021.676564"},{"key":"e_1_2_10_32_1","unstructured":"ChinHH TsayRS WuHI.A high\u2010performance adaptive quantization approach for edge cnn applications. arXiv preprint arXiv:2107.08382 July 18 2021."},{"key":"e_1_2_10_33_1","doi-asserted-by":"crossref","unstructured":"XuY WangY ZhouA LinW XiongH.Deep neural network compression with single and multiple level quantization. Proceedings of the AAAI Conference on Artificial Intelligence; Vol. 32 April 29 2018:1; New Orleans Louisiana.","DOI":"10.1609\/aaai.v32i1.11663"},{"key":"e_1_2_10_34_1","doi-asserted-by":"crossref","unstructured":"DingC WangS LiuN XuK WangY LiangY.REQ\u2010YOLO: a resource\u2010aware efficient quantization framework for object detection on FPGAs. Proceedings of the 2019 ACM\/SIGDA International Symposium on Field\u2010Programmable Gate Arrays; February 20 2019:33\u201042; Seaside CA.","DOI":"10.1145\/3289602.3293904"},{"key":"e_1_2_10_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2886192"},{"key":"e_1_2_10_36_1","unstructured":"YeS ZhangT ZhangK et al.A unified framework of dnn weight pruning and weight clustering\/quantization using ADMM. arXiv preprint arXiv:1811.01907 November 5 2018."},{"key":"e_1_2_10_37_1","doi-asserted-by":"crossref","unstructured":"GuerraL DrummondT.Automatic pruning for quantized neural networks. Proceedings of the 2021 Digital Image Computing: Techniques and Applications (DICTA); 2021:01\u201008; Gold Coast Australia: IEEE.","DOI":"10.1109\/DICTA52665.2021.9647074"},{"key":"e_1_2_10_38_1","unstructured":"NishikawaJ IkegayaR.Filter pre\u2010pruning for improved fine\u2010tuning of quantized deep neural networks. arXiv preprint arXiv:2011.06751 November 13 2020."},{"key":"e_1_2_10_39_1","doi-asserted-by":"crossref","unstructured":"HuP PengX ZhuH AlyMM LinJ.Opq: compressing deep neural networks with one\u2010shot pruning\u2010quantization. Proceedings of the AAAI Conference on Artificial Intelligence; Vol. 35 May 18 2021:7780\u20107788; Pennsylvania State University.","DOI":"10.1609\/aaai.v35i9.16950"},{"key":"e_1_2_10_40_1","unstructured":"LiH KadavA DurdanovicI SametH GrafHP.Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710; August 31 2016."},{"key":"e_1_2_10_41_1","doi-asserted-by":"crossref","unstructured":"KimJ ChangS KwakN.PQK: model compression via pruning quantization and knowledge distillation. arXiv preprint arXiv:2106.14681 June 25 2021.","DOI":"10.21437\/Interspeech.2021-248"},{"key":"e_1_2_10_42_1","unstructured":"ZhouL WenH TeodorescuR DuDH.Distributing deep neural networks with containerized partitions at the edge. Proceedings of the 2nd USENIX Workshop on Hot Topics in Edge Computing (HotEdge 19); 2019."},{"key":"e_1_2_10_43_1","doi-asserted-by":"crossref","unstructured":"ZhouL SamavatianMH BachaA MajumdarS TeodorescuR.Adaptive parallel execution of deep neural networks on heterogeneous edge devices. Proceedings of the 4th ACM\/IEEE Symposium on Edge Computing; November 7 2019:195\u2010208; Arlington Virginia.","DOI":"10.1145\/3318216.3363312"},{"key":"e_1_2_10_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2021.3058532"},{"key":"e_1_2_10_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TWC.2022.3156905"},{"key":"e_1_2_10_46_1","doi-asserted-by":"publisher","DOI":"10.3390\/info12070264"},{"key":"e_1_2_10_47_1","doi-asserted-by":"publisher","DOI":"10.1111\/mice.12449"},{"key":"e_1_2_10_48_1","doi-asserted-by":"publisher","DOI":"10.1111\/mice.12755"},{"key":"e_1_2_10_49_1","doi-asserted-by":"crossref","unstructured":"AlwaniM ChenH FerdmanM MilderP.Fused\u2010layer CNN accelerators. Proceedings of the 2016 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO); October 15 2016:1\u201012; Taipei Taiwan: IEEE.","DOI":"10.1109\/MICRO.2016.7783725"},{"key":"e_1_2_10_50_1","doi-asserted-by":"crossref","unstructured":"RedmonJ FarhadiA.YOLO9000: better faster stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2017:7263\u20107271; Honolulu HI.","DOI":"10.1109\/CVPR.2017.690"},{"key":"e_1_2_10_51_1","unstructured":"JosephR.YOLO: real\u2010time object detection. [Online].https:\/\/pjreddie.com\/darknet\/yolo\/"},{"key":"e_1_2_10_52_1","unstructured":"MaoH HanS PoolJ et al.Exploring the regularity of sparse structure in convolutional neural networks. arXiv preprint arXiv:1705.08922 May 24 2017."},{"key":"e_1_2_10_53_1","unstructured":"HanS PoolJ TranJ DallyW.Learning both weights and connections for efficient neural network. Proceedings of the 28th International Conference on Neural Informaion Processing Systems (NIPS);2015:1135\u20101143; Montreal Canada."},{"key":"e_1_2_10_54_1","doi-asserted-by":"crossref","unstructured":"KnightA LeeBK.Performance analysis of network pruning for deep learning based age\u2010gender estimation. Proceedings of the 2020 International Conference on Computational Science and Computational Intelligence (CSCI); December 16 2020:1684\u20101687; Las Vegas NV: IEEE.","DOI":"10.1109\/CSCI51800.2020.00310"},{"key":"e_1_2_10_55_1","unstructured":"JacksonB.Exploring iterative pruning in deep convolutional. Stanford University CS230: deep learning Winter; 2018"},{"key":"e_1_2_10_56_1","doi-asserted-by":"publisher","DOI":"10.3390\/s21227543"}],"container-title":["Transactions on Emerging Telecommunications Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/ett.4648","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/ett.4648","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/ett.4648","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,21]],"date-time":"2023-08-21T02:21:25Z","timestamp":1692584485000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/ett.4648"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,15]]},"references-count":55,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["10.1002\/ett.4648"],"URL":"http:\/\/dx.doi.org\/10.1002\/ett.4648","archive":["Portico"],"relation":{},"ISSN":["2161-3915","2161-3915"],"issn-type":[{"value":"2161-3915","type":"print"},{"value":"2161-3915","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,15]]},"assertion":[{"value":"2022-03-21","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-15","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-09-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}