{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:57:00Z","timestamp":1772726220781,"version":"3.50.1"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,11,9]],"date-time":"2023-11-09T00:00:00Z","timestamp":1699488000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>We present SensiX++, a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors. SensiX++ operates on two fundamental principles: highly modular componentisation to externalise data operations with clear abstractions and document-centric manifestation for system-wide orchestration. First, a data coordinator manages the lifecycle of sensors and serves models with correct data through automated transformations. Next, a resource-aware model server executes multiple models in isolation through model abstraction, pipeline automation, and feature sharing. An adaptive scheduler then orchestrates the best-effort executions of multiple models across heterogeneous accelerators, balancing latency and throughput. Finally, microservices with REST APIs serve synthesised model predictions, system statistics, and continuous deployment. Collectively, these components enable SensiX++ to serve multiple models efficiently with fine-grained control on edge devices while minimising data operation redundancy, managing data and device heterogeneity, and reducing resource contention. We benchmark SensiX++ with 10 different vision and acoustics models across various multi-tenant configurations on different edge accelerators (Jetson AGX and Coral TPU) designed for sensory devices. We report on the overall throughput and quantified benefits of various automation components of SensiX++ and demonstrate its efficacy in significantly reducing operational complexity and lowering the effort to deploy, upgrade, reconfigure, and serve embedded models on edge devices.<\/jats:p>","DOI":"10.1145\/3617507","type":"journal-article","created":{"date-parts":[[2023,9,7]],"date-time":"2023-09-07T11:29:12Z","timestamp":1694086152000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["SensiX++: Bringing MLOps and Multi-tenant Model Serving to Sensory Edge Devices"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5197-9840","authenticated-orcid":false,"given":"Chulhong","family":"Min","sequence":"first","affiliation":[{"name":"Nokia Bell Labs, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1475-3017","authenticated-orcid":false,"given":"Akhil","family":"Mathur","sequence":"additional","affiliation":[{"name":"Nokia Bell Labs, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7222-2145","authenticated-orcid":false,"given":"Utku G\u00fcnay","family":"Acer","sequence":"additional","affiliation":[{"name":"Nokia Bell Labs, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4444-6242","authenticated-orcid":false,"given":"Alessandro","family":"Montanari","sequence":"additional","affiliation":[{"name":"Nokia Bell Labs, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5057-9557","authenticated-orcid":false,"given":"Fahim","family":"Kawsar","sequence":"additional","affiliation":[{"name":"Nokia Bell Labs, UK"}]}],"member":"320","published-online":{"date-parts":[[2023,11,9]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"2021. BentoML. (2021). Retrieved August 8 2023 from https:\/\/www.bentoml.ai"},{"key":"e_1_3_2_3_2","unstructured":"2021. Coral Keyphrase Detector. (2021). Retrieved August 8 2023 from https:\/\/github.com\/google-coral\/project-keyword-spotter"},{"key":"e_1_3_2_4_2","unstructured":"2021. ElectrifAI. (2021). Retrieved August 8 2023 from https:\/\/electrifai.net"},{"key":"e_1_3_2_5_2","unstructured":"2021. Emotion Classification. (2021). Retrieved August 8 2023 from https:\/\/github.com\/Data-Science-kosta\/Speech-Emotion-Classification-with-PyTorch\/"},{"key":"e_1_3_2_6_2","unstructured":"2021. KubeFlow. (2021). Retrieved August 8 2023 from https:\/\/www.kubeflow.org"},{"key":"e_1_3_2_7_2","unstructured":"2021. LevelDB. (2021). Retrieved August 8 2023 from https:\/\/github.com\/google\/leveldb"},{"key":"e_1_3_2_8_2","unstructured":"2021. Michelangelo. (2021). Retrieved August 8 2023 from https:\/\/eng.uber.com\/michelangelo-machine-learning-platform\/"},{"key":"e_1_3_2_9_2","unstructured":"2021. SageMaker. (2021). Retrieved August 8 2023 from https:\/\/aws.amazon.com\/sagemaker\/"},{"key":"e_1_3_2_10_2","unstructured":"2021. YAMNet. (2021). Retrieved August 8 2023 from https:\/\/github.com\/tensorflow\/models\/tree\/master\/research\/audioset\/yamnet"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3534573"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/EAIS51927.2022.9787703"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3363347.3363363"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3132211.3134454"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3508396.3512870"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3560905.3568512"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/2994551.2994564"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/2494091.2499576"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/FG.2018.00020"},{"key":"e_1_3_2_20_2","volume-title":"Proceedings of the 7th Biennial Conference on Innovative Data Systems Research (CIDR\u201915)","author":"Crankshaw Daniel","year":"2015","unstructured":"Daniel Crankshaw, Peter Bailis, Joseph E. Gonzalez, Haoyuan Li, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan. 2015. The missing piece in complex analytics: Low latency, scalable model management and serving with velox. In Proceedings of the 7th Biennial Conference on Innovative Data Systems Research (CIDR\u201915). www.cidrdb.org. http:\/\/cidrdb.org\/cidr2015\/Papers\/CIDR15_Paper19u.pdf"},{"key":"e_1_3_2_21_2","first-page":"613","volume-title":"Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI\u201917)","author":"Crankshaw Daniel","year":"2017","unstructured":"Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A low-latency online prediction serving system. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI\u201917). USENIX Association, Boston, MA, 613\u2013627. https:\/\/www.usenix.org\/conference\/nsdi17\/technical-sessions\/presentation\/crankshaw"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241559"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/2973750.2973777"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/2594368.2594383"},{"key":"e_1_3_2_25_2","unstructured":"Song Han Huizi Mao and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning trained quantization and Huffman coding. https:\/\/arxiv.org\/abs\/1510.00149"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/2906388.2906396"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.1603.05027"},{"key":"e_1_3_2_28_2","doi-asserted-by":"crossref","unstructured":"Gao Huang Zhuang Liu Laurens van der Maaten and Kilian Q. Weinberger. 2018. Densely connected convolutional networks. https:\/\/arxiv.org\/abs\/1608.06993","DOI":"10.1109\/CVPR.2017.243"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2212.03332"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.632"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3498361.3538948"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3093337.3037698"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/MPRV.2018.03367740"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/AVSS.2018.8639121"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPSN.2016.7460664"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2010.5560598"},{"key":"e_1_3_2_37_2","unstructured":"Hao Li Asim Kadav Igor Durdanovic Hanan Samet and Hans Peter Graf. 2016. Pruning filters for efficient convNets. https:\/\/arxiv.org\/abs\/1608.08710"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3314404"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3210240.3210337"},{"key":"e_1_3_2_40_2","unstructured":"Zhuang Liu Mingjie Sun Tinghui Zhou Gao Huang and Trevor Darrell. 2018. Rethinking the value of network pruning. https:\/\/arxiv.org\/abs\/1810.05270"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3302506.3310398"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081359"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPSN.2018.00048"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3341163.3347716"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2022.3173914"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3356250.3360043"},{"key":"e_1_3_2_47_2","volume-title":"Proceedings of KDD 2017","author":"Modi Akshay Naresh","year":"2017","unstructured":"Akshay Naresh Modi, Chiu Yuen Koo, Chuan Yu Foo, Clemens Mewald, Denis M. Baylor, Eric Breck, Heng-Tze Cheng, Jarek Wilkiewicz, Levent Koc, Lukasz Lew, Martin A. Zinkevich, Martin Wicke, Mustafa Ispir, Neoklis Polyzotis, Noah Fiedel, Salem Elie Haykal, Steven Whang, Sudip Roy, Sukriti Ramesh, Vihan Jain, Xin Zhang, and Zakaria Haque. 2017. TFX: A TensorFlow-based production-scale machine learning platform. In Proceedings of KDD 2017."},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3341162.3349337"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3213526.3213532"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.14236\/ewic\/HCI2016.18"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3384419.3430782"},{"key":"e_1_3_2_52_2","doi-asserted-by":"crossref","unstructured":"Arthur Moss Hyunjong Lee Lei Xun Chulhong Min Fahim Kawsar and Alessandro Montanari. 2022. Ultra-low power DNN accelerators for IoT: Resource characterization of the MAX78000. In Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems (SenSys\u201922) 934\u2013940.","DOI":"10.1145\/3560905.3568300"},{"key":"e_1_3_2_53_2","first-page":"20","volume-title":"NeurIPS Workshop on Systems for Machine Learning","author":"Narayanan Deepak","year":"2018","unstructured":"Deepak Narayanan, Keshav Santhanam, Amar Phanishayee, and Matei Zaharia. 2018. Accelerating deep learning workloads through efficient multi-model execution. In NeurIPS Workshop on Systems for Machine Learning. 20."},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.3390\/s16010115"},{"key":"e_1_3_2_55_2","article-title":"Challenges in deploying machine learning: A survey of case studies","volume":"2011","author":"Paleyes Andrei","year":"2020","unstructured":"Andrei Paleyes, Raoul-Gabriel Urma, and Neil D. Lawrence. 2020. Challenges in deploying machine learning: A survey of case studies. CoRR abs\/2011.09926 (2020). arxiv:2011.09926.https:\/\/arxiv.org\/abs\/2011.09926","journal-title":"CoRR"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/SCC53864.2021.00038"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.32"},{"key":"e_1_3_2_58_2","unstructured":"Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An incremental improvement. https:\/\/arxiv.org\/abs\/1804.02767"},{"key":"e_1_3_2_59_2","doi-asserted-by":"crossref","unstructured":"Mark Sandler Andrew Howard Menglong Zhu Andrey Zhmoginov and Liang-Chieh Chen. 2019. MobileNetV2: Inverted residuals and linear bottlenecks. https:\/\/arxiv.org\/abs\/1801.04381","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2017.9"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.5555\/2969442.2969519"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.1512.00567"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2017.226"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3485730.3493453"},{"key":"e_1_3_2_65_2","unstructured":"Michael Zhu and Suyog Gupta. 2017. To prune or not to prune: Exploring the efficacy of pruning for model compression. https:\/\/arxiv.org\/abs\/1710.01878"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3617507","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3617507","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:45:58Z","timestamp":1750178758000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3617507"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,9]]},"references-count":64,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3617507"],"URL":"https:\/\/doi.org\/10.1145\/3617507","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,9]]},"assertion":[{"value":"2022-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-29","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-11-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}