{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,2]],"date-time":"2025-10-02T00:55:49Z","timestamp":1759366549039,"version":"build-2065373602"},"reference-count":25,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","funder":[{"name":"NSF","award":["CNS-2211509 and CNS-2211508"],"award-info":[{"award-number":["CNS-2211509 and CNS-2211508"]}]},{"DOI":"10.13039\/100007270","name":"University of Michigan","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100007270","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>\n            Computationally efficient, camera-based, real-time human position tracking on low-end, edge devices would enable numerous applications, including privacy-preserving video redaction and analysis. Unfortunately, running most deep neural network based models in real time requires expensive hardware, making widespread deployment difficult, particularly on edge devices. Shifting inference to the cloud increases the attack surface, generally requiring that users trust cloud servers, and increases demands on wireless networks in deployment venues. Our goal is to determine the extreme to which edge video redaction efficiency can be taken, with a particular interest in enabling, for the first time, low-cost, real-time deployments with inexpensive commodity hardware. We present an efficient solution to the human detection (and redaction) problem based on singular value decomposition (SVD) background removal and describe a novel time-efficient and energy-efficient sensor-fusion algorithm that leverages human position information in real-world coordinates to enable real-time visual human detection and tracking at the edge. These ideas are evaluated using a prototype built from (resource-constrained) commodity hardware representative of commonly used low-cost IoT edge devices. The speed and accuracy of the system are evaluated via a deployment study, and it is compared with the most advanced relevant alternatives. The multi-modal system operates at a frame rate ranging from 20\u00a0FPS to 60\u00a0FPS, achieves a\n            <jats:italic toggle=\"yes\">wIoU<\/jats:italic>\n            <jats:sub>0.3<\/jats:sub>\n            score (see Section\u00a0\n            <jats:xref ref-type=\"sec\">5.4<\/jats:xref>\n            ) ranging from 0.71 to 0.79, and successfully performs complete redaction of privacy-sensitive pixels with a success rate of 91%\u201399% in human head regions and 77%\u201391% in upper body regions, depending on the number of individuals present in the field of view. These results demonstrate that it is possible to achieve adequate efficiency to enable real-time redaction on inexpensive, commodity edge hardware.\n          <\/jats:p>","DOI":"10.1145\/3762994","type":"journal-article","created":{"date-parts":[[2025,8,27]],"date-time":"2025-08-27T11:51:58Z","timestamp":1756295518000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Efficient Video Redaction at the Edge: Human Motion Tracking for Privacy Protection"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-7444-3162","authenticated-orcid":false,"given":"Haotian","family":"Qiao","sequence":"first","affiliation":[{"name":"University of Michigan-Ann Arbor","place":["Ann Arbor, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-0286-5582","authenticated-orcid":false,"given":"Vidya","family":"Srinivas","sequence":"additional","affiliation":[{"name":"University of Washington","place":["Seattle, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5315-5987","authenticated-orcid":false,"given":"Peter","family":"Dinda","sequence":"additional","affiliation":[{"name":"Northwestern University","place":["Evanston, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5428-9530","authenticated-orcid":false,"given":"Robert","family":"Dick","sequence":"additional","affiliation":[{"name":"University of Michigan-Ann Arbor","place":["Ann Arbor, United States"]}]}],"member":"320","published-online":{"date-parts":[[2025,9,26]]},"reference":[{"key":"e_1_3_4_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICUWB.2008.4653361"},{"key":"e_1_3_4_3_2","doi-asserted-by":"publisher","unstructured":"Yeong-Jun Cho. 2024. Weighted Intersection over Union (wIoU) for evaluating image segmentation. Pattern Recognition Letters 185 (2024) 101\u2013107. DOI:10.1016\/j.patrec.2024.07.011","DOI":"10.1016\/j.patrec.2024.07.011"},{"key":"e_1_3_4_4_2","doi-asserted-by":"publisher","unstructured":"Mathias Ciliberto Vitor Fortes Rey Alberto Calatroni Paul Lukowicz and Daniel Roggen. 2021. Opportunity ++: A Multimodal Dataset for Video- and Wearable Object and Ambient Sensors-based Human Activity Recognition. DOI:10.21227\/vd6r-db31","DOI":"10.21227\/vd6r-db31"},{"key":"e_1_3_4_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCOMM.2021.3070311"},{"key":"e_1_3_4_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW59228.2023.00019"},{"key":"e_1_3_4_7_2","doi-asserted-by":"publisher","DOI":"10.56553\/popets-2024-0146"},{"key":"e_1_3_4_8_2","unstructured":"Matthew Ishige Yasuhiro Yoshimura and Ryo Yonetani. 2024. Opt-in camera: Person identification in video via UWB localization and its application to opt-in systems. arXiv:2409.19891. Retrieved from https:\/\/arxiv.org\/abs\/2409.19891 (2024)."},{"key":"e_1_3_4_9_2","volume-title":"Ultralytics YOLO","author":"Jocher Glenn","year":"2023","unstructured":"Glenn Jocher, Ayush Chaurasia, and Jing Qiu. 2023. Ultralytics YOLO."},{"key":"e_1_3_4_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2841825"},{"key":"e_1_3_4_11_2","doi-asserted-by":"publisher","unstructured":"Jun Ha Lee and Su Jeong You. 2024. Balancing privacy and accuracy: Exploring the impact of data anonymization on deep learning models in computer vision. IEEE Access 12 (2024) 8346\u20138358. DOI:10.1109\/ACCESS.2024.3352146","DOI":"10.1109\/ACCESS.2024.3352146"},{"key":"e_1_3_4_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2419084"},{"key":"e_1_3_4_13_2","volume-title":"Proceedings of the Computer Vision for AR\/VR at IEEE Computer Vision and Pattern Recognition","author":"Lugaresi Camillo","year":"2019","unstructured":"Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, and Matthias Grundmann. 2019. MediaPipe: A framework for perceiving and processing reality. In Proceedings of the Computer Vision for AR\/VR at IEEE Computer Vision and Pattern Recognition. Retrieved from https:\/\/mixedreality.cs.cornell.edu\/s\/NewTitle_May1_MediaPipe_CVPR_CV4ARVR_Workshop_2019.pdf"},{"key":"e_1_3_4_14_2","doi-asserted-by":"publisher","DOI":"10.3390\/s24206732"},{"key":"e_1_3_4_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICL-GNSS.2015.7217140"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3057838"},{"key":"e_1_3_4_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.868684"},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSEN.2019.2935634"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2015.01.041"},{"key":"e_1_3_4_20_2","unstructured":"Fabian Pedregosa Ga\u00ebl Varoquaux Alexandre Gramfort Vincent Michel Bertrand Thirion Olivier Grisel Mathieu Blondel Peter Prettenhofer Ron Weiss Vincent Dubourg Jake Vanderplas Alexandre Passos David Cournapeau Matthieu Brucher Matthieu Perrot and \u00c9douard Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 null (2011) 2825\u20132830."},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.3390\/s22041394"},{"key":"e_1_3_4_22_2","unstructured":"Jonathon Shlens. 2014. A Tutorial on principal component analysis. Retrieved from https:\/\/arxiv.org\/abs\/1404.1100"},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.1999.784637"},{"key":"e_1_3_4_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ITOEC49072.2020.9141707"},{"key":"e_1_3_4_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.888718"},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2004.1333992"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3762994","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,1]],"date-time":"2025-10-01T17:13:13Z","timestamp":1759338793000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3762994"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,26]]},"references-count":25,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3762994"],"URL":"https:\/\/doi.org\/10.1145\/3762994","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2025,9,26]]},"assertion":[{"value":"2025-08-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-13","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-26","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}