{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T17:30:41Z","timestamp":1764783041327,"version":"3.46.0"},"reference-count":83,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"name":"China NSF grant","award":["62025204"],"award-info":[{"award-number":["62025204"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Meas. Anal. Comput. Syst."],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:p>Machine learning (ML) models are increasingly integrated into modern mobile apps to enable personalized and intelligent services. These models typically rely on rich input features derived from historical user behaviors to capture user intents. However, as ML-driven services become more prevalent, recording necessary user behavior data imposes substantial storage cost on mobile apps, leading to lower system responsiveness and more app uninstalls. To address this storage bottleneck, we present AdaLog, a lightweight and adaptive system designed to improve the storage efficiency of user behavior log in ML-embedded mobile apps, without compromising model inference accuracy or latency. We identify two key inefficiencies in current industrial practices of user behavior log: (i) redundant logging of overlapping behavior data across different features and models, and (ii) sparse storage caused by storing behaviors with heterogeneous attribute descriptions in a single log file. To solve these issues, AdaLog first formulates the elimination of feature-level redundant data as a maximum weighted matching problem in hypergraphs, and proposes a hierarchical algorithm for efficient on-device deployment. Then, AdaLog employs a virtually hashed attribute design to distribute heterogeneous behaviors into a few log files with physically dense storage. Finally, to ensure scalability to dynamic user behavior patterns, AdaLog designs an incremental update mechanism to minimize the I\/O operations needed for adapting outdated behavior log. We implement a prototype of AdaLog and deploy it into popular mobile apps in collaboration with our industry partner. Evaluations on real-world user data show that AdaLog reduces behavior log size by 19% to 44% with minimal system overhead (only 2 seconds latency and 15 MB memory usage), providing a more efficient data foundation for broader adoption of on-device ML.<\/jats:p>","DOI":"10.1145\/3771575","type":"journal-article","created":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T20:07:03Z","timestamp":1764706023000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Optimizing Storage Overhead of User Behavior Log for ML-embedded Mobile Apps"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0333-6418","authenticated-orcid":false,"given":"Chen","family":"Gong","sequence":"first","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-0702-5093","authenticated-orcid":false,"given":"Yan","family":"Zhuang","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5094-5331","authenticated-orcid":false,"given":"Zhenzhe","family":"Zheng","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2775-3566","authenticated-orcid":false,"given":"Yiliu","family":"Chen","sequence":"additional","affiliation":[{"name":"ByteDance, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-5554-6207","authenticated-orcid":false,"given":"Sheng","family":"Wang","sequence":"additional","affiliation":[{"name":"ByteDance, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0965-9058","authenticated-orcid":false,"given":"Fan","family":"Wu","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6934-1685","authenticated-orcid":false,"given":"Guihai","family":"Chen","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}]}],"member":"320","published-online":{"date-parts":[[2025,12,2]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2025. App uninstall report -- 2025 edition. https:\/\/www.appsflyer.com\/resources\/reports\/app-uninstall-benchmarks\/."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142473.1142548"},{"volume-title":"ACM Internet Measurement Conference (IMC). 658-672","author":"Almeida M\u00e1rio","key":"e_1_2_1_3_1","unstructured":"M\u00e1rio Almeida, Stefanos Laskaridis, Abhinav Mehrotra, Lukasz Dudziak, Ilias Leontiadis, and Nicholas D. Lane. 2021. Smart at what cost?: characterising mobile deep neural networks in the wild. In ACM Internet Measurement Conference (IMC). 658-672."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-015-0062-2"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2994551.2994564"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dam.2017.11.029"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3089801.3089804"},{"key":"e_1_2_1_8_1","volume-title":"ACM Conference on Embedded Networked Sensor Systems (Sensys). 155-168","author":"Yu-Han Chen Tiffany","year":"2015","unstructured":"Tiffany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In ACM Conference on Embedded Networked Sensor Systems (Sensys). 155-168."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988450.2988454"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2959100.2959190"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611973105.25"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/356571.356573"},{"key":"e_1_2_1_13_1","unstructured":"Apple Developer. 2023. Maximum build file sizes. https:\/\/developer.apple.com\/help\/app-store-connect\/reference\/maximum-build-file-sizes\/."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241559"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4842-1766-5"},{"key":"e_1_2_1_16_1","unstructured":"Apache Software Foundation. 2025. The smallest fastest columnar storage for Hadoop workloads. https:\/\/orc.apache.org\/."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554842"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3446662"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2843948"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3711896.3736823"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2024.3365534"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3636534.3690701"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3543507.3583426"},{"key":"e_1_2_1_24_1","unstructured":"Google. [n.d.]. Android Developers: APK Expansion Files. https:\/\/developer.android.com\/google\/play\/expansion-files."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241557"},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Peizhen Guo andWenjun Hu. 2018. Potluck: Cross-Application Approximate Deduplication for Computation-Intensive Mobile Applications. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 271-284.","DOI":"10.1145\/3173162.3173185"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3442381.3449942"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.5523"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3498361.3538932"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447993.3483274"},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Se Young Jung Jeong-Whun Kim Hee Hwang Keehyuck Lee Rong-Min Baek Ho-Young Lee Sooyoung Yoo Wongeun Song and Jong Soo Han. 2019. Development of comprehensive personal health records integrating patient-generated health data directly from Samsung S-Health and Apple Health apps: retrospective cross-sectional observational study. JMIR mHealth and uHealth 7 5 (2019) e12691.","DOI":"10.2196\/12691"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/WCNC.2013.6555309"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077136.3080838"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3302424.3303950"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO50266.2020.00090"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12532-009-0002-8"},{"volume-title":"Using SQLite. ''O'Reilly Media","author":"Kreibich Jay","key":"e_1_2_1_37_1","unstructured":"Jay Kreibich. 2010. Using SQLite. ''O'Reilly Media, Inc.''."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3589263"},{"key":"e_1_2_1_39_1","volume-title":"USENIX Symposium on Operating Systems Design and Implementation (OSDI). 817-831","author":"Lai Fan","year":"2023","unstructured":"Fan Lai, Wei Zhang, Rui Liu, William Tsai, Xiaohan Wei, Yuxi Hu, Sabin Devkota, Jianyu Huang, Jongsoo Park, Xing Liu, et al. 2023. {AdaEmbed}: Adaptive embedding for {Large-Scale} recommendation models. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). 817-831."},{"key":"e_1_2_1_40_1","volume-title":"DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices. In International Conference on Information Processing in Sensor Networks (IPSN). 23:1-23:12","author":"Lane Nicholas D.","year":"2016","unstructured":"Nicholas D. Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices. In International Conference on Information Processing in Sensor Networks (IPSN). 23:1-23:12."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3636534.3649391"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3387514.3405874"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.3027656"},{"key":"e_1_2_1_44_1","first-page":"111","article-title":"Sparse indexing: Large scale, inline deduplication using sampling and locality","volume":"9","author":"Lillibridge Mark","year":"2009","unstructured":"Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezis, Peter Camble, et al. 2009. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Fast, Vol. 9. 111-123.","journal-title":"Fast"},{"key":"e_1_2_1_45_1","first-page":"87","article-title":"Awq: Activation-aware weight quantization for on-device llm compression and acceleration","volume":"6","author":"Lin Ji","year":"2024","unstructured":"Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, and Song Han. 2024. Awq: Activation-aware weight quantization for on-device llm compression and acceleration. Proceedings of Machine Learning and Systems (MLsys) 6 (2024), 87-100.","journal-title":"Proceedings of Machine Learning and Systems (MLsys)"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC.2015.103"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2973750.2973752"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3696410.3714796"},{"key":"e_1_2_1_49_1","volume-title":"Latency-Aware Online Continual Learning for Non-Stationary Data Streams. In IEEE INFOCOM 2025-IEEE Conference on Computer Communications (INFOCOM). 1-10","author":"Liu Haibo","year":"2025","unstructured":"Haibo Liu, Da Huo, Zhenzhe Zheng, and Fan Wu. 2025. Latency-Aware Online Continual Learning for Non-Stationary Data Streams. In IEEE INFOCOM 2025-IEEE Conference on Computer Communications (INFOCOM). 1-10."},{"key":"e_1_2_1_50_1","volume-title":"From Non-IID to IID: Mobility-aware Hierarchical Federated Learning with Client-Edge Association Control","author":"Liu Haibo","year":"2025","unstructured":"Haibo Liu, Zhenzhe Zheng, Fan Wu, and Guihai Chen. 2025. From Non-IID to IID: Mobility-aware Hierarchical Federated Learning with Client-Edge Association Control. IEEE Transactions on Mobile Computing (TMC) (2025)."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3210240.3210337"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jda.2008.04.001"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295500.3356156"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5954"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2018.2850026"},{"key":"e_1_2_1_56_1","first-page":"283","article-title":"Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems","volume":"14","author":"M\u00fcller Ingo","year":"2014","unstructured":"Ingo M\u00fcller, Cornelius Ratsch, Franz F\u00e4rber, et al. 2014. Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems. In EDBT, Vol. 14. 283-294.","journal-title":"EDBT"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1002\/widm.53"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3373376.3378534"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1089\/tmj.2015.0106"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eap.2022.08.010"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOTEH.2019.8717652"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824044"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2022.3161114"},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.5555\/1639537.1639542"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/234313.234346"},{"key":"e_1_2_1_66_1","volume-title":"Revenue models, in-app purchase, and the app performance: Evidence from Apple's App Store and Google Play. Electronic commerce research and applications 17","author":"Roma Paolo","year":"2016","unstructured":"Paolo Roma and Daniele Ragaglia. 2016. Revenue models, in-app purchase, and the app performance: Evidence from Apple's App Store and Google Play. Electronic commerce research and applications 17 (2016), 173-190."},{"key":"e_1_2_1_67_1","volume-title":"Md Kafil Uddin, and Tawfeeq Alsanoosy.","author":"Sarker Iqbal H","year":"2021","unstructured":"Iqbal H Sarker, Mohammed Moshiul Hoque, Md Kafil Uddin, and Tawfeeq Alsanoosy. 2021. Mobile data science and intelligent apps: concepts, AI-based modeling and research directions. Mobile Networks and Applications (2021), 285-303."},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1145\/3643832.3661880"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2020.2978833"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/3575693.3575722"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2018.2871147"},{"key":"e_1_2_1_72_1","volume-title":"International Conference on Mobile Computing and Networking (MobiCom). 215-228","author":"Ding Shaohua","year":"2021","unstructured":"ManniWang, Shaohua Ding, Ting Cao, Yunxin Liu, and Fengyuan Xu. 2021. AsyMo: scalable and efficient deep-learning inference on asymmetric mobile CPUs. In International Conference on Mobile Computing and Networking (MobiCom). 215-228."},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3498361.3538928"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3636534.3649379"},{"key":"e_1_2_1_75_1","volume-title":"AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments. In International Conference on Mobile Computing and Networking (MobiCom). 28:1-28:17","author":"Wen Hao","year":"2023","unstructured":"Hao Wen, Yuanchun Li, Zunshuai Zhang, Shiqi Jiang, Xiaozhou Ye, Ye Ouyang, Yaqin Zhang, and Yunxin Liu. 2023. AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments. In International Conference on Mobile Computing and Networking (MobiCom). 28:1-28:17."},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/3495243.3560545"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313591"},{"key":"e_1_2_1_78_1","volume-title":"USENIX Symposium on Operating Systems Design and Implementation (OSDI). 645-661","author":"Yeo Hyunho","year":"2018","unstructured":"Hyunho Yeo, Youngmok Jung, Jaehong Kim, Jinwoo Shin, and Dongsu Han. 2018. Neural adaptive content-aware internet video delivery. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). 645-661."},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/3636534.3649361"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","DOI":"10.1145\/3495243.3517016"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1145\/3603269.3604825"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3271776"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.3390\/fi11020032"}],"container-title":["Proceedings of the ACM on Measurement and Analysis of Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3771575","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T17:26:22Z","timestamp":1764782782000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3771575"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12]]},"references-count":83,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["10.1145\/3771575"],"URL":"https:\/\/doi.org\/10.1145\/3771575","relation":{},"ISSN":["2476-1249"],"issn-type":[{"type":"electronic","value":"2476-1249"}],"subject":[],"published":{"date-parts":[[2025,12]]},"assertion":[{"value":"2025-12-02","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}