{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T08:54:45Z","timestamp":1775638485410,"version":"3.50.1"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"9","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,5]]},"abstract":"<jats:p>\n            Matrix sketching, aimed at approximating a matrix\n            <jats:italic>A<\/jats:italic>\n            \u2208 R\n            <jats:sup>\n              <jats:italic>N\u00d7d<\/jats:italic>\n            <\/jats:sup>\n            consisting of vector streams of length\n            <jats:italic>N<\/jats:italic>\n            with a smaller sketching matrix\n            <jats:bold>\n              <jats:italic>B<\/jats:italic>\n            <\/jats:bold>\n            \u2208 R\n            <jats:italic>\n              <jats:sup>\u2113\u00d7d<\/jats:sup>\n              , \u2113\n            <\/jats:italic>\n            \u226a\n            <jats:italic>N<\/jats:italic>\n            , has garnered increasing attention in fields such as large-scale data analytics and machine learning. A well-known deterministic matrix sketching method is the FreqentDirections algorithm, which achieves the optimal [EQUATION] space bound and provides a covariance error guarantee of\n            <jats:italic>\u03b5<\/jats:italic>\n            = ||\n            <jats:bold>\n              <jats:italic>A<\/jats:italic>\n            <\/jats:bold>\n            <jats:sup>\u22a4<\/jats:sup>\n            <jats:bold>\n              <jats:italic>A<\/jats:italic>\n            <\/jats:bold>\n            -\n            <jats:bold>\n              <jats:italic>B<\/jats:italic>\n            <\/jats:bold>\n            <jats:sup>\u22a4<\/jats:sup>\n            <jats:italic>B<\/jats:italic>\n            ||\n            <jats:sub>2<\/jats:sub>\n            \/||\n            <jats:bold>\n              <jats:italic>A<\/jats:italic>\n            <\/jats:bold>\n            ||\n            <jats:sup>2<\/jats:sup>\n            <jats:sub>\n              <jats:italic>F.<\/jats:italic>\n            <\/jats:sub>\n            The matrix sketching problem becomes particularly interesting in the context of sliding windows, where the goal is to approximate the matrix\n            <jats:italic>\n              A\n              <jats:sub>W<\/jats:sub>\n            <\/jats:italic>\n            , formed by input vectors over the most recent\n            <jats:italic>N<\/jats:italic>\n            time units. However, despite recent efforts, whether achieving the optimal [EQUATION] space bound on sliding windows is possible has remained an open question.\n          <\/jats:p>\n          <jats:p>In this paper, we introduce the DS-FD algorithm, which achieves the optimal [EQUATION] space bound for matrix sketching over row-normalized, sequence-based sliding windows. We also present matching upper and lower space bounds for time-based and unnormalized sliding windows, demonstrating the generality and optimality of DS-FD across various sliding window models. This conclusively answers the open question regarding the optimal space bound for matrix sketching over sliding windows. We conduct extensive experiments with both synthetic and real-world datasets, validating our theoretical claims and thus confirming the correctness and effectiveness of our algorithm, both theoretically and empirically.<\/jats:p>","DOI":"10.14778\/3665844.3665847","type":"journal-article","created":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T22:19:07Z","timestamp":1722982747000},"page":"2149-2161","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Optimal Matrix Sketching over Sliding Windows"],"prefix":"10.14778","volume":"17","author":[{"given":"Hanyan","family":"Yin","sequence":"first","affiliation":[{"name":"Renmin University of China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dongxie","family":"Wen","sequence":"additional","affiliation":[{"name":"Renmin University of China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiajun","family":"Li","sequence":"additional","affiliation":[{"name":"Renmin University of China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhewei","family":"Wei","sequence":"additional","affiliation":[{"name":"Renmin University of China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Renmin University of China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zengfeng","family":"Huang","sequence":"additional","affiliation":[{"name":"Fudan University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Feifei","family":"Li","sequence":"additional","affiliation":[{"name":"Alibaba Group"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,8,6]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"A modified moving window dynamic PCA with fuzzy logic filter and application to fault detection. Chemometrics and intelligent laboratory systems 177","author":"Ammiche Mustapha","year":"2018","unstructured":"Mustapha Ammiche, Abdelmalek Kouadri, and Abderazak Bensmail. 2018. A modified moving window dynamic PCA with fuzzy logic filter and application to fault detection. Chemometrics and intelligent laboratory systems 177 (2018), 100--113."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 286--296","author":"Arasu Arvind","year":"2004","unstructured":"Arvind Arasu and Gurmeet Singh Manku. 2004. Approximate counts and quantiles over sliding windows. In Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 286--296."},{"key":"e_1_2_1_3_1","volume-title":"International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer, 272--279","author":"Arora Sanjeev","year":"2006","unstructured":"Sanjeev Arora, Elad Hazan, and Satyen Kale. 2006. A fast random sampling algorithm for sparsifying matrices. In International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer, 272--279."},{"key":"e_1_2_1_4_1","volume-title":"Krylov Methods are (nearly) Optimal for Low-Rank Approximation. arXiv preprint arXiv:2304.03191","author":"Bakshi Ainesh","year":"2023","unstructured":"Ainesh Bakshi and Shyam Narayanan. 2023. Krylov Methods are (nearly) Optimal for Low-Rank Approximation. arXiv preprint arXiv:2304.03191 (2023)."},{"key":"e_1_2_1_5_1","volume-title":"Brian Whitman, and Paul Lamere.","author":"Bertin-Mahieux Thierry","year":"2011","unstructured":"Thierry Bertin-Mahieux, Daniel PW Ellis, Brian Whitman, and Paul Lamere. 2011. The million song dataset. (2011)."},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","first-page":"687","DOI":"10.1137\/12086755X","article-title":"Near-optimal column-based matrix reconstruction","volume":"43","author":"Boutsidis Christos","year":"2014","unstructured":"Christos Boutsidis, Petros Drineas, and Malik Magdon-Ismail. 2014. Near-optimal column-based matrix reconstruction. SIAM J. Comput. 43, 2 (2014), 687--717.","journal-title":"SIAM J. Comput."},{"key":"e_1_2_1_7_1","volume-title":"2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 517--528","author":"Braverman Vladimir","year":"2020","unstructured":"Vladimir Braverman, Petros Drineas, Cameron Musco, Christopher Musco, Jalaj Upadhyay, David P Woodruff, and Samson Zhou. 2020. Near optimal linear algebra in the online and sliding window models. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 517--528."},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 4259--4265","author":"Chen Cheng","year":"2021","unstructured":"Cheng Chen, Luo Luo, Weinan Zhang, Yong Yu, and Yijiang Lian. 2021. Efficient and robust high-dimensional linear contextual bandits. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 4259--4265."},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3374750","article-title":"Real-time principal component analysis","volume":"1","author":"Chowdhury Ranak Roy","year":"2020","unstructured":"Ranak Roy Chowdhury, Muhammad Abdullah Adnan, and Rajesh K Gupta. 2020. Real-time principal component analysis. ACM Transactions on Data Science 1, 2 (2020), 1--36.","journal-title":"ACM Transactions on Data Science"},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3019134","article-title":"Low-rank approximation and regression in input sparsity time","volume":"63","author":"Clarkson Kenneth L","year":"2017","unstructured":"Kenneth L Clarkson and David P Woodruff. 2017. Low-rank approximation and regression in input sparsity time. Journal of the ACM (JACM) 63, 6 (2017), 1--45.","journal-title":"Journal of the ACM (JACM)"},{"key":"e_1_2_1_11_1","volume-title":"Small summaries for big data","author":"Cormode Graham","unstructured":"Graham Cormode and Ke Yi. 2020. Small summaries for big data. Cambridge University Press."},{"key":"e_1_2_1_12_1","series-title":"SIAM journal on computing 31, 6","volume-title":"Maintaining stream statistics over sliding windows","author":"Datar Mayur","year":"2002","unstructured":"Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 2002. Maintaining stream statistics over sliding windows. SIAM journal on computing 31, 6 (2002), 1794--1813."},{"key":"e_1_2_1_13_1","volume-title":"International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer, 292--303","author":"Deshpande Amit","year":"2006","unstructured":"Amit Deshpande and Santosh Vempala. 2006. Adaptive sampling and fast low-rank matrix approximation. In International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer, 292--303."},{"key":"e_1_2_1_14_1","volume-title":"Ridge Regression with Frequent Directions: Statistical and Optimization Perspectives. arXiv preprint arXiv:2011.03607","author":"Dickens Charlie","year":"2020","unstructured":"Charlie Dickens. 2020. Ridge Regression with Frequent Directions: Statistical and Optimization Perspectives. arXiv preprint arXiv:2011.03607 (2020)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1016\/j.ipl.2011.01.010","article-title":"A note on element-wise matrix sparsification via a matrix-valued Bernstein inequality","volume":"111","author":"Drineas Petros","year":"2011","unstructured":"Petros Drineas and Anastasios Zouzias. 2011. A note on element-wise matrix sparsification via a matrix-valued Bernstein inequality. Inform. Process. Lett. 111, 8 (2011), 385--389.","journal-title":"Inform. Process. Lett."},{"key":"e_1_2_1_16_1","volume-title":"Sketchy: Memory-efficient adaptive regularization with frequent directions. Advances in Neural Information Processing Systems 36","author":"Feinberg Vladimir","year":"2024","unstructured":"Vladimir Feinberg, Xinyi Chen, Y Jennifer Sun, Rohan Anil, and Elad Hazan. 2024. Sketchy: Memory-efficient adaptive regularization with frequent directions. Advances in Neural Information Processing Systems 36 (2024)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1145\/1083784.1083789","article-title":"Mining data streams: a review","volume":"34","author":"Gaber Mohamed Medhat","year":"2005","unstructured":"Mohamed Medhat Gaber, Arkady Zaslavsky, and Shonali Krishnaswamy. 2005. Mining data streams: a review. ACM Sigmod Record 34, 2 (2005), 18--26.","journal-title":"ACM Sigmod Record"},{"key":"e_1_2_1_18_1","volume-title":"International Conference on Algorithmic Learning Theory. Springer, 174--188","author":"Garivier Aur\u00e9lien","year":"2011","unstructured":"Aur\u00e9lien Garivier and Eric Moulines. 2011. On upper-confidence bound policies for switching bandit problems. In International Conference on Algorithmic Learning Theory. Springer, 174--188."},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","first-page":"1762","DOI":"10.1137\/15M1009718","article-title":"Frequent directions: Simple and deterministic matrix sketching","volume":"45","author":"Ghashami Mina","year":"2016","unstructured":"Mina Ghashami, Edo Liberty, Jeff M Phillips, and David P Woodruff. 2016. Frequent directions: Simple and deterministic matrix sketching. SIAM J. Comput. 45, 5 (2016), 1762--1792.","journal-title":"SIAM J. Comput."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 290--297","author":"Lee Lap-Kei","year":"2006","unstructured":"Lap-Kei Lee and HF Ting. 2006. A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 290--297."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 581--588","author":"Liberty Edo","year":"2013","unstructured":"Edo Liberty. 2013. Simple and deterministic matrix sketching. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 581--588."},{"key":"e_1_2_1_22_1","first-page":"1697","article-title":"Robust frequent directions with application in online learning","volume":"20","author":"Luo Luo","year":"2019","unstructured":"Luo Luo, Cheng Chen, Zhihua Zhang, Wu-Jun Li, and Tong Zhang. 2019. Robust frequent directions with application in online learning. The Journal of Machine Learning Research 20, 1 (2019), 1697--1737.","journal-title":"The Journal of Machine Learning Research"},{"key":"e_1_2_1_23_1","volume-title":"Finding repeated elements. Science of computer programming 2, 2","author":"Misra Jayadev","year":"1982","unstructured":"Jayadev Misra and David Gries. 1982. Finding repeated elements. Science of computer programming 2, 2 (1982), 143--152."},{"key":"e_1_2_1_24_1","volume-title":"Randomized block krylov methods for stronger and faster approximate singular value decomposition. Advances in neural information processing systems 28","author":"Musco Cameron","year":"2015","unstructured":"Cameron Musco and Christopher Musco. 2015. Randomized block krylov methods for stronger and faster approximate singular value decomposition. Advances in neural information processing systems 28 (2015)."},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Shanmugavelayutham Muthukrishnan et al. 2005. Data streams: Algorithms and applications. Foundations and Trends\u00ae in Theoretical Computer Science 1 2 (2005) 117--236.","DOI":"10.1561\/0400000002"},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","first-page":"2537","DOI":"10.1109\/TSG.2016.2559444","article-title":"Real-time multiple event detection and classification using moving window PCA","volume":"7","author":"Rafferty Mark","year":"2016","unstructured":"Mark Rafferty, Xueqin Liu, David M Laverty, and Sean McLoone. 2016. Real-time multiple event detection and classification using moving window PCA. IEEE Transactions on Smart Grid 7, 5 (2016), 2537--2548.","journal-title":"IEEE Transactions on Smart Grid"},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1145\/1255443.1255449","article-title":"Sampling from large matrices: An approach through geometric functional analysis","volume":"54","author":"Rudelson Mark","year":"2007","unstructured":"Mark Rudelson and Roman Vershynin. 2007. Sampling from large matrices: An approach through geometric functional analysis. Journal of the ACM (JACM) 54, 4 (2007), 21--es.","journal-title":"Journal of the ACM (JACM)"},{"key":"e_1_2_1_28_1","volume-title":"Improved approximation algorithms for large matrices via random projections. In 2006 47th annual IEEE symposium on foundations of computer science (FOCS'06)","author":"Sarlos Tamas","unstructured":"Tamas Sarlos. 2006. Improved approximation algorithms for large matrices via random projections. In 2006 47th annual IEEE symposium on foundations of computer science (FOCS'06). IEEE, 143--152."},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1016\/j.jprocont.2017.03.004","article-title":"Fault detection using multiscale PCA-based moving window GLRT","volume":"54","author":"Sheriff M Ziyan","year":"2017","unstructured":"M Ziyan Sheriff, Majdi Mansouri, M Nazmul Karim, Hazem Nounou, and Mohamed Nounou. 2017. Fault detection using multiscale PCA-based moving window GLRT. Journal of Process Control 54 (2017), 47--64.","journal-title":"Journal of Process Control"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 2021 International Conference on Management of Data. 1623--1636","author":"Shi Benwei","year":"2021","unstructured":"Benwei Shi, Zhuoyue Zhao, Yanqing Peng, Feifei Li, and Jeff M Phillips. 2021. At-the-time and back-in-time persistent sketches. In Proceedings of the 2021 International Conference on Management of Data. 1623--1636."},{"key":"e_1_2_1_31_1","volume-title":"International Conference on Machine Learning. PMLR, 10465--10475","author":"Upadhyay Jalaj","year":"2021","unstructured":"Jalaj Upadhyay and Sarvagya Upadhyay. 2021. A framework for private matrix analysis in sliding window model. In International Conference on Machine Learning. PMLR, 10465--10475."},{"key":"e_1_2_1_32_1","volume-title":"The random projection method","author":"Vempala Santosh S","unstructured":"Santosh S Vempala. 2005. The random projection method. Vol. 65. American Mathematical Soc."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 2016 International Conference on Management of Data. 1465--1480","author":"Wei Zhewei","year":"2016","unstructured":"Zhewei Wei, Xuancheng Liu, Feifei Li, Shuo Shang, Xiaoyong Du, and Ji-Rong Wen. 2016. Matrix sketching over sliding windows. In Proceedings of the 2016 International Conference on Management of Data. 1465--1480."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 26th annual international conference on machine learning. 1113--1120","author":"Weinberger Kilian","year":"2009","unstructured":"Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. 2009. Feature hashing for large scale multitask learning. In Proceedings of the 26th annual international conference on machine learning. 1113--1120."},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"David P Woodruff et al. 2014. Sketching as a tool for numerical linear algebra. Foundations and Trends\u00ae in Theoretical Computer Science 10 1--2 (2014) 1--157.","DOI":"10.1561\/0400000060"},{"key":"e_1_2_1_36_1","unstructured":"Hanyan Yin Dongxie Wen Jiajun Li Zhewei Wei Xiao Zhang Zengfeng Huang and Feifei Li. 2024. Optimal Matrix Sketching over Sliding Windows. arXiv:2405.07792 [cs.DB]"},{"key":"e_1_2_1_37_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3531053","article-title":"Persistent Summaries","volume":"47","author":"Zeng Tianjing","year":"2022","unstructured":"Tianjing Zeng, Zhewei Wei, Ge Luo, Ke Yi, Xiaoyong Du, and Ji-Rong Wen. 2022. Persistent Summaries. ACM Transactions on Database Systems (TODS) 47, 3 (2022), 1--42.","journal-title":"ACM Transactions on Database Systems (TODS)"},{"key":"e_1_2_1_38_1","volume-title":"2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 833--844","author":"Zhang Haida","year":"2017","unstructured":"Haida Zhang, Zengfeng Huang, Zhewei Wei, Wenjie Zhang, and Xuemin Lin. 2017. Tracking matrix approximation over distributed sliding windows. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 833--844."},{"key":"e_1_2_1_39_1","volume-title":"2008 IEEE 24th International Conference on Data Engineering. IEEE, 1385--1387","author":"Zhang Linfeng","year":"2008","unstructured":"Linfeng Zhang and Yong Guan. 2008. Frequency estimation over sliding windows. In 2008 IEEE 24th International Conference on Data Engineering. IEEE, 1385--1387."},{"key":"e_1_2_1_40_1","volume-title":"Stream efficient learning. arXiv preprint arXiv:2305.02217","author":"Zhou Zhi-Hua","year":"2023","unstructured":"Zhi-Hua Zhou. 2023. Stream efficient learning. arXiv preprint arXiv:2305.02217 (2023)."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3665844.3665847","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T22:31:17Z","timestamp":1722983477000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3665844.3665847"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5]]},"references-count":40,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2024,5]]}},"alternative-id":["10.14778\/3665844.3665847"],"URL":"https:\/\/doi.org\/10.14778\/3665844.3665847","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2024,5]]},"assertion":[{"value":"2024-08-06","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}