{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T09:21:50Z","timestamp":1762507310304,"version":"3.41.0"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2018,1,30]],"date-time":"2018-01-30T00:00:00Z","timestamp":1517270400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"International Research Centres in Singapore Funding Initiative"},{"DOI":"10.13039\/501100001381","name":"National Research Foundation, Prime Minister's Office, Singapore","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001381","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2018,7,31]]},"abstract":"<jats:p>One critical deficiency of traditional online kernel learning methods is their unbounded and growing number of support vectors in the online learning process, making them inefficient and non-scalable for large-scale applications. Recent studies on scalable online kernel learning have attempted to overcome this shortcoming, e.g., by imposing a constant budget on the number of support vectors. Although they attempt to bound the number of support vectors at each online learning iteration, most of them fail to bound the number of support vectors for the final output hypothesis, which is often obtained by averaging the series of hypotheses over all the iterations. In this article, we propose a novel framework for bounded online kernel methods, named \u201cSparse Passive-Aggressive (SPA)\u201d learning, which is able to yield a final output kernel-based hypothesis with a bounded number of support vectors. Unlike the common budget maintenance strategy used by many existing budget online kernel learning approaches, the idea of our approach is to attain the bounded number of support vectors using an efficient stochastic sampling strategy that samples an incoming training example as a new support vector with a probability proportional to its loss suffered. We theoretically prove that SPA achieves an optimal mistake bound in expectation, and we empirically show that it outperforms various budget online kernel learning algorithms. Finally, in addition to general online kernel learning tasks, we also apply SPA to derive bounded online multiple-kernel learning algorithms, which can significantly improve the scalability of traditional Online Multiple-Kernel Classification (OMKC) algorithms while achieving satisfactory learning accuracy as compared with the existing unbounded OMKC algorithms.<\/jats:p>","DOI":"10.1145\/3156684","type":"journal-article","created":{"date-parts":[[2018,1,31]],"date-time":"2018-01-31T13:25:40Z","timestamp":1517405140000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["Sparse Passive-Aggressive Learning for Bounded Online Kernel Methods"],"prefix":"10.1145","volume":"9","author":[{"given":"Jing","family":"Lu","sequence":"first","affiliation":[{"name":"School of Information Systems, Singapore Management University, Singapore"}]},{"given":"Doyen","family":"Sahoo","sequence":"additional","affiliation":[{"name":"School of Information Systems, Singapore Management University, Singapore"}]},{"given":"Peilin","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Software Engineering, South China University of Technology, Guangzhou, China"}]},{"given":"Steven C. H.","family":"Hoi","sequence":"additional","affiliation":[{"name":"School of Information Systems, Singapore Management University, Singapore"}]}],"member":"320","published-online":{"date-parts":[[2018,1,30]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1390721"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015424"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0925-2312(01)00643-9"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-007-5003-0"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2004.833339"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1961189.1961199"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684822.2685305"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2835776.2835812"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1022627411411"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/1248547.1248566"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the Conference on Neural Information Processing Systems (NIPS\u201903)","volume":"2","author":"Crammer Koby","year":"2003"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1162\/jmlr.2003.3.4-5.951"},{"key":"e_1_2_1_13_1","unstructured":"Ofer Dekel. 2009. From online to batch learning with cutoff-averaging. In Adv. Neural Info. Process. Syst. 377--384. Ofer Dekel. 2009. From online to batch learning with cutoff-averaging. In Adv. Neural Info. Process. Syst. 377--384."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1137\/060666998"},{"key":"e_1_2_1_15_1","unstructured":"Ofer Dekel and Yoram Singer. 2005. Data-driven online to batch conversions. In Adv. Neural Info. Process. Syst. 267--274. Ofer Dekel and Yoram Singer. 2005. Data-driven online to batch conversions. In Adv. Neural Info. Process. Syst. 267--274."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.1997.1504"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007662407062"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2021071"},{"key":"e_1_2_1_19_1","unstructured":"Thomas Hofmann Bernhard Sch\u00f6lkopf and Alexander J. Smola. 2008. Kernel methods in machine learning. Ann. Stat. (2008) 1171--1220. Thomas Hofmann Bernhard Sch\u00f6lkopf and Alexander J. Smola. 2008. Kernel methods in machine learning. Ann. Stat. (2008) 1171--1220."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-012-5319-2"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2627450"},{"volume-title":"Algorithmic Learning Theory","author":"Jin Rong","key":"e_1_2_1_22_1"},{"volume-title":"Proceedings of the Conference on Neural Information Processing Systems (NIPS\u201901)","author":"Kivinen Jyrki","key":"e_1_2_1_23_1"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/1005332.1005334"},{"volume-title":"Proceedings of the 2016 SIAM International Conference on Data Mining (SDM\u201916)","author":"Lu Jing","key":"e_1_2_1_25_1"},{"volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201910)","year":"2010","author":"Luo Jie","key":"e_1_2_1_26_1"},{"volume-title":"Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. 507--515","author":"Martins Andr\u00e9 F. T.","key":"e_1_2_1_27_1"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5540137"},{"key":"e_1_2_1_29_1","first-page":"227","article-title":"Multi kernel learning with online-batch optimization","volume":"13","author":"Orabona Francesco","year":"2012","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.5555\/1577069.1755875"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273594"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1037\/h0042519"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623712"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/646257.685385"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Bernhard Scholkopf and Alexander J. Smola. 2001. Learning with Kernels: Support Vector Machines Regularization Optimization and Beyond. MIT Press. Bernhard Scholkopf and Alexander J. Smola. 2001. Learning with Kernels: Support Vector Machines Regularization Optimization and Beyond. MIT Press.","DOI":"10.7551\/mitpress\/4175.001.0001"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1561\/2200000018"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273598"},{"key":"e_1_2_1_38_1","unstructured":"Ohad Shamir and Tong Zhang. 2012. Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes. arXiv:1212.1824. Ohad Shamir and Tong Zhang. 2012. Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes. arXiv:1212.1824."},{"key":"e_1_2_1_39_1","doi-asserted-by":"crossref","unstructured":"John Shawe-Taylor and Nello Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press. John Shawe-Taylor and Nello Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press.","DOI":"10.1017\/CBO9780511809682"},{"key":"e_1_2_1_40_1","unstructured":"Alex J. Smola and Bernhard Sch\u00f6lkopf. 1998. Learning with Kernels. Citeseer. Alex J. Smola and Bernhard Sch\u00f6lkopf. 1998. Learning with Kernels. Citeseer."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/1248547.1248604"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2009.98"},{"key":"e_1_2_1_43_1","unstructured":"Vladimir Naumovich Vapnik and Vlamimir Vapnik. 1998. Statistical learning theory. Vol. 1. Wiley New York. Vladimir Naumovich Vapnik and Vlamimir Vapnik. 1998. Statistical learning theory. Vol. 1. Wiley New York."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553510"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.5555\/2503308.2503341"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972795.78"},{"volume-title":"Proceedings of the International Conference on Artificial Intelligence and Statistics. 908--915","year":"2010","author":"Wang Zhuang","key":"e_1_2_1_47_1"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.149"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-11-S11-S1"},{"volume-title":"Proceedings of the Conference on Artificial Intelligence, the Innovative Application of Artificial Intelligence (AAAI\u201912)","year":"2012","author":"Zhang Lijun","key":"e_1_2_1_50_1"},{"volume-title":"Proceedings of the 30th International Conference on Machine Learning (ICML\u201913)","year":"2013","author":"Zhang Lijun","key":"e_1_2_1_51_1"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2021051"},{"volume-title":"Proceedings of the International Conference on Machine Learning (ICML\u201912)","author":"Zhao Peilin","key":"e_1_2_1_53_1"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3156684","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3156684","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:38:44Z","timestamp":1750221524000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3156684"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,1,30]]},"references-count":53,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2018,7,31]]}},"alternative-id":["10.1145\/3156684"],"URL":"https:\/\/doi.org\/10.1145\/3156684","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"type":"print","value":"2157-6904"},{"type":"electronic","value":"2157-6912"}],"subject":[],"published":{"date-parts":[[2018,1,30]]},"assertion":[{"value":"2017-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-01-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}