{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T19:30:34Z","timestamp":1770751834287,"version":"3.50.0"},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2012,2,1]],"date-time":"2012-02-01T00:00:00Z","timestamp":1328054400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001868","name":"National Science Council Taiwan","doi-asserted-by":"publisher","award":["98-2221-E-002-136-MY3"],"award-info":[{"award-number":["98-2221-E-002-136-MY3"]}],"id":[{"id":"10.13039\/501100001868","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2012,2]]},"abstract":"<jats:p>Recent advances in linear classification have shown that for applications such as document classification, the training process can be extremely efficient. However, most of the existing training methods are designed by assuming that data can be stored in the computer memory. These methods cannot be easily applied to data larger than the memory capacity due to the random access to the disk. We propose and analyze a block minimization framework for data larger than the memory size. At each step a block of data is loaded from the disk and handled by certain learning methods. We investigate two implementations of the proposed framework for primal and dual SVMs, respectively. Because data cannot fit in memory, many design considerations are very different from those for traditional algorithms. We discuss and compare with existing approaches that are able to handle data larger than memory. Experiments using data sets 20 times larger than the memory demonstrate the effectiveness of the proposed method.<\/jats:p>","DOI":"10.1145\/2086737.2086743","type":"journal-article","created":{"date-parts":[[2012,1,31]],"date-time":"2012-01-31T14:49:20Z","timestamp":1328021360000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":31,"title":["Large Linear Classification When Data Cannot Fit in Memory"],"prefix":"10.1145","volume":"5","author":[{"given":"Hsiang-Fu","family":"Yu","sequence":"first","affiliation":[{"name":"National Taiwan University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cho-Jui","family":"Hsieh","sequence":"additional","affiliation":[{"name":"National Taiwan University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kai-Wei","family":"Chang","sequence":"additional","affiliation":[{"name":"National Taiwan University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chih-Jen","family":"Lin","sequence":"additional","affiliation":[{"name":"National Taiwan University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2012,2]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Bertsekas D. P. 1999. Nonlinear Programming 2nd Ed. Athena Scientific Belmont MA. Bertsekas D. P. 1999. Nonlinear Programming 2nd Ed. Athena Scientific Belmont MA."},{"key":"e_1_2_1_2_1","unstructured":"Bottou L. 2007. Stochastic gradient descent examples. http:\/\/leon.bottou.org\/projects\/sgd. Bottou L. 2007. Stochastic gradient descent examples. http:\/\/leon.bottou.org\/projects\/sgd."},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Boyd S. and Vandenberghe L. 2004. Convex Optimization. Cambridge University Press. Boyd S. and Vandenberghe L. 2004. Convex Optimization . Cambridge University Press.","DOI":"10.1017\/CBO9780511804441"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1018054314350"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367497.1367554"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1961189.1961199"},{"key":"e_1_2_1_7_1","unstructured":"Chang E. Zhu K. Wang H. Bai H. Li J. Qiu Z. and Cui H. 2008. Parallelizing support vector machines on distributed computers. In Advances in Neural Information Processing Systems 20 J. Platt D. Koller Y. Singer and S. Roweis Eds. MIT Press Cambridge MA 257--264. Chang E. Zhu K. Wang H. Bai H. Li J. Qiu Z. and Cui H. 2008. Parallelizing support vector machines on distributed computers. In Advances in Neural Information Processing Systems 20 J. Platt D. Koller Y. Singer and S. Roweis Eds. MIT Press Cambridge MA 257--264."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020517"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1013637720281"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/1390681.1442794"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1137\/S1052623400374379"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390208"},{"key":"e_1_2_1_13_1","volume-title":"Advances in Kernel Methods--Support Vector Learning","author":"Joachims T."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150429"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1401890.1401942"},{"key":"e_1_2_1_16_1","unstructured":"Langford J. Li L. and Strehl A. 2007. Vowpal Wabbit. https:\/\/github.com\/JohnLangford\/vowpal_wabbit\/wiki. Langford J. Li L. and Strehl A. 2007. Vowpal Wabbit. https:\/\/github.com\/JohnLangford\/vowpal_wabbit\/wiki."},{"key":"e_1_2_1_17_1","first-page":"771","article-title":"Sparse online learning via truncated gradient","volume":"10","author":"Langford J.","year":"2009","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_1_18_1","unstructured":"Langford J. Smola A. and Zinkevich M. 2009b. Slow learners are fast. In Advances in Neural Information Processing Systems 22 Y. Bengio D. Schuurmans J. Lafferty C. K. I. Williams and A. Culotta Eds. 2331--2339. Langford J. Smola A. and Zinkevich M. 2009b. Slow learners are fast. In Advances in Neural Information Processing Systems 22 Y. Bengio D. Schuurmans J. Lafferty C. K. I. Williams and A. Culotta Eds. 2331--2339."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772759"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00939948"},{"key":"e_1_2_1_21_1","unstructured":"Memisevic R. 2006. Dual optimization of conditional probability models. Tech. rep. Department of Computer Science University of Toronto. Memisevic R. 2006. Dual optimization of conditional probability models. Tech. rep. Department of Computer Science University of Toronto."},{"key":"e_1_2_1_22_1","unstructured":"Morse  Jr. K. G. 2005. Compression tools compared. Linux J. Morse Jr. K. G. 2005. Compression tools compared. Linux J ."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of Learning.","author":"P\u00e9rez-Cruz F."},{"key":"e_1_2_1_24_1","unstructured":"R\u00fcping S. 2000. mySVM---another one of those support vector machines. http:\/\/www-ai.cs.uni-dortmund.de\/SOFTWARE\/MYSVM\/. R\u00fcping S. 2000. mySVM---another one of those support vector machines. http:\/\/www-ai.cs.uni-dortmund.de\/SOFTWARE\/MYSVM\/."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1080\/10556780500140714"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-010-0420-4"},{"key":"e_1_2_1_27_1","unstructured":"Tong S. 2010. Lessons learned developing a practical large scale machine learning system. Google research blog. http:\/\/googleresearch.blogspot.com\/2010\/04\/lessons-learned-developing-practical.html. Tong S. 2010. Lessons learned developing a practical large scale machine learning system. Google research blog. http:\/\/googleresearch.blogspot.com\/2010\/04\/lessons-learned-developing-practical.html."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/956750.956786"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1835804.1835910"},{"key":"e_1_2_1_30_1","volume-title":"JMLR Workshop and Conference Proceedings. To appear.","author":"Yu H.-F."},{"key":"e_1_2_1_31_1","volume-title":"Proc. IEEE. Submitted.","author":"Yuan G.-X."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1012498226479"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2009.29"},{"key":"e_1_2_1_34_1","unstructured":"Zinkevich M. Weimer M. Smola A. and Li L. 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems 23 J. Lafferty C. K. I. Williams J. Shawe-Taylor R. Zemel and A. Culotta Eds. 2595--2603. Zinkevich M. Weimer M. Smola A. and Li L. 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems 23 J. Lafferty C. K. I. Williams J. Shawe-Taylor R. Zemel and A. Culotta Eds. 2595--2603."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2086737.2086743","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2086737.2086743","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:05:54Z","timestamp":1750241154000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2086737.2086743"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,2]]},"references-count":34,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2012,2]]}},"alternative-id":["10.1145\/2086737.2086743"],"URL":"https:\/\/doi.org\/10.1145\/2086737.2086743","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,2]]},"assertion":[{"value":"2011-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-02-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}