{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:07:25Z","timestamp":1750306045247,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":46,"publisher":"ACM","license":[{"start":{"date-parts":[[2017,6,26]],"date-time":"2017-06-26T00:00:00Z","timestamp":1498435200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"DARPA SIMPLEX through SPAWAR","award":["N66001-15-C-4041"],"award-info":[{"award-number":["N66001-15-C-4041"]}]},{"name":"DARPA GRAPHS","award":["N66001-14-1-4028"],"award-info":[{"award-number":["N66001-14-1-4028"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2017,6,26]]},"DOI":"10.1145\/3078597.3078607","type":"proceedings-article","created":{"date-parts":[[2017,6,23]],"date-time":"2017-06-23T12:46:51Z","timestamp":1498222011000},"page":"67-78","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["knor"],"prefix":"10.1145","author":[{"given":"Disa","family":"Mhembere","sequence":"first","affiliation":[{"name":"Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Da","family":"Zheng","sequence":"additional","affiliation":[{"name":"Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Carey E.","family":"Priebe","sequence":"additional","affiliation":[{"name":"Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Joshua T.","family":"Vogelstein","sequence":"additional","affiliation":[{"name":"Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Randal","family":"Burns","sequence":"additional","affiliation":[{"name":"Johns Hopkins University, Baltimore, MD, USA"}]}],"member":"320","published-online":{"date-parts":[[2017,6,26]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"332","volume-title":"Algorithmica","author":"Abello J.","year":"1998","unstructured":"J. Abello , A. L. Buchsbaum , and J. R. Westbrook . A functional approach to external graph algorithms . In Algorithmica , pages 332 -- 343 . Springer-Verlag , 1998 . J. Abello, A. L. Buchsbaum, and J. R. Westbrook. A functional approach to external graph algorithms. In Algorithmica, pages 332--343. Springer-Verlag, 1998."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/Co-HPC.2014.4"},{"key":"e_1_3_2_1_3_1","first-page":"35","volume-title":"Proceedings of KDD cup and workshop","volume":"2007","author":"Bennett J.","year":"2007","unstructured":"J. Bennett and S. Lanning . The netflix prize . In Proceedings of KDD cup and workshop , volume 2007 , page 35 , 2007 . J. Bennett and S. Lanning. The netflix prize. In Proceedings of KDD cup and workshop, volume 2007, page 35, 2007."},{"key":"e_1_3_2_1_4_1","volume-title":"Covariate assisted spectral clustering. arXiv preprint arXiv:1411.2158","author":"Binkiewicz N.","year":"2014","unstructured":"N. Binkiewicz , J. T. Vogelstein , and K. Rohe . Covariate assisted spectral clustering. arXiv preprint arXiv:1411.2158 , 2014 . N. Binkiewicz, J. T. Vogelstein, and K. Rohe. Covariate assisted spectral clustering. arXiv preprint arXiv:1411.2158, 2014."},{"key":"e_1_3_2_1_5_1","volume-title":"Mlpack: A scalable c++ machine learning library. Journal of Machine Learning Research, 14(Mar):801--805","author":"Curtin R. R.","year":"2013","unstructured":"R. R. Curtin , J. R. Cline , N. P. Slagle , W. B. March , P. Ram , N. A. Mehta , and A. G. Gray . Mlpack: A scalable c++ machine learning library. Journal of Machine Learning Research, 14(Mar):801--805 , 2013 . R. R. Curtin, J. R. Cline, N. P. Slagle, W. B. March, P. Ram, N. A. Mehta, and A. G. Gray. Mlpack: A scalable c++ machine learning library. Journal of Machine Learning Research, 14(Mar):801--805, 2013."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1242572.1242610"},{"key":"e_1_3_2_1_7_1","volume-title":"Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation -","volume":"6","author":"Dean J.","year":"2004","unstructured":"J. Dean and S. Ghemawat . MapReduce: Simplified data processing on large clusters . In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6 , 2004 . J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6, 2004."},{"key":"e_1_3_2_1_8_1","series-title":"Series B (methodological)","first-page":"1","volume-title":"Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society","author":"Dempster A. P.","year":"1977","unstructured":"A. P. Dempster , N. M. Laird , and D. B. Rubin . Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society . Series B (methodological) , pages 1 -- 38 , 1977 . A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society. Series B (methodological), pages 1--38, 1977."},{"key":"e_1_3_2_1_9_1","first-page":"579","volume-title":"Proceedings of the 32nd International Conference on Machine Learning (ICML-15)","author":"Ding Y.","year":"2015","unstructured":"Y. Ding , Y. Zhao , X. Shen , M. Musuvathi , and T. Mytkowicz . Yinyang k-means: A drop-in replacement of the classic k-means with consistent speedup . In Proceedings of the 32nd International Conference on Machine Learning (ICML-15) , pages 579 -- 587 , 2015 . Y. Ding, Y. Zhao, X. Shen, M. Musuvathi, and T. Mytkowicz. Yinyang k-means: A drop-in replacement of the classic k-means with consistent speedup. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 579--587, 2015."},{"key":"e_1_3_2_1_10_1","volume-title":"Pattern classification and scene analysis","author":"Duda R. O.","year":"1973","unstructured":"R. O. Duda , P. E. Hart , Pattern classification and scene analysis , volume 3 . Wiley New York , 1973 . R. O. Duda, P. E. Hart, et al. Pattern classification and scene analysis, volume 3. Wiley New York, 1973."},{"key":"e_1_3_2_1_11_1","first-page":"147","volume-title":"ICML","volume":"3","author":"Elkan C.","year":"2003","unstructured":"C. Elkan . Using the triangle inequality to accelerate k-means . In ICML , volume 3 , pages 147 -- 153 , 2003 . C. Elkan. Using the triangle inequality to accelerate k-means. In ICML, volume 3, pages 147--153, 2003."},{"key":"e_1_3_2_1_12_1","volume-title":"USA","author":"Forum M. P.","year":"1994","unstructured":"M. P. Forum . Mpi : A message-passing interface standard. Technical report, Knoxville, TN , USA , 1994 . M. P. Forum. Mpi: A message-passing interface standard. Technical report, Knoxville, TN, USA, 1994."},{"key":"e_1_3_2_1_13_1","unstructured":"Frienster graph. https:\/\/archive.org\/download\/friendster-dataset-201107 Accessed 4\/18\/2014.  Frienster graph. https:\/\/archive.org\/download\/friendster-dataset-201107 Accessed 4\/18\/2014."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.5555\/795665.796479"},{"key":"e_1_3_2_1_15_1","volume-title":"Brown and Comp.","author":"Gauss C. F.","year":"1857","unstructured":"C. F. Gauss . Theory of the motion of the heavenly bodies moving about the sun in conic sections: a translation of Carl Frdr. Gauss\" Theoria motus\": With an appendix. By Ch. H. Davis. Little , Brown and Comp. , 1857 . C. F. Gauss. Theory of the motion of the heavenly bodies moving about the sun in conic sections: a translation of Carl Frdr. Gauss\" Theoria motus\": With an appendix. By Ch. H. Davis. Little, Brown and Comp., 1857."},{"key":"e_1_3_2_1_16_1","unstructured":"h2o. h2o. http:\/\/h2o.ai\/ 2005--2015.  h2o. h2o. http:\/\/h2o.ai\/ 2005--2015."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.18637\/jss.v050.i10"},{"key":"e_1_3_2_1_18_1","unstructured":"A. Inc. Amazon web services.  A. Inc. Amazon web services."},{"key":"e_1_3_2_1_19_1","unstructured":"G. Inc. Google cloud platform.  G. Inc. Google cloud platform."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1038\/ng1435"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/355841.355847"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1982.1056489"},{"key":"e_1_3_2_1_23_1","volume-title":"Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041","author":"Low Y.","year":"2014","unstructured":"Y. Low , J. E. Gonzalez , A. Kyrola , D. Bickson , C. E. Guestrin , and J. Hellerstein . Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041 , 2014 . Y. Low, J. E. Gonzalez, A. Kyrola, D. Bickson, C. E. Guestrin, and J. Hellerstein. Graphlab: A new framework for parallel machine learning. arXiv preprint arXiv:1408.2041, 2014."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.parco.2015.03.004"},{"key":"e_1_3_2_1_25_1","volume-title":"Community detection and classification in hierarchical stochastic blockmodels. arXiv preprint arXiv:1503.02115","author":"Lyzinski V.","year":"2015","unstructured":"V. Lyzinski , M. Tang , A. Athreya , Y. Park , and C. E. Priebe . Community detection and classification in hierarchical stochastic blockmodels. arXiv preprint arXiv:1503.02115 , 2015 . V. Lyzinski, M. Tang, A. Athreya, Y. Park, and C. E. Priebe. Community detection and classification in hierarchical stochastic blockmodels. arXiv preprint arXiv:1503.02115, 2015."},{"key":"e_1_3_2_1_26_1","volume-title":"Natick","author":"AB.","year":"2010","unstructured":"MATL AB. version 7.10.0 (R2010a). The MathWorks Inc ., Natick , Massachusetts , 2010 . MATLAB. version 7.10.0 (R2010a). The MathWorks Inc., Natick, Massachusetts, 2010."},{"key":"e_1_3_2_1_27_1","volume-title":"15th Workshop on Hot Topics in Operating Systems (HotOS XV)","author":"McSherry F.","year":"2015","unstructured":"F. McSherry , M. Isard , and D. G. Murray . Scalability! but at what cost ? In 15th Workshop on Hot Topics in Operating Systems (HotOS XV) , 2015 . F. McSherry, M. Isard, and D. G. Murray. Scalability! but at what cost? In 15th Workshop on Hot Topics in Operating Systems (HotOS XV), 2015."},{"key":"e_1_3_2_1_28_1","volume-title":"et al. Mllib: Machine learning in apache spark. arXiv preprint arXiv:1505.06807","author":"Meng X.","year":"2015","unstructured":"X. Meng , J. Bradley , B. Yavuz , E. Sparks , S. Venkataraman , D. Liu , J. Freeman , D. Tsai , M. Amde , S. Owen , et al. Mllib: Machine learning in apache spark. arXiv preprint arXiv:1505.06807 , 2015 . X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen, et al. Mllib: Machine learning in apache spark. arXiv preprint arXiv:1505.06807, 2015."},{"key":"e_1_3_2_1_29_1","volume-title":"Mahout in action","author":"Owen S.","year":"2011","unstructured":"S. Owen , R. Anil , T. Dunning , and E. Friedman . Mahout in action . Manning Shelter Island , 2011 . S. Owen, R. Anil, T. Dunning, and E. Friedman. Mahout in action. Manning Shelter Island, 2011."},{"key":"e_1_3_2_1_30_1","volume-title":"Population structure and eigenanalysis","author":"Patterson N.","year":"2006","unstructured":"N. Patterson , A. L. Price , and D. Reich . Population structure and eigenanalysis . 2006 . N. Patterson, A. L. Price, and D. Reich. Population structure and eigenanalysis. 2006."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2010.34"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"volume-title":"R: A Language and Environment for Statistical Computing","author":"Core Team R","key":"e_1_3_2_1_33_1","unstructured":"R Core Team . R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing , Vienna, Austria , R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria,"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/0-387-25465-X_15"},{"key":"e_1_3_2_1_35_1","unstructured":"Scalability! but at what cost? http:\/\/www.frankmcsherry.org\/graph\/scalability\/ cost\/2015\/01\/15\/COST.html Accessed 9\/3\/2016.  Scalability! but at what cost? http:\/\/www.frankmcsherry.org\/graph\/scalability\/ cost\/2015\/01\/15\/COST.html Accessed 9\/3\/2016."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772862"},{"key":"e_1_3_2_1_37_1","first-page":"2375","volume-title":"Advances in neural information processing systems","author":"Shindler M.","year":"2011","unstructured":"M. Shindler , A. Wong , and A. W. Meyerson . Fast and accurate k-means for large datasets . In Advances in neural information processing systems , pages 2375 -- 2383 , 2011 . M. Shindler, A. Wong, and A. W. Meyerson. Fast and accurate k-means for large datasets. In Advances in neural information processing systems, pages 2375--2383, 2011."},{"key":"e_1_3_2_1_38_1","volume-title":"The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503","author":"Ugander J.","year":"2011","unstructured":"J. Ugander , B. Karrer , L. Backstrom , and C. Marlow . The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503 , 2011 . J. Ugander, B. Karrer, L. Backstrom, and C. Marlow. The anatomy of the facebook social graph. arXiv preprint arXiv:1111.4503, 2011."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1250298"},{"key":"e_1_3_2_1_40_1","volume-title":"Semi-supervised k-means++. arXiv preprint arXiv:1602.00360","author":"Yoder J.","year":"2016","unstructured":"J. Yoder and C. E. Priebe . Semi-supervised k-means++. arXiv preprint arXiv:1602.00360 , 2016 . J. Yoder and C. E. Priebe. Semi-supervised k-means++. arXiv preprint arXiv:1602.00360, 2016."},{"key":"e_1_3_2_1_41_1","first-page":"2","volume-title":"Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation","author":"Zaharia M.","year":"2012","unstructured":"M. Zaharia , M. Chowdhury , T. Das , A. Dave , J. Ma , M. McCauley , M. J. Franklin , S. Shenker , and I. Stoica . Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing . In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation , pages 2 -- 2 . USENIX Association , 2012 . M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pages 2--2. USENIX Association, 2012."},{"key":"e_1_3_2_1_42_1","first-page":"10","article-title":"Spark: Cluster computing with working sets","volume":"10","author":"Zaharia M.","year":"2010","unstructured":"M. Zaharia , M. Chowdhury , M. J. Franklin , S. Shenker , and I. Stoica . Spark: Cluster computing with working sets . HotCloud , 10 : 10 -- 10 , 2010 . M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. HotCloud, 10:10--10, 2010.","journal-title":"HotCloud"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-10665-1_71"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503225"},{"key":"e_1_3_2_1_45_1","volume-title":"13th USENIX Conference on File and Storage Technologies (FAST 15)","author":"Zheng D.","year":"2015","unstructured":"D. Zheng , D. Mhembere , R. Burns , J. Vogelstein , C. E. Priebe , and A. S. Szalay . FlashGraph: Processing billion-node graphs on an array of commodity SSDs . In 13th USENIX Conference on File and Storage Technologies (FAST 15) , 2015 . D. Zheng, D. Mhembere, R. Burns, J. Vogelstein, C. E. Priebe, and A. S. Szalay. FlashGraph: Processing billion-node graphs on an array of commodity SSDs. In 13th USENIX Conference on File and Storage Technologies (FAST 15), 2015."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-10665-1_62"}],"event":{"name":"HPDC '17: The 26th International Symposium on High-Performance Parallel and Distributed Computing","sponsor":["University of Arizona University of Arizona","SIGARCH ACM Special Interest Group on Computer Architecture","SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing"],"location":"Washington DC USA","acronym":"HPDC '17"},"container-title":["Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3078597.3078607","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3078597.3078607","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:03:15Z","timestamp":1750215795000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3078597.3078607"}},"subtitle":["A NUMA-Optimized In-Memory, Distributed and Semi-External-Memory k-means Library"],"short-title":[],"issued":{"date-parts":[[2017,6,26]]},"references-count":46,"alternative-id":["10.1145\/3078597.3078607","10.1145\/3078597"],"URL":"https:\/\/doi.org\/10.1145\/3078597.3078607","relation":{},"subject":[],"published":{"date-parts":[[2017,6,26]]},"assertion":[{"value":"2017-06-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}