{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,10]],"date-time":"2025-11-10T13:35:43Z","timestamp":1762781743739,"version":"3.41.0"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2013,1,1]],"date-time":"2013-01-01T00:00:00Z","timestamp":1356998400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2013,1]]},"abstract":"<jats:p>Data mining is a new field of computer science with a wide range of applications. Its goal is to extract knowledge from massive datasets in a human-understandable structure, for example, the decision trees. In this article we present an innovative, high-performance, system-level architecture for the Classification And Regression Tree (CART) algorithm, one of the most important and widely used algorithms in the data mining area. Our proposed architecture exploits parallelism at the decision variable level, and was fully implemented and evaluated on a modern high-performance reconfigurable platform, the Convey HC-1 server, that features four FPGAs and a multicore processor. Our FPGA-based implementation was integrated with the widely used \u201crpart\u201d software library of the R project in order to provide the first fully functional reconfigurable system that can handle real-world large databases. The proposed system, named HC-CART system, achieves a performance speedup of up to two orders of magnitude compared to well-known single-threaded data mining software platforms, such as WEKA and the R platform. It also outperforms similar hardware systems which implement parts of the complete application by an order of magnitude. Finally, we show that the HC-CART system offers higher performance speedup than some other proposed parallel software implementations of decision tree construction algorithms.<\/jats:p>","DOI":"10.1145\/2400682.2400706","type":"journal-article","created":{"date-parts":[[2013,1,22]],"date-time":"2013-01-22T15:28:56Z","timestamp":1358868536000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":32,"title":["HC-CART"],"prefix":"10.1145","volume":"9","author":[{"given":"Grigorios","family":"Chrysos","sequence":"first","affiliation":[{"name":"Technical University of Crete, Greece"}]},{"given":"Panagiotis","family":"Dagritzikos","sequence":"additional","affiliation":[{"name":"Technical University of Crete, Greece"}]},{"given":"Ioannis","family":"Papaefstathiou","sequence":"additional","affiliation":[{"name":"Technical University of Crete, Greece"}]},{"given":"Apostolos","family":"Dollas","sequence":"additional","affiliation":[{"name":"Technical University of Crete, Greece"}]}],"member":"320","published-online":{"date-parts":[[2013,1,20]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Amado N. Gama J. and Silva F. 2001. Parallel implementation of decision tree learning algorithms. In Progress in Artificial Intelligence 34--52.   Amado N. Gama J. and Silva F. 2001. Parallel implementation of decision tree learning algorithms. In Progress in Artificial Intelligence 34--52.","DOI":"10.1007\/3-540-45329-6_4"},{"key":"e_1_2_1_2_1","first-page":"230","article-title":"Comparative analysis of serial decision tree classification algorithms","volume":"3","author":"Anyanwu M. N.","year":"2009","unstructured":"Anyanwu , M. N. and Shiva , S. 2009 . Comparative analysis of serial decision tree classification algorithms . Int. J. Comput. Sci. Secur. 3 , 3, 230 -- 240 . Anyanwu, M. N. and Shiva, S. 2009. Comparative analysis of serial decision tree classification algorithms. Int. J. Comput. Sci. Secur. 3, 3, 230--240.","journal-title":"Int. J. Comput. Sci. Secur."},{"key":"e_1_2_1_3_1","unstructured":"Apache. Apache hadoop. http:\/\/hadoop.apache.org\/.  Apache. Apache hadoop. http:\/\/hadoop.apache.org\/."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2005.31"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.5555\/1170135.1170441"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Bekkerman R. Bilenko M. and Langford J. 2011. Scaling Up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press.   Bekkerman R. Bilenko M. and Langford J. 2011. Scaling Up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press.","DOI":"10.1017\/CBO9781139042918"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/1756006.1756034"},{"key":"e_1_2_1_8_1","volume-title":"CART: Classification and Regression Trees","author":"Breiman L.","year":"1983","unstructured":"Breiman , L. , Friedman , J. , Stone , C. J. , and Olshen , R. A . 1983 . CART: Classification and Regression Trees . Wadsworth Press . Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. 1983. CART: Classification and Regression Trees. Wadsworth Press."},{"key":"e_1_2_1_9_1","unstructured":"Celis S. and Musicant D. R. 2002. WEKA-Parallel: Machine learning in parallel. Tech. rep. CS-TR Carleton College.  Celis S. and Musicant D. R. 2002. WEKA-Parallel: Machine learning in parallel. Tech. rep. CS-TR Carleton College."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2011.82"},{"key":"e_1_2_1_11_1","unstructured":"Convey Computer Corporation. http:\/\/www.conveycomputer.com\/.  Convey Computer Corporation. http:\/\/www.conveycomputer.com\/."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/360276.360311"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1565694.1565702"},{"key":"e_1_2_1_14_1","unstructured":"Frank A. and Asuncion A. 2010. UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml.  Frank A. and Asuncion A. 2010. UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml."},{"key":"e_1_2_1_15_1","unstructured":"Gehrke J. Ramakrishnan R. and Ganti V. 1998. RainForest: A framework for fast decision tree.  Gehrke J. Ramakrishnan R. and Ganti V. 1998. RainForest: A framework for fast decision tree."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1656274.1656278"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the International Conference on Rough Set and Knowledge Technology.","volume":"6401","author":"He Q.","unstructured":"He , Q. , Tan , Q. , Ma , X.-D. , and Shi , Z . -Z. 2010. The high-activity parallel implementation of data preprocessing based on mapreduce . In Proceedings of the International Conference on Rough Set and Knowledge Technology. Vol. 6401 , 646--654. He, Q., Tan, Q., Ma, X.-D., and Shi, Z.-Z. 2010. The high-activity parallel implementation of data preprocessing based on mapreduce. In Proceedings of the International Conference on Rough Set and Knowledge Technology. Vol. 6401, 646--654."},{"volume-title":"Proceedings of the International Parallel Processing Symposium. 573--579","author":"Joshi M. V.","key":"e_1_2_1_18_1","unstructured":"Joshi , M. V. , Karypis , G. , and Kumar , V . 1998. ScalParC: A new scalable and efficient parallel classification algorithm for mining large datasets . In Proceedings of the International Parallel Processing Symposium. 573--579 . Joshi, M. V., Karypis, G., and Kumar, V. 1998. ScalParC: A new scalable and efficient parallel classification algorithm for mining large datasets. In Proceedings of the International Parallel Processing Symposium. 573--579."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1477942.1477949"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1542275.1542331"},{"key":"e_1_2_1_21_1","volume-title":"SLIQ: A fast scalable classifier for data mining. In Advances in Database Technology","author":"Mehta M.","year":"1996","unstructured":"Mehta , M. , Agrawal , R. , and Rissanen , J . 1996 . SLIQ: A fast scalable classifier for data mining. In Advances in Database Technology . Springer , 18--32. Mehta, M., Agrawal, R., and Rissanen, J. 1996. SLIQ: A fast scalable classifier for data mining. In Advances in Database Technology. Springer, 18--32."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150531"},{"key":"e_1_2_1_23_1","first-page":"431","article-title":"Data mining and knowledge discovery","volume":"11","author":"Mikut R.","year":"2011","unstructured":"Mikut , R. and Reischl , M. 2011 . Data mining and knowledge discovery . Wiley Interdis. Rev. 11 , 431 -- 445 . Mikut, R. and Reischl, M. 2011. Data mining and knowledge discovery. Wiley Interdis. Rev. 11, 431--445.","journal-title":"Wiley Interdis. Rev."},{"volume-title":"Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 45","author":"Narayanan R.","key":"e_1_2_1_24_1","unstructured":"Narayanan , R. , Honbo , D. , Memik , G. , Choudhary , A. , and Zambreno , J . 2007. An fpga implementation of decision tree classification . In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 45 . Narayanan, R., Honbo, D., Memik, G., Choudhary, A., and Zambreno, J. 2007. An fpga implementation of decision tree classification. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition. 45."},{"volume-title":"Proc. VLDB Endow. 1426--1437","author":"Panda B.","key":"e_1_2_1_25_1","unstructured":"Panda , B. , Herbach , J. , Basu , S. , and Bayardo , R . 2009. Planet: Massively parallel learning of tree ensembles with mapreduce . Proc. VLDB Endow. 1426--1437 . Panda, B., Herbach, J., Basu, S., and Bayardo, R. 2009. Planet: Massively parallel learning of tree ensembles with mapreduce. Proc. VLDB Endow. 1426--1437."},{"key":"e_1_2_1_26_1","unstructured":"Quinlan J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann.   Quinlan J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann."},{"volume-title":"Proceedings of the International Conference on Field Programmable Technology. 337--340","author":"Papadonikolakis M.","key":"e_1_2_1_27_1","unstructured":"Papadonikolakis , M. and Bouganis , C . 2008. A scalable fpga architecture for non-linear svm training . In Proceedings of the International Conference on Field Programmable Technology. 337--340 . Papadonikolakis, M. and Bouganis, C. 2008. A scalable fpga architecture for non-linear svm training. In Proceedings of the International Conference on Field Programmable Technology. 337--340."},{"volume-title":"Proceedings of the International Conference on Field Programmable Technology. 388--391","author":"Papadonikolakis M.","key":"e_1_2_1_28_1","unstructured":"Papadonikolakis , M. , Bouganis , C. , and Constantinides , G . 2009. Performance comparison of gpu and fpga architectures for the svm training problem . In Proceedings of the International Conference on Field Programmable Technology. 388--391 . Papadonikolakis, M., Bouganis, C., and Constantinides, G. 2009. Performance comparison of gpu and fpga architectures for the svm training problem. In Proceedings of the International Conference on Field Programmable Technology. 388--391."},{"key":"e_1_2_1_29_1","unstructured":"R Project. http:\/\/www.r-project.org\/.  R Project. http:\/\/www.r-project.org\/."},{"key":"e_1_2_1_30_1","unstructured":"Rexer Analytics. http:\/\/www.rexeranalytics.com.  Rexer Analytics. http:\/\/www.rexeranalytics.com."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-007-0055-8"},{"key":"e_1_2_1_32_1","unstructured":"Salford Systems. http:\/\/www.salford-systems.com\/  Salford Systems. http:\/\/www.salford-systems.com\/"},{"volume-title":"Proceedings of the 22nd International Conference on Very Large Databases. 544--555","author":"Shafer J.","key":"e_1_2_1_33_1","unstructured":"Shafer , J. , Agrawal , R. , and Mehta , M . 1996. SPRINT: A scalable parallel classifier for data mining . In Proceedings of the 22nd International Conference on Very Large Databases. 544--555 . Shafer, J., Agrawal, R., and Mehta, M. 1996. SPRINT: A scalable parallel classifier for data mining. In Proceedings of the 22nd International Conference on Very Large Databases. 544--555."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1723112.1723129"},{"volume-title":"Proceedings of the International Conference on Parallel Processing.","author":"Srivastava A.","key":"e_1_2_1_35_1","unstructured":"Srivastava , A. , Han , E. , Kumar , V. , and Singh , V . 1998. Parallel formulations of decision-tree classification algorithms . In Proceedings of the International Conference on Parallel Processing. Srivastava, A., Han, E., Kumar, V., and Singh, V. 1998. Parallel formulations of decision-tree classification algorithms. In Proceedings of the International Conference on Parallel Processing."},{"volume-title":"Proceedings of the International Conference on Field Programmable Logic and Applications. 143--148","author":"Sun S.","key":"e_1_2_1_36_1","unstructured":"Sun , S. and Zambreno , J . 2008. Mining association rules with systolic trees . In Proceedings of the International Conference on Field Programmable Logic and Applications. 143--148 . Sun, S. and Zambreno, J. 2008. Mining association rules with systolic trees. In Proceedings of the International Conference on Field Programmable Logic and Applications. 143--148."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ReConFig.2008.80"},{"key":"e_1_2_1_38_1","unstructured":"Therneau T. M. and Grambsch P. M. 1997. An Introduction to Recursive Partitioning Using the RPART Routines. Mayo Foundation.  Therneau T. M. and Grambsch P. M. 1997. An Introduction to Recursive Partitioning Using the RPART Routines. Mayo Foundation."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1963405.1963461"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2007.40"},{"key":"e_1_2_1_41_1","unstructured":"Wiley. Wiley interdisciplinary review. http:\/\/wires.wiley.com\/WileyCDA\/.  Wiley. Wiley interdisciplinary review. http:\/\/wires.wiley.com\/WileyCDA\/."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-007-0114-2"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0219622006002258"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2287016.2287027"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2400682.2400706","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2400682.2400706","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:18:52Z","timestamp":1750234732000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2400682.2400706"}},"subtitle":["A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system"],"short-title":[],"issued":{"date-parts":[[2013,1]]},"references-count":44,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2013,1]]}},"alternative-id":["10.1145\/2400682.2400706"],"URL":"https:\/\/doi.org\/10.1145\/2400682.2400706","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2013,1]]},"assertion":[{"value":"2012-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-01-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}