{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T14:57:13Z","timestamp":1767193033530},"reference-count":31,"publisher":"MIT Press - Journals","issue":"4","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,3,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>In this letter, we compare the representational power of random forests, binary decision diagrams (BDDs), and neural networks in terms of the number of nodes. We assume that an axis-aligned function on a single variable is assigned to each edge in random forests and BDDs, and the activation functions of neural networks are sigmoid, rectified linear unit, or similar functions. Based on existing studies, we show that for any random forest, there exists an equivalent depth-3 neural network with a linear number of nodes. We also show that for any BDD with balanced width, there exists an equivalent shallow depth neural network with a polynomial number of nodes. These results suggest that even shallow neural networks have the same or higher representation power than deep random forests and deep BDDs. We also show that in some cases, an exponential number of nodes are required to express a given random forest by a random forest with a much fewer number of trees, which suggests that many trees are required for random forests to represent some specific knowledge efficiently.<\/jats:p>","DOI":"10.1162\/neco_a_01486","type":"journal-article","created":{"date-parts":[[2022,3,2]],"date-time":"2022-03-02T00:51:25Z","timestamp":1646182285000},"page":"1019-1044","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":9,"title":["Comparison of the Representational Power of Random Forests, Binary Decision Diagrams, and Neural Networks"],"prefix":"10.1162","volume":"34","author":[{"given":"So","family":"Kumano","sequence":"first","affiliation":[{"name":"Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan soir22@icloud.com"}]},{"given":"Tatsuya","family":"Akutsu","sequence":"additional","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan takutsu@kuicr.kyoto-u.ac.jp"}]}],"member":"281","published-online":{"date-parts":[[2022,3,23]]},"reference":[{"key":"2022032817110803000_B1","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1111\/j.1467-8640.2010.00366.x","article-title":"Decision trees do not generalize to new variations","volume":"26","author":"Bengio","year":"2010","journal-title":"Computational Intelligence"},{"key":"2022032817110803000_B2","first-page":"1","volume-title":"Large-scale kernel machines","author":"Bengio","year":"2007"},{"key":"2022032817110803000_B3","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1007\/s13171-018-0133-y","article-title":"Neural random forests","volume":"81","author":"Biau","year":"2019","journal-title":"Sankhya A"},{"key":"2022032817110803000_B4","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Machine Learning"},{"key":"2022032817110803000_B5","author":"Breiman","year":"1984","journal-title":"Classification and regression trees"},{"key":"2022032817110803000_B6","doi-asserted-by":"publisher","first-page":"677","DOI":"10.1109\/TC.1986.1676819","article-title":"Graph-based algorithms for Boolean function manipulation","volume":"C-35","author":"Bryant","year":"1986","journal-title":"IEEE Transactions on Computers"},{"key":"2022032817110803000_B7","author":"Chatziafratis","year":"2020","journal-title":"Depth-width trade-offs for ReLU networks via Sharkovsky's theorem"},{"key":"2022032817110803000_B8","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1006\/jcom.1999.0519","article-title":"Complexity lower bounds for approximation algebraic computation trees","volume":"15","author":"Cucker","year":"1999","journal-title":"Journal of Complexity"},{"key":"2022032817110803000_B9","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/j.eswa.2019.07.012","article-title":"Measuring the shattering coefficient of decision tree models","volume":"137","author":"de Mello","year":"2019","journal-title":"Expert Systems with Applications"},{"key":"2022032817110803000_B10","first-page":"666","volume-title":"Advances in neural information processing systems","author":"Delalleau","year":"2011"},{"key":"2022032817110803000_B11","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/s000370050010","article-title":"An exponential lower bound on the size of algebraic decision trees for Max.","volume":"7","author":"Grigoriev","year":"1998","journal-title":"Computational Complexity"},{"key":"2022032817110803000_B12","first-page":"2596","article-title":"Complexity of linear regions in deep networks.","author":"Hanin","year":"2019","journal-title":"Proceedings of the 36th International Conference on Machine Learning"},{"key":"2022032817110803000_B13","first-page":"278","article-title":"Random decision forests","author":"Ho","year":"1995","journal-title":"Proceedings of Third International Conference on Document Analysis and Recognition"},{"key":"2022032817110803000_B14","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1016\/0893-6080(89)90020-8","article-title":"Multilayer feedforward networks are universal approximators","volume":"2","author":"Hornik","year":"1989","journal-title":"Neural Networks"},{"key":"2022032817110803000_B15","first-page":"195","article-title":"Pessimistic decision tree pruning based on tree size","author":"Mansour","year":"1997","journal-title":"Proceedings of the 14th International Conference on Machine Learning"},{"key":"2022032817110803000_B16","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-58940-9","author":"Meinel","year":"1998","journal-title":"Algorithms and data structures in VLSI-design: OBDD\u2014Foundations and applications"},{"key":"2022032817110803000_B17","first-page":"2924","volume-title":"Advances in neural information processing systems","author":"Montufar","year":"2014"},{"key":"2022032817110803000_B18","first-page":"807","article-title":"Rectified linear units improve restricted Boltzmann machines","author":"Nair","year":"2010","journal-title":"Proceedings of the 27th International Conference on Machine Learning"},{"key":"2022032817110803000_B19","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1007\/978-3-642-31537-4_13","article-title":"How many trees in a random forest?","author":"Oshiro","year":"2012","journal-title":"Proceedings of International Workshop on Machine Learning and Data Mining in Pattern Recognition"},{"key":"2022032817110803000_B20","first-page":"391","article-title":"Learning despite concept variation by finding structure in attribute-based data","author":"Perez","year":"1996","journal-title":"Proceedings of the 13th International Conference on Machine Learning"},{"key":"2022032817110803000_B21","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1007\/s11227-006-0010-7","article-title":"Binary decision diagrams and neural networks","volume":"39","author":"Prasad","year":"2007","journal-title":"Journal of Supercomputing"},{"key":"2022032817110803000_B22","first-page":"2847","article-title":"On the expressive power of deep neural networks","author":"Raghu","year":"2017","journal-title":"Proceedings of the 34th International Conference on Machine Learning"},{"key":"2022032817110803000_B23","doi-asserted-by":"publisher","first-page":"1605","DOI":"10.1109\/5.58346","article-title":"Entropy nets: From decision trees to neural networks","volume":"78","author":"Sethi","year":"1990","journal-title":"Proceedings of the IEEE"},{"key":"2022032817110803000_B24","author":"Shalev-Shwartz","year":"2014","journal-title":"Understanding machine learning: From theory to algorithms"},{"key":"2022032817110803000_B25","doi-asserted-by":"publisher","first-page":"1816","DOI":"10.1109\/TNNLS.2013.2296046","article-title":"Deep networks are effective encoders of period- icity","volume":"25","author":"Szymanski","year":"2014","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"2022032817110803000_B26","author":"Telgarsky","year":"2015","journal-title":"Representation benefits of deep feedforward networks"},{"key":"2022032817110803000_B27","first-page":"207","article-title":"Parametric exponential linear unit for deep convolutional neural networks","author":"Trottier","year":"2017","journal-title":"Proceedings of the 16th IEEE International Conference on Machine Learning and Applications"},{"key":"2022032817110803000_B28","first-page":"312","article-title":"Global data analysis and the fragmentation problem in decision tree induction","author":"Vilalta","year":"1997","journal-title":"Proceedings of the 9th European Conference on Machine Learning"},{"key":"2022032817110803000_B29","author":"Xu","year":"2018","journal-title":"Fast OBDD reordering using neural message passing on hypergraph"},{"key":"2022032817110803000_B30","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1109\/TNNLS.2014.2385837","article-title":"VC-dimension of univariate decision trees","volume":"26","author":"Yildiz","year":"2015","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"2022032817110803000_B31","author":"Zhou","year":"2021","journal-title":"Trees, forests, chickens, and eggs: When and why to prune trees in a random forest"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/34\/4\/1019\/2003062\/neco_a_01486.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/34\/4\/1019\/2003062\/neco_a_01486.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,28]],"date-time":"2022-03-28T17:11:38Z","timestamp":1648487498000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/34\/4\/1019\/109661\/Comparison-of-the-Representational-Power-of-Random"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,23]]},"references-count":31,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2022,3,23]]},"published-print":{"date-parts":[[2022,3,23]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01486","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,4]]},"published":{"date-parts":[[2022,3,23]]}}}